How are visual object representations organized and used to perform tasks ?
Abstract
We rely heavily on vision for our daily activities, and around 40% of our brain is dedicated to vision. It is known that during a visual task, the visual information falling on the retina is processed in a hierarchy of cortical regions, starting from a simple edge detector in the primary visual cortex to complex shape representations in the higher visual cortex and decision making in the pre-frontal cortex. Yet we understand little about the underlying neural representations and computations that facilitate decision-making. The goal of this thesis is to understand visual representations in the brain and to uncover basic computations on these representations that might support a variety of visual tasks. We performed three main studies.
In the first study, we sought to uncover qualitative similarities and differences between brains and deep networks trained for object classification by comparing their object representations. The main findings are (1) Perceptual phenomena like the Thatcher effect, Mirror confusion, and Weber’s law emerge when deep networks are trained for object recognition (2) Perceptual phenomena like 3D shape processing, surface invariance, and the global advantage are absent. These results show us when we can consider deep networks good models of vision and how deep networks can be improved.
In the second study, we investigated how humans perceive global and local shapes. Two classical phenomena have been observed: the global advantage effect (we identify global shape before local shape) and the interference effect (we identify shapes slower when global and local shapes are different). Because these phenomena have been observed during shape categorization tasks, it is unclear whether they reflect the categorical judgement or the underlying shape representation. We performed two behavioural experiments (oddball visual search and same-different task) on the same set of hierarchical shapes to check if these phenomena emerge due to shape representations. The main findings are (1) Global advantage phenomena is observed in visual search where the participants are not making any categorical judgements; (2) Response times of any image pairs are systematic, and it can be predicted using two factors: dissimilarity and distinctiveness. These results show that global and local shapes combine according to systematic rules in perception.
In the third study, we investigated the neural correlates of task-specific computations. One possibility is that we perform distinct computations for each visual task. Alternatively, there could be generic computations that could drive a variety of tasks. We provide evidence for the latter by performing behavioural and brain imaging experiments with two distinct visual tasks: present-absent search and symmetry judgement tasks. We hypothesized that the decision during these two visual tasks depends on how distinctive the activation is relative to some reference point in perceptual space, and we define this as distinctiveness computations. The main findings are (1) Distinctiveness computations show properties of a decision variable for both visual tasks (2) Neural activation in the region anterior to lateral occipital (LO) region was correlated with distinctiveness consistently across both present-absent search task and symmetry judgement task. These findings show that generic computations can be useful for multiple visual tasks.