dc.contributor.advisor | Arun, S P | |
dc.contributor.author | Pramod, R T | |
dc.date.accessioned | 2021-09-23T08:49:00Z | |
dc.date.available | 2021-09-23T08:49:00Z | |
dc.date.submitted | 2018 | |
dc.identifier.uri | https://etd.iisc.ac.in/handle/2005/5340 | |
dc.description.abstract | Compositionality in object vision can be defined as the principles governing the
relationship between whole objects and their constituent attributes. It is known that object
information falling on the retina is processed in a hierarchy of cortical regions starting from
simple edge-detectors in the primary visual cortex to complex shape representations in the
higher visual cortex, yet we still do not understand how whole objects are represented in terms
of their attributes. With recent advances in computer vision, we have, for the first time in
history, a very good machine vision system in the form of convolutional neural networks. How
do these systems compare with human vision? We argue that understanding vision in the brain
and making machines see the way we do form two sides of the same coin – understanding one
will give us insights into the other. With this in mind, the goal of my thesis is twofold – to
study compositionality in object representations in the brain; and to compare compositionality
in brains and machines with the goal of improving machine vision.
I will present results from a series of studies where we investigate object representations
in brains and machines. In the first set of studies, we investigated whether whole object
responses in perception and in single neurons could be understood in terms of their parts. The
main findings are: (1) Object attributes combine linearly in visual search (Pramod & Arun,
2016); (2) Although symmetry is a salient holistic property, responses to symmetric objects are
also explained as a sum of their parts as were asymmetric objects (Pramod & Arun, 2018).
Taken together these findings confirm the compositionality of object representations in
perception and in high-level visual cortex.
In the second set of studies, we compared the compositionality of object representations
in brains and machines. The main findings are: (1) Object representations in virtually all
computer vision models (including deep neural networks) deviate systematically from human
perception (Pramod & Arun, 2016); (2) Symmetric objects are more salient in perception than
in deep neural networks, and fixing this bias leads to significant improvements in object
detection performance; and finally, (3) we show that under-sampling of the periphery in the
biological retina is computationally optimal for object recognition in natural scenes, pointing
to dissociable roles for object and context. Taken together, these findings show that machine
vision can be understood and improved by studying biological vision. | en_US |
dc.language.iso | en_US | en_US |
dc.relation.ispartofseries | ;G29330 | |
dc.rights | I grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part
of this thesis or dissertation | en_US |
dc.subject | convolutional neural networks | en_US |
dc.subject | computer vision | en_US |
dc.subject | machine vision | en_US |
dc.subject | biological vision | en_US |
dc.subject.classification | Research Subject Categories::TECHNOLOGY::Other technology | en_US |
dc.title | Compositionality of Object Representations in Brains and Machines | en_US |
dc.type | Thesis | en_US |
dc.degree.name | PhD | en_US |
dc.degree.level | Doctoral | en_US |
dc.degree.grantor | Indian Institute of Science | en_US |
dc.degree.discipline | Engineering | en_US |