Compositionality of letter shape in word recognition
Abstract
As you read this sentence, your brain just performed a miraculous task of converting collections of letter shapes into meaning. Reading is a cultural invention that is thought to exploit the intrinsic recognition abilities of our visual system, but it also leads to widespread changes in the brain. How do visual representations change to enable efficient reading i.e. our ability to read words at a glance? It is widely believed that learning to read should lead to the formation of novel detectors for letter combinations, thereby creating word responses that are not predictable from single letters. Alternatively, reading could lead to separable or compositional word responses that are predictable from familiar letters or scripts. There is insufficient evidence to resolve this fundamental question in the literature.
In my thesis, I have performed 3 main studies to address this fundamental question. In the first study, I explored the changes in representation associated with reading expertise. To address this, I compared the visual representations of readers and non-readers of two Indian languages, Telugu and Malayalam. I found a subtle change in visual representation with reading expertise, but surprisingly it decreased the interaction between letters of a word, thereby, making the letters of a word more compositional. Using fMRI, I found the locus of this effect in higher visual areas.
In the second study, I explored the nature of visual representation that enable us to read words with spelling mistakes (jumbled words). To address this, I built computational models to predict the visual similarity between any two strings. This model is compositional in nature i.e. response of a word can be predicted using its individual letters. Interestingly, the time taken to identify a jumbled word or to classify it as a nonword is dependent solely on the visual similarity. This result extends the intrinsic capabilities of our visual system in word recognition.
In the third study, I investigated the underlying neural correlates in a lexical decision task. Using fMRI, I recorded brain activity while subjects performed a lexical decision task
5
inside the scanner. Consistent with the first study, I found the neural correlates of visual representations in the Lateral Occipital area and the neural correlates of semantic space was found in temporal regions. Based on these observations, I theorize that VWFA receives bottom-up perceptual representation from higher visual areas and the stored representation of words from temporal regions that aids in the process of decision making.
In a series of additional follow-up studies, I found that (1) The strength of compositional representations of words predicts the reading fluency in children; (2) The compositional representation is the key for better accuracies of neural networks trained on scene text recognition. 3) Finally, reading using peripheral vision becomes harder due to increased letter interactions.
Taken together, the above studies demonstrate how reading increases the compositionality of visual word representations, and how this compositional representation enables efficient/fast reading.