dc.description.abstract | Language learning and speech perception are remarkable feats performed by the human brain, involving complex neural mechanisms that allow us to understand and communicate with one another. Unravelling the mysteries of these mechanisms has far-reaching implications, from theories of human cognition to developing effective language learning strategies and advancing speech technology. By employing a multidisciplinary approach encompassing neural investigations using EEG signals, behavioural analyses, and machine learning perspectives, this thesis seeks to shed light on the underlying processes involved in word learning and speech perception.
The thesis is divided into three parts. The first part examines how imitation-based learning of foreign sounds is captured in the EEG signals. In this listen-and-reproduce setting, subjects were introduced to words from a foreign language (Japanese) and English. The subjects were also asked to articulate the words. The results show that time-frequency features and phase in the EEG signal contain information for language discrimination. Further analysis showed that speech production improved over time, and the frontal brain regions were involved in language learning. These findings suggest the potential of EEG for personalized language exercises and assessing learners' abilities.
The next part of the thesis investigates what changes in neural patterns occur when semantics are introduced and presented in a sentence context. The participants listen to Japanese words in an English sentence, once before understanding the semantics of these words and later with the semantic exposure. We quantify the learning patterns in the EEG signal. Notably, a delayed P600 component emerges for Japanese words, suggesting short-term memory processing, unlike the N400 typically seen for a semantic anomaly in the known language. We have also shown the association of the P600 amplitude with the similarity of newly learned to the known language. The brain regions associated with semantic learning are also identified in this study using the EEG data. These findings demonstrate that there are differences in the underlying cognitive processes involved in rapid and long-term language learning.
In the final part of the thesis, we analyze the neural mechanisms of human speech comprehension using a match-mismatch classification of the continuous speech stimulus and the neural response (EEG). We make three significant contributions on this front - i) Illustrate the role of word boundaries in continuous speech comprehension for the first time, ii) Elicit the encoding of speech data (acoustics) as well as the text data (semantics) in the EEG signal, and, iii) Increased signature of semantic content (text) in the EEG data in acoustically challenging environments of dichotic listening. Previous studies focused on fixed-duration segments without considering the variable length processing of speech in the brain. Our approach involved processing speech and EEG signals with convolutional layers, word boundary-based pooling, and inter-word context through a recurrent layer. We introduced a novel loss function based on Manhattan similarity. The findings have potential applications for understanding speech recognition in noise, brain-computer interfaces, and attention studies.
Overall, this thesis contributes to our understanding of language learning, speech comprehension, and the underlying neural mechanisms. Through the analysis of EEG signals, this work provides valuable insights into the processing of familiar and unfamiliar languages, the effects of semantic dissimilarity, and the role of word boundaries in sentence comprehension. These findings have implications for both human language learning and the development of machine systems aimed at understanding and processing speech. | en_US |