Relating Representations in Deep Learning and the Brain
Abstract
Deep Neural Networks (DNN) inspired by the human brain have redefined the state-of-the-art performance in AI during the past decade. Much of the research is still trying to understand and explain the function of these networks. In this thesis, we leverage knowledge from the neuroscience literature to evaluate the representations learned in state-of-the-art language models. We use sentences with simple syntax and semantics (e.g., “The bone was eaten by the dog.”), and train multiple neural networks to predict the part of speech, next word. We present other sentences of this same simple form, word-by-word to humans in a magnetoencephalography (MEG) scanner for silent reading and comprehension. We then train a linear regression model to predict observed brain recording from the hidden layers of the trained neural networks and popular pre-trained networks like BERT and ELMo.
We find that the middle layers of these networks are the most predictive of the recorded brain activity. But, a more fine-grained evaluation shows that various types of stimuli (determiner, adjective, noun, verb) are represented more dominantly in different layers of the language model. Further, we test the semantic composition capabilities of these networks with respect to the human brain. Semantic composition is defined as the rule-based combination of the parts that constitutes the meaning of the whole. We collect new data and develop a new framework to perform this evaluation incrementally as each word in the sentence is processed in the brain and DNN. As a result, we are able to analyze the effect of the composition function in representing the same word as more of the sentence context becomes available. Our experiments show that DNN models are effective in encoding the sentence being read and are able to predict the word which occurred earlier in the sentence, indicating good composition. We find that in these tests, the right frontal and right temporal brain regions are predicted with best accuracy. Previous research has suggested that these brain regions are responsible for executive and memory function.
As an additional contribution, we propose a new dynamic time warping based distance metric to evaluate alignment between the predicted brain activity versus the observed brain activity. The new metric helps tackle the variability observed in a single subject’s recorded brain activity.