Deep Learning Methods For Audio EEG Analysis

Katthi, Jaswanth Reddy

dc.contributor.advisor	Ganapathy, Sriram
dc.contributor.author	Katthi, Jaswanth Reddy
dc.date.accessioned	2022-05-17T11:10:28Z
dc.date.available	2022-05-17T11:10:28Z
dc.date.submitted	2021
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/5734
dc.description.abstract	The perception of speech and audio is one of the defining features of humans. Much of the brain’s underlying processes as we listen to acoustic signals are unknown, and significant research efforts are needed to unravel them. The non-invasive recordings capturing the brain activations like electroencephalogram (EEG) and magnetoencephalogram (MEG) are commonly deployed to capture the brain responses to auditory stimuli. But these non-invasive techniques capture artifacts and signals not related to the stimuli, which distort the stimulus-response analysis. The effect of the artifacts be- comes more evident for naturalistic stimuli. To reduce the inter-subject redundancies and amplify the components related to the stimuli, the EEG responses from multiple subjects listening to a common naturalistic stimulus need to be normalized. The currently used normalization and pre-processing methods are the canonical correlation analysis (CCA) models and the temporal response function based forward/backward models. However, these methods assume a simplistic linear relationship between the audio features and the EEG responses and therefore, may not alleviate the recording artifacts and interfering signals in EEG. This thesis proposes novel methods using machine learning advances to improve the audio-EEG analysis. We propose a deep learning framework for audio-EEG analysis in intra-subject and inter-subject settings. The deep learning based intra-subject analysis methods are trained with a Pearson correlation-based cost function between the stimuli and EEG responses. This model allows the transformation of the audio and EEG features that are maximally correlated. The correlation-based cost function can be optimized with the learnable parameters of the model trained using standard gradient descent- based methods. This model is referred to as the deep CCA (DCCA) model. Several experiments are performed on the EEG data recorded when the subjects are listening to naturalistic speech and music stimuli. We show that the deep methods obtain better representations than the linear methods and results in statistically significant improvements in correlation values. Further, we propose a neural network model with shared encoders that align the EEG responses from multiple subjects listening to the same audio stimuli. This inter-subject model boosts the signals common across the subjects and suppresses the subject-specific artifacts. The impact of improving stimulus-response correlations are highlighted based on multi-subject EEG data from speech and music tasks. This model is referred to as the deep multi-way canonical correlation analysis (DMCCA). The combination of inter-subject analysis using DMCCA and intra-subject analysis using DCCA is shown to provide the best stimulus-response in audio-EEG experiments. We highlight how much of the audio signal can be recovered purely from the non- invasive EEG recordings with modern machine learning methods, and conclude with a discussion on future challenges in audio-EEG analysis.	en_US
dc.language.iso	en_US	en_US
dc.rights	I grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation	en_US
dc.subject	Speech perception	en_US
dc.subject	EEG	en_US
dc.subject	Canonical correlation analysis (CCA)	en_US
dc.subject	multiway CCA	en_US
dc.subject	Deep CCA	en_US
dc.subject	Deep MCCA	en_US
dc.subject	Audio-EEG analysis	en_US
dc.subject	Machine learning	en_US
dc.subject	Speech Processing	en_US
dc.subject.classification	Research Subject Categories::TECHNOLOGY::Electrical engineering, electronics and photonics::Electrical engineering	en_US
dc.title	Deep Learning Methods For Audio EEG Analysis	en_US
dc.type	Thesis	en_US
dc.degree.name	MTech (Res)	en_US
dc.degree.level	Masters	en_US
dc.degree.grantor	Indian Institute of Science	en_US
dc.degree.discipline	Engineering	en_US

Files in this item

Name:: Jaswanth_Thesis-9.pdf
Size:: 5.196Mb
Format:: PDF
Description:: Thesis full text

View/Open

This item appears in the following Collection(s)

Electrical Engineering (EE) [448]

Show simple item record