dc.contributor.advisor | Ganapathy, Sriram | |
dc.contributor.author | Purushothaman, Anurenjan | |
dc.date.accessioned | 2023-10-03T10:11:56Z | |
dc.date.available | 2023-10-03T10:11:56Z | |
dc.date.submitted | 2023 | |
dc.identifier.uri | https://etd.iisc.ac.in/handle/2005/6236 | |
dc.description.abstract | Automatic speech recognition (ASR) based technologies are radically changing the way we interact with digital services and information. Most of these application leverage on hands-free speech, where talkers are able to speak at a distance from the microphones without the nuance of handheld or body-worn device. The applications like, meeting annotations, speech to text transcription in teleconferencing, hands-free interfaces for controlling consumer-products, like interactive TV, virtual assistants in mobile phones, smart speakers etc, will benefit from distant talking mode of operation. The main issues in distant talking speech recognition is the corruption of speech signals by noise and the reverberation. This thesis is focused on developing dereverberation methods for speech processing using sub-band temporal envelopes.
This thesis pursues two broad directions for addressing issues in far-field ASR. In the first part of the thesis, two methods for dereverberation are proposed. In the second part of the thesis, we develop a speech enhancement model, where the audio signal is re-synthesized using dereverberated temporal envelopes and corresponding carrier components.
In the first part of the thesis, two methods to address reverberation is developed. The first method deals with developing a 3-D Acoustic modeling framework for far-field ASR (Automatic Speech Recognition), where spatio-spectral features from all the available channels are extracted. The features that are input to the 3-D CNN are extracted by modeling the signal peaks in the spatio-spectral domain using a multi-variate autoregressive (AR) modeling approach. This AR model is efficient in capturing the channel correlations in the frequency domain of the multi-channel signal. In the second method, a neural model for speech dereverberation using the long-term sub-band envelopes of speech is developed. The neural dereverberation model estimates the envelope gain, which when applied to reverberant signals, suppresses the late reflection components in the far-field signal. The dereverberated envelopes are used for feature extraction in speech recognition.
The second part of the thesis deals with envelope-carrier based speech enhancement. Here, we investigate the effect of far-field artifacts on temporal envelopes and the corresponding carrier components. A dual path recurrent neural model is used to parallelly learn the mapping for the reverberant envelopes and the carrier signals. Further, joint learning of the speech enhancement model with the end-to-end ASR model a single neural model is proposed.
Both parts of the thesis use the frequency domain linear prediction (FDLP) based model for extracting the envelopes of the sub-band signals in long analysis windows. We show several ASR and speech quality experiments to highlight the benefits of the proposed techniques. | en_US |
dc.description.sponsorship | Samsung Research India Bangalore, College of Engineering Trivandrum | en_US |
dc.language.iso | en_US | en_US |
dc.relation.ispartofseries | ;ET00249 | |
dc.rights | I grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part
of this thesis or dissertation | en_US |
dc.subject | Speech processing | en_US |
dc.subject | Automatic speech recognition | en_US |
dc.subject | frequency domain linear prediction | en_US |
dc.subject.classification | Research Subject Categories::TECHNOLOGY | en_US |
dc.title | Dereverberation of Speech Using Autoregressive Models of Sub-band Envelopes | en_US |
dc.type | Thesis | en_US |
dc.degree.name | PhD | en_US |
dc.degree.level | Doctoral | en_US |
dc.degree.grantor | Indian Institute of Science | en_US |
dc.degree.discipline | Engineering | en_US |