Risk Estimation Strategies for Speech Signal Denoising

Sadasivan, Jishnu

View/Open

Thesis full text (23.30Mb)

Author

Sadasivan, Jishnu

Metadata

Show full item record

Abstract

In automatic speech recognition (ASR) systems, the recognition performance is severely affected by noise in the input speech signal [2]. One can improve the performance of ASR systems in noisy environments by denoising the speech signal that is input to the ASR system [3]. Another important application of denoising algorithms is for hearing aids. In noisy environments, the ability of a hearing impaired listener to understand speech suffers more than that of a normal hearing listener. This is because normal hearing listeners are able to take advantage of the redundancy in the noisy speech signal, which helps in understanding speech, whereas hearing-impaired listeners are not able to do so [4, 5]. Hearing-impaired listeners may have a higher speech hearing threshold, to compensate for which a hearing aid is required to amplify the signal energy in a frequency dependent fashion. The noise present in the input speech also gets amplified by the hearing aids and causes further difficulties in hearing. Hence, it is important to provide denoised speech at the input of the hearing aid to improve speech intelligibility and listening comfort [4–7]. Depending on the number of microphones (channels) available for the signal recording, we have either a single-channel denoising problem, where only one microphone channel signal is available, or a multichannel speech denoising problem where a microphone array is used for signal acquisition. In this thesis, we develop techniques for single-channel speech denoising. The results presented herein could be suitably extended to the multichannel case. A rather simplistic approach would be to process each channel independently of the others, which would be sub-optimal compared with processing all of them together taking inter-channel correlations into account. Considering additive noise and ignoring reverberation effects, several enhancement algorithms have been developed [1], which can be broadly classified into four types: (i) spectral subtraction algorithms; (ii) Wiener filter techniques; (iii) subspace methods; and (iv) stochastic model-based techniques.

URI

https://etd.iisc.ac.in/handle/2005/5414

Collections

Electrical Communication Engineering (ECE) [518]