Risk Estimation Strategies for Speech Signal Denoising
Abstract
In automatic speech recognition (ASR) systems, the recognition performance is severely
affected by noise in the input speech signal [2]. One can improve the performance of ASR
systems in noisy environments by denoising the speech signal that is input to the ASR
system [3]. Another important application of denoising algorithms is for hearing aids. In
noisy environments, the ability of a hearing impaired listener to understand speech suffers
more than that of a normal hearing listener. This is because normal hearing listeners are able
to take advantage of the redundancy in the noisy speech signal, which helps in understanding
speech, whereas hearing-impaired listeners are not able to do so [4, 5]. Hearing-impaired
listeners may have a higher speech hearing threshold, to compensate for which a hearing aid
is required to amplify the signal energy in a frequency dependent fashion. The noise present
in the input speech also gets amplified by the hearing aids and causes further difficulties in
hearing. Hence, it is important to provide denoised speech at the input of the hearing aid to
improve speech intelligibility and listening comfort [4–7].
Depending on the number of microphones (channels) available for the signal recording,
we have either a single-channel denoising problem, where only one microphone channel
signal is available, or a multichannel speech denoising problem where a microphone array is
used for signal acquisition. In this thesis, we develop techniques for single-channel speech
denoising. The results presented herein could be suitably extended to the multichannel
case. A rather simplistic approach would be to process each channel independently of
the others, which would be sub-optimal compared with processing all of them together
taking inter-channel correlations into account. Considering additive noise and ignoring
reverberation effects, several enhancement algorithms have been developed [1], which can
be broadly classified into four types: (i) spectral subtraction algorithms; (ii) Wiener filter
techniques; (iii) subspace methods; and (iv) stochastic model-based techniques.