Speech enhancement using deep mixture of experts

Karjol, Pavan Subhaschandra

dc.contributor.advisor	Ghosh, Prasanta Kumar
dc.contributor.author	Karjol, Pavan Subhaschandra
dc.date.accessioned	2021-07-08T10:13:47Z
dc.date.available	2021-07-08T10:13:47Z
dc.date.submitted	2018
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/5190
dc.description.abstract	Speech enhancement is at the heart of many applications such as speech com- munication, automatic speech recognition, hearing aids etc. In this work, we consider the speech enhancement under the framework of multiple deep neural network (DNN) system. DNNs have been extensively used in speech enhance- ment due to its ability to capture complex variations in the input data. As a natural extension, researchers have used variants of a network with multi- ple DNNs for speech enhancement. Input data could be clustered to train each DNN or train all the DNNs jointly without any clustering. In this work, we pro- pose clustering methods for training multiple DNN systems and its variants for speech enhancement. One of the proposed works involves grouping phonemes into broad classes and training separate DNN for each class. Such an approach is found to perform better than single DNN based speech enhancement. However, it relies on phoneme information which may not be available for all corpora. Hence, we propose a hard expectation-maximization (EM) based task speci c clustering method, which, automatically determines clusters without relying on the knowledge of speech units. The idea is to redistribute the data points among multiple DNNs such that it enables better speech enhancement. The experimen- tal results show that the hard EM based clustering performs better than the single DNN based speech enhancement and provides results similar to that of the broad phoneme class based approach.	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	;G29477
dc.rights	I grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation	en_US
dc.subject	deep neural network	en_US
dc.subject	Speech enhancement	en_US
dc.subject.classification	Research Subject Categories::TECHNOLOGY::Electrical engineering, electronics and photonics::Electrical engineering	en_US
dc.title	Speech enhancement using deep mixture of experts	en_US
dc.type	Thesis	en_US
dc.degree.name	MS	en_US
dc.degree.level	Masters	en_US
dc.degree.grantor	Indian Institute of Science	en_US
dc.degree.discipline	Engineering	en_US

Files in this item

Name:: G29477.pdf
Size:: 1.254Mb
Format:: PDF
Description:: Thesis full text

View/Open

This item appears in the following Collection(s)

Electrical Engineering (EE) [342]

Show simple item record