Show simple item record

dc.contributor.advisorBalakrishnan, N
dc.contributor.authorPatra, Sabyasachi
dc.date.accessioned2025-10-30T10:39:54Z
dc.date.available2025-10-30T10:39:54Z
dc.date.submitted2007
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/7257
dc.description.abstractRecent advances in research and development in speaker recognition and identification systems have made speaker identification one of the most trusted methods for authorization and forensic applications. However, field deployment of such systems requires their ability to function effectively in noisy environments. Designing robust speaker identification systems for such conditions has gained significant attention from the research community and is the focus of this thesis. In this work, we explore various dimensionality reduction techniques and their application to speaker identification. Principal Component Analysis (PCA), a coordinate-based dimensionality reduction method, plays a dominant role in this domain. By projecting the original feature set into a smaller subspace through a linear orthogonal transformation, PCA reduces both the dimensionality and the correlation among feature vectors. This transformation lowers computational overhead in subsequent processing stages and minimizes the effect of noise, thereby improving accuracy. This thesis applies a feature-dependent dimensionality reduction technique known as Weighted Principal Component Analysis (WPCA). The key advantage of WPCA is its ability to merge coordinate-based and weight-based methods into a unified framework. Experimental results show an improvement of up to 3% in speaker identification accuracy using WPCA over PCA across various Signal-to-Noise Ratios (SNRs). Selecting the optimal set of parameters is critical in dimensionality reduction. In speaker identification, the most significant components are those that clearly distinguish speech among individuals. We conducted experiments to identify the optimal parameters in the transformed feature vectors. In this thesis, 24-dimensional MFCC (Mel-Frequency Cepstral Coefficients) feature vectors are used, which are transformed into either PCA or WPCA space. In PCA space, each feature vector is divided into two parts: coefficients 1 to 12 correspond to higher eigenvalues forming Principal Component Features (PCF), and coefficients 13 to 24 correspond to lower eigenvalues forming Minor Component Features (MCF). Experimental and analytical results show that MCFs have greater discriminative power than PCFs. Another significant contribution of this thesis is the extraction of latent features from the speech spectrum to enable automatic noise filtration. The proposed method applies Latent Variable Decomposition (LVD) to the magnitude spectral vector of the speech signal. In this method, the distribution of spectral vectors is modeled using a mixture multinomial distribution based on the a priori probability of a fixed number of hidden classes and the conditional frequency beam distribution. These form the transformation matrix used to generate new feature vectors. The number of hidden classes determines the dimensionality of the new feature vector. Since these features are inherently frequency-independent, noise effects are absorbed during this process. These features are used in the candidate selection stage, where decisions are made based on Bhattacharyya Distance between speakers. Subsequently, Gaussian Mixture Models (GMM) are applied to the selected candidates using MFCC feature vectors. Results show that the proposed features yield up to 400% improvement in speaker identification rate over MFCC features at 10 dB SNR, demonstrating high effectiveness in noisy environments.
dc.language.isoen_US
dc.relation.ispartofseriesT06533
dc.rightsI grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation
dc.subjectPrincipal Component Analysis
dc.subjectWeighted PCA
dc.subjectLatent Variable Decomposition
dc.titleRobust Speaker Identification System
dc.degree.nameMSc Engg
dc.degree.levelMasters
dc.degree.grantorIndian Institute of Science
dc.degree.disciplineEngineering


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record