Browsing by Advisor "Ghosh, Prasanta Kumar"

Acoustic-Articulatory Mapping: Analysis and Improvements with Neural Network Learning Paradigms

Illa, Aravind

Human speech is one of many acoustic signals we perceive, which carries linguistic and paralinguistic (e.g., speaker identity, emotional state) information. Speech acoustics are produced as a result of different temporally ...

Analysis of vocal sounds in asthmatic patients

Yadav, Shivani

Around 334 million people have asthma worldwide. Asthma is an inflammatory disease of the airways which causes cough, breathlessness, chest tightness, and other peculiar sounds during breathing. The golden standard test ...

Binaural Source Localization using subband reliability and interaural time difference patterns

Karthik, Girija Ramesan

Machine localization of sound sources is necessary for a wide range of appli- cations, including human-robot interaction, surveillance and hearing aids. Robot sound localization algorithms have been proposed using ...

Improved air-tissue boundary segmentation in real-time magnetic resonance imaging videos using speech articulator specific error criterion

Roy, Anwesha

Real-time Magnetic Resonance Imaging (rtMRI) is a tool used exhaustively in speech science and linguistics to understand the dynamics of the speech production process across languages and health conditions. rtMRI has two ...

On the Optimality of Generative Adversarial Networks — A Variational Perspective

Asokan, Siddarth

Generative adversarial networks (GANs) are a popular learning framework to model the underlying distribution of images. GANs comprise a min-max game between the generator and the discriminator. While the generator transforms ...

Probabilistic source-filter model of speech

Achuth Rao, M V

The human respiratory system plays a crucial role in breathing and swallow ing. However, it also plays an essential role in speech production, which is unique to humans. Speech production involves expelling air from the ...

Pronunciation assessment and semi-supervised feedback prediction for spoken English tutoring

Yarra, Chiranjeevi

Spoken English pronunciation quality is often influenced by the nativity of a learner, for whom English is the second language. Typically, the pronunciation quality of a learner depends on the degree of the following four ...

Speaker verification using whispered speech

Naini, Abinay Reddy

Like neutral speech, whispered speech is one of the natural modes of speech production, and it is often used by speakers in their day-to-day life. For some people, such as laryngectomees, whispered speech is the only ...

Speech enhancement using deep mixture of experts

Karjol, Pavan Subhaschandra

Speech enhancement is at the heart of many applications such as speech com- munication, automatic speech recognition, hearing aids etc. In this work, we consider the speech enhancement under the framework of multiple ...

Speech task-specific representation learning using acoustic-articulatory data

Mannem, Renuka

Human speech production involves modulation of the air stream by the vocal tract shape determined by the articulatory configuration. Articulatory gestures are often used to represent the speech units. It has been shown ...