Browsing by Advisor "Ganapathy, Sriram"
Now showing items 1-9 of 9
-
Audio-Visual association learning in Humans and Multimodal Networks
We easily learn audiovisual associations when we give visual objects their names. While humans easily learn the names of new objects while retaining previously learned information, deep neural networks forget old ... -
Deep Learning Methods For Audio EEG Analysis
The perception of speech and audio is one of the defining features of humans. Much of the brain’s underlying processes as we listen to acoustic signals are unknown, and significant research efforts are needed to unravel ... -
Dereverberation of Speech Using Autoregressive Models of Sub-band Envelopes
Automatic speech recognition (ASR) based technologies are radically changing the way we interact with digital services and information. Most of these application leverage on hands-free speech, where talkers are able to ... -
Graph Clustering Approaches for Speaker Diarization of Conversational Speech
In this era of advanced machine intelligence, real-world speech applications need to be equipped to deal with conversations involving multiple speakers. An essential first step in speech information extraction from ... -
Investigating Neural Mechanisms of Word Learning and Speech Perception
Language learning and speech perception are remarkable feats performed by the human brain, involving complex neural mechanisms that allow us to understand and communicate with one another. Unravelling the mysteries of these ... -
A Learnable Distillation Approach For Model-agnostic Explainability With Multimodal Applications
Deep neural networks are the most widely used examples of sophisticated mapping functions from feature space to class labels. In the recent years, several high impact decisions in domains such as finance, healthcare, law ... -
Mask Estimator Approaches For Audio Beamforming
Beamforming is a family of algorithms and performs a spatial filtering operation that makes it possible to map the distribution of the sources at a certain distance from the microphones and therefore locate the ... -
Neural Representation Learning for Speech and Audio Signals
Representation learning is the branch of machine learning consisting of techniques that are capable of automatically discovering meaningful representations from raw data for efficient information extraction. In recent ... -
Supervised Learning Approaches for Language and Speaker Recognition
In the age of artificial intelligence, it is important for machines to figure out who is speaking automatically and in what language - a task humans are naturally capable of. Developing algorithms that automatically infer ...