Visual Speech Recognition

Jain, Abhilash

dc.contributor.advisor	Rathna, G N
dc.contributor.author	Jain, Abhilash
dc.date.accessioned	2020-12-17T11:11:03Z
dc.date.available	2020-12-17T11:11:03Z
dc.date.submitted	2018
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/4767
dc.description.abstract	Visual speech recognition (VSR), or automatic lip-reading, is the task of extracting speech information from visual input. The addition of visual speech has been shown to improve the performance of traditional audio speech recognition (ASR) systems, and hence has been active area of research since it's inception. This thesis proposes a new VSR system for isolated word recognition tasks, with focus on the feature extraction methodology. A novel two-stage feature extraction technique is proposed. Image transform based features { discrete cosine transform (DCT) and local binary patterns (LBP) { are used. The use of di erence images for temporal feature extraction is also proposed. A new region of interest (ROI), which consists of the throat and lower jaw along with the mouth, is also introduced. For ROI extraction, the Viola-Jones algorithm is used. Classi cation is done using a multi-class Support Vector Machine (SVM) model. The system provides a simple, yet effective way to extract features from the video input, and performs comparably to some recent VSR systems, which employ more complicated techniques, like lip modelling or deep learning, to extract visual features.	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	;G29657
dc.rights	I grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation	en_US
dc.subject	Speech recognition	en_US
dc.subject	audio speech recognition	en_US
dc.subject.classification	Research Subject Categories::TECHNOLOGY::Electrical engineering, electronics and photonics::Electrical engineering	en_US
dc.title	Visual Speech Recognition	en_US
dc.type	Thesis	en_US
dc.degree.name	MS	en_US
dc.degree.level	Masters	en_US
dc.degree.grantor	Indian Institute of Science	en_US
dc.degree.discipline	Engineering	en_US

Files in this item

Name:: G29657.pdf
Size:: 7.665Mb
Format:: PDF
Description:: Thesis full text

View/Open

This item appears in the following Collection(s)

Electrical Engineering (EE) [423]

Show simple item record