Show simple item record

dc.contributor.advisorGhosh, Prasanta Kumar
dc.contributor.authorRoy, Anwesha
dc.date.accessioned2022-10-26T07:06:42Z
dc.date.available2022-10-26T07:06:42Z
dc.date.submitted2022
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/5882
dc.description.abstractReal-time Magnetic Resonance Imaging (rtMRI) is a tool used exhaustively in speech science and linguistics to understand the dynamics of the speech production process across languages and health conditions. rtMRI has two advantages over other methods which capture articulatory movement, like X-ray, Ultrasound and Electromagnetic articulography - it is non invasive, and it captures a complete view of the vocal tract including pharyngeal structures. The rtMRI video provides spatio-temporal information of speech articulatory movements, which helps in modeling speech production. For this purpose, a common step is to obtain the air-tissue boundary (ATB) segmentation in all frames of the rtMRI video. The accurate estimation of ATBs of the upper airway of the vocal tract is essential for many speech processing applications like speaker verification, text-to-speech synthesis, visual augmentation for synthesized articulatory videos, and analysis of vocal tract movement. Thus, it is necessary to have an accurate air-tissue boundary segmentation in every frame of the rtMRI videos. The best performance in ATB segmentation of rtMRI videos in speech production, in unseen subject conditions, is known to be achieved by a 3-dimensional convolutional neural network (3D-CNN) model. In seen subject conditions, both 3D-CNN and 2-dimensional deep convolutional encoder-decoder network (SegNet) show similar performance. However, the evaluation of these models, as well as other ATB segmentation techniques reported in literature, has been done using Dynamic Time Warping (DTW) distance between the entire original and predicted boundaries or contours. Such an evaluation measure may not capture local errors in the predicted contour. Careful analysis of predicted contours reveals errors in regions like the velum part and tongue base section, which are not captured in a global evaluation metric like DTW distance. In this thesis, such errors are automatically detected and a novel correction scheme is proposed for them. Two new evaluation metrics are also proposed for ATB segmentation, separately for each contour, to explicitly capture errors in these contours. Moreover, the state-of-the-art models use overall binary cross entropy as the loss function during model training. However, such a global loss function does not give enough emphasis on regions which are more prone to errors. In this thesis, together with global loss, the use of regional loss functions has been explored, which focus on areas of the contours which have been analyzed as error prone in the analysis. Two different losses are considered in the regions around velum and tongue base - binary cross entropy (BCE) loss and dice loss. It is observed that dice-loss based models perform better than their BCE loss based counterparts.en_US
dc.language.isoen_USen_US
dc.rightsI grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertationen_US
dc.subjectsegmentationen_US
dc.subjectrtMRIen_US
dc.subject.classificationResearch Subject Categories::TECHNOLOGY::Electrical engineering, electronics and photonics::Electrical engineeringen_US
dc.titleImproved air-tissue boundary segmentation in real-time magnetic resonance imaging videos using speech articulator specific error criterionen_US
dc.typeThesisen_US
dc.degree.nameMTech (Res)en_US
dc.degree.levelMastersen_US
dc.degree.grantorIndian Institute of Scienceen_US
dc.degree.disciplineEngineeringen_US


Files in this item

This item appears in the following Collection(s)

Show simple item record