Show simple item record

dc.contributor.advisorSreenivas, T V
dc.contributor.authorStalin, Suryan.
dc.date.accessioned2025-10-30T11:06:53Z
dc.date.available2025-10-30T11:06:53Z
dc.date.submitted1994
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/7292
dc.description.abstractAmong the various neural network architectures and learning algorithms that have emerged recently, multilayer perceptron (MLP) network using backpropagation learning is found most effective for speech recognition due to its ability to form arbitrary complex decision regions in the feature space and the overall versatility of the backpropagation learning algorithm. This thesis addresses the issues of applying MLP to phoneme recognition in continuous speech and proposes modifications to the network architecture and learning algorithms which can give improved performance. Feature representation of speech units, such as phonemes, is complex due to talker variability and contextual variability in continuous speech. This leads to difficulties in a pattern classifier. A typical MLP network classifier would require large number of learning iterations. Poor learning would result in poor network performance. The problem of choosing an optimum size/structure of the MLP network for a given task is an open problem. This thesis proposes (i) a new MLP network architecture which provides faster convergence in the learning phase, (ii) a network pruning algorithm that provides an optimized network structure along with network convergence, and (iii) a method of improving network generalization. These features are shown to be effective in a neural network-based speech recognizer for vowel and semivowel classification in speaker independent continuous speech. An analysis of the slow convergence of the backpropagation algorithm reveals the significance of presenting the output error in vector form and propagating it back vectorically. It is shown that such a mode of propagation in addition to providing faster convergence, also converts the M class recognition problem to M independent two-class recognition (dichotomizer) problems. For speech recognition and character recognition tasks, the new scheme is shown to converge within 20–50% of the iterations required for the MLP to converge and it does not have any degradation in the recognition performance. The convergence and generalization properties of the network are also related to its size/structure. Network pruning is a method in which a larger size network is iteratively reduced to an optimum size. A dynamic link pruning algorithm has been developed, which is incorporated into the backpropagation algorithm so that an optimized network would be obtained along with network convergence. In experiments with speech recognition and character recognition tasks, this optimization is shown to provide a 2–7% improvement in the recognition performance. The network generalization is determined by the feature space partitions formed after learning. A network generalization algorithm is proposed which under suitable conditions, maximizes the volume of the decision regions formed by the network in the feature space. Such a decision region expansion leads to improved test set performance. Incorporating all the features described above, an optimized neural network dichotomizer is developed for vowel and semivowel classification in continuous speech. Results obtained are encouraging compared to the results available in the literature for the talker-independent semivowel classification in continuous speech using the TIMIT database.
dc.language.isoen_US
dc.relation.ispartofseriesT03532
dc.rightsI grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation
dc.subjectMultilayer Perceptron
dc.subjectBackpropagation Learning
dc.subjectGeneralization Improvement
dc.titleOptimized neural network dichotomizer for speech recognition
dc.typeThesis
dc.degree.nameMSc Engg
dc.degree.levelMasters
dc.degree.grantorIndian Institute of Science
dc.degree.disciplineEngineering


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record