Efficient Algorithms for Structured Output Learning
Structured output learning is the machine learning task of building a classiﬁer to predict structured outputs. Structured outputs arise in several contexts in diverse applications like natural language processing, computer vision, bioinformatics and social networks. Unlike the simple two(or multi)-class outputs which belong to a set of distinct or univariate categories, structured outputs are composed of multiple components with complex interdependencies amongst them. As an illustrative example ,consider the natural language processing task of tagging a sentence with its corresponding part-of-speech tags. The part-of-speech tag sequence is an example of a structured output as it is made up of multiple components, the interactions among them being governed by the underlying properties of the language. This thesis provides eﬃcient solutions for diﬀerent problems pertaining to structured output learning. The classiﬁer for structured outputs is generally built by learning a suitable model from a set of training examples labeled with their associated structured outputs. Discriminative techniques like Structural Support Vector Machines(Structural SVMs) and Conditional Random Fields(CRFs) are popular alternatives developed for structured output learning. The thesis contributes towards developing eﬃcient training strategies for structural SVMs. In particular, an eﬃcient sequential optimization method is proposed for structural SVMs, which is faster than several competing methods. An extension of the sequential method to CRFs is also developed. The sequential method is adapted to a variant of structural SVM with linear cumulative loss. The thesis also presents a systematic empirical evaluation of various training methods available for structured output learning, which will be useful to the practitioner. To train structural SVMs in the presence of a vast number of training examples without labels, the thesis develops a simple semi-supervised technique based on switching the labels of the components of the structured output. The proposed technique is general and its eﬃcacy is demonstrated using experiments on diﬀerent benchmark applications. Another contribution of the thesis is towards the design of fast algorithms for sparse structured output learning. Eﬃcient alternating optimization algorithms are developed for sparse classiﬁer design. These algorithms are shown to achieve sparse models faster, when compared to existing methods.
Showing items related by title, author, creator and subject.
Bhattacharya, Sourangshu (2010-08-24)The focus of this thesis is to develop computational techniques for analysis of protein structures. We model protein structures as points in 3-dimensional space which in turn are modeled as weighted graphs. The problem of ...
Posinasetty, Anusha (2018-06-18)Multilabel classification has attracted much interest in recent times due to the wide applicability of the problem and the challenges involved in learning a classifier for multilabeled data. A crucial aspect of multilabel ...
Sindhuja, C VTime-to-market pressures, accelerating technologies, innovation, resource limitations, and technical complexities, social and ethical issues are key characteristics of a high technology organizations. They are complex, ...