Learning Across Domains: Applications to Text-based Person Search and Multi-Source Domain Adaptation

Aggarwal, Surbhi

dc.contributor.advisor	Chakraborty, Anirban
dc.contributor.author	Aggarwal, Surbhi
dc.date.accessioned	2020-09-07T08:54:12Z
dc.date.available	2020-09-07T08:54:12Z
dc.date.submitted	2020
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/4579
dc.description.abstract	With rapid development in technology and ubiquitous presence of diverse types of sensors, a large amount of data from different modalities (e.g., text, audio, images etc.) describing the same person/ object/event has become easily available. Similarly, multiple datasets targeted towards the same task but exhibiting different data distributions are often available. To be able to learn and utilize the complementary information present across diverse domains can be immensely valuable towards building more intelligent models. Cross-modal learning and domain adaptation techniques are closely related to learning under such scenarios. In this thesis, we investigate and provide novel algorithms for two applications of learning across domains - namely Text-based Person Search and Multi-Source Domain Adaptation. Person search in a camera network is an important problem in the field of intelligent video surveillance. Often the search query comes in the form of unstructured textual description of the target of interest, and the goal is to retrieve the pedestrian images that best match this description. In the first part of the thesis, we investigate methods for this cross-modal retrieval problem of Text-based Person Search. Existing methods utilize class-id information to get discriminative and identity-preserving features. However, it is not well-explored whether it is beneficial to explicitly ensure that the semantics of the data are also retained. In the proposed work, we aim to create semantics-preserving embeddings through an additional task of attribute prediction. Since attribute annotation is typically unavailable in text-based person search, we first mine them from the text corpus. These attributes are then used as a means to bridge the modality gap between the image-text inputs, as well as to improve the representation learning. In summary, we propose an approach for text-based person search by learning an attribute-driven space along with a class-information driven space, and utilize both for obtaining the retrieval results. Our experiments show that learning the attribute space not only helps in improving performance but also yields humanly-interpretable features. In the second part of the thesis, we worked on Multi-Source Domain Adaptation, a problem involving multiple data sources, which are of the same modality but follow different distributions. Domain adaptation is a field of machine learning that aims at learning a model from a labelled source dataset, such that the model performs well on samples drawn from an unlabelled target domain which has iv Abstract a related but different distribution. The problem of single-source unsupervised domain adaptation has been explored quite extensively. However, in practice, labelled data is often available from multiple, differently distributed sources - giving rise to the problem of multi-source domain adaptation (MSDA). Recent works in MSDA propose to learn a domain-invariant space for the sources and the target. However, such methods treat each source to be equally relevant and are not sensitive to the intrinsic relations amongst domains. In this work, we provide a novel algorithm for multi-source domain adaptation which utilizes the multiple sources based on their relative importance to the target. Our objective is to dynamically explore the relevance of sources, and then to perform weighted alignment of domains. We experimentally validate the performance of our method on benchmark datasets, and achieve state-of-the-art results on Office-Home and Office-Caltech.	en_US
dc.language.iso	en_US	en_US
dc.rights	I grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation	en_US
dc.subject	Deep Learning	en_US
dc.subject	Text-based Person Search	en_US
dc.subject	Domain Adaptation	en_US
dc.subject.classification	Research Subject Categories::TECHNOLOGY::Information technology::Computer science	en_US
dc.title	Learning Across Domains: Applications to Text-based Person Search and Multi-Source Domain Adaptation	en_US
dc.type	Thesis	en_US
dc.degree.name	MTech (Res)	en_US
dc.degree.level	Masters	en_US
dc.degree.grantor	Indian Institute of Science	en_US
dc.degree.discipline	Engineering	en_US

Files in this item

Name:: final_revised_thesis_surbhi_ag ...
Size:: 9.871Mb
Format:: PDF
Description:: Thesis full text

View/Open

This item appears in the following Collection(s)

Department of Computational and Data Sciences (CDS) [118]

Show simple item record