Show simple item record

dc.contributor.advisorChakraborty, Anirban
dc.contributor.authorAggarwal, Surbhi
dc.date.accessioned2020-09-07T08:54:12Z
dc.date.available2020-09-07T08:54:12Z
dc.date.submitted2020
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/4579
dc.description.abstractWith rapid development in technology and ubiquitous presence of diverse types of sensors, a large amount of data from different modalities (e.g., text, audio, images etc.) describing the same person/ object/event has become easily available. Similarly, multiple datasets targeted towards the same task but exhibiting different data distributions are often available. To be able to learn and utilize the complementary information present across diverse domains can be immensely valuable towards building more intelligent models. Cross-modal learning and domain adaptation techniques are closely related to learning under such scenarios. In this thesis, we investigate and provide novel algorithms for two applications of learning across domains - namely Text-based Person Search and Multi-Source Domain Adaptation. Person search in a camera network is an important problem in the field of intelligent video surveillance. Often the search query comes in the form of unstructured textual description of the target of interest, and the goal is to retrieve the pedestrian images that best match this description. In the first part of the thesis, we investigate methods for this cross-modal retrieval problem of Text-based Person Search. Existing methods utilize class-id information to get discriminative and identity-preserving features. However, it is not well-explored whether it is beneficial to explicitly ensure that the semantics of the data are also retained. In the proposed work, we aim to create semantics-preserving embeddings through an additional task of attribute prediction. Since attribute annotation is typically unavailable in text-based person search, we first mine them from the text corpus. These attributes are then used as a means to bridge the modality gap between the image-text inputs, as well as to improve the representation learning. In summary, we propose an approach for text-based person search by learning an attribute-driven space along with a class-information driven space, and utilize both for obtaining the retrieval results. Our experiments show that learning the attribute space not only helps in improving performance but also yields humanly-interpretable features. In the second part of the thesis, we worked on Multi-Source Domain Adaptation, a problem involving multiple data sources, which are of the same modality but follow different distributions. Domain adaptation is a field of machine learning that aims at learning a model from a labelled source dataset, such that the model performs well on samples drawn from an unlabelled target domain which has iv Abstract a related but different distribution. The problem of single-source unsupervised domain adaptation has been explored quite extensively. However, in practice, labelled data is often available from multiple, differently distributed sources - giving rise to the problem of multi-source domain adaptation (MSDA). Recent works in MSDA propose to learn a domain-invariant space for the sources and the target. However, such methods treat each source to be equally relevant and are not sensitive to the intrinsic relations amongst domains. In this work, we provide a novel algorithm for multi-source domain adaptation which utilizes the multiple sources based on their relative importance to the target. Our objective is to dynamically explore the relevance of sources, and then to perform weighted alignment of domains. We experimentally validate the performance of our method on benchmark datasets, and achieve state-of-the-art results on Office-Home and Office-Caltech.en_US
dc.language.isoen_USen_US
dc.rightsI grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertationen_US
dc.subjectDeep Learningen_US
dc.subjectText-based Person Searchen_US
dc.subjectDomain Adaptationen_US
dc.subject.classificationResearch Subject Categories::TECHNOLOGY::Information technology::Computer scienceen_US
dc.titleLearning Across Domains: Applications to Text-based Person Search and Multi-Source Domain Adaptationen_US
dc.typeThesisen_US
dc.degree.nameMTech (Res)en_US
dc.degree.levelMastersen_US
dc.degree.grantorIndian Institute of Scienceen_US
dc.degree.disciplineEngineeringen_US


Files in this item

This item appears in the following Collection(s)

Show simple item record