Show simple item record

dc.contributor.advisorVenkatesh Babu, R
dc.contributor.authorSarvadevabhatla, Ravi Kiran
dc.date.accessioned2021-09-28T05:25:05Z
dc.date.available2021-09-28T05:25:05Z
dc.date.submitted2018
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/5351
dc.description.abstractDeep Learning-based object category understanding is an important and active area of research in Computer Vision. Most work in this area has predominantly focused on the portion of depiction spectrum consisting of photographic images. However, depictions at the other end of the spectrum, freehand sketches, are a fascinating visual representation and worthy of study in themselves. In this thesis, we present deep-learning approaches for sketch analysis, sketch synthesis and modelling sketch-driven cognitive processes. On the analysis front, we first focus on the problem of recognizing hand-drawn line sketches of objects. We propose a deep Recurrent Neural Network architecture with a novel loss formulation for sketch object recognition. Our approach achieves state-of-the-art results on a large-scale sketch dataset. We also show that the inherently online nature of our framework is especially suitable for on-the- fly recognition of objects as they are being drawn. We then move beyond object-level label prediction to the relatively harder problem of parsing sketched objects, i.e. given a freehand object sketch, determine its salient attributes (e.g. category, semantic parts, pose). To this end, we propose SketchParse, the first deep-network architecture for fully automatic parsing of freehand object sketches. We subsequently demonstrate SketchParse's abilities (i) on two challenging large-scale sketch datasets (ii) in parsing unseen, semantically related object categories (iii) in improving fine-grained sketch-based image retrieval. As a novel application, we also illustrate how SketchParse's output can be used to generate caption-style descriptions for hand-drawn sketches. On the synthesis front, we design generative models for sketches via Generative Adversarial Networks (GANs). Keeping the limited size of sketch datasets in mind, we propose DeLi- GAN, a novel architecture for diverse and limited training data scenarios. In our approach, we reparameterize the latent generative space as a mixture model and learn the mixture model's parameters along with those of GAN. This seemingly simple modification to the vanilla GAN framework is surprisingly e ective and results in models which enable diversity in generated samples although trained with limited data. We show that DeLiGAN generates diverse samples not just for hand-drawn sketches but for other image modalities as well. To quantitatively characterize intra-class diversity of generated samples, we also introduce a modi ed version of \inception-score", a measure which has been found to correlate well with human assessment of generated samples. We subsequently present an approach for synthesizing minimally discriminative sketch-based object representations which we term category-epitomes. The synthesis procedure concurrently provides a natural measure for quantifying the sparseness underlying the original sketch, which we term epitome-score. We show that the category-level distribution of epitome-scores can be used to characterize level of detail required in general for recognizing object categories. On the cognitive process modelling front, we analyze the results of a free-viewing eye fixation study conducted on freehand sketches. The analysis reveals that eye relaxation sequences exhibit marked consistency within a sketch, across sketches of a category and even across suitably grouped sets of categories. This multi-level consistency is remarkable given the variability in depiction and extreme image content sparsity that characterizes hand-drawn object sketches. We show that the multi-level consistency in the fixation data can be exploited to predict a sketch's category given only its fixation sequence and to build a computational model which predicts part-labels underlying the eye fixations on objects. The ability of machine-based agents to play games in human-like fashion is considered a benchmark of progress in AI. Motivated by this observation, we introduce the first computational model aimed at Pictionary, the popular word-guessing social game. We first introduce Sketch-QA, an elementary version of Visual Question Answering task. Styled after Pictionary, Sketch-QA uses incrementally accumulated sketch stroke sequences as visual data and gathering open-ended guess-words from human guessers. To mimic humans playing Pictionary, we propose a deep neural model which generates guess-words in response to temporally evolving human-drawn sketches. The model even makes human-like mistakes while guessing, thus amplifying the human mimicry factor. We evaluate the model on the large-scale guess-word dataset generated via Sketch-QA task and compare with various baselines. We also conduct a Visual Turing Test to obtain human impressions of the guess-words generated by humans and our model. The promising experimental results demonstrate the challenges and opportunities in building computational models for Pictionary and similarly themed games.en_US
dc.language.isoen_USen_US
dc.relation.ispartofseries;G29364
dc.rightsI grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertationen_US
dc.subjectComputer Visionen_US
dc.subjectDeep Learningen_US
dc.subjectRecurrent Neural Network architectureen_US
dc.subjectsketch object recognitionen_US
dc.subjectSketchParseen_US
dc.subjectGenerative Adversarial Networksen_US
dc.subjectSketch-QAen_US
dc.subject.classificationResearch Subject Categories::TECHNOLOGY::Information technology::Computer scienceen_US
dc.titleDeep Learning for Hand-drawn Sketches: Analysis, Synthesis and Cognitive Process Modelsen_US
dc.typeThesisen_US
dc.degree.namePhDen_US
dc.degree.levelDoctoralen_US
dc.degree.grantorIndian Institute of Scienceen_US
dc.degree.disciplineEngineeringen_US


Files in this item

This item appears in the following Collection(s)

Show simple item record