Show simple item record

dc.contributor.advisorTalukdar, Partha
dc.contributor.authorOjha, Prakhar
dc.date.accessioned2019-08-08T08:52:35Z
dc.date.available2019-08-08T08:52:35Z
dc.date.submitted2017
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/4265
dc.description.abstractCrowdsourcing has emerged as a convenient mechanism to collect human judgments on a variety of tasks, ranging from document and image classification to scientific experimentation. However, in recent times crowdsourcing has evolved from solving simpler tasks, like recognizing objects in images, to more complex tasks such as collaborative journalism, language translation, product designing etc. Unlike simpler micro-tasks performed by a single worker, these complex tasks require a group of workers and greater resources. In such scenarios, where groups of participants are the atomic units, it is a non-trivial task to distinguish workers (who contribute positively) from idlers (who do not contribute to group task) among the participants using only group's performance. The first part of this thesis studies the problem of distinguishing workers from idlers, without assuming any prior knowledge of individual skills and considers \groups" as the smallest observable unit for evaluation. We draw upon literature from group-testing and give bounds over minimum number of groups required to identify quality of subsets of individuals with high confidence. We validate our theory experimentally and report insights for the number of workers and idlers that can be identified for a given number of group-tasks with significant probability. In most crowdsourcing applications, there exist dependencies among the pool of Human Intelligence Tasks (HITs) and often in practical scenarios there are far too many HITs available than what can realistically be covered by limited available budget. Estimating the accuracy of automatically constructed Knowledge Graphs (KG) is one such important application. Automatic construction of large knowledge graphs has gained wide popularity in recent times. These KGs, such as NELL, Google Knowledge Vault, etc., consist of thousands of predicate-relations (e.g., is Person, is Mayor Of) and millions of their instances (e.g., (Bill de Blasio, is Mayor Of, New York City)). Estimating accuracy of such KGs is a challenging problem due to their size and diversity. In the second part of this study, we show that standard single-task crowdsourc- ing is sub-optimal and very expensive as it ignores dependencies among various predicates and instances. We propose Relational Crowdsourcing (RelCrowd) to overcome this challenge, where the tasks are created while taking dependencies among predicates and instances into account. We apply this framework in the context of large-scale Knowledge Graph Evaluation (KGEval) and demonstrate its effectiveness through extensive experiments on real-world datasets.en_US
dc.language.isoen_USen_US
dc.relation.ispartofseriesG28302;
dc.rightsI grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertationen_US
dc.subjectCrowdsourcingen_US
dc.subjectRelational Crowdsourcing (RelCrowd)en_US
dc.subjectKnowledge Graph Evaluationen_US
dc.subjectKGEvalen_US
dc.subjectKnowledge Graphs (KG)en_US
dc.subjectKG-Evaluationen_US
dc.subjectEvaluation Coupling Graph (ECG)en_US
dc.subject.classificationComputer Science and Automationen_US
dc.titleUtilizing Worker Groups And Task Dependencies in Crowdsourcingen_US
dc.typeThesisen_US
dc.degree.nameMSc Enggen_US
dc.degree.levelMastersen_US
dc.degree.grantorIndian Institute of Scienceen_US
dc.degree.disciplineEngineeringen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record