Utilizing Worker Groups And Task Dependencies in Crowdsourcing

Ojha, Prakhar

dc.contributor.advisor	Talukdar, Partha
dc.contributor.author	Ojha, Prakhar
dc.date.accessioned	2019-08-08T08:52:35Z
dc.date.available	2019-08-08T08:52:35Z
dc.date.submitted	2017
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/4265
dc.description.abstract	Crowdsourcing has emerged as a convenient mechanism to collect human judgments on a variety of tasks, ranging from document and image classification to scientific experimentation. However, in recent times crowdsourcing has evolved from solving simpler tasks, like recognizing objects in images, to more complex tasks such as collaborative journalism, language translation, product designing etc. Unlike simpler micro-tasks performed by a single worker, these complex tasks require a group of workers and greater resources. In such scenarios, where groups of participants are the atomic units, it is a non-trivial task to distinguish workers (who contribute positively) from idlers (who do not contribute to group task) among the participants using only group's performance. The first part of this thesis studies the problem of distinguishing workers from idlers, without assuming any prior knowledge of individual skills and considers \groups" as the smallest observable unit for evaluation. We draw upon literature from group-testing and give bounds over minimum number of groups required to identify quality of subsets of individuals with high confidence. We validate our theory experimentally and report insights for the number of workers and idlers that can be identified for a given number of group-tasks with significant probability. In most crowdsourcing applications, there exist dependencies among the pool of Human Intelligence Tasks (HITs) and often in practical scenarios there are far too many HITs available than what can realistically be covered by limited available budget. Estimating the accuracy of automatically constructed Knowledge Graphs (KG) is one such important application. Automatic construction of large knowledge graphs has gained wide popularity in recent times. These KGs, such as NELL, Google Knowledge Vault, etc., consist of thousands of predicate-relations (e.g., is Person, is Mayor Of) and millions of their instances (e.g., (Bill de Blasio, is Mayor Of, New York City)). Estimating accuracy of such KGs is a challenging problem due to their size and diversity. In the second part of this study, we show that standard single-task crowdsourc- ing is sub-optimal and very expensive as it ignores dependencies among various predicates and instances. We propose Relational Crowdsourcing (RelCrowd) to overcome this challenge, where the tasks are created while taking dependencies among predicates and instances into account. We apply this framework in the context of large-scale Knowledge Graph Evaluation (KGEval) and demonstrate its effectiveness through extensive experiments on real-world datasets.	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	G28302;
dc.rights	I grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation	en_US
dc.subject	Crowdsourcing	en_US
dc.subject	Relational Crowdsourcing (RelCrowd)	en_US
dc.subject	Knowledge Graph Evaluation	en_US
dc.subject	KGEval	en_US
dc.subject	Knowledge Graphs (KG)	en_US
dc.subject	KG-Evaluation	en_US
dc.subject	Evaluation Coupling Graph (ECG)	en_US
dc.subject.classification	Computer Science and Automation	en_US
dc.title	Utilizing Worker Groups And Task Dependencies in Crowdsourcing	en_US
dc.type	Thesis	en_US
dc.degree.name	MSc Engg	en_US
dc.degree.level	Masters	en_US
dc.degree.grantor	Indian Institute of Science	en_US
dc.degree.discipline	Engineering	en_US

Files in this item

Name:: G28302-Abs.pdf
Size:: 79.15Kb
Format:: PDF
Description:: Thesis-Abstract

View/Open

Name:: G28302.pdf
Size:: 1.592Mb
Format:: PDF
Description:: Thesis-Full Text

View/Open

This item appears in the following Collection(s)

Computer Science and Automation (CSA) [542]

Show simple item record