Open access
Date
2012-09Type
- Report
ETH Bibliography
yes
Altmetrics
Abstract
There are several computational tasks for which the help of people is useful. One such task is entity resolution. For this task, human experts can help to identify whether two customers are identical given their profile. Since crowdsourcing is expensive, the goal is to ask as few questions as possible. At the same time, high quality results can only be achieved if several experts are asked for their opinion and for confirmation. This paper shows how to address this cost / quality trade-off and how to tolerate and resolve errors from the crowd. Specifically, this paper shows how to exploit mathematical properties such as symmetry, transitivity, and anti-transitivity of the is-same-entity-as relation to improve both cost and quality. The results of extensive experiments provide surprising insights on how best to crowd-source for entity resolution and other classification problems. Show more
Permanent link
https://doi.org/10.3929/ethz-a-009761323Publication status
publishedJournal / series
Technical Report / ETH Zurich, Department of Computer ScienceVolume
Publisher
ETH, Department of Computer Science, Systems GroupSubject
DATABASE MANAGEMENT + DATABASE ADMINISTRATION (INFORMATION SYSTEMS); INFORMATION MANAGEMENT (MANAGEMENT OF COMPUTER SYSTEMS); SPECIAL PROGRAMMING METHODS; SPEZIELLE PROGRAMMIERMETHODEN; INFORMATIONSMANAGEMENT (MANAGEMENT VON COMPUTERSYSTEMEN); DATENBANKVERWALTUNG + DATENBANKADMINISTRATION (INFORMATIONSSYSTEME)Organisational unit
03689 - Kossmann, Donald (ehemalig)
02150 - Dep. Informatik / Dep. of Computer Science
More
Show all metadata
ETH Bibliography
yes
Altmetrics