Carrying out research tasks on data collections is hampered, or even made impossible, by data quality issues of different types, such as incompleteness or inconsistency, and severity. We identify research tasks carried out by professional users of data collections that are hampered by inherent quality issues. We investigate what types of issues exist and how they influence these research tasks. To measure the quality perceived by professional users, we develop a quality metric. This allows us to measure the suitability of the data quality for a chosen user task. For a chosen task, we study how the data quality can be improved using crowdsourcing. We validate our quality metric by investigating whether professionals perform better on the chosen research task.
ACM New York, NY, USA
COMMIT: Socially Enriched Acces to Linked Cultural Media (P06)
Information Interaction in Context
Human-Centered Data Analytics

Traub, M. (2014). Measuring and improving data quality of media collections for professional tasks. In Proceedings of Information Interaction in Context 2014 (IIiX 2014). ACM New York, NY, USA.