Crowdsourcing Inclusivity with CrowdTruth Tutorial @ WebConf 2019

Dealing with diversity of opinions, perspectives and ambiguity in annotated data

The second edition of the CrowdTruth tutorial will be held during the Web Conference 2019, on Monday, May 13th 2019, at the Hyatt Regency in San Francisco, California, USA. Follow updates on Twitter and FB for #CrowdTruth #WebConf2019 @thewebconf

In this tutorial, we introduce the CrowdTruth methodology for crowdsourcing ground truth by harnessing and interpreting inter-annotator disagreement. The central characteristic of CrowdTruth is harnessing the diversity in human interpretation to capture the wide range of opinions and perspectives, and thus, provide more reliable and realistic real-world annotated data for training and evaluating machine learning components. Unlike other methods, we do not discard dissenting votes, but incorporate them into a richer and more continuous representation of truth. Creating this more complex notion of truth contributes directly to the larger discussion on how to make the Web more reliable, diverse and inclusive.

The goal of this tutorial is to introduce the methodology and provide guided exercises on how to apply it in specific cases. As dealing with disagreement and diversity in crowdsourcing are becoming increasingly popular, this tutorial provides a timely solution. All the materials of the tutorial will be publically available. We will provide slides, handouts and python notebooks (both Jupiter and Colab). Understanding and skills in Python can be helpful for getting most out of this tutorial. However, we envision work in small groups to allow people with various prior experiences to group together.

CrowdTruth methodology and framework is a widely used crowdsourcing methodology adopted by industrial partners and public organizations, e.g. Google, IBM, New York Times, The Cleveland Clinic, Crowdynews, The Netherlands Institute for Sound and Vision, Rijksmuseum, and in a multitude of domains, e.g. AI, news, medicine, social media, cultural heritage, social sciences. You can find the data from our experiments, extensive list of papers and more you can find on the CrowdTruth website.

The first edition of the CrowdTruth tutorial was held during the ISWC 2018 conference.

Tutorial Schedule

Content Time
Introduction [slides] 14:00 – 14:35
CrowdTruth Metrics [slides] 14:35 – 15:00
Hands-on: Video annotation [Colab] [Github] 15:00 – 15:30
Coffee Break
Hands-on: Textual entailment [Colab] [Github] 16:00 – 16:30
CrowdTruth Task Design [slides] 16:30 – 17:30
Wrap-up & Discussion 17:00 – 17:30

Tutorial Organizers

Lora Aroyo

Anca Dumitrache

Oana Inel

Chris Welty

Resources

CrowdTruth Papers