Amsterdam Data Science – Coffee & Data: Controversy in Web Data

On 9th of June we are organising a Coffee & Data event with the Amsterdam Data Science community. The topic is “How to deal with controversy, bias, quality and opinions on the Web” and will be organised in the context of the COMMIT ControCurator project. In this project VU and UvA computer scientists and humanities researchers investigate jointly the computational modeling of controversial issues on the Web, and explore its application within real use cases in existing organisational pipelines, e.g. Crowdynews and Netherlands Institute for Sound and Vision.

The Agenda is as follows:

09:00-09:10 Coffee

Introduction & Chair by Lora Aroyo, Full Professor at the Web & Media group (VU, Computer Science)

09:10 – 09:25: Gerben van Eerten – Crowdynews deploying ControCurator

09:25 – 09:40: Kaspar Beelen – Detecting Controversies in Online News Media (UvA, Faculty of Humanities)

09:40 – 09:50: Benjamin Timmermans – Understanding Controversy Using Collective Intelligence (VU, Computer Science)

09:50 – 10:00: Davide Ceolin – (VU, Computer Science)

10:00 – 10:15: Damian Trilling – (UvA, Faculty of Social and Behavioural Sciences)

10:15 – 10:30: Daan Oodijk (Blendle)

10:30 – 10:45: Andy Tanenbaum – “Unskewed polls” in 2012

10:45 – 11:00: Q&A Coffee

The event takes place at the Kerkzaal (HG-16A00) on the top floor of the VU Amsterdam main building.

IBM Ph.D. Fellowship 2017-2018

Oana Inel received for the second time the IBM Ph.D. Fellowship. Her research topic focuses on data enrichment with events and event-related entities, by combining the computer power with the crowd potential to identify their relevant dimension, granularity and perspective. She performs her research and experiments in the context of the CrowdTruth project, a project in collaboration with IBM Benelux Centre for Advanced Studies.

Relation Extraction at Collective Intelligence 2017

We are happy to announce that our project exploring relation extraction from natural language has 2 extended abstracts accepted at the Collective Intelligence conference this summer! Here are the papers:

  • Crowdsourcing Ambiguity-Aware Ground Truth: we apply the CrowdTruth methodology to collect data over a set of diverse tasks: medical relation extraction, Twitter event identification, news event extraction and sound interpretation. We prove that capturing disagreement is essential for acquiring a high quality ground truth. We achieve this by comparing the quality of the data aggregated with CrowdTruth metrics with majority vote, a method which enforces consensus among annotators. By applying our analysis over a set of diverse tasks we show that, even though ambiguity manifests differently depending on the task, our theory of inter-annotator disagreement as a property of ambiguity is generalizable.
  • Disagreement in Crowdsourcing and Active Learning for Better Distant Supervision Quality: we present ongoing work on combining active learning with the CrowdTruth methodology for further improving the quality of DS training data. We report the results of a crowdsourcing experiment ran on 2,500 sentences from the open domain. We show that modeling disagreement can be used to identify interesting types of errors caused by ambiguity in the TAC-KBP knowledge base, and we discuss how an active learning approach can incorporate these observations to utilize the crowd more efficiently.