Oana Inel received for the second time the IBM Ph.D. Fellowship. Her research topic focuses on data enrichment with events and event-related entities, by combining the computer power with the crowd potential to identify their relevant dimension, granularity and perspective. She performs her research and experiments in the context of the CrowdTruth project, a project in collaboration with IBM Benelux Centre for Advanced Studies.
We are happy to announce that our project exploring relation extraction from natural language has 2 extended abstracts accepted at the Collective Intelligence conference this summer! Here are the papers:
- Crowdsourcing Ambiguity-Aware Ground Truth: we apply the CrowdTruth methodology to collect data over a set of diverse tasks: medical relation extraction, Twitter event identification, news event extraction and sound interpretation. We prove that capturing disagreement is essential for acquiring a high quality ground truth. We achieve this by comparing the quality of the data aggregated with CrowdTruth metrics with majority vote, a method which enforces consensus among annotators. By applying our analysis over a set of diverse tasks we show that, even though ambiguity manifests differently depending on the task, our theory of inter-annotator disagreement as a property of ambiguity is generalizable.
- Disagreement in Crowdsourcing and Active Learning for Better Distant Supervision Quality: we present ongoing work on combining active learning with the CrowdTruth methodology for further improving the quality of DS training data. We report the results of a crowdsourcing experiment ran on 2,500 sentences from the open domain. We show that modeling disagreement can be used to identify interesting types of errors caused by ambiguity in the TAC-KBP knowledge base, and we discuss how an active learning approach can incorporate these observations to utilize the crowd more efficiently.
We present our latest work on the CrowdTruth framework, titled “Human Computing for the Real World”, at the ICT Open 2017 conference on 21st and 22nd of March 2017. I made a new video that demonstrates the different aspects of the framework for dealing with ambiguity in data, crowdsourcing of human interpretations, and evaluating disagreement between annotations.
We present our latest work on the CrowdTruth framework, titled “Human Computing for the Real World”, at the ICT Open 2017 conference on 21st and 22nd of March 2017. The demonstrative video shows the different aspects of the framework for dealing with ambiguity in data, crowdsourcing of human interpretations, and evaluating disagreement between annotations.
Our demo of ControCurator titled “ControCurator: Human-Machine Framework For Identifying Controversy” will be shown at ICT Open 2017. In this demo the ControCurator human-machine framework for identifying controversy in multimodal data is shown. The goal of ControCurator is to enable modern information access systems to discover and understand controversial topics and events by bringing together crowds and machines in a joint active learning workflow for the creation of adequate training data. This active learning workflow allows a user to identify and understand controversy in ongoing issues, regardless of whether there is existing knowledge on the topic.