We present our latest work on the CrowdTruth framework, titled “Human Computing for the Real World”, at the ICT Open 2017 conference on 21st and 22nd of March 2017. The demonstrative video shows the different aspects of the framework for dealing with ambiguity in data, crowdsourcing of human interpretations, and evaluating disagreement between annotations.
Our demo of ControCurator titled “ControCurator: Human-Machine Framework For Identifying Controversy” will be shown at ICT Open 2017. In this demo the ControCurator human-machine framework for identifying controversy in multimodal data is shown. The goal of ControCurator is to enable modern information access systems to discover and understand controversial topics and events by bringing together crowds and machines in a joint active learning workflow for the creation of adequate training data. This active learning workflow allows a user to identify and understand controversy in ongoing issues, regardless of whether there is existing knowledge on the topic.
On 7th of March the DIVE+ project will be presented at Cross Media Café: Uit het Lab. DIVE+ is result of a true inter-disciplinary collaboration between computer scientists, humanities scholars, cultural heritage professionals and interaction designers. In this project, we use the CrowdTruth methodology and framework in order to crowdsource events for the news broadcasts from The Netherlands Institute for Sound and Vision (NISV) that are published under open licenses in the OpenImages platform. As part of the digital humanities effort, DIVE+ is also integrated in the CLARIAH (Common Lab Research Infrastructure for the Arts and Humanities) research infrastructure, next to other media studies research tools, that aims at supporting the media studies researchers and scholars by providing access to digital data and tools. In order to develop this project we work together with eScience Center, which is also funding the DIVE+ project.
Our paper “Harnessing Diversity in Crowds and Machines for Better NER Performance” (Oana Inel and Lora Aroyo) has been accepted for the ESWC 2017 Research Track. The paper is to be published together with the proceedings of the conference.
Over the last years, information extraction tools have gained a great popularity and brought significant improvement in performance in extracting meaning from structured or unstructured data. For example, named entity recognition (NER) tools identify types such as people, organizations or places in text. However, despite their high F1 performance, NER tools are still prone to brittleness due to their highly specialized and constrained input and training data. Thus, each tool is able to extract only a subset of the named entities (NE) mentioned in a given text. In order to improve NE Coverage, we propose a hybrid approach, where we first aggregate the output of various NER tools and then validate and extend it through crowdsourcing. The results from our experiments show that this approach performs significantly better than the individual state-of-the-art tools (including existing tools that integrate individual outputs already). Furthermore, we show that the crowd is quite effective in (1) identifying mistakes, inconsistencies and ambiguities in currently used ground truth, as well as in (2) a promising approach to gather ground truth annotations for NER that capture a multitude of opinions.
Yesterday at the Computable Awards the Vrije Universiteit, University of Amsterdam and IBM won the prize for “ICT project of the year in education” with the Watson Innovation Course. Furthermore, the project was highest rated across all nominees of all prize categories. The course is ongoing at the moment for the second time, with an improved setup and new state of the art tools for the students.
The course is run by Lora Aroyo, Anca Dumitrache, Benjamin Timmermans and Oana Inel from the VU, and Robert-Jan Sips and Zoltan Szlavik from IBM. In the course the students were challenged by Amsterdam Marketing to solve the issue of the increasing overcrowdedness of tourists in the city center of Amsterdam. The city is culturally rich with many places to visit, yet most visitors cluster around a limited set of popular locations. The students came up with ideas to motivate visitors to spread in the city and provide them with relevant information for their visit.