Our demo of ControCurator titled “ControCurator: Human-Machine Framework For Identifying Controversy” will be shown at ICT Open 2017. In this demo the ControCurator human-machine framework for identifying controversy in multimodal data is shown. The goal of ControCurator is to enable modern information access systems to discover and understand controversial topics and events by bringing together crowds and machines in a joint active learning workflow for the creation of adequate training data. This active learning workflow allows a user to identify and understand controversy in ongoing issues, regardless of whether there is existing knowledge on the topic.
On 7th of March the DIVE+ project will be presented at Cross Media Café: Uit het Lab. DIVE+ is result of a true inter-disciplinary collaboration between computer scientists, humanities scholars, cultural heritage professionals and interaction designers. In this project, we use the CrowdTruth methodology and framework in order to crowdsource events for the news broadcasts from The Netherlands Institute for Sound and Vision (NISV) that are published under open licenses in the OpenImages platform. As part of the digital humanities effort, DIVE+ is also integrated in the CLARIAH (Common Lab Research Infrastructure for the Arts and Humanities) research infrastructure, next to other media studies research tools, that aims at supporting the media studies researchers and scholars by providing access to digital data and tools. In order to develop this project we work together with eScience Center, which is also funding the DIVE+ project.
Our paper “Harnessing Diversity in Crowds and Machines for Better NER Performance” (Oana Inel and Lora Aroyo) has been accepted for the ESWC 2017 Research Track. The paper is to be published together with the proceedings of the conference.
Over the last years, information extraction tools have gained a great popularity and brought significant improvement in performance in extracting meaning from structured or unstructured data. For example, named entity recognition (NER) tools identify types such as people, organizations or places in text. However, despite their high F1 performance, NER tools are still prone to brittleness due to their highly specialized and constrained input and training data. Thus, each tool is able to extract only a subset of the named entities (NE) mentioned in a given text. In order to improve NE Coverage, we propose a hybrid approach, where we first aggregate the output of various NER tools and then validate and extend it through crowdsourcing. The results from our experiments show that this approach performs significantly better than the individual state-of-the-art tools (including existing tools that integrate individual outputs already). Furthermore, we show that the crowd is quite effective in (1) identifying mistakes, inconsistencies and ambiguities in currently used ground truth, as well as in (2) a promising approach to gather ground truth annotations for NER that capture a multitude of opinions.
Yesterday at the Computable Awards the Vrije Universiteit, University of Amsterdam and IBM won the prize for “ICT project of the year in education” with the Watson Innovation Course. Furthermore, the project was highest rated across all nominees of all prize categories. The course is ongoing at the moment for the second time, with an improved setup and new state of the art tools for the students.
The course is run by Lora Aroyo, Anca Dumitrache, Benjamin Timmermans and Oana Inel from the VU, and Robert-Jan Sips and Zoltan Szlavik from IBM. In the course the students were challenged by Amsterdam Marketing to solve the issue of the increasing overcrowdedness of tourists in the city center of Amsterdam. The city is culturally rich with many places to visit, yet most visitors cluster around a limited set of popular locations. The students came up with ideas to motivate visitors to spread in the city and provide them with relevant information for their visit.
Although the mortgage application process and the regulations surrounding that are clearly mapped, institutionalised and supported by automated systems, the process of orientation in the housing market is not. When looking for an appropriate place to live / open a business, clients of the bank are confronted with questions and considerations like the location, image of and facilities in the neighbourhood and the safety of this neighbourhood, energy labels, average price levels, future development plans, etc., surrounding one of the largest decisions of their life: the purchase of a house. At current, the bank is unclear about the steps customers take in the orientation process and with which extra services / answers / information their bank could support them.
Within this 3 month project, IBM, VU and UvA will join forces and will use the IBM Open Innovation approach to come up with a new data-driven service concept for one of the leading banks in the Netherlands. The team will consist of IBM Consultants (design / business strategy), an IBM programmer, 2 IBM researchers and researchers from VU Amsterdam and UvA.
To strengthen our team we are looking for 3 students for a 3-month-internship at IBM, potentially followed by a MSc thesis project deepening/continuing upon their work, in the following disciplines:
(1) Business/Service Innovation: Students with an entrepreneurial / service innovation background, looking to gain experience in the development of a real-life business case. The student should ideally have experience with focus groups and qualitative interviews, to help gain initial insights into the house orientation process.
(2) Crowdsourcing: Using crowdsourcing and the social web, to get a clear(er) picture of the demands, questions, uncertainties surrounding the purchase of a house. This project will complement the work done by student (1) with quantitative results and a larger scope.
(3) Open Data / Information Retrieval: Finding (open) datasets and retrieving datasources which would be able to provide insights in the questions identified by the work done by student (1) and (2).
If you are interested in an internship with IBM / ABNAMRO within the context of this project, please contact Lora Aroyo via firstname.lastname@example.org with a short motivation why you would like to work on it, CV and your availability in the coming 3-4 months.