Open Student Projects

Crowdsourcing English grammar ambiguity

Crowdsourcing, or the practice of obtaining information by enlisting the services of a large number of people, is becoming a popular method for data collection in many fields, such as machine translation and natural language processing. But with a crowd of opinions also comes disagreement. For this project, we will look at how to use disagreement between crowd workers to detect interesting properties of language, such as the varying degrees to which a sentence can sound “correct”. We will build a crowdsourcing task to ask the crowd how correct they think a translated sentence appears, and then look at the disagreement to find out what are ambiguous properties of English grammar.

Further reading: Lau, Jey Han, Alexander Clark, and Shalom Lappin. “Measuring Gradience in Speakers’ Grammaticality Judgements.” CogSci. 2014.

Contact: Anca Dumitrache and/or Lora Aroyo

Visualizing the structure of controversial issues

There are many issues in the world that people do not agree on, for instance global warming, gun control, or even whether Trump should be president of the United States. Having opposing opinions on such topics can lead to heated discussions, making them said to be controversial. Understanding controversies from a temporal (when), textual (what), and social (who) angle is a necessary prerequisite for effective intervention. As part of the ControCurator project [1] we ask you to investigate how to best visualize the structure of controversial issues in a web based tool. This should allow journalists and media professionals to gain a better understanding of such issues, and allow them to monitor ongoing events in real-time.


Contact: Benjamin Timmermans and/or Lora Aroyo

Modeling events through their sentiment

Recently many events, actions or issues emerge on social media, such as Twitter. People use Twitter as a medium to express their opinion with regard to various things that happen at the moment. In this project we will work and analyze crowdsourcing data for gathering relevance, sentiment interpretation and information novelty of tweets (and possibly news) with regard to different events.
The aim of the project is to perform an in-depth data analysis of the crowdsourced data, in comparison with machine generated data (and possibly expert data) and see how they complement each other or differ from each other. One example of analysis could be: syntactic and semantic analysis of the event word cloud (nouns, adjectives). We are also interested in visualizing the event in a timeline: are there new sub-events linked to the main event? can we see the changes in sentiment as an event timeline?

Contact: Oana Inel and/or Lora Aroyo

Storytelling for news

New articles are rich sources of information that present events, facts and their evolution across time. In this project we will define a (series of) crowdsourcing task(s) to gather information about how events and sub-events are linked at the level of a news article or across news articles. We want to see whether by linking events through cause-effect relations or temporal relations we can draw conclusions about the relevance or importance of events.

Relevant paper:

Contact: Oana Inel and/or Lora Aroyo

Running Student Projects

Past Student Projects