Open Student Projects
Crowdsourcing English grammar ambiguity
Crowdsourcing, or the practice of obtaining information by enlisting the services of a large number of people, is becoming a popular method for data collection in many fields, such as machine translation and natural language processing. But with a crowd of opinions also comes disagreement. For this project, we will look at how to use disagreement between crowd workers to detect interesting properties of language, such as the varying degrees to which a sentence can sound “correct”. We will build a crowdsourcing task to ask the crowd how correct they think a translated sentence appears, and then look at the disagreement to find out what are ambiguous properties of English grammar.
Further reading: Lau, Jey Han, Alexander Clark, and Shalom Lappin. “Measuring Gradience in Speakers’ Grammaticality Judgements.” CogSci. 2014.
Visualizing the structure of controversial issues
There are many issues in the world that people do not agree on, for instance global warming, gun control, or even whether Trump should be president of the United States. Having opposing opinions on such topics can lead to heated discussions, making them said to be controversial. Understanding controversies from a temporal (when), textual (what), and social (who) angle is a necessary prerequisite for effective intervention. As part of the ControCurator project  we ask you to investigate how to best visualize the structure of controversial issues in a web based tool. This should allow journalists and media professionals to gain a better understanding of such issues, and allow them to monitor ongoing events in real-time.
Modeling events through their sentiment
Recently many events, actions or issues emerge on social media, such as Twitter. People use Twitter as a medium to express their opinion with regard to various things that happen at the moment. In this project we will work and analyze crowdsourcing data for gathering relevance, sentiment interpretation and information novelty of tweets (and possibly news) with regard to different events.
The aim of the project is to perform an in-depth data analysis of the crowdsourced data, in comparison with machine generated data (and possibly expert data) and see how they complement each other or differ from each other. One example of analysis could be: syntactic and semantic analysis of the event word cloud (nouns, adjectives). We are also interested in visualizing the event in a timeline: are there new sub-events linked to the main event? can we see the changes in sentiment as an event timeline?
Storytelling for news
New articles are rich sources of information that present events, facts and their evolution across time. In this project we will define a (series of) crowdsourcing task(s) to gather information about how events and sub-events are linked at the level of a news article or across news articles. We want to see whether by linking events through cause-effect relations or temporal relations we can draw conclusions about the relevance or importance of events.
Relevant paper: http://aclweb.org/anthology/W/W16/W16-5708.pdf