About

Understanding of Language by Machines – an escape from the world of language is the umbrella title of four NWO Spinoza Prize 2013 projects led by Prof. dr. Piek Vossen: SPI 30-673 (2014-2019).

NWO-Spinoza_Prize_2013NWO-Spinoza Laureates during the announcement of the NWO-Spinoza Prizes 2013 on Monday, 10 June

NWO awards the Spinoza Prizes each year to researchers employed in the Netherlands who according to international standards belong to the absolute top of science. The NWO Spinoza laureates perform outstanding and groundbreaking research that attracts widespread interest and are a source of inspiration to young researchers. The NWO Spinoza Prize has been awarded since 1995. The awards are made on the basis of nominations. An international committee evaluates the nominated candidates.

Goal
The goal of the Spinoza project “Understanding of language by machines” (ULM) is to develop computer models that can assign deeper meaning to language that approximates human understanding and to use these models to automatically read and understand text. Current approaches to natural language understanding consider language as a closed-world of relations between words. Words and text are however highly ambiguous and vague. People do not notice this ambiguity when using language within their social communicative context. This project tries to get a better understanding of the scope and complexity of this ambiguity and how to model the social communicative contexts to help resolving it.

The project is divided into 4 sub projects (ULM-1, ULM-2, ULM-3 & ULM-4), each investigating a different aspect of assigning meaning:

ULM-1: The borders of ambiguity: ULM-1 will explore the closed world of language as a system of word relations. The goal is to more properly define the problem and find the optimal solution given the vast volumes of textual data that are available. This project starts from the results obtained in the DutchSemCor project.

ULM-2: Word, concept and the perception of images and sounds: ULM-2 will cross the borders of language and relate words and their meanings to perceptual data.

ULM-3: Storylines and perspectives: ULM-3 will consider the interpretation of text built up from words as a function of our ways of interacting with the changing world around us. We interpret changes from our world-views on the here and now and the future. Furthermore, we structure these changes as stories along explanatory motivations. This project builds on the results of the European project NewsReader.

ULM-4: A quantum model of text understanding: ULM-4 is a technical project that investigates a new model of natural-language-processing. Current approaches are based on a pipeline architecture, in which the complete problem is divided in a series of smaller isolated tasks, e.g. tokenization, part-of-speech-tagging, lemmatisation, syntactic parsing, recognition of entities, detection of word meanings. In this new model, none of these tasks is decisive and the final interpretation is left to higher-order semantic and contextual models. This project also builds on the findings of previous European (KYOTO) and ongoing OpeNER and NewsReader) and national (BiographyNet) projects carried out at the VU University Amsterdam. The goal is to develop a new model of natural-language-processing in which text is interpreted in a combined top-down and bottom-up proces.

Piek Vossen sketching the 'Reference Machine' - An Escape from the World of Language, July 30, 2014
Piek Vossen sketching the ‘Reference Machine’ – An Escape from the World of Language, July 30, 2014