Introducing Understanding Language by Machines
Spinoza Prize projects by Prof. dr. Piek Vossen
Spinoza Prize projects Understanding Language by Machines
The goal of Spinoza Prize projects “Understanding Language by Machines” is to develop computer models that can assign deeper meaning to language that approximates human understanding and to use these models to automatically read text and understand language in relation to the world. Current approaches to natural language understanding consider language as a closed-world of relations between words. Words and text are however highly ambiguous and vague while at the same time there is a large variation in the way we express similar things. People do not notice this ambiguity and variation when using language within their social communicative context. This project tries to get a better understanding of the scope and complexity of ambiguity and variation and to model the social communicative contexts to resolve it.
The Reference Machine is an abstract framework: a conceptual machine that can map natural language to the extra-linguistic world as we perceive it and represent it in our brain. The Reference Machine models different aspects of understanding through novel computer programs. It brings together three key concepts: identity, reference and perspective, and studies these three concepts in relation to each other. In total 6 PhDs and 3 PostDocs have researched various aspects of the above problems.
The Reference Machine is not only a conceptual model that illustrates the complexity of the relations between identity, reference and perspective, but is also a real machine in the physical world that reasons and communicates from its own perspective. This is embodied by our robot project. Pepper robot Leolani shares the world with us but perceives it differently. In order to communicate adequately about the world, a robot needs to relate its perception of the world to the way we perceive it.
Three fundamental concepts overarch the research:
If only all things in the world were given and identified by a number. However this is not how it works. There are infinite ways to define what is there, so how to identify them?
Perspective is the point of view from the source to the world. This can be spatial, temporal, emotional or motivational and determines how we make reference to things.
Five projects, each investigating a different aspect of assigning meaning:
Project 1 — Word Sense Disambiguation
Project 2 — Perception & Description of Images
Project 3 — Storylines & Perspectives
Project 4 — Context & Background Knowledge
Project 5 — Make Robots talk
One theory to bind them all (identity, reference and perspective):
Identify, reference and perspective are strongly interwoven. We identify what we refer to and our perspective determines how we make reference. We therefore combine all three notions in the Theory of Identity, Reference and Perspective (TIRP). TIRP claims that our communication is optimised in relation to the properties of the direct context. So whereas ambiguity and variation are abundant in large text collections of data, they vanish or resolve within specific contexts and given the shared knowledge and experience of the communication partners (Ilievski et al. 2016, Postma et al. 2016, Vossen et al. 2018). Read more at: http://www.understandinglanguagebymachines.org/tirp/
From the Blog
Language, Knowledge and People in PerspectiveFinding knowledge and information is becoming an almost trivial task on the web. Understanding its value, the credibility and perspective of its source, on the …Continue Reading
Looking at the Long Tail of LanguageThe distribution of symbols in natural language and their meanings are no exception to Zipf’s law: a small amount of observations are very frequent and …Continue Reading
Can Machines Understand Language?According to John Searle, this is fundamentally impossible. He used the Chinese Room thought-experiment to demonstrate that computers follow instructions to manipulate symbols without understanding …Continue Reading