SLT-1: Semantics of the Long Tail

Descending into the Long Tail: from abundant to sparse details in the semantic processing of language

Setting the stage

Natural Language Processing massively uses data for supervised and unsupervised learning. Frequency of forms and relations plays a major role both in the derived models and in the evaluation of these models. But event and entity instances in the world have no frequency, they just exist for some time. Frequency in data comes from our communication about these instances. Due to our biased interest, the expressions we use to refer to events and entities have a strong frequency profile, following a roughly Zipfian distribution (Zipf, 1949), featuring a small amount of very frequent observations and a very long tail of less frequent observations. Since our NLP datasets sample texts but do not sample the world, they are no exception to Zipf’s law. Thus, the salient interpretations are very prominent in our test data, which causes NLP methods to exploit this redundancy by ignoring hard cases and relying only on straightforward ones. The same is true for large commonsense knowledge repositories that encode explicit semantics such as DBpedia, Wikidata, Freebase, BabelNet, and WordNet.  Research and practice have shown these knowledge bases to be of immense value in NLP (IBM Watson, Moro et al., 2014, inter alia), but they similarly tend to focus on the most “popular” classes, instances, and relations between them.

This bias towards the head favors models of language that tend to rely on statistically obvious cases that do not require very deep understanding or reasoning. Typically, their performance is high when the test cases match the most frequent cases, and very low when they belong to the long tail (Postma et al., 2016). Interestingly enough, humans do not suffer from overfitting in the same way as machines do. They can perfectly handle long tail phenomena. Little attention has been devoted to how systems should solve interpretation tasks for local and perhaps unknown event and entity instances, which are described in sparse documents, but might not be observed in any training set or knowledge base. Potentially, this would require new representations and deeper processing than the ones that work well on the head, which involves reading between the lines, e.g. textual entailment and robust (common sense) reasoning. How can systems gather the necessary knowledge to correctly interpret low frequent long tail entities and events? How can systems establish identity and reference across text sources which are not or poorly represented in knowledge resources? How should systems exploit popular data to correctly interpret low frequent data?

Various concrete aspects of the long tail have become of interest in recent years.  Vagueness and ambiguity, while long recognized as features of natural language, are in the long tail phenomena for which we do not have meaningful data for training and evaluation. Time, location, and other modalities, as well as overly strict semantic targets introduce issues of granularity into the NLP problem that are typically ignored in evaluation. Many researchers believe relational semantics to be important, yet the amount of data for the less frequent relations is staggeringly small. Notably, the data scarcity aspect of the long tail has been addressed for several NLP tasks, such as relation detection, entity typing, document filtering, enrichment, discovering of emerging entities, and open information extraction.

Aim of the workshop

Through this workshop, we want to gather NLP and Knowledge Representation researchers to share ideas about the long tail for the semantic processing of text with a special focus on the task of disambiguation and reference. We aim at a critical discussion of the complexity and relevance of long tail phenomena. We hope to find an incentive for the community to consider the long tail as a first-class citizen in NLP tasks. We hope to encourage new approaches, which would ideally be able to deal with knowledge and data sparseness, as well as contextual ambiguity with respect to aspects of time, location, and topic.

We have been engaged with several complementary efforts to make the community aware of the relevance and complexity of this problem. In June 2016, we organized the workshop Looking at the Long Tail at Vrije Universiteit Amsterdam, which brought together experts from various fields: NLP, Information Retrieval, Machine Learning, Knowledge Representation and Reasoning. In addition, we have quantified the semantic overfitting towards the head in disambiguation and reference datasets (Ilievski et al., 2016). Based on these observations, we proposed an approach to move away from overfitting to the head towards interpretation of long tail meanings (Postma et al., 2016). We are currently organizing a SemEval-2018 task based on this proposal, for more information please check the task website.

The long tail phenomena are novel and very challenging AI- and NLP-wide problems that should be the focus of a global audience interested in semantic NLP.  We consider the IJCNLP workshop as a valuable venue for this purpose.


We want to gain insights with respect to how to address the semantic long tail in NLP systems, eventually to extract detailed knowledge on event and entity instances from unstructured text. The following topics can be used as a guide for submissions.

We believe that these topics are crucial to improve the state-of-the-art in NLP with respect to long tail phenomena, which in turn should have a major impact on overall language understanding. We are interested in systems that reveal interesting insights for addressing long tail aspects, even if their overall performance is lower than the state-of-the-art.




The SLT-1 workshop is collocated with the IJCNLP conference, which in 2017 will be held from November 27th until December 1st in Taipei, Taiwan.

Submit here

Call for papers

Important Dates

Workshop Website and First Call for Paper Ready: May 1, 2017
Second Call for Paper Sending-out: July 5, 2017
Third Call for Paper Sending-out: August 5, 2017
Paper Submission Deadline: September 5, 2017
Notification of Acceptance: September 30, 2017
Camera-Ready Deadline: October 10, 2017
Workshop day: December 1, 2017

Workshop Chairs


Piek Vossen (VU Amsterdam)
Filip Ilievski (VU Amsterdam)
Marten Postma (VU Amsterdam)


Martha Palmer (University of Colorado Boulder)
Chris Welty (Google)
Eduard Hovy (Carnegie Mellon University)
Ivan Titov (University of Edinburgh)
Philipp Cimiano (University of Bielefeld)
Frank van Harmelen (VU Amsterdam)
Eneko Agirre (University of the Basque Country)
Key-Sun Choi (Korea Advanced Institute of Science and Technology)

Program Committee

Agata Cybulska (Oracle)
Anders Søgaard (University of Copenhagen)
Andre Freitas (University of Passau)
Anselmo Peñas (UNED Madrid)
Antske Fokkens (VU Amsterdam)
Barbara Plank (University of Groningen)
Brian Davis (National University of Ireland Galway)
Dirk Hovy (University of Copenhagen)
Giuseppe Rizzo (ISMB, Turin)
Jacopo Urbani (VU Amsterdam/Max Planck Institute for Informatics)
Johan Bos (University of Groningen)
Lea Frermann (University of Edinburgh)
Leon Derczynski (University of Sheffield)
Karthik Narasimhan (Massachusetts Institute of Technology)
Marco Rospocher (Fondazione Bruno Kessler, Trento)
Marieke van Erp (VU Amsterdam)
Pradeep Dasigi (Carnegie Mellon University)
Ridho Reinanda (University of Amsterdam)
Sabine Schulte im Walde (University of Stuttgart)
Sara Tonelli (Fondazione Bruno Kessler, Trento)
Sebastian Pado (Stuttgart University)
Stephan Oepen (University of Oslo)
Sujay Kumar Jauhar (Carnegie Mellon University)
Tim Baldwin (University of Melbourne)
Tommaso Caselli (VU Amsterdam)


Introduction and the perspective of the organizers
Presentation of accepted submissions
Plenary discussion

System Performance

How do systems perform WRT the long tail versus the head? This would require a discussion about the definition of head and tail for each semantic NLP task.
Which evaluation metrics are needed to gain insight into system performance on the tail?
Do existing datasets suffice to gain insight into tail performance? What kind of benchmarks are needed to better track progress in processing the tail?

Contextual Adaptation

How to build systems that can switch between contexts of time, topic, and location (e.g. how to build systems that can adapt to new or past long tail realities)?

Data and Knowledge Requirements

What are the data and knowledge requirements to perform well on the head and the long tail?
What kind and amount of (annotated) data is needed? Do customary knowledge sources (e.g. DBpedia, BabelNet, and WordNet) suffice?
Do we need massive local knowledge resources to represent the world and all its contexts?

Methods and Linguistic Representations

Which methods and linguistic representations are necessary for accurately modeling the tail?
Are they different from the ones developed for the head?
How can we transfer models developed for the head to make them appropriate for modeling the tail? How can the recent advances in deep neural networks and matrix factorization be directed to accomplish this?