News continuously report on many different events but can NLP systems actually tell one event from the other within large streams of data? Especially when we consider long-tail events of the same type on which little information is available, NLP systems do a poor job of determining the quantity of participants of events, and there are essentially no NLP systems that can determine the number of times a given type of event has taken place in a corpus.
We are hosting a “referential quantification” task that requires systems to provide answers to questions about the number of incidents of an event type (subtasks S1 and S2) or participants in roles (subtask S3).
Given a set of questions and corresponding documents, the participating systems need to provide a numeric answer together with the supporting documents and text mentions of the answer events in the documents. To correctly answer each question, participating systems must be able to to establish the meaning, reference, and identity (i.e. coreference) of events and participants in news articles. A schematic example of the S2 challenge is given below:
The data (texts and answers) are prepared in such a way that the task deliberately exhibits large ambiguity and variation, as well as coverage of long tail phenomena by including a substantial amount of low-frequent, local events and entities.
Domains – Our data covers three domains: gun violence, fire disasters, and business.
Document representation – For each document, we provide its title, content (tokenized), and creation time.
Question representation – We will provide the participants with a structured representation of each question. Example:
Answer representation Based on our tokenized document input, systems will generate a CoNLL-like format, in which they indicate which mentions refer to which incidents. The number of unique incident identifiers determines the system answer.
This SemEval-2018 task is designed as three incrementally harder subtasks:
For each question, systems submit a CoNLL file, consisting of a set of incidents with their corresponding documents and event mentions. We perform both extrinsic and intrinsic evaluation over the system output:
We will provide the following baselines:
Marten Postma (firstname.lastname@example.org)
Filip Ilievski (email@example.com)
Piek Vossen (firstname.lastname@example.org)
August 14, 2017 – Trial data ready
December 1, 2017 – Test data ready
January 8-29, 2018 – Evaluation
February 28, 2018 – Paper submission due
April 30, 2018 – Camera ready version
Summer 2018 – SemEval workshop
Please fill this Google form to stay in touch!