Fact Extraction and VERification

Schedule

09:00-09:15	Welcome talk Organizers
09:15–10:00	Inducing Fake, and Real, Information from NLP Models Sameer Singh
10:00-10:30	Research Talks 1
10:00-10:15	Fact Checking or Psycholinguistics: How to Distinguish Fake and True Claims? Aleksander Wawer, Grzegorz Wojdyga and Justyna Sarzyńska-Wawer
10:15-10:30	Neural Multi-Task Learning for Stance Prediction Wei Fang, Moin Nadeem, Mitra Mohtarami and James Glass
10:30-11:00	Coffee Break
11:00-11:45	Fact Checking Using Stance Detection and User Replies Emine Yilmaz
11:45-12:00	Research Talks 2
11:45-12:00	Towards a Positive Feedback between the Wikimedia Ecosystem and Machine Learning Fact Verification Diego Saez-Trumper and Jonathan Morgan
12:00-12:10	FEVER2.0 Shared Task Talks
12:00-12:10	The FEVER 2.0 Shared Task James Thorne, Andreas Vlachos, Oana Cocarascu, Christos Christodoulopoulos and Arpit Mittal
12:10-12:20	GEM: Generative Enhanced Model for adversarial attacks Piotr Niewinski, Maria Pszona and Maria Janicka
12:20-12:30	Cure My FEVER : Building, Breaking and Fixing Models for Fact-Checking Christopher Hidey, Tuhin Chakrabarty, Tariq Alhindi, Siddharth Varia, Kriste Krstovski, Mona Diab and Smaranda Muresan
12:30-14:00	Lunch Break
14:00-14:45	Fact Verification with Semi-Structured Knowledge William Wang
14:45-15:30	The use and abuse of automated fact verification David Corney
15:30-16:30	Research Posters + Coffee [show/hide details]
	Aligning Multilingual Word Embeddings for Cross-Modal Retrieval Task Alireza Mohammadshahi, Rémi Lebret and Karl Aberer
	Unsupervised Natural Question Answering with a Small Model Martin Andrews and Sam Witteveen
	Scalable Knowledge Graph Construction from Text Collections Ryan Clancy, Ihab F. Ilyas and Jimmy Lin
	Relation Extraction among Multiple Entities Using a Dual Pointer Network with a Multi-Head Attention Mechanism Seong Sik Park and Harksoo Kim
	Question Answering for Fact-Checking Mayank Jobanputra
	Improving Evidence Detection by Leveraging Warrants Keshav Singh, Paul Reisert, Naoya Inoue, Pride Kavumba and Kentaro Inui
	Hybrid Models for Aspects Extraction without Labelled Dataset Wai-Howe Khong, Lay-Ki Soon and Hui-Ngo Goh
	Extract and Aggregate: A Novel Domain-Independent Approach to Factual Data Verification Anton Chernyavskiy and Dmitry Ilvovsky
	Interactive Evidence Detection: train state-of-the-art model out-of-domain or simple model interactively? Chris Stahlhut
	Veritas Annotator: Discovering the Origin of a Rumour Lucas Azevedo and Mohamed Moustafa
15:30-16:30	Shared Task Posters + Coffee [show/hide details]
	FEVER Breaker’s Run of Team NbAuzDrLqg Youngwoo Kim and James Allan
	Team DOMLIN: Exploiting Evidence Enhancement for the FEVER Shared Task Dominik Stammbach and Guenter Neumann
	Team GPLSI. Approach for automated fact checking Aimée Alonso-Reina, Robiert Sepúlveda-Torres, Estela Saquete and Manuel Palomar
16:30-17:15	Fact Extraction and Verification for Precision Medicine Hoifung Poon
17:15-17:30	Closing Remarks Organizers

Invited Talks

	Inducing Fake, and Real, Information from NLP Models Sameer Singh As machine learning models become better at generating factual looking information, they will increasingly become part of deployed, practical systems, with their output directly presented to users. In this talk, I will present some of our work demonstrating that current models are far from ready for such a use case: even if they look accurate, it is easy to manipulate them to generate false information, often using changes to the input that look unrelated and innocuous. I will present examples of such “adversarial attacks” on knowledge graph completion (produces false facts), reading comprehension (produces wrong answers), and text generation (produces fake text). I will also present some of our recent work on a language model that uses an external knowledge graph to generate more accurate text, as a step towards generating factually correct information by an NLP model.
	Fact Checking Using Stance Detection and User Replies Emine Yilmaz Social media platforms are a plethora of misinformation and its potential negative influence on the public is a growing concern. This concern has drawn the attention of the research community on developing mechanisms to detect misinformation. The task of misinformation detection consists of classifying whether a claim is True or False. One of the primary problems studied as part of misinformation detection is stance detection, where the goal is to categorize an overall position of a subject towards an object such as agree, disagree, unrelated, etc. One of the major problems faced by current machine learning models used for stance detection is caused by a severe class imbalance among these classes. Hence, most models fail to correctly classify instances that fall into minority classes. In this talk, I will first present a model that addresses this problem by proposing a hierarchical representation of these classes and show how such a model could achieve significant performance improvement especially in the classification of minority classes. In addition to stance detection, the way people respond to a claim is also quite informative regarding the truthfulness of the claim. In the second part of this talk, I will present a model that uses information from people's replies to a claim that can be used to predict the truthfulness of the claims made, together with its uncertainty.
	Fact Verification with Semi-Structured Knowledge William Wang Our society is struggling with an unprecedented amount of falsehood, hyperbole, and half-truths. Politicians and organizations repeatedly make false claims that jeopardize the integrity of journalism. Disinformation now floods the cyberspace and influences many events on and offline. To fight false information, the need for automatic fact verification has never been so urgent. Existing studies primarily focus on free-form text as evidence crawled from Wikipedia or News websites. The direction of using semi-structured knowledge as evidence like relational tables has yet to be explored systematically. In this talk, we will mainly focus on introducing a new benchmark dataset called TabFact, which allows us to systematically study the fact verification problem under semi-structured tables as evidence.
	The use and abuse of automated fact verification David Corney The volume of unstructured text online continues to grow unabated, including digital news, TV subtitles and social media. Many people around the world now rely on online sources for their news. However, not all claims made online are equally reliable, leading to a demand for tools that can guide people towards trustworthy, verified content. New methods in AI and NLP are increasingly being used to extract structured information from text and one natural application is the fully-automated verification of claims made online. In parallel to this, fact checking organisations like Full Fact continue to work hard to verify a wide range of important claims and improve the quality of information in the public sphere. However, manual fact checking is a very labour-intensive process. Can NLP, machine learning and related tools help? In this talk, I will describe the fact checking process and the motivation behind it. I'll describe the tools that fact checkers currently use at Full Fact, including a fully-automated fact verification tool. I will also discuss the limitations of such tools, and how their misuse may lead to more harm than good.
	Fact Extraction and Verification for Precision Medicine Hoifung Poon The advent of big data promises to revolutionize medicine by making it more personalized and effective, but big data also presents a grand challenge of information overload. For example, tumor sequencing has become routine in cancer treatment, yet interpreting the genomic data requires painstakingly curating facts from a vast biomedical literature, which grows by thousands of papers every day. Machine reading can play a key role in precision medicine by substantially accelerating knowledge curation, so that we "leave no fact behind". However, standard supervised methods require labeled examples, which are expensive and time-consuming to produce at scale. In this talk, I'll present Project Hanover, where we overcome the annotation bottleneck by combining deep learning with probabilistic logic, and by exploiting self-supervision from readily available resources such as ontologies and databases. This enables us to train accurate machine readers without requiring labeled examples, and extract knowledge from millions of publications, which can be quickly verified by medical experts to support precision oncology.

Call For Papers

With billions of individual pages on the web providing information on almost every conceivable topic, we should have the ability to collect facts that answer almost every conceivable question. However, only a small fraction of this information is contained in structured sources (Wikidata, Freebase, etc.) – we are therefore limited by our ability to transform free-form text to structured knowledge. There is, however, another problem that has become the focus of a lot of recent research and media coverage: false information coming from unreliable sources.1,2

In an effort to jointly address both problems, herein we are organizing a workshop promoting research in joint Fact Extraction and VERification (FEVER). We aim for FEVER to be a long-term venue for work in verifiable knowledge extraction and to stimulate progress in this direction, we will also host the FEVER shared task, an information verification task based on a recently released dataset consisting of 220K claims verified against Wikipedia (Thorne et al., NAACL 2018).

The workshop will consist of oral and poster presentation of submitted papers including papers from the shared task participants, panel discussions and presentations by the following invited speakers:

Submissions

We invite long and short papers on all topics related to fact extraction and verification, including:

Information Extraction
Semantic Parsing
Knowledge Base Population
Natural Language Inference
Textual Entailment Recognition
Argumentation Mining
Machine Reading and Comprehension
Claim Validation/Fact checking
Question Answering
Theorem Proving
Stance detection
Adversarial learning
Computational journalism
System demonstrations on the FEVER 2.0 Shared Task

Long/short papers should consist of eight/four pages of content plus unlimited pages for bibliography. Submissions must be in PDF format, anonymized for review, and follow the EMNLP 2019 two-column format, using the LaTeX style files or Word templates to be provided on the official EMNLP-IJCNLP 2019 website.

Each long paper submission consists of a paper of up to eight (8) pages of content, plus unlimited pages for references; final versions of long papers will be given one additional page (up to nine pages with unlimited pages for references) so that reviewers’ comments can be taken into account.

Each short paper submission consists of up to four (4) pages of content, plus unlimited pages for references; final versions of short papers will be given one additional page (up to five pages in the proceedings and unlimited pages for references) so that reviewers’ comments can be taken into account.

Papers can be submitted as non-archival, so that their content can be reused for other venues. Add "(NON-ARCHIVAL)" to the title of the submission. Non-archival papers will be linked from this webpage.

Authors can also submit extended abstracts of up to eight pages of content. Add "(EXTENDED ABSTRACT)" to the title of an extended abstract submission. Extended abstracts will be presented as talks or posters if selected by the program committee, but not included in the proceedings. Thus, your work will retain the status of being unpublished and later submission at another venue is not precluded.

Previously published work can also be submitted as an extended abstract in the same way, with the additional requirement to state on the first page the original publication.

Softconf submission link: http://softconf.com/emnlp2019/ws-FEVER

FEVER Shared task

For more information on the shared task please visit the following page: Shared Task

Important dates

First call for papers: 10 May 2019
Second call for papers: 14 June 2019
Submission deadline: 30 August 2019
Notification: 20 September 2019
Camera-ready deadline: 30 September 2019
Workshop: 3 November (EMNLP-IJCNLP)

All deadlines are calculated at 11:59pm Pacific Daylight Savings Time (UTC -7h).

Workshop Organising Committee

James Thorne

KAIST AI

Andreas Vlachos

University of Cambridge

Oana Cocarascu

King's College London

Christos Christodoulopoulos

Amazon