Analyzing and interpreting neural networks for NLP

Revealing the content of the neural black box: workshop on the analysis and interpretation of neural networks for Natural Language Processing.

The BlackboxNLP 2020 Shared Interpretation Mission

The 2020 BlackboxNLP workshop on Analyzing and Interpreting Neural Networks for NLP will include, for the first time, a shared interpretation mission. This is an event designed based on shared tasks. However, its goal is not to find a best-performing system for a well-defined problem but rather to encourage the development of useful, creative analysis techniques that would help us better understand existing models.

Description

Participants in the shared mission are asked to develop a method which, given a large pre-trained (masked) language model (such as GPT, BERT, or XLNet), answers one or both items on the interpretation research agenda listed below. At the first phase of the mission, which we call “development (dev) phase” for convenience purposes, participants will be supplied with instructions on obtaining 2-3 pre-trained models and may develop their interpretation system. At a later time, 1-2 “test” models will be unveiled, which the participants would be required to run the same analysis on, testing their system’s technical adaptability to the new setting, as well as the validity of any conclusions made towards the agenda based on the analysis of the “dev” models.

Evaluation of the systems will be qualitative, based on a report produced by each participating team. It will be performed manually by a committee, aiming to reward systems proven to be relevant (to the research agenda), generalizable (from dev to test), and simple (to understand and to replicate).

Dev Models

The models for the development phase are:

Test Models

The models for the test phase are:

Research Agenda

  1. What factors determine the extent to which models learn different language properties?
  2. How does interpretation of a model reflect its performance on a task, its robustness and generalizability?

Guidelines

An FAQ with further details is available here.

Submission details

Timeline