Analyzing and interpreting neural networks for NLP

Revealing the content of the neural black box: workshop on the analysis and interpretation of neural networks for Natural Language Processing.

Shared Interpretation Mission – Frequently Asked Questions (probably)

Dear Shared Interpreters,

Your mission is to provide an interpretation scheme for large-scale transformer models which generalizes as well as possible across architectures, hyperparams, and underlying data. Concretely, you are tasked with explaining the following three models as a “development” phase:

On July 21, you will be required to run the same interpretation on two “test” models and report within two weeks, to demonstrate how generalizable the interpretation methods really are. We recommend having a full description and analysis of the dev models ready before this time, as the timeline is tight.

As a reminder, we repeat the research agenda which is available on the shared mission page:

  1. What factors determine the extent to which models learn different language properties?
  2. How does interpretation of a model reflect its performance on a task, its robustness and generalizability?

How Do We Start?

The models are available for download from the huggingface (HF) repository. HF also provides a vast array of tutorials for using the library, including a specific script demonstrating how to examine inner states of models (“BERTology”). Another code sample, due to Yonatan Belinkov and Michael Wu, is here.

Why not BERT?

We have purposely not selected the vanilla BERT models which we feel have been extensively studied in the past year, leading to some of the original concern regarding interpretation generalizability that got us to announce this mission in the first place. Regardless, we recommend reading the (very) recent survey by Anna Rogers, Olga Kovaleva, and Anna Rumshisky summarizing much of the BERTology landscape.

What Difficulties might We Expect?

Well, we can’t anticipate all issues, but there are two you might want to keep an eye for:

  1. XLM is a multilingual model; it would be wise not only to acquaint yourselves with the various languages it is trained on, but also on the workings of how to use it in different languages. Inference tensors may require accompanying language_id tensors; generation calls for a global configuration parameter specifying the language.
  2. The tokenization rules may differ across different models. This may affect interpretation methods where word-level annotations are needed, for example. Consider wrapping your implementations accordingly.

If you encounter more issues, or find ways that can help other participants overcome existing ones which you are willing to share, please consider contributing by creating a pull request on this very markdown file on the workshop github repo.

What is the Generalizability Credit and How Do we Get It?

Similar to the Transformers repository from HuggingFace where one can experiment with a different model using minor changes in the command, e.g. changing the model name parameter from BERT to RoBERTa, we want to achieve similar generalizability for interpretation methods.

We expect participants to submit their interpretation repository (built with dependence on the transformers repository, and possibly other well-established projects). The generalizability credit will be based on how seamlessly one can choose another model and get its interpretation and analysis. We will try all submitted APIs ourselves on the test models before announcing them, ensuring de-facto generalizability and validating reported results. Said APIs must specify all necessary input parameters and output an artifact which can be matched to a clear equivalent present in the participant-submitted report.

How long the analysis report should be?

We encourage papers up to 8 pages of content including references. However, shorter papers are welcome too.

Do I need to be able to replicate my results exactly on the test models, what If I fail to generalize, can I still report that and submit, or do I not qualify anymore?

Ideally, you should replicate the analysis performed on the dev models on the test models. However, both positive and negative results are appreciated. For example, if a technique does not generalize to all models, this itself is a good finding to report.

Can I submit analysis on one test models?

Yes, it is acceptable to analyze a single test model.

Happy analyzing!

BlackboxNLP 2020 organizers