Dear Shared Interpreters,
Your mission is to provide an interpretation scheme for large-scale transformer models which generalizes as well as possible across architectures, hyperparams, and underlying data. Concretely, you are tasked with explaining the following three models as a “development” phase:
On July 21, you will be required to run the same interpretation on two “test” models and report within two weeks, to demonstrate how generalizable the interpretation methods really are. We recommend having a full description and analysis of the dev models ready before this time, as the timeline is tight.
As a reminder, we repeat the research agenda which is available on the shared mission page:
The models are available for download from the huggingface (HF) repository. HF also provides a vast array of tutorials for using the library, including a specific script demonstrating how to examine inner states of models (“BERTology”). Another code sample, due to Yonatan Belinkov and Michael Wu, is here.
We have purposely not selected the vanilla BERT models which we feel have been extensively studied in the past year, leading to some of the original concern regarding interpretation generalizability that got us to announce this mission in the first place. Regardless, we recommend reading the (very) recent survey by Anna Rogers, Olga Kovaleva, and Anna Rumshisky summarizing much of the BERTology landscape.
Well, we can’t anticipate all issues, but there are two you might want to keep an eye for:
If you encounter more issues, or find ways that can help other participants overcome existing ones which you are willing to share, please consider contributing by creating a pull request on this very markdown file on the workshop github repo.
Similar to the Transformers repository from HuggingFace where one can experiment with a different model using minor changes in the command, e.g. changing the model name parameter from BERT to RoBERTa, we want to achieve similar generalizability for interpretation methods.
We expect participants to submit their interpretation repository (built with dependence on the transformers repository, and possibly other well-established projects). The generalizability credit will be based on how seamlessly one can choose another model and get its interpretation and analysis. We will try all submitted APIs ourselves on the test models before announcing them, ensuring de-facto generalizability and validating reported results. Said APIs must specify all necessary input parameters and output an artifact which can be matched to a clear equivalent present in the participant-submitted report.
We encourage papers up to 8 pages of content including references. However, shorter papers are welcome too.
Ideally, you should replicate the analysis performed on the dev models on the test models. However, both positive and negative results are appreciated. For example, if a technique does not generalize to all models, this itself is a good finding to report.
Yes, it is acceptable to analyze a single test model.
BlackboxNLP 2020 organizers