NLP News Cypher 02.16.20

NLP News Cypher | 02.16.20

And… We’re back! How was your week?

Last week was intensely fun and adventurous, many new datasets, studies, and NLP research were shot out of a cannon!

Also in NYC, the AAAI conference happened. And there, the Turing award winners (LeCun, Bengio, and Hinton) came together for some exciting talks.

Yoshua was so excited he even started a blog… his first words:

Yoshua Bengio’s blog – first words

I often write comments and posts on social media but these tend to be only temporarily visible, so I thought I needed a place to couch some of my thoughts that would be more permanent…

Yann gave a talk and shared slides:

Self-Supervised Learning

Yann LeCun
NYU – Courant Institute & Center for Data Science
Facebook AI Research

Yann’s self-supervised learning talk shows how much NLP has brought us in the past few years. You may have heard of “masking” from BERT and other transformers, this key concept of filtering out data that causes models to adapt to lack of information is the crux of the slides, and its consequences are far-reaching. Just ask the peeps in Computer Vision:

Last but not least, at the AAAI conference, the outstanding paper award went to the authors of the WinoGrande dataset from the Allen Institute. Special thanks to Chandra for forwarding us the dataset a couple of weeks ago. It’s been added to the vault of the Big Bad NLP Database. 👍

This Week:

– The Galkin Graphs Cometh
– Compressing BERT
– DeepMind Keeps It PG-19
– DeepSpeed and ZeRO Cool
– Open Domain QA Strikes Back!
– Why Models Fail
– Kaggle, Colab and Mr. Burns
– Dataset of the Week: WinoGrande

The Galkin Graphs Cometh

“This year AAAI got 1591 accepted papers among which about 140 are graph-related.”

Can’t have a conference without mentioning Galkin’s coverage of knowledge graphs!

What’s popular:

Dumping knowledge graphs on Language Models…

Entity matching over knowledge graphs with different schemas… Temporal knowledge graphs aka dynamic graphs…

And for those building goal-oriented bots 👇, check out the Schema-Guided Dialogue State Tracking workshop paper:

Corpus of News Headlines

Compressing BERT

Peeps dropped a new BERT on Hugging Face’s community library. It’s the compressed version of BERT and outperforms the distilled version on 6 GLUE tasks (it’s actually comparable to the base model)! This is great for those looking to save money on computing! (like me 😁)

Model: canwenxu/BERT-of-Theseus-MNLI

BERT-of-Theseus is a new compressed BERT by progressively replacing the components of the original BERT…

Continue Reading…