And… We’re back! How was your week?
Last week was intensely fun and adventurous, many new datasets, studies, and NLP research were shot out of a cannon!
Also in NYC, the AAAI conference happened. And there, the Turing award winners (LeCun, Bengio, and Hinton) came together for some exciting talks.
Yoshua was so excited he even started a blog… his first words:
I often write comments and posts on social media but these tend to be only temporarily visible, so I thought I needed a place to couch some of my thoughts that would be more permanent…
Yann gave a talk and shared slides:
Yann LeCun
NYU – Courant Institute & Center for Data Science
Facebook AI Research
Yann’s self-supervised learning talk shows how much NLP has brought us in the past few years. You may have heard of “masking” from BERT and other transformers, this key concept of filtering out data that causes models to adapt to lack of information is the crux of the slides, and its consequences are far-reaching. Just ask the peeps in Computer Vision:
Unsupervised learning of representations is beginning to work quite well without requiring reconstruction. https://t.co/EgZbX3tYD7
— Geoffrey Hinton (@geoffreyhinton) February 14, 2020
Last but not least, at the AAAI conference, the outstanding paper award went to the authors of the WinoGrande dataset from the Allen Institute. Special thanks to Chandra for forwarding us the dataset a couple of weeks ago. It’s been added to the vault of the Big Bad NLP Database. 👍
Congratulations to the winners of this year’s winners of the Outstanding Paper Award! #AAAI20 pic.twitter.com/Gbd3biH8z0
— AAAI (@RealAAAI) February 11, 2020
This Week:
– The Galkin Graphs Cometh
– Compressing BERT
– DeepMind Keeps It PG-19
– DeepSpeed and ZeRO Cool
– Open Domain QA Strikes Back!
– Why Models Fail
– Kaggle, Colab and Mr. Burns
– Dataset of the Week: WinoGrande
The Galkin Graphs Cometh
“This year AAAI got 1591 accepted papers among which about 140 are graph-related.”
Can’t have a conference without mentioning Galkin’s coverage of knowledge graphs!
What’s popular:
Dumping knowledge graphs on Language Models…
Entity matching over knowledge graphs with different schemas… Temporal knowledge graphs aka dynamic graphs…
And for those building goal-oriented bots 👇, check out the Schema-Guided Dialogue State Tracking workshop paper:
Compressing BERT
Peeps dropped a new BERT on Hugging Face’s community library. It’s the compressed version of BERT and outperforms the distilled version on 6 GLUE tasks (it’s actually comparable to the base model)! This is great for those looking to save money on computing! (like me 😁)
BERT-of-Theseus is a new compressed BERT by progressively replacing the components of the original BERT…