Echoes from EMNLP and GPT-2 Strikes Back
Wow, what a week it was. The EMNLP conference gave us many treats to chew on such as the growing popularity of cross-lingual learning and the continued adoption of knowledge graphs in language models.
Because of all this action, this week’s Cypher will be a bit longer than usual.
🤯 EMNLP 2019 🤯
New QA Leaderboard Attempting to Mitigate SQuAD Problems
GPT-2 Doesn’t Bring Armageddon
Chollet’s New Formulation of Intelligence
Unsupervised Cross-lingual Representation Learning
Compute Growth Goes Hyperbolic
What were some of the top keywords in EMNLP papers?
Barbara Plank on the rise of cross-lingual NLP papers #deeplo19 #emnlp2019 pic.twitter.com/2W724R1mvl
— Stephen Mayhew (@mayhewsw) November 3, 2019
To help the community quickly catch up on the work presented in this conference, Paper Digest Team…
EMNLP 2019
Stephen Mayhew et al were live tweeting during the conference (thank you) and sharing all the action. Here are a few threads that caught our eyes:
1. Chris Manning discusses the GQA dataset, which takes natural language questions generated from graphs (based on the visual genome project) and delivers new leaderboard results from his neural state machine paper to be presented at NeurIPS next month. Full thread and link to GQA below:
Question Answering on Image Scene Graphs…
Manning: Lehnert (1977) says that NLU can be measured by asking questions. #conll2019 #emnp2019 Manning: SQuAD leaderboard…
2. Allen Institute’s Matt Gardner shares his slides on the limitations of reading comprehension task in NLP. He posits an open reading benchmark that can evaluate multiple problems in reading comprehension (e.g. Sentence-level linguistic structure, Discrete Reasoning Over Paragraphs, Question-based coreference resolution, Reasoning Over Paragraph Effects in Situations, time, grounding and others) all at once. Slides:
3. If you haven’t heard of GNN’s, (Graph Neural Networks) you should get familiar. Below is the presentation Graph Neural Networks for Natural Language Processing by Shikhar Vashishth, Naganand Yadati and Partha Talukdar. Warning it’s 315 slides long. Great work!
The repository contains code examples for GNN-for-NLP tutorial at EMNLP 2019.
4. My prayers were answered when Michael Galkin summarized all the knowledge graph insights from EMNLP. I won’t even bother discussing it since he did such a great job in the column below, part 2 is dropping soon!
The review post of the papers from ACL 2019 on knowledge graphs (KGs) in NLP was well-received so I thought …
New QA Leaderboard Attempting to Mitigate SQuAD Problems
TechQA, a new leaderboard based on questions posted in IBM DeveloperWorks, is introduced by IBM Research for enterprise Question Answering systems. Sobering insight we already knew:
“Natural Questions was created by harvesting users’ questions of Google’s search engine and then finding answers by using turkers. When a SQuAD system is tested on the Natural Questions leaderboard the F measure drops dramatically to 6% (on short answers — it is 2% for a SQuAD v1.1 system) illustrating the brittleness of SQuAD trained systems.”
Answering users’ questions in an enterprise domain remains a challenging proposition. Businesses are increasingly …
GPT-2 Doesn’t Bring Armageddon
OpenAI finally unveiled their 1.5 billion parameter transformer to the world. In addition, they also unveiled a detection model for detecting AI written text that thinks all my stuff is written by AI. 😂😂
We're releasing the 1.5billion parameter GPT-2 model as part of our staged release publication strategy.
— OpenAI (@OpenAI) November 5, 2019
– GPT-2 output detection model: https://t.co/PX3tbOOOTy
– Research from partners on potential malicious uses: https://t.co/om28yMULL5
– More details: https://t.co/d2JzaENiks pic.twitter.com/O3k28rrE5l
… 55 minutes later, Adam King, the creator of Talktotransformer.com put GPT-2 up. 🧐🧐
We're releasing the 1.5billion parameter GPT-2 model as part of our staged release publication strategy.
— OpenAI (@OpenAI) November 5, 2019
– GPT-2 output detection model: https://t.co/PX3tbOOOTy
– Research from partners on potential malicious uses: https://t.co/om28yMULL5
– More details: https://t.co/d2JzaENiks pic.twitter.com/O3k28rrE5l
… 80 minutes later, Hugging Face put it up…🧐🧐>
The 1.5 billion parameter GPT-2 (aka gpt2-xl) is up:
— Hugging Face (@huggingface) November 5, 2019
✅ in the transformers repo: https://t.co/KvUK5V7owl
✅ try it out live in Write With Transformer🦄 https://t.co/R0WHn2WMQt
Coming next:
🔘 Detector model based on RoBERTa
Thanks @OpenAI @Miles_Brundage @jackclarkSF and all
Me:
Chollet’s New Formulation of Intelligence
Francis Chollet (of the Keras fame) dropped his thesis on defining and measuring intelligence and a new eval dataset called ARC (Abstraction and Reasoning Corpus). Apparently he was working on this for the past 2 years.
I’ve just released a fairly lengthy paper on defining & measuring intelligence, as well as a new AI evaluation dataset, the “Abstraction and Reasoning Corpus”…
Unsupervised Cross-lingual Representation Learning
Last week we showed a picture we took during Sebastian Ruder’s talk at NYU. In his blog post, he shares some the slides he used and more:
This post expands on the ACL 2019 tutorial on Unsupervised Cross-lingual Representation Learning…
Compute Growth Goes Hyperbolic
Nothing to see here, move along…
This column is a weekly round-up of NLP news and code drops from researchers worldwide.
Follow us on Twitter for more Code & Demos: @Quantum_Stat