This Week’s Content
-T5, Google’s New Transformer
– Facebook’s RoBERTa Distilled by Hugging Face
-Multiprocessing vs. Threading
-Fine-Tuning BERT, a Tutorial
-Microsoft’s UniLM AI Improves Summarization
T5 | The New SOTA Transformer from Google
A new entrant in the transformer school of hard-knocks was unveiled yesterday by Google called T5. This new transformer achieved new SOTA performance on SuperGLUE leaderboard scoring a total score of 88.9, just 0.9 away from human performance.
The model comes in 5 sizes:
- T5-Small (60 million params)
- T5-Base (220 million params)
- T5-Large (770 million params)
- T5–3B (3 billion params)
- T5–11B (11 billion params)
![](https://miro.medium.com/max/3280/1*tQLr-s_QtEthozDYG4sfig.png)
T5 serves primarily as code for reproducing the experiments in Exploring the Limits of Transfer Learning with a Unified…
Facebook AI’s RoBERTa Distilled by Hugging Face
Smaller models make it easier to deploy and less $$ for cloud compute.
“95% of RoBERTa-base
‘s performance on GLUE, twice as fast as RoBERTa while being 35% smaller.” — Hugging Face
Below are the results of dev sets on GLUE:
![](https://miro.medium.com/max/1880/1*2-eFWMN_6x3MXcEnueY0FQ.png)
This folder contains the original code used to train Distil* as well as examples showcasing how to use DistilBERT…
Multiprocessing vs. Threading
Understanding the difference between multiprocessing vs. threading is important when deploying machine learning models: FloydHub’s new article goes in-depth:
Sooner or later, every data science project faces an inevitable challenge: speed. Working with larger data sets leads…
Fine-Tuning BERT, a Tutorial
Chris McCormick’s blog show us how to use Hugging Face’s Pytorch library to fine-tune BERT for sentence classification:
In this tutorial I’ll show you how to use BERT with the huggingface PyTorch library to quickly and efficiently…
Microsoft’s UniLM AI Improves Summarization
New Microsoft model, UniLM, completes unidirectional, sequence-to-sequence, and bidirectional prediction which helps improve performance on several NLP tasks. Code and pre-trained models found here:
New October 1st, 2019: UniLM v1 release ***** UniLM v1 (September 30th, 2019): the code and pre-trained models for the…
This is a weekly round-up of NLP News and Code drops from Techies worldwide.
Follow us on Twitter for more NLP News, Code & Demos: @Quantum_Stat