NLP News Cypher | 01.26.20

Big Bad NLP Database Update

And… we’re back! How was your week??

We’re continuing our hard work on the NLP Database! During the mid-week, we added 21 datasets and user contribution has helped tremendously, they are mentioned here:

Since the release of the Big Bad NLP Database, we have received overwhelming support across the globe including visitors stemming from 98 countries in the past week alone. We have many more datasets left to go and hopefully, we’ll capture more international variants. More updates coming soon!

This Week:

– The Dutch RoBERTa
– Hacking GitHub for Blog Posting
– Serverless, VMs and Containers
– Mellon’s Twitter NLP Library
– A New Dataset for Visual Question Answering
– AI Content Made Simple
– mmmmmBART
– Know Your Hardware
– Dataset of the Week: DailyDialog

The Dutch RoBERTa

Looks like we have a new language-focused transformer, and it’s straight out of the Netherlands, world meet RobBERT. It achieves SOTA results on several Dutch-based downstream tasks. You can read more about it here:


GitHub (silhouette head shot, props):


A Dutch language model based on RoBERTa with some tasks specific to Dutch. Read more on our blog post or on the paper…



Hacking GitHub for Blog Posting

A fellow figured out how to embed interactive Jupyter notebooks in GitHub pages 😎😎. Pretty cool if you want to show your work with a little more twinkle.

Way to Export Notebook as HTML in Jekyll for Blog Posts

Ok, folks, I figured it out. It turns out I don’t have to write any custom exporter or change any templates. Here is how you can embed interactive Jupyter notebooks with Altair in…

Continue Reading…