NLP News Cypher

NLP News Cypher | 11.03.19

This Week’s Content

-Google Creates BERT, Google Adopts BERT

-Summarization Meets Fact Checking

-Batch Inference vs. Online Inference

-Summary of Machine Learning Evaluation Metrics

-Goal-Oriented Dialogue + Knowledge Base

-Chitchat Dialogue is Hard!

Quick note: Today the EMNLP conference gets underway thru Nov. 7th. Quantum Stat will be dishing out research news and other highlights from top NLP researchers on our twitter feed HERE. Personally, I’m excited to see what comes away from the recent trend of coupling language models with knowledge graphs and further advancements of distilling large transformers.

Google Creates BERT, Google Adopts BERT

BERT, Google’s transformer is now being executed on its creator’s search engine which may impact 10% of of all queries. BERT will be utilized to better serve longer search queries that requires more contextual understanding of natural language. Wonder how this will impact the way users conduct search going forward? If Google sees an uptick in question answering, BERT is a win with search. Time will tell.

Google brings in BERT to improve its search results

Google today announced one of the biggest updates to its search algorithm in recent years.

Summarization Meets Fact Checking

In this @Salesforce Research paper, Richard Socher introduces a weakly-supervised model for fact checking summarized sentences vs. the source corpus.

The chart below shows the fragility of summarization and how it doesn’t take much to change the semantics of a transformed sentence.

Batch Inference vs. Online Inference

When taking your next machine learning model, developers must understand the consequences of dealing with batch (static) inference models vs. online (dynamic) models. The latter is harder and you must maintain high quality of inference for your users in real-time. When do you choose one vs. the other?

“If the predictions do not need to be served immediately, you may opt for the simplicity of batch inference. If predictions need to served on an individual basis and within the time of a single web request, online inference is the way to go.”

Learn more here:

Batch Inference vs Online Inference

Batch Inference vs Online Inference Introduction You’ve spent the last few weeks training a new machine learning model. After working with the product team to define the business objectives

Summary of Machine Learning Evaluation Metrics

This article from FloydHub discusses the most popular machine learning metrics for model evaluation. Below are the metrics in question:

  • Confusion Matrix
  • Accuracy
  • Precision
  • Recall
  • Precision-Recall Curve
  • F1-Score
  • Area Under the Curve (AUC)
A Pirate’s Guide to Accuracy, Precision, Recall, and Other Scores.

Whether you’re inventing a new classification algorithm or investigating the efficacy of a new drug, getting results is not the end of the process.

Goal-Oriented Dialogue + Knowledge Base Research

Meet the Neural Assistant, the AI takes user utterance alongside a knowledge base triple in helping to generate a KB guided response from the assistant. Below is an example conversation for restaurant search:

User: “Find me an inexpensive Italian restaurant in San Francisco”

(KB Triple: The Great Italian, cuisine, Italian)

Agent Response: “How about The Great Italian?

At the moment, knowledge bases with greater than 2,000 triples negatively impacts AI performance.

“The model is able to incorporate external knowledge effectively as long as the KB size is 2000 triples or smaller.”

Chitchat Dialogue is Hard!

Rasa, who open-sources its own chatbot framework, recently released their video archive of presenters from their recent dev conference. Below, Nouha Dziri from Google AI shares the difficulty of benchmarking dialogue quality and offers options for workarounds. Stanford & Facebook AI’s work, which we recently deployed, was discussed!

Finally, this past Friday we got a chance to check out Sebastian Ruder’s (DeepMind) presentation at NYU.