This Week’s Content
-Google Creates BERT, Google Adopts BERT
-Summarization Meets Fact Checking
-Batch Inference vs. Online Inference
-Summary of Machine Learning Evaluation Metrics
-Goal-Oriented Dialogue + Knowledge Base
-Chitchat Dialogue is Hard!
Quick note: Today the EMNLP conference gets underway thru Nov. 7th. Quantum Stat will be dishing out research news and other highlights from top NLP researchers on our twitter feed HERE. Personally, I’m excited to see what comes away from the recent trend of coupling language models with knowledge graphs and further advancements of distilling large transformers.
Google Creates BERT, Google Adopts BERT
BERT, Google’s transformer is now being executed on its creator’s search engine which may impact 10% of of all queries. BERT will be utilized to better serve longer search queries that requires more contextual understanding of natural language. Wonder how this will impact the way users conduct search going forward? If Google sees an uptick in question answering, BERT is a win with search. Time will tell.
Google today announced one of the biggest updates to its search algorithm in recent years.
Summarization Meets Fact Checking
In this @Salesforce Research paper, Richard Socher introduces a weakly-supervised model for fact checking summarized sentences vs. the source corpus.
Summarization is one of the most important & least solved tasks in #NLProc
— Richard Socher (@RichardSocher) October 29, 2019
Problem with all #DeepLearning models: they are not optimized for factual correctness
We introduce a new task, dataset and model. Work by @iam_wkr @BMarcusMcCann @CaimingXiong
Paper https://t.co/NWer7tLWnj pic.twitter.com/pumDBw4RbO
The chart below shows the fragility of summarization and how it doesn’t take much to change the semantics of a transformed sentence.
Batch Inference vs. Online Inference
When taking your next machine learning model, developers must understand the consequences of dealing with batch (static) inference models vs. online (dynamic) models. The latter is harder and you must maintain high quality of inference for your users in real-time. When do you choose one vs. the other?
“If the predictions do not need to be served immediately, you may opt for the simplicity of batch inference. If predictions need to served on an individual basis and within the time of a single web request, online inference is the way to go.”
Learn more here:
Batch Inference vs Online Inference Introduction You’ve spent the last few weeks training a new machine learning model. After working with the product team to define the business objectives
Summary of Machine Learning Evaluation Metrics
This article from FloydHub discusses the most popular machine learning metrics for model evaluation. Below are the metrics in question:
- Confusion Matrix
- Accuracy
- Precision
- Recall
- Precision-Recall Curve
- F1-Score
- Area Under the Curve (AUC)
Whether you’re inventing a new classification algorithm or investigating the efficacy of a new drug, getting results is not the end of the process.
Goal-Oriented Dialogue + Knowledge Base Research
Meet the Neural Assistant, the AI takes user utterance alongside a knowledge base triple in helping to generate a KB guided response from the assistant. Below is an example conversation for restaurant search:
User: “Find me an inexpensive Italian restaurant in San Francisco”
(KB Triple: The Great Italian, cuisine, Italian)
Agent Response: “How about The Great Italian?
At the moment, knowledge bases with greater than 2,000 triples negatively impacts AI performance.
“The model is able to incorporate external knowledge effectively as long as the KB size is 2000 triples or smaller.”
We explore a simple approach to task-oriented dialog. A single neural network consumes conversation history and external knowledge as input and generates the next turn text response along with the action (when necessary) as output. Paper: https://t.co/EI9L4Qc5BD 1/4 pic.twitter.com/BjptMZBHvm
— Arvind Neelakantan (@arvind_io) November 1, 2019
Chitchat Dialogue is Hard!
Rasa, who open-sources its own chatbot framework, recently released their video archive of presenters from their recent dev conference. Below, Nouha Dziri from Google AI shares the difficulty of benchmarking dialogue quality and offers options for workarounds. Stanford & Facebook AI’s work, which we recently deployed ai.quantumstat.com, was discussed!
Finally, this past Friday we got a chance to check out Sebastian Ruder’s (DeepMind) presentation at NYU.
Tonight we had a great time joining @seb_ruder ‘s talk about cross-lingual representations at @nyuniversity
— Quantum Stat (@Quantum_Stat) November 2, 2019
This word2vec slide brought us back to 2013. 🧐#AI #ArtificialIntelligence #NLProc #MachineLearning #DataScience pic.twitter.com/nTXw83iVk5