It Rages that Way
Onward, we go. In last week’s column, I posed a problem w/r/t the problem of complexity in sentiment analysis. In our example:
Gold is up 6% in the pre-market as the downward pressure of the coronavirus outbreak weighs on equity stocks.
The consequence of when dealing with complex systems, especially in dealing with this example, is that a generalized approach is difficult. Very difficult to reduce complexity down to a vector. But if we go bottom-up, if we instead localize it, to the client, we see light at the end of the tunnel. In order to know the ground-truth we must localize sentiment to the holdings of the user. If the user holds Gold, then it’s Bullish, If the client is betting against Gold, then it’s Bearish, if there is no position it’s a neutral statement. In other words, for this example, personalization isn’t just a marketing gimmick, but it’s a functional requirement. There is no perfect solution and every unique domain will require it’s own local rules for ground-truth interpretation.
Albeit, this statement was one of the most difficult to analyze and is usually an edge case. But with deep learning, outliers in datasets are how models get smoked.
(There are other complex bottlenecks that we may encounter, n-order logic, ambiguous ground-truth, domain-shift etc. I will discuss these and other factors in an upcoming white paper. Stay tuned!)
How was your week?
BTW, we updated the BBN database, thank you to all contributors!
[UPDATE] Big Bad NLP Database— Quantum Stat (@Quantum_Stat) March 6, 2020
We've updated the database with 28 new datasets! Thanks again for contributing: @pasini_t, @bouscarrat_leo, @nikita_moghe, @LaxmanSTomar
c/c @seb_ruder #NLProc #Datasets #AI #ArtificialIntelligence #DataSciencehttps://t.co/iDKvXXwZqT
– Walking Wikipedia
– Learning From Unlabeled Data
– A Jiant Among Us
– Hugging Face’s Notebooks
– Composing Semantics
– Transformer Size, Training and Compression
– Graphing Knowledge
– Dataset of the Week: DVQA
The ongoing research in localizing graph structures with transformers continues to march forward. Research shows how a new model is able to follow an English Wikipedia reasoning path to answer multi-hop questions as found in HotpotQA. This is meant for the open domain scale.
(Last week’s column had a similar paper, looks like open-domain, multi-hop is really gaining steam among researchers.✨😎)