ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

Intelligent Agents : Machine Reading Comprehension

Somnath Biswas

å‘å¸ƒæ—¥æœŸ: 2018å¹´2æœˆ28æ—¥

On 5th of Jan this year, an AI model for the first time outperformed humans in reading comprehension. The SLQA+ (ensemble) model from Alibaba recorded an Exact Match score of 82.44 against the human score of 82.304 , on the SQuAD dataset.

It turns out that the Microsoft r-net+ (ensemble) model had achieved 82.650 two days prior to that. And since then, two other models have also gone on to beat the human EM score. While none of the models have beaten the human F1 Score ( precision, recall) of 91.21 yet, these events further underline the frantic pace at which the RC models are evolving, which is great news because Reading Comprehension ( RC) is a key element of intelligent agent systems.

Intelligent Agents & Machine Reading Comprehension

Building intelligent agents with the ability to answer open -domain/even closed domain questions with high accuracy, has been a key goal of most AI labs. Intelligent agents with RC and Question-Answer ( QA) abilities can help AI personal assistant systems like Alexa, Google Assistant, Siri, Cortana etc. perform better and help enterprises in having intelligent agent bots supplement human agents or directly process chat & messaging traffic and maybe even voice to some extent.

Machine Comprehension/Machine Reading Comprehension/Machine Reading models enable computers to read a document and answer general questions against it. While this is a relatively elementary task for a human, its not that straightforward for AI Models. There are multiple NTM (Neural Turning Machine), Memory Network and Attention Models for Reading Comprehension available.The list of SQuAD models can be accessed here.

As a first step towards building our Intelligent agent system ( humanly.ai ), we are also building a Machine Reading system. Our implementation is based on the BiDAF ( Bi-Directional Attention Flow ) ensemble model & Textual Entailment. It's still work in progress ( EM 67%, and F1 77%), and sometimes it gives funny answers but you can try it out here.

Key Challenges

One of the basic challenges we faced was handling the questions which would require a yes/no type of answers ( a further inference between the question, answer and the document) - and hence the implementation of the Textual Entailment module. The other observation was to respond back in full sentences ( "Yes, Narendra Modi is the Prime Minister of India" instead of a "Yes", to the question is "Is Narendra Modi the Prime Minister of India?"), and for that as the next product increment we are currently planning on implementing the Seq2Seq model to format our responses.

But one major challenge, all Machine Reading systems face, especially when it comes to practical implementations for specific domains or verticals is the absence of supervised learning data ( labelled data) for that domain. All the contemporary, reading Comprehensions models are built on supervised training data, with labelled questions and answers/ paragraph with the answer etc. So when it comes to new domains, whilst the Enterprises have artefacts & data, the absence of labelled data presents a challenge.

We are currently experimenting with an ensemble of Machine Reading Comprehension models, each trained on a specific dataset, so that the learning is incremental. While, the scores are improving for the model, but the need for labelled domain data to train the MRC model in the first place still persists. Towards this problem, I came across two very neat solutions which attempt domain transference from Microsoft - SynNet and ReasoNet, that we intend to explore further.

The 'two stage Synthesis Networks' or SynNet model first gets trained on supervised data for a given vertical and learns the technique to identify patterns for critical information ( named entities, knowledge points etc.), and then generates questions around these answers. Once trained, it can then generate pseudo questions & answers against artefacts for the new domain. These can then be used to train the MRC on the new domain.

The Reasoning Network or ReasoNet essentially uses Reinforcement learning to dynamically figure when it has enough information to answer a question, and that it should stop reading. This is a deviation from the approach of using a fixed number of turns during the process of inferring the relationship between the questions, artefacts and the answers. This has also performed exceptionally when on the SQuAD dataset.

We shall overcome

As various models continue to emerge, its a reasonable guess that sooner rather than later ( especially catalyzed by the availability of so many datasets that are themselves growing rapidly, btw MS MARCO V2 becomes available on 01/03/2018 ) that Machine Comprehension Models will be able to overcome the key challenges, and get us closer to the goal of Intelligent Agents that can be trained on standard documents & answer general questions - as humans do ( which also happens to be the byline for humanly.ai btw :P )

I do hope you were able to look past the blatant plugs for humanly.ai :P, and found the post useful in getting some basic understanding of Machine Comprehension. As always, do leave your comments & thoughts - including any aspects that I might have missed. I will be more than happy to incorporate them.

Disclaimers: The above post in no way claims any copyright to any of the images or literature presented.

Tanvir Hussain

7 å¹´

Good one Som. Would also be interesting to read some thoughts on to prevent or mitigate AI going rogue

èµž

å›žå¤

1 æ¬¡å›žåº”

Partha Neog

7 å¹´

Very well explained, Som

èµž

å›žå¤

1 æ¬¡å›žåº”

Vineet Nair

Senior Product Manager II Technology Delivery Head II Agile Practitioner : Digital Programs ? Blockchain ? Big Data & Analytics ? AI ? Cloud Native ? ERP ? Business Application Platforms ? PMO (Golden Visa Holder )

7 å¹´

Interesting space ..A low hanging fruit may be book translation in different languages. Chat Bots need to be able to handle different conversational styles, cultural nuances, sarcasm, etc before they start becoming main stream. There is still opportunity to combine it cleverly with humans to augment their capacity rather than replace them. Great post !!

èµž

å›žå¤

2 æ¬¡å›žåº”

Jay Nana

Client Partner/Director (Enterprise Services Manager) at Amazon Web Services (AWS)

7 å¹´

Good read, AI is evolving fast and contributing into revenue expansion and cost optimisation via cognitive capabilities, we see it rapidly consuming a larger wallet-share of IT budgets of leading enterprises.

èµž

å›žå¤

1 æ¬¡å›žåº”

Sushil Asar

APAC Lead - AI, Data and Digital

7 å¹´

Great writeup, Som!! Industries like legal and healthcare are waiting to be disrupted with RC capabilities.

èµž

å›žå¤

1 æ¬¡å›žåº”

æŸ¥çœ‹æ›´å¤šè¯„è®º

è¦æŸ¥çœ‹æˆ–æ·»åŠ è¯„è®ºï¼Œè¯·ç™»å½•

Somnath Biswasçš„æ›´å¤šæ–‡ç«

Product Spotlight: Conversational Shopping -Amazon Rufus

2025å¹´3æœˆ8æ—¥

Product Spotlight: Conversational Shopping -Amazon Rufus

Reading time: 4 minutes Named after the very first dog allowed inside the Amazon premises, Rufus is a conversational AIâ€¦

2 æ¡è¯„è®º
EU AI Act?â€”?2nd Feb Was the Deadline for Prohibited Systems. Whatâ€™s Next?

2025å¹´2æœˆ2æ—¥

EU AI Act?â€”?2nd Feb Was the Deadline for Prohibited Systems. Whatâ€™s Next?

Reading time: ~5 mins ; Audience: Product professionals building AI systems Source: Solicitors Journal February 2â€¦

1 æ¡è¯„è®º
Building AI Saas: 7 key product decisions

2024å¹´7æœˆ15æ—¥

Building AI Saas: 7 key product decisions

Sharing my notes from a recent session on the key decision points when building a LLM SaaS (Software-as-a-service)â€¦

5 æ¡è¯„è®º
Exploration of GPT-3, ChatGPT and the Large Language Models landscape

2023å¹´1æœˆ28æ—¥

Exploration of GPT-3, ChatGPT and the Large Language Models landscape

Before Nov 30th 2022, â€˜chatbotâ€™ used to be a bad word â€” not anymore! Open AIâ€™s ChatGPT beta has been the best thingâ€¦

5 æ¡è¯„è®º
[1 min read] Finding similarity between objects #ml #first principles

2021å¹´3æœˆ21æ—¥

[1 min read] Finding similarity between objects #ml #first principles

Since I have promised an under 1 min reading time, let me deliver the punchline upfront â€” similarity between twoâ€¦
[5 min read] Metrics to measure the performance of your classification ML models

2021å¹´3æœˆ14æ—¥

[5 min read] Metrics to measure the performance of your classification ML models

â€˜If you cannot measure it, you cannot manage it.â€™ Whilst there are many metrics available to evaluate a Classificationâ€¦

1 æ¡è¯„è®º
Building Chatbots for Enterprises: 5 things to keep in mind.

2018å¹´5æœˆ12æ—¥

Building Chatbots for Enterprises: 5 things to keep in mind.

According to a survey conducted by Drift earlier this year, 15% of consumers have communicated with business via aâ€¦

13 æ¡è¯„è®º
Regulating Artificial Intelligence - Definitely, Maybe

2018å¹´4æœˆ4æ—¥

Regulating Artificial Intelligence - Definitely, Maybe

Statista has it at $60 Billion by 2025, the McKinsey Global Institute puts its between $644 Million and $126 Billion byâ€¦

9 æ¡è¯„è®º
Lets talk about Natural Language Processing

2018å¹´2æœˆ19æ—¥

Lets talk about Natural Language Processing

"Handle them carefully, for words have more power than atom bombs." - Pearl Strachan Hurd I am fairly sure, that whenâ€¦

6 æ¡è¯„è®º
AI vs Machine Learning vs Deep Learning

2017å¹´5æœˆ5æ—¥

AI vs Machine Learning vs Deep Learning

If one was to go through the technology roadmap of any organization worth its salt, amongst other things, the presenceâ€¦

14 æ¡è¯„è®º

See all articles

Intelligent Agents : Machine Reading Comprehension

Somnath Biswas

Intelligent Agents & Machine Reading Comprehension

Key Challenges

We shall overcome

Somnath Biswasçš„æ›´å¤šæ–‡ç«

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

The Dawn of Reasoning AI and What It Means for You

Understanding the Basic Elements of AI and ML

9 Areas Where Humans Still Outperform AI

Why AI Agent is the next frontier for the AI field

AI Vs. AI â€“ The fight has just begun.

GenAI Weekly â€” Edition 13

Understanding Artificial Intelligence: Its Definition and Boundaries

There is AI in the AIR

How an AI Thinks Before It Speaks: Quiet-STaR

Understanding Artificial Intelligence: Its Definition and Boundaries

Intelligent Agents & Machine Reading Comprehension

Key Challenges

We shall overcome

Somnath Biswasçš„æ›´å¤šæ–‡ç«

Product Spotlight: Conversational Shopping -Amazon Rufus

EU AI Act?â€”?2nd Feb Was the Deadline for Prohibited Systems. Whatâ€™s Next?

Building AI Saas: 7 key product decisions

Exploration of GPT-3, ChatGPT and the Large Language Models landscape

[1 min read] Finding similarity between objects #ml #first principles

[5 min read] Metrics to measure the performance of your classification ML models

Building Chatbots for Enterprises: 5 things to keep in mind.

Regulating Artificial Intelligence - Definitely, Maybe

Lets talk about Natural Language Processing

AI vs Machine Learning vs Deep Learning

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

The Dawn of Reasoning AI and What It Means for You

Understanding the Basic Elements of AI and ML

9 Areas Where Humans Still Outperform AI

Why AI Agent is the next frontier for the AI field

AI Vs. AI â€“ The fight has just begun.

GenAI Weekly â€” Edition 13

Understanding Artificial Intelligence: Its Definition and Boundaries

There is AI in the AIR

How an AI Thinks Before It Speaks: Quiet-STaR

Understanding Artificial Intelligence: Its Definition and Boundaries

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†