Generative models in NLP

Generative models in NLP

Breakthroughs in AI have resulted in systems that seem to understand natural language far better than even their most recent predecessors. Translating from one language to another has, for example, improved significantly. But there is more. Remarkably powerful language models underpin the ability to generate language, for instance, to finalize an email, create a relevant summary of a complex document, or even produce a full article or marketing text from scratch.

The current pandemic has made it much more important – if not downright essential – for enterprises to shift their business focus to virtual, digital channels. There is a sharp increase in online commerce and consumption of online information across the globe, as enterprises struggle to engage employees and customers alike. A study by the Capgemini Research Institute on conversational commerce clearly shows the benefits of adopting voice assistants and the ascendance of conversational, intent-driven systems in a context that becomes more digital every day.

AI-driven processing, understanding, and increased generation of natural language are the critical components in this evolution.

NLP

Natural language processing (NLP) has been around for some time: Computers have achieved human-like performance levels in many standard natural language related tasks such as part-of-speech tagging, query expansion, and more.

The introduction of the Transformer neural network architecture in 2017 has affected the field of NLP dramatically. Very large datasets are being used for AI learning, and powerful models including Bidirectional Encoder Representations from Transformer (BERT) and Generative Pre-trained Transformer (GPT) have become commonplace for typical NLP tasks.

Notably, GTP-3 – from the OpenAI research laboratory – has shown compelling improvements in areas such as text summarization and text generation. The surprising fact is that the model has mastered language without using any labeled data. It instead shows the promise of morphology –the study of words, how they are formed, and their relationship to other words in the same language. Using this radically different approach, GPT-3 can do language translation much better than any existing state-of-the-art models, and it thus looks to be highly influential on the future of NLP.

A recent publication by the Capgemini Research Institute on customer-centric AI shows that customers’ trust is increasing when they are dealing with AI. This is because the content used to train the AI is created and curated by humans.

However, new language AI systems (particularly those based on the Transformer models mentioned above) are becoming more and more capable of performing natural language generation (NLG). They move towards further understanding the phonology, pragmatics, morphology, syntax, and semantics of languages by training deep learning models with a high-volume corpus and by using clever innovation in network architectures. The result is a set of language-generation capabilities that would have sounded like science fiction only a short while ago.

No alt text provided for this image

Illustration

The following text is written by an AI NLG algorithm, provided only with the topic “impact of natural language generation.” The first thing that we noticed is that it does not take many words to generate the same effect as a human. We also tried to create an algorithm that would generate the same effect as a human, but with more accuracy and with a smaller amount of words.

We quickly found that it is hard to get the same effect with several words.

The solution we found was to use a “system” that generates simple sentences that are likely to be used in a natural language conversation. That system is called a “sentence-generation engine.”

We created an algorithm that predicts the next sentence without reading it. This algorithm has many disadvantages: it is not very fast, and it is hard for an algorithm to understand the context in which a sentence is being generated. But, we found that it is not that difficult to make our system as fast as possible. We used the word-based approach in order to get a good performance. We tried to make it as fast as possible.

We created a system that generates sentences in natural language. As can be observed, AI is getting quite convincing. A recent, much-published article in The Guardian was completely authored by GPT-3, and eloquently made the case for involving more AI bots in creative areas.

Advanced NLG capabilities potentially have a big impact, for example in the areas of social media and marketing, where AI-generated text could be competing with human-curated text used by companies about their products and services.

Ethics

However, ethical considerations quickly come into play. Having AI systems generate textual content without humans in the loop presents risks, due for example to inherent bias or wrong assumptions and facts that the training data may have contained. Currently, when using any search engine, for every page of content an organization creates there are 95 pages of content not created by those companies. This ratio of 5:95 will get more skewed once “clickbait” players generate huge amounts of text with AI.

Next-generation natural language AI systems allow companies to implement more helpful chatbots that understand the context and intentions of questions much better and as a result, provide a much more satisfying user experience. Text summarization allows companies to comprehend vast amounts of data from various sources and to provide their knowledge workers with the most relevant content available. R&D departments are able to follow trends and updates using sentiment tracking and topic modeling. Producers of textual content – whether for example in legal, life sciences, marketing or other contexts – find their productivity drastically improved, allowing more time to be spent on creativity and other non-routine activities.

Executive challenges

The issue that enterprise executives will face is to embrace the new technology without putting their brand name at risk while also keeping the enterprise's Emotional Intelligence (“EQ”) at a healthy level. They have to ask key questions about the adaptation of AI for natural language tasks such as:

? How can we augment AI with humans in the loop, or vice versa, to get the best of both?

? With so many quick improvements in natural language models, how can we benefit more

from (self-service) conversational systems and improve customer and employee

satisfaction?

? What are the most impactful use cases that address our key business objectives?

? How can we ensure compliance with ethical guidelines?

? How will human role descriptions shift as generative AI covers more work previously

done by humans?

Above all, with language being our prime and preferred way of communicating, the potential impact of generative, natural language AI systems cannot be overstated. Take our word for it.

For more information, please contact the author, who heads up the AI Centre of Excellence in India – driving innovation for Capgemini Insights & Data. A real human will respond!

Data-powered Innovation Takeaways

Deep language: Breakthroughs in applying deep learning – as an alternative to more established approaches – have significantly improved the abilities of AI systems to understand and process natural language.

No loss in translation: Natural language AI systems can thus be applied to build better, more emphatic conversational systems – such as chatbots and voice assistants – and have more effective language translation applications.

Search you right: Natural language AI systems enable more intelligent, highly personalized search and knowledge provisioning.

Content generator: Natural language generation (NLG) systems now enable the fully automated creation of increasingly passable textual content, from simple tweets to full-fledged documents, brochures, and articles.

Ethical conundrum: NLG systems pose crucial ethical challenges for enterprises, especially as the creation of natural language always has been considered the exclusive forte of humans, being in full control of the content.

---------------------------------------------------------------------------------------------

Like what you read?

Our latest Data-powered Innovation Review | Wave 1 contains more such powerful tech. stories written by Capgemini experts, technology partners, and analysts. Download your copy now: https://bit.ly/2Mc71jp

Raj, gud decrypting one but GPT works for Eng only now right? Also how will the API ecosystems develop for eg: for me to search thru College.org for right renowned global Biz school for Virtual class Professional MBA within my budget?

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了