登录查看更多内容

New Architectures are Driving Progress in Natural Language Processing

TCS Digital Software & Solutions

DS&S is a Strategic Growth Business within TCS, helping large businesses navigate critical digital transformations.

发布日期: 2022年9月13日

J. R. Firth, the famous English linguist, ?is known for this succinct quote describing one of the fundamental properties of language: context-dependence. His point is the literal meaning of individual words cannot be interpreted in isolation and are highly dependent on the other words ?in a given sentence. Context-dependence concepts dictate the complexity, nuance, and richness in languages. It also creates some of the most difficult problems in natural language processing. The following two sentences illustrate the importance of context-dependence in communicating accurately: ?

a) I swam across the river to get to the other bank

b) I drove across the street to the bank.

It is obvious that the word “bank” in the above two examples has different meanings: in sentence a) the word “bank”, refers to the land alongside of a river and is remotely related neighboring words “swam” and “river”, but less dependent on words in closer proximity such as “other”. ?In sentence b) ?the word “bank” refers to the financial institution and is associated with the preceding words “drove” and “street”. ??

These examples illustrate two of the most difficult problems in natural language processing: 1) The influence of neighboring words and, 2) Long-term memorization of neighboring words in both directions (before and after) a particular word. In practice the number of words in a sentence varies widely but taken on average sentences generally contain 20-25 words. As a result, building a context-aware mechanism for correctly deciphering the meanings and nuances of words in a sentence requires long-term memorization of the neighboring words, which as mentioned poses significant technical challenges.

Extensive research has been conducted on context-awareness. This includes recurrent neural networks (RNNs) which are a well-known neural network architecture for natural language processing. However RNNs suffer from two disadvantages: 1) Lack of long-term memorization which impairs language processing effectiveness; 2) Sequential processing that precludes parallel computation in model training.?

In 2017, after a brief hiatus, Vaswani, etc. [1] introduced a novel neural network architecture called “Transformer” as shown in Figure 1.?This architecture is revolutionizing natural language processing. The three key innovative aspects of the Transformer architecture are:

Positional encoding and attention mechanisms, which differentially weigh the significance of each word based on its surrounding words, regardless of distance, which enhances context-awareness and long-term memorization,
Encoder-decoder mechanism that enables substantial improvement in next-word prediction,
Adoption of a parallel-friendly feedforward neural network architecture that leads to a substantial reduction in training time.?

Transformer architecture has seen great success in a variety of machine learning language tasks including machine translation, question-answering, text-generation, chatbots. This architecture outperformed many previously reported language models on industrial benchmarks. As a result, the once dominant RNN architecture in natural language process is beginning to gradually to give way to the Transformer architecture.

Figure 1 High-level Transformer Neural Network Architecture

Transformer architecture has proved invaluable not only in natural language processing but also in the field of computer vision. Alexey, etc. [2] developed a Transformer-based vision network called ViT. This network splits an image into 16x16 patches and generates a sequence of linear image patch embeddings as an input to a Transformer. Image patches are treated as the same way as word tokens in natural language processing as shown in Figure 2.

When pre-trained on a large dataset, ViT approaches or beats the performance of state of the art convolutional neural networks (CNN) on multiple image recognition benchmarks. Similar to RNN in natural language processing the CNN architecture in computer vision will be gradually replaced by Transformer architecture. As unlikely as it may seem these two fields – natural language processing, and computer vision-- have a fascinating relationship and certainly seem to underscore the prophetic quote by J. R. Firth.

Figure 2: Image Patch and Position Embedding of Transformer?

领英推荐

Large Language Models

Julio Cesar Alonzo Dacaret 9 个月前

Large Language Models: An In-Depth Exploration of LLMs…

Adria Business & Technology 4 个月前

Survey on Hallucination in LLM; LLM’s Understanding…

Danny Butvinik 1 年前

TCS Digital Software & Solutions (DS&S) is the team behind TCS Customer Intelligence & Insights? and TCS Intelligent Urban Exchange? AI-driven CX and sustainability analytics software. Its customers are increasingly finding value in both computer vision and language modeling. Transformers increase this value and open new possibilities for enterprises.??The R&D team is further pushing cutting-edge developments in Transformer technology to supply vision and language modeling offerings that will anticipate, meet, and exceed customer needs. The team is always interested to hear how its verticalized analytics solutions in retail, finance, sustainability, smart cities, and more, can meet your needs better. Schedule a meeting with our product teams and help drive the roadmap.

References:

1.?????Vaswani,?N. Shazeer,?N. Parmar,?J. Uszkoreit,?L. Jones,?A. N. Gomez,?L. Kaiser,?I. Plosukhin, Attention is all you need, Proceedings of the 31th International Conference on Neural Information Processing Systems, December, 2017, pp. 4768—4777

2.?????Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby, AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE, the 9th International Conference on Learning Representation, 2021

Learn More:

Visit the https://www.tcs.com/what-we-do/products-platforms/tcs-intelligent-urban-exchange?page on?https://www.tcs.com

Email Us:?[email protected]

About the Authors:

?Dr. Arup Acharya, PhD

Arup is the Head of Research & Innovation at TCS Digital Software and Solutions and leads both?Architecture & Design and Research teams in DS&S.??He received his PhD from Rutgers University and B.Tech from IIT, Kharagpur in Computer Science. He has 40+ patents issued and well-published in conferences & journals in?leading-edge technology topics.?Prior to TCS, Arup worked at IBM Research and NEC Research.

Dr. Yibei Ling, PhD

Yibei Ling is a Senior Data Scientist at TCS Digital Software and Solutions and works on energy-aware, no-code AutoML frameworks, machine learning models for sentiment analysis, face recognition, and time-series analysis.?Prior to working at TCS Yibei was with the research labs in Bellcore (Telcordia) working on DARPA projects including sensor and distributed networks. He has published more than 30 papers in IEEE and ACM Transactions covering fault-tolerant?and distributed computing, and network security. Yibei has been granted 21 US patents and is a senior IEEE member and reviewer for Mathematical Reviews and IEEE Transactions publications.

Dr. Guillermo Rangel, PhD

Guillermo is a Senior Data Scientist at TCS Digital Software and Solutions with expertise in text analytics built over a decade of working on natural language modeling (NLP/G/U). Guillermo has previously worked in verticals like banking, retail, Telco and gaming serving as a solution advisor and data science consultant for companies like Bloomberg LP, Vodafone, Blizzard Entertainment, and The Home Depot amongst others. Guillermo holds a PhD in Physics from the University of California at Davis.?

New Architectures are Driving Progress in Natural Language Processing

TCS Digital Software & Solutions

DS&S is a Strategic Growth Business within TCS, helping large businesses navigate critical digital transformations.

领英推荐

Learn More:

About the Authors:

TCS Digital Software & Solutions的更多文章

社区洞察

其他会员也浏览了

Large Language Models vs. Liquid Form Models: A Comparative Analysis for Industry Professionals

Plaintext: Language AI Models

Large language models (LLMs)

Small Language Models vs. Large Language Models: Understanding the Trade-offs

Unlocking the Potential of AI in Healthcare: How Generative Pre-training Transformer Models (like ChatGPT) will Change Healthcare

LLM

Part 6: RNNs — The Memory That Powers Language

What is a Large Language Model?

Large Language Models: A Comprehensive Exploration

领英推荐

Learn More:

About the Authors:

TCS Digital Software & Solutions的更多文章

The Retail Shake-Up: What’s Happening Now and What’s Coming Next

The Power of Data for CPG Brands: Unlocking Growth, Personalization, and Profitability

Maximizing Customer Lifetime Value in Retail and SMB Banking with AI-Driven Insights

Scaling Personalization: The Role of AI and ML in Retail

Enhancing Banking Relationships and Customer Value with Real-Time AI-Driven Insights

How AI Can Transform Building Societies

The AI Revolution in Customer Experience & Engagement: Transforming Financial Services

Why Shoptalk 2025 is Retail’s Must-Attend Event: Key Trends to Watch

Unlocking Member Potential: How AI-Powered Insights Drive Personalized Solutions for Credit Unions

Sustainability in Retail: A Blueprint for Operational Efficiency

社区洞察

其他会员也浏览了

Large Language Models vs. Liquid Form Models: A Comparative Analysis for Industry Professionals

Plaintext: Language AI Models

Large language models (LLMs)

Small Language Models vs. Large Language Models: Understanding the Trade-offs

Unlocking the Potential of AI in Healthcare: How Generative Pre-training Transformer Models (like ChatGPT) will Change Healthcare

LLM

Part 6: RNNs — The Memory That Powers Language

What is a Large Language Model?

Large Language Models: A Comprehensive Exploration