Evolution of Language Models and Their Impact on Search

Evolution of Language Models and Their Impact on Search

Welcome to this edition of our newsletter, where we'll explore the world of large language models (LLMs) and their impact on search. In recent years, LLMs have revolutionized natural language processing, thanks to their ability to learn from vast amounts of data, making them a game-changer in search technology. So we invited Jayant Kumar to understand how LLMs are used to improve search accuracy, enhance the user experience, and transform the way we interact with information.

Whether you are a seasoned researcher, a student learning about causal inference for the first time, or simply curious about how we can better understand language processing, we hope this newsletter will provide valuable insights and inspiration. So, let's dive into the world of language models!


About the host:

Manisha Arora?is a Data Scientist at Google with 10 years' experience in driving business impact through data-driven decision making. She is currently leading ad measurement and experimentation for Ads across Search, YouTube, Shopping, and Display. She works with top Google advertisers to support their marketing and business objectives through data insights, machine learning, and experimentation solutions.

About the speaker:

Jayant Kumar is an applied ML scientist with 16+ years of experience in the development of AI-based products and services. At Adobe, he is responsible for ML-based search, discovery, and recommendation pipelines in creative products. He has over 25 peer-reviewed papers in Machine Learning and Artificial Intelligence.


A Quick Overview of Language Models:

Language models are computer programs that learn to predict the probability of a sequence of words given the previous sequence of words. They are a key component of natural language processing (NLP), which is the field of computer science that focuses on enabling computers to understand, interpret, and generate human language. Language models are trained on large amounts of text data, such as books, articles, and social media posts, and they learn to predict the likelihood of different words and phrases based on the context in which they appear. These models have come a long way in recent years, with the most advanced models incorporating billions of parameters and demonstrating impressive capabilities in tasks such as language generation, question-answering, and more.


Language models have come a long way in the past five years. The exponential growth of the number of parameters has contributed to their effectiveness in learning knowledge from a large compilation of data. The models have close to half a trillion parameters, a significant increase from the 94 million parameters of the Elmo model in 2018. Elmo was the first model to use context-dependent word embeddings. It utilized bi-directional LSDM models to learn from the context and create embeddings that were dependent on the context in which a word appeared. This was a significant breakthrough from previous models such as GloVe vectors, where the word vectors were static, and it did not matter in which context the word appeared.

No alt text provided for this image

In 2018, the "Attention is All You Need" paper introduced the Transformers architecture, which led to the creation of models such as BERT and GPT. OpenAI trained GPT on a large dataset and increased its parameters to 1.5 billion, leading to significant improvements in language models. Models such as Megatron and DALL-E also emerged during this time, focusing on the efficient training of larger models.

No alt text provided for this image

GPT-3 marked a significant breakthrough in language models' ability to perform few-shot learning, where the model can understand a task with only a few or even zero examples. Today, we have new models such as GPT-3, and recently, OpenAI released Codex, a model that can generate code. Language models have come a long way in the past five years, and it is an exciting time to be part of this field.

No alt text provided for this image

The latest development in AI that has been making waves in the news is the release of GPT-4. Microsoft researchers evaluated GPT-4's capabilities and found that it was able to reason, solve problems, think abstractly, and comprehend complex ideas consistently. The only areas where it fell short were in planning, learning, and running, although it showed some early signs of improvement.

One of the main breakthroughs in AI technology that enabled these developments was the Transformer architecture, which allowed for modeling long-range dependencies in text. This improved upon the previous popular model, LSDM, which struggled with dependencies. Multi-headed attention, which was introduced in Transformers, enabled the model to attend to complex interactions between the current context and the previous context.

LLMs have come a long way from just predicting the next word to now being able to understand the full context and reason beyond just simple tasks. These models have been able to incorporate the rich information about object properties and interactions present in language, leading to a significant improvement in their ability to generate human-like text.

Large-scale data has played a critical role in this advancement, with models now being trained on trillions of tokens of internet-scale data, which encompass everything that humans have written on the internet, including media, discussions, and more. However, there could be biases or noise in this data, leading to problems such as factual errors or incorrect reasoning.

Recent work has focused on biasing the model to generate more balanced and reasonable text. To address potential biases or noise in the data, reinforcement learning with human feedback is essential. By training the models to generate text that would look and sound good to humans, researchers hope to improve the quality of the text generated by these models.

Overall, these developments in LLMs are exciting and demonstrate the potential of AI to reason and understand complex concepts in natural language.

Read more about it here:

  1. Generative AI for Enterprises
  2. LLM Intro and Tutorials
  3. Early Experiments with GPT-4


Trends in LLMs:

Although we have seen models with over half a trillion parameters, many researchers have debated whether such massive models are necessary. Some researchers have explored the idea of using a mix of smaller language models and other technologies like search and retrieval.

One interesting example is the LLAMA (Language Model Benchmarking) project, which includes a series of smaller models ranging from 14 billion to 65 billion parameters. The researchers demonstrated that the number of parameters is not the only factor that affects performance, as even smaller models can achieve close to the same performance if trained for longer.

No alt text provided for this image

Stanford researchers built on the LAMA project to create ALPaCA (Adaptor-Layer Parameterization for Conditional Language Modeling), which showed that a 7 million parameter model could behave similarly to the larger GPT-3-based models by adapting to their instructions and responses.

Looking to the future, the potential applications for large language models are vast. Google's PaLM (Pathways language model) is capable of handling various tasks, including text completion, question answering, summarization, and translation.

There is also active work in developing assistive experiences, such as ChatGPT and BARD, which are fine-tuned for conditional generation using reinforcement learning to provide human-like responses tailored to the user's needs and context.


LLM Architecture:

Two main architectures are worth considering when using LLMs:

  1. The first involves using very large language models that have been trained on all relevant documents and can generate responses based on that knowledge. However, this approach becomes less efficient and updatable as the number of documents and data points increases, or when new events arise. Additionally, it can be difficult to ensure that the model's generated responses accurately reflect the contents of the documents. This is where the concept of "provenance" becomes relevant, which involves attributing the document's content and properties to ensure the model generates accurate responses.
  2. The second approach involves encoding the documents into vectors and allowing the user to input their question. From there, the model can generate a response based on the vectors and the question. This approach is more efficient and easily updatable since the documents can be easily re-encoded as needed. In addition, it provides more transparency in terms of how the model generates its responses, which is important for ensuring accuracy and trustworthiness.

There is ongoing research into using LLMs for search and retrieval in various ways, including fine-tuning the models for specific domains or tasks and integrating them with other technologies like retrieval and search algorithms. LLMs have the potential to greatly improve the search and retrieval experience, but there are still challenges to overcome in terms of efficiency, updatability, and accuracy.


LLMs for Search and Conversational AI:

The traditional approach to question-answering tasks involved using a large language model (LLM) that is pre-trained on a large corpus of text and fine-tuned for the specific task. Recently, researchers have focused on building more efficient and upgradable models that incorporate document retrieval.

To achieve this, the model retrieves relevant documents based on the user's query. A smaller LLM then generates a response by understanding the context of these documents. This approach is more efficient and can even show the provenance of the documents used to generate the response.

Research has also demonstrated that it is possible to improve the attention of transformers to retrieved documents. Retrieval-enhanced transformers, for instance, have modified the transformer architecture to attend more to retrieved documents. Another study focused on reducing the size of the LLM while still being able to effectively generate responses based on retrieved documents. Recent work has also demonstrated that it is possible to create a context vector out of retrieved documents and generate responses based on this context without changing the LLM parameters.

The key takeaway here is that there is a trade-off between the size of the LLM and the efficiency and effectiveness of the document retrieval system. Studies have shown that incorporating retrieved documents can lead to better performance in question-answering tasks. In particular, the in-context approach has consistently outperformed other approaches for GPT-2 models.

Read more about Information Retrieval here:

  1. Few Shot Learning with Language Models
  2. Illustrated Retrieval Transformer


Quantifying the Quality of Language:

A common question is how to measure and monitor the quality of a language model. Although we can use a sizable pre-trained model or benefit from open-source models, we must first determine whether they are appropriate for our particular application and how to ensure high quality. Perplexity is one way to measure the quality, which is the likelihood of a sentence being generated from the language distribution. Additionally, evaluating the model's performance on benchmark datasets for specific tasks and monitoring the inputs and outputs to ensure they align with the desired distribution is also important. We can also compute accuracy on downstream tasks such as translation or classification, and measure F1 scores and Blue/Rouge scores for text generation. Ongoing observability work in this area ensures that deployed models generate output from the desired distribution and perform well.


Beyond the AI Hype:

AI is a powerful tool that has the potential to improve many aspects of our lives, but we must also ensure that the models we build are safe, fair, and respectful of user privacy. One of our key areas of focus is privacy. As countries around the world implement stricter privacy laws, providing transparency and allowing users to decide what their data is used for is essential. Addressing the complex task of defining and quantifying fairness in AI models is necessary. Ensuring that our models respond to cultural differences and provide fair outcomes for different demographics is crucial.

No alt text provided for this image

Since many of these models are black box models, we may never see the full dataset they are trained on. Putting security practices in place to avoid unintended harm that AI models may cause is crucial. To sum up, while AI is a powerful technology, we must ensure that it is deployed ethically, with a focus on user privacy, fairness, and safety.

Read more about AI Ethics and Responsible Practices:

  1. Google's AI Principles
  2. LinkedIn's AI Principles in Practice

No alt text provided for this image


It has been an absolute pleasure to have Jayant Kumar with us for this Fireside Chat on Language Models.

If you are looking to hone up your skills in Data Science, PrepVector offers a comprehensive course led by experienced professionals. You will gain skills in Product Sense, AB Testing, Machine Learning and more through a series of live coaching sessions, industry mentors, and personalized career coaching sessions. In addition, you will also compound your skills by learning with like-minded professionals and sharing your learnings with the larger community along the way.

The next cohort will kick off May 15, 2023. Book a free consultation to know more! ??

Check out our previous newsletters:

  1. Causal Inference Fundamentals
  2. Trends and Career Paths in Data Science
  3. Search Rankings and Recommendations
  4. Skills and Growth as a Product Data Scientist


Kudos to?Sujithra Gunasekar?for helping draft this article. ?? Subscribe to this newsletter to stay tuned about more such events!

No alt text provided for this image

要查看或添加评论,请登录

社区洞察

其他会员也浏览了