登录查看更多内容

On ChatGPT, LLM and ML Infra

Zhongliang Liang

Applied AI and AI Infra @ Meta

发布日期: 2023年2月21日

Recently there have been lots of discussions on ChatGPT, LLM (Large Language Model) and how they would change the way we use data, model and infra in the industry. There are both great insights and confusions. I want to share some of my thoughts on this topic based on my experience.

I’ll put the conclusions out first:

LLMs don’t replace models used in other ML tasks (this one is straightforward)
LLMs will work together with other models to boost their performance in ML tasks.

The ML task and model landscape

Let’s begin with the lay of the land of ML tasks and typical models. I’ll use the below diagram to illustrate the general landscape. It will be easy to see where things like Deep Learning and language models fit on the chart.

As seen in the diagram, majority of the ML tasks in the industry fall into two main categories: behavior prediction and content understanding/generation. For example, recommending content to a user (by predicting whether the user will like it) belongs to the former, and generating a picture based on a prompt belongs to the latter. This categorization is high level and not absolute.?

In both of these tasks, models have evolved over multiple generations from shallow algorithms to deep algorithms in the last couple years. The entire upper half of the plane are using Deep Learning models. Now with this context, let’s talk about a few interesting topics.

Not just LLMs use Deep Learning, other models can be deep too

Language and vision models have been frequently discussed in connection with Deep Learning, and it created a false impression that they (especially LLMs) are the only (or primary) models that use DL. In fact, DL has already been widely used in many models outside of the language/vision domain.?

While language models are getting bigger in the last couple years, the recommendation models have been evolving at the same time using DL too. Today these recommendation models built by tech companies are also large and deep, and they are the primary models being monetized by the industry.

Bernard Marr 1 年前

Chatting about ChatGPT

Solita Marcelli 1 年前

Mastering ChatGPT: A Comprehensive Guide

Bohdan Lukianets 11 个月前

LLMs don’t replace other models?

Language models are created to perform a specific set of language related ML tasks. They are not designed to perform other tasks such as recognizing pictures, recommending videos, or detecting fraudulent accounts.?

LLMs are more powerful versions of language models, but that doesn’t change the fact that they are not designed for every ML task. Majority of the ML tasks outside of language domain will continue to be done by the models designed for those purposes, and the need for these tasks in the industry will certainly continue to grow. And like language models, these non-language models will evolve as well.

What LLMs will replace are the language models with lower ROI. Note that ROI is the key here, since it only makes sense to use a model when the return outweighs the cost, and LLMs are very expensive to train and run today.

However, one thing LLMs can potentially do is to work with other models on complex tasks, and outrun legacy systems without LLMs. Read on.

LLMs can help other models work better

Language and vision models are not only capable of working standalone, but they are also increasingly being used to enhance other models in recommendation tasks today, such as pre-processing data for downstream models to consume. As a result of their enhanced ability to interpret and generate information, LLMs could unlock new potential to boost the performance of other models in these tasks, by working at upstream or downstream of these models.

Although I don’t have information on the internal design of the new Bing search experience, I suspect it’s one of such examples where large recommendation model work in conjunction with LLM to perform a complex task of search recommendation + conversation, where it does both behavior prediction and content understanding/gen. Based on the preview result, it has certainly brought the search experience to a new level. Such architecture will be more prevalent going forward.

What do all these mean for ML infra

As discussed above, LLMs are advanced version of smaller language models, but they won’t replace models built for other tasks such as recommendation. In addition, LLMs can enhance how other models are used and help them perform more complex tasks. As LLMs are evolving, non-language/vision models such as recommendation models are evolving too.

This means infra used to support both language models and non-language models (such as recommendation models) will continue to be needed. Furthermore, with the rapid increase of both model size and hybrid usage of models, there will be a growing demand for infrastructure to scale well and to support different model types to work together seamlessly.

What's your thought? Let me know and I'm happy to discuss.

On ChatGPT, LLM and ML Infra

Zhongliang Liang

Applied AI and AI Infra @ Meta

The ML task and model landscape

Not just LLMs use Deep Learning, other models can be deep too

领英推荐

LLMs don’t replace other models?

LLMs can help other models work better

What do all these mean for ML infra

社区洞察

其他会员也浏览了

A Deep Dive into ChatGPT's Latest Updates (GPTs, Code & Search)

Can ChatGPT be the next go-to search engine?

Researchers Find That OpenAI ChatGPT Quality Has Worsened

Comparison: Google vs ChatGPT (Which is the Best?)

Testing the limitations of ChatGPT-4

Insider's Edit: ChatGPT Performance Drift - a New Risk for Business

3 Things ChatGPT Needs Before it Can be Deployed in Customer Service

Comparing LLM Models- ChatGpt vs Gemini vs Claude

Top 5 differences between ChatGPT and Google's Bard Ai

3 ChatGPT Use Cases To Try in L&D