Don't Boil the Ocean: Unlocking LLM Power with Smarter Context, Not Fine-Tuning

Don't Boil the Ocean: Unlocking LLM Power with Smarter Context, Not Fine-Tuning

I spent a few hours on Friday with a customer about to fine-tune a large language model to embed specific policies, processes, and some aspects of legal regulations. Following that, they were looking to generate some sample instances and ask the model to make decisions and provide reasons, citing policies etc.

They are reconsidering.

Fine-tuning, for the uninitiated, is painful. For this client, they were breaking down each policy into tiny, granular data points like:

  • "Job 1 falls under Process A"
  • "Job 1 does not qualify for Process B"
  • "If X condition is met, follow section 3.2 of Policy Y"

That's just matching Job 1 (having around 320 types of job, once we did a cartesian product of all of the options). The work just to provide the instructions was huge, let alone testing it.

Then you have more pain points:

  • You need to label each data point so the LLM knows what it's looking at. This can be a tedious and error-prone process, especially for large datasets.
  • The quality of your data directly impacts the LLM's performance. Inaccurate or incomplete data can lead to biased or nonsensical outputs.
  • Fine-tuning often requires significant computing power, which translates to real dollars.

While the idea of a custom-built language model sounds appealing, I urged them to consider a different approach: leveraging the power of ever-widening context windows offered by many models.

Now, before your eyes glaze over with technical jargon, let's break this down. You give a model like ChatGPT context by providing a prompt. As your conversation continues, it absorbs the instructions it has received on prior and new prompts. This is its context window: how much it can remember.

Google's latest LLM, codenamed Gemini, can only consider a maximum of 1 million tokens (roughly equivalent to 6 million characters) at a time in its context window. Even the most advanced LLMs can miss crucial nuances if the information they need is spread across a larger dataset.

You might have seen that from prompting, figuring out that more was less with ChatGPT - you often need to simplify prompts and add some few-shot examples.

So, here's where context windows come in. Instead of cramming everything into the LLM and hoping it makes sense, we can focus on feeding it the most relevant information at the right time. This is where a vector database, essentially a super-powered search engine, can be a game-changer. Pinecone, Weaviate, and Milvus are examples.

Imagine this: your company has a wealth of policies, procedures, and historical data. By storing this information in a vector database, we can train the system to efficiently retrieve the most relevant snippets whenever the LLM needs them. This allows the LLM to understand the bigger picture, even if the information is spread across a vast dataset.

In simpler terms, we're not trying to teach the LLM a new language; we're giving it a better way to access and understand the language it already knows. You feed it what it needs to know when it needs to know it. And if you are using Azure or AWS, it is easy to chain all this together.

This approach offers several advantages:

Reduced Costs: Fine-tuning LLMs can be a resource-intensive process. You can achieve similar results without the hefty price tag using context windows and vector databases. And I'm not just talking about OpenAI token costs - you will soon find yourself in the burstable zone on your cloud platform.

Faster Implementation: Fine-tuning takes time. Focusing on context windows allows for a quicker turnaround and gets your LLM working for you sooner. Most cases I have seen can be handled through the API alone - no need for a vector database in every scenario.

Greater Accuracy: A wider context window means the LLM has a better chance of understanding the nuances of your specific needs, leading to more accurate and relevant outputs.

I am not saying that fine-tuning is always the wrong call. However, for many businesses, focusing on context windows and vector databases can be a more cost-effective and efficient way to unlock the power of large language models.

要查看或添加评论,请登录

Patrick FitzGerald的更多文章

  • Homo heidelbergensis, AI bull**** and the future

    Homo heidelbergensis, AI bull**** and the future

    I have been sent this paper five times over the last year and have finally read it. My suspicions about it were…

    1 条评论
  • AI, GPT, and all that Explainer

    AI, GPT, and all that Explainer

    We hear AI all the time these days ..

  • Degrees of anti-fragility

    Degrees of anti-fragility

    Last week I posted about the concept of anti-fragility (a la Nassim Nicholas Taleb) and business models. My former…

    5 条评论

社区洞察

其他会员也浏览了