GenAI Optimization Techniques - Part 1
David Jitendranath
AI Strategy Developer, Advisor, CxO -- Architect of the absurdly brilliant
Techniques used in the Store and Structure Phase
There is a view to LLMs that it is some sort of a database that contains tons of data, you can ask it anything and it will respond with accuracy. If the response is not accurate, then we have to employ “optimization” to get the response to be accurate.
That view is operating on the assumption that LLMs are fact based responders. In the field of Generative AI, that is largely a misdirected view considering the fact that most LLMs are closed (we do not know what data they are trained on).?
The right assumption would be to imagine LLMs as primarily a vocabulary provider that provides relevancy and not accuracy. In other words, you can inquire an LLM to examine the relevancy of the response.?
To illustrate this, Imagine if you were to go to a bookstore that has English books. Though all books are written in the same language, The books in the Religion section will use a certain vocabulary which would be quite different from books in the Travel section.?
Let's say the section tags were missing and all the books had blank covers.? You as the reader will be able to figure out whether you are in the Travel book section or Religious book section. To determine this, you may employ some of your background knowledge (aka LLM) to evaluate what category the book you are reading belongs to. This is what I mean by relevance and context awareness.
If we view LLMs this way it would make more sense to apply optimization at different stages within the Gen AI process.?
To briefly define optimization, It is a set of techniques applied at different stages within the GPT pipeline. These techniques are applied so the response to a user's question will be accurate, relevant and safe.?
I primarily like to segment the GPT pipeline into two phases. Though they may not be sequential in timeline for execution.
领英推荐
Phase one Optimization?
In this blog we will cover optimization techniques that are primarily used in Phase1.? In my next blog we will cover optimization techniques used in Phase 2.? So stay tuned by clicking the subscribe button.
Let's quickly understand the store and structure phase. There are two steps that happen in this phase (refer diagram below).?
Ingestion is the process of creating the necessary data pipelines to define and store the model. The key components create the data pipelines are?
The below diagram will shed some light on the steps that happen during this phase.
Chunking algorithms are a way to employ optimization at the ingestion step. Choosing the right algorithm to use for chunking is an essential optimization technique. Here are a few chunking techniques.
Indexing or model embedding one utilizes LLMs to inquire and add meaning to the chunks of data. This meaning is basically a numerical representation of the chunked data’s context, and relative relevancy. Refer to the earlier blog link here to learn more about how this numerical representation is done. During this step, the numerical representations are stored as indices. These indicer are more like the table of contents for a book
In this step,? one may want to? determine which would be the best embedding model to choose in order to optimize for performance.
Here is a link to the Leaderboard to find embedding models that may be best suited for embedding/encoding.? How you determine which model to choose is beyond the scope of this blog. ? The important point here is that this is another optimization technique that can be applied in this phase.
Based on the choice the embedding data can be added to the vector database as indices.? Again, the Table of Contents example helps here.
In my next blog, we will cover Optimization techniques used in Phase 2.? So stay tuned by clicking the subscribe button.
Can’t wait to dive into your blog! David Jitendranath