Decoding GenAI Terminologies: A short intro into Architectures, Models and Prompt Engineering
Sudhanva Dixit
Data science and GenAI for Electric Vehicles @ Bosch Software| Ex-Benz
#LearnWithDixit - Module 3: Decoding GenAI Terminologies: A short intro into Architectures, Models and Prompt Engineering
GenAI field is a giant and not an easy task to understand the steps and concepts of it with the ever-changing landscape. Currently being in its transient form, our idea of the landscape should keep adapting as inventions happen.
?
We hear many new terminologies in the recent past. Foundation models, LLMs, GPTs, Prompt Engineering, RAG, Fine-tuning, Transformers and many more. What are these and how do these relate to each other?
?
To understand this let's sit in the development team seat and see how we travel through the development till user phase. Let’s say there's a team which is developing such models. They are said to build a GenAI model for a chatbot. While there are many other applications of GenAI, let’s take this as a simple example.
?
What is the first thing they decide/do?
The GenAI development steps look similar to the conventional AI. Data collection and pre-processing are also a part of GenAI development cycle.
For the scope of this article, I am skipping Data acquisition and pre-processing steps, the so called 'boring part' (for some developers) of the pipeline.
?
Next comes, choosing the architecture.
As we get into this step, there is one more word which is closely used along with architecture and that is 'model'.
So, what is the difference between ‘architecture’ and a 'Model'?
There are two words which are close by and are confused quite sometimes and even after some digging the difference seems a bit blurry.
I felt that " 'data' given to an 'architecture' results in a 'model' " would a simple way to explain it. Model is an instance of the architecture. Examples are BERT, GPT and T5 are 'models' developed on Transformer architecture.
?
Basically, 'architecture' is the framework/structure/skeleton, and we pass data through it to get a 'model'.
?
Are we done? Is the model ready? Technically, yes.
Does it give the best results? No!
?
The pre-trained model is trained on large corpus of public general data, typically internet and books, and sometimes also on curated and filtered data. In the final stages of getting a usable pre-trained model, the model may still need to be instructed how to answer, to not have bias, phrases to use, what ethics to have, etc. So now, I will instruct it through 'prompts' on many angles to be a usable model. Here comes the concepts of Prompt Engineering, Fine-tuning and similar concepts. More on it later in the article.
?
So maybe we can simplify it into an equation as
'Architecture + Training Corpus (General Data) + Pretraining Process (PE/FT) = pre-trained GenAI Model'.
In short, if you are interacting as an end user, then you are interacting with a 'model' and not an 'architecture'.
Wait, we just said that the pre-trained model is trained on a large general corpus. Doesn’t the statement conflict the word 'pre-trained' if it’s already 'trained on large corpus of data'?
Well, with respect to GenAI models, there are two stages of training.
One is while building the general model, another is while adapting it for a specific use-case. An example for this is how GPT is trained on publicly available data. Without any form of training, a model cannot be developed. A model obtained after training on large, generalized data is called as 'pre-trained' model.
领英推荐
?
Then, when is a model actually 'trained'?
The training referred to here is the application specific training. Let's say you want to use a model for a team specific purpose or a domain specific (finance, medical, automotive, etc.), then you need more application specific data for accurate answers, to avoid hallucination, to give team/domain specific internal information-based answers. Once you take the general model and train it furthermore specifically, then it’s called completely 'trained'.
In short,
Pre-trained GenAI Model + Fine-tuning Data + Fine-tuning Process = Fine-tuned GenAI Chatbot Model
?
Here,
Fine-tuning Data = Domain/Team specific data
Fine-tuning Process = Prompt Engineering (PE), Fine-tuning (FT)
?
After getting the model ready, which is not in its best form, now the team thinks how we can make it better and tell it how to behave, respond, structure its responses, have manners in answering, etc. Now comes one of the methods which comes in in the final stages before the locking in of the pre-trained model. And that is Prompt Engineering.
?
Prompt Engineering is not the only process between the training of a model and its deployment. There are other too like Fine-tuning and more. Depending on need, one of them, some of them or all of them may be implemented to better the model’s initial version.
?
In a simplified manner, Prompt Engineering can be said as the ‘behavioural calibration’ of the model’s initial version.
?
So, when do we do Prompt Engineering? What is "engineering" about it?
Is it for building the general model or on the 'pre-trained' model to make it more application specific?
?
To sum it up in two statements
?
Note that PE is an iterative process and model might go through PE multiple times. This being called 'engineering' has created quite a debate that it seems a bit grandiose, while its defended stating 'the prompts are designed, made specific and optimized by understanding NLP concepts and how the model works'. ?
Though Prompt Engineering is mainly related to optimizing the model behaviour before release, some call even the end user interaction as Prompt Engineering, though I'd prefer it to be called just 'Prompting'.
?
Some considerations to have for improving the response of the model.
In short, prompting is like learning how to Google. All of us have learnt how to interact with the Google search engine over time and in the same manner we will all learn to prompt better.
The process is not complete yet. RAG, Model evaluation, deployment and monitoring are some more important concepts in the process. More on them in another article.
??
Well, that's it for now. Hope you got a better and deeper insight into the world of GenAI. Even with this much info, it's only the tip of the iceberg. I learnt much more about this field in the process of writing this article. Even with this much info, it's only the tip of the iceberg.
I would love to hear opinions and improvements :) Until next time.