What's the hype surrounding Gen AI, LLMs, RAG and so on? It's crucial not to overlook this [Beginner]
Gaurav Kumar Singh
Software Engineer | JavaScript Pro | React Pro | Node Pro | Databases Expert | xQL expert | Python | Java | K8s | AWS | Bigdata | ES | AI-ML
The purpose of this article is to provide you with a foundational understanding of Gen AI and its associated terminology. By doing so, you'll be equipped to explore solutions independently and contribute to this groundbreaking transformation.
Before I go into specifics of Gen AI , LLMs, Embedding, Vector Search, RAG etc lets try to understand a bit about AI-ML first and a brief history so that you don't feel alienated.
Background:
AI:
Artificial intelligence is general field with very broad scope including Language Processing, Computer Vision, Summarisation etc
Machine Learning:(Check this course)
Machine learning is the branch of AI that covers the statistical part of artificial intelligence. It teaches computer to solve problems by looking at the thousands of examples, learning from them and then using that experience to solve the same problem in new situations.
Deep Learning and Neural Network: (check this coursera course)
Deep learning is a very special field of ML where computer can actually learn and make intelligent decision of their own. It involves a deeper level of automation in comparison with most ML algorithms.
Neural Network: It is the network of neurons, as the name suggests, but here the neurons are some mathematical functions represented by nodes defined in below diagrams. All the neuron does id take input, computes the function and then output the result(prediction/estimation etc).
Lets try to understand in simple terms e.g if y={1,2,3,4 for every x in {1,2,3,4}} then the function is y = x linear equation,
y={1,3,8,15..} for every x in {1,2,3,4...} then the function is y= x^2 -1 (quadratic equation), Or the input and output can be related with any order of equation
One good thing about NN is, given enough data about input(x) and output(y), the NN are remarkable good and figuring out the functions that accurately maps x -> y. i.e if your train on enough data, the NN can find the best algo like is it linear, quadratic or polynomial equation of a particular order.
The real things are not as simple as that but it will give you layman understanding.
Evolution of AI
Generative AI
Before Gen AI, we had ML models as shown in the earlier para that can help you do predictions or forecasts based on existing data ( The historical data which you have trained your model on) whereas Generative AI learns the patterns in the training data and generate new content which are closer to that but are not exact replicas.
Generative AI is used in applications such as image generation, video description, text generation, music composition, and content creation etc.
Now you have the AI power in your hand and instead of Googling things, try out these to get yourself familiar on the go.
Lets now talk about elephant in the room. LLMs. What is it? What can it do? etc etc. Check out the course Startup-School-AI. You can get yourself familiar with Gen AI and all the terminology which I am listing here. The hands on are based on Google's gemini model but again you can get the theoretical knowledge and see the demos to understand things in a bit more details.
LLMs:
LLMs are the models trained on huge amount of data enabling them to learn the statistical patterns and semantic relationships within language and once tested and fine tuned, it is released to use for the outside world. Given the sequence what LLMs do is predict the next token in the sequence(can be a sentence or a document etc). LLMs are typically based on deep learning architectures. A simple diagram below shows how we come to any LLM model after training. e.g open AI's GPT-3, GPT-4, Google's BERT, LaMDA, PaLM, Gemini , Anthropic's Claude etc.
These LLMs can be used now to create different solutions. Outside, they can be seen as separate apps, but under the hood they may use the same LLMs. e.g Github co-pilot, Microsoft 365 co-pilot etc
Multi-modality: Here modals can process and understand data from various sources, such as text, images, audio, video, and sensor data instead of data of a specific type. Traditional language models, like GPT models, primarily focus on processing text data. However, multi-modality LLMs extend this capability to incorporate and understand other forms of data, such as images, audio, video, or structured data etc.
Prompt Engineering:
Prompt: The text you feed to the model is called prompt.
It is the art and science of figuring out what text to feed your language model to nudge the model to behave in the desired way. You might have amazing algorithm but if your prompt are not good enough, you can't get potential out of that algorithm.
Prompt can be of following types:
领英推荐
Context: On top of that , we can also provide some context around that input.
You can add contextual info in your prompt when you need to give info to the model or restrict the boundaries of response to whats only within the prompt.
Input and Context can be used interchangeably.
Embedding & Vector Search:
Embedding is being used in internet everyday and you may not be aware of it but its part of your daily internet usage e.g Google, Spotify, Facebook, Instagram, Youtube recommendation etc. Almost all the big tech are using embedding at their core but we still lack its penetration in IT services where more than 90% still relies on tabular data. Thats why knowing embedding is super important even for business people.
Technically embeddings are some real numbers(aka vectors) and it is generated by AI model or deep learning models (LLMs). It is like the meaning of entire text represented as a single point vector of n dimentions.
E.g this text "What are the books related to Human Behaviour ?" --> can be represented by a vector [0.3,0.1,0.25.......] in vector space.
Vector/embedding space: You can break the texts from your file to create chunks and find the embedding for each chunk. Now when you map this , it creates an embedding space where embedding model puts chunks(text) with similar meaning close together. Basically it is map of meaning.
You can go to https://www.nomic.ai/ and play around with real visualisation of your data.
Use cases for embedding:
Vector Search:
Embeddings are vectors and you can calculate the similarity between two embeddings by using many metrics and some popular metrics are
This is how when you query, your query is first converted into vector and then it is searched against the vector DB where you have stored all the source information.
RAG - Retrieval Augmented Generation:
Lets understand why do we need RAG first.
Grounding: Connecting abstract concepts or language representations to real-world knowledge or experiences i.e how to integrate LLMs or AI chatbot with existing IT systems, database & business data.
LLMs problem is Hallucinations (aka grounding problem):
Hallucinations means your are giving some results which are not relevant to my query at all but model is confident 100% that it is the correct answer.
LLMs are phenomenal for knowledge generation and reasoning. They are pre-trained on large amount of public data. But LLMs can only understand the information -
LLMs don't have capability to ask for more information and potentially need some outside input. Lets see some naive solution for this.
So we need a different approach then. Here comes RAG.
RAG allows user to provide their own data as context so that the query searches are more specific to my use case rather working on entire world data. E.g company A(Aviation Company) and Company B(E-Commerce) chose same LLM model (say GPT-4) to build a Q&A app for its users. Now with RAG, they can bring their own data into vector DB, and then augment the query by its user with this additional info before sending the final query to new LLM model(Generator) to come up with the correct answers.
Given the prompt, LLM extracts the info from the prompt that will be sent to retriever (which is going to be the data source that you provide - your internal data stored into vector DB which will limit the search scope instead of searching the whole world data)
Here is quick and nice explanation of RAG, Embedding and all by John Savill .
I trust this overview provides you with a broad understanding of numerous new terminologies in the AI and Gen AI domains. Your feedback or comments are greatly appreciated.
Happy Reading - Gaurav
#GenAI #ArtificialIntelligence #MachineLearning #DeepLearning #Multimodality #Vectorsearch #NaturalLanguageProcessing #NLP #LanguageModel #LLM #RAG #Embedding #WordEmbedding #TextEmbedding #NeuralNetworks #TechTrends #Innovation #FutureTech #DataScience #Automation #EmergingTech #SmartTechnology #AIResearch #AIApplications #AIinBusiness #AIinHealthcare #AIinFinance #AIinEducation #EthicalAI #ResponsibleAI #TechEthics #DigitalTransformation
Content driven SEO | Helping businesses thrive through organic growth strategies ? | Living the life of storytelling and Content Marketing ? ?? Follow to skyrocket your organic marketing! ??
4 个月Wow, that's quite the mix of conversations Explaining AI to a suit seller must've been interesting. Gaurav Kumar Singh
Senior Software Engineer | Full Stack .Net Developer
4 个月Such an awesome work - The Gaurav's way of explanation ??