Weekly Update: New LLM Models and the Basics of RAG
Kashif Manzoor
Enabling Customers for a Successful AI Adoption | AI Tech Evangelist | AI Solutions Architect
Stay Ahead in AI with Weekly AI Roundup; read and listen on AITechCircle:
Welcome to the weekly AI Newsletter, your go-to source for practical and actionable ideas. I'm here to give you tips to apply to your job and business immediately.
Before we start, share this week's updates with a friend or a colleague:
Today at a Glance:
RAG Basics: A Beginner’s Guide to Retrieval-Augmented Generation
if you are new to this topic before you start reading this, I would suggest you go through these 2 earlier editions of the newsletter:
This week, we will start with the very basic RAG from scratch, based on the repository available from Mistral LLM. The goal is to clarify your understanding of RAG’s internal workings and equip you with the foundational knowledge needed to construct an RAG using minimal dependencies.
Let's start with installing the required packages:
Now, to get the data from any article or document, or web source:
response = requests.get('https://www.gutenberg.org/cache/epub/1513/pg1513.txt')text = response.text
Then, split the data into Chunks: In a Retrieval-Augmented Generation (RAG) system, breaking the document into smaller chunks is essential for efficiently identifying and retrieving the most relevant information during the retrieval process. In this example, we split the text by characters and grouped 2048 characters into each chunk.
Key points:
Chunk size: To achieve optimal performance in RAG, we may need to customize or experiment with different chunk sizes and overlaps based on the specific use case. Smaller chunks can be more beneficial for retrieval processes, as larger chunks often contain filler text that can obscure semantic representation. Using smaller chunks allows the RAG system to identify and extract relevant information more effectively and accurately. However, be mindful of the trade-offs, such as increased processing time and computational resources, that come with using smaller chunks.
How to split: The simplest method is to split the text by character, but other options are based on the use case and document structure. To avoid exceeding token limits in API calls, you might need to split the text by tokens. Consider splitting the text into sentences, paragraphs, or HTML headers to maintain chunk cohesiveness. When working with code, it’s often best to split by meaningful code chunks, such as using an Abstract Syntax Tree (AST) parser.
Creation of embeddings for each text chunk:
Text embeddings convert text into numeric representations in a vector, enabling the model to understand semantic relationships between words. Words with similar meanings will be closer in this space, which is crucial for tasks like information retrieval and semantic search.
To generate these embeddings, we use Mistral AI’s embeddings API endpoint with the mistral-embed model. We create a function called get_text_embedding to retrieve the embedding for a single text chunk. Then, we use list comprehension to apply this function to all text chunks and obtain their embeddings efficiently.
Loading into Vector Database: after getting the embeddings in place, we need to store them in the Vector Database.
The question that the user will ask needs to create embeddings and then receive similar chunks from the Vector DB.
To perform a search on the vector database, we use the index .search method, which requires two arguments: the vector of the question embeddings and the number of similar vectors to retrieve. This method returns the distances and indices of the most similar vectors to the question vector in the database. Using these indices, we can then retrieve the corresponding relevant text chunks.
There are some common methods:
Combine Context and Question in a Prompt to Generate a Response:
Lastly, we can use the retrieved text chunks as a context within the prompt to generate a response.
Prompting Techniques for Developing a RAG System: In developing a Retrieval-Augmented Generation (RAG) system, various prompting techniques can significantly enhance the model’s performance and the quality of its responses.
Here are some key techniques that can be applied:
Head over to this link, and you can try building your first simple RAG.
Weekly News & Updates...
Last week's AI breakthroughs marked another leap forward in the tech revolution.
The Cloud: the backbone of the AI revolution
Gen AI Use Case of the Week:
Generative AI use cases in the Government and Public Sector :
Utilizing large language models (LLMs) for Simulating Urban Planning Scenarios (Urban Planning/Future of Cities), this use case derived from Deloitte
Business Challenges
AI Solution Description
Using large language models (LLMs), generative AI can simulate urban planning scenarios by processing vast data and generating multiple design concepts.
Here’s how it can be done:
Data Integration: The AI model ingests various data sources, including demographic data, environmental reports, infrastructure details, and economic statistics.
Scenario Generation: The LLM processes this data to generate multiple urban planning scenarios. It can create detailed descriptions, visualizations, and potential outcomes for each scenario.
Simulation and Optimization: The generated scenarios are then simulated to predict their impacts. The AI model optimizes these scenarios based on predefined goals, such as sustainability, economic growth, and livability.
领英推荐
Expected Impact/Business Outcome
Required Data Sources
Strategic Fit and Impact
Implementing generative AI in urban planning aligns well with the strategic goals of modernizing infrastructure, improving public services, and fostering sustainable development. The high impact rating reflects its potential to transform urban planning processes, leading to more efficient and effective development outcomes.
Rating: High Impact & strategic fit
Favorite Tip Of The Week:
Here's my favorite resource of the week.
Potential of AI
Things to Know...
This week, I liked the resources on AI and Generative AI from Georgetown University on how to use Gen AI and cite it in your articles, papers, and research.
"To cite the informational product generated by ChatGPT or other AI, the recommendation is for the Methodology and/or Introduction of your paper to specify the following:
Please remember that if AI connects you to another resource, you need to cite that resource, just as you would in a literature review."
The Opportunity...
Podcast:
Courses to attend:
Events:
Tech and Tools...
Data Sets...
Other Technology News
Want to stay on the cutting edge?
Here's what else is happening in Information Technology you should know about:
Join a mini email course on Generative AI ...
Earlier week's Post:
That's it!
As always, thanks for reading.
Hit reply and let me know what you found most helpful this week - I'd love to hear from you!
Until next week,
Kashif Manzoor
The opinions expressed here are solely my conjecture based on experience, practice, and observation. They do not represent the thoughts, intentions, plans, or strategies of my current or previous employers or their clients/customers. The objective of this newsletter is to share and learn with the community.
Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer
7 个月The next generation of LLMs will seamlessly integrate with external knowledge bases, enabling truly contextual and insightful responses. Imagine an LLM that can access and synthesize information from scientific journals in real-time, providing groundbreaking insights. In a recent forecast, Elon Musk predicted that AI will eventually surpass human intelligence. How do you envision your approach evolving to ensure ethical and responsible development in this rapidly advancing field?