ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

Creating RagaaS: With AWS Bedrock & LangChain + the Consequential Death of SQL

Nikunj J Parekh

Agentic AI Executive | CTO @ EV Platform | Principal DMTS | Board Advisor | IEEE | Speaker | President, IIT Tech Clubs | Author | Angel Investor

å‘å¸ƒæ—¥æœŸ: 2024å¹´3æœˆ20æ—¥

No click-baiting! RagaaS stands for "Retrieval Augmented Generation as a Service" and has nothing to do with my background in Music (although the capital S at the end was hint enough ;).)

The focus of my articles is to underline the point that AI is not for just the bold (or italic, irk..), but is easy.

In this post, you'll learn how you can set up and integrate Amazon Bedrock with your LangChain app for an end-to-end RAG pipeline.

A passing comment, LangChain is a good name, but if they'd call it GPTProjects, or LLMProjects, that'd align with its job as well. Although, it is a chain, so it is one-way flow of generated info (usually text) from LLMs to humans or other consuming systems. It's like IFTTT for LLMs, if you're familiar with that service, except that LangChain is free unlike IFTTT.

Amazon Bedrock is a fully managed AWS service that gives you access to popular foundation models from leading AI companies like Anthropic and Mistral AI via a single API.

Since this is a fully managed service, it has the ability to handle the complete RAG pipeline for your application.

For this tutorial, we're going to work with the Amazon Titan Text G1 - Lite model, specifically: amazon.titan-text-lite-v1. We could work with any of the embedding models, such as OpenAI's "ada" (text-embedding-ada-002) or we can pick from many that HuggingFace offers or using Claude.

You can use "ada" just like I show below, but let's proceed with Titan Text G1 - Lite due to convenience on AWS.

from langchain.embeddings import OpenAIEmbeddings

emb = OpenAIEmbeddings(model_name="ada")
emb.embed_query(text) # text / token / sentence...

You can generate and observe embeddings just like that. The vector has 1536 dimensions (this is considered small; Ada is small).

Why choose Amazon Bedrock?

Whether you're an existing AWS customer or new to the platform, Amazon Bedrock is a solid choice for the following reasons:

Fine-tuning and RAG: Easily fine-tune your choice of foundation models (FMs) and use Bedrock as a RAG-as-a-service
Serverless and scalable: Scale to production without infrastructure worry while AWS easily scales based on your setup and usage - serverless!
Model evaluation via Single API: Switch between FMs without heavily re-writing code. All FMs integrate with the same Bedrock API.

Retrieval-Augmented Generation (RAG)

Retrieval-augmented generation (RAG) is a technique that uses knowledge that wasn't part of a model's initial training data. This helps the model get additional relevant context from specific data sources so its output is enhanced.

Knowledge like private company documents could be used to improve a LLM's response to a specific user prompt or query.

RAG Options with Amazon Bedrock

We have two options to build a Retrieval augmented generation (RAG) pipeline using Amazon Bedrock.

Integrate with data framework: when you just need to use Bedrock's Foundation Models (FM) for NLP tasks, and handle the RAG pipeline outside of Bedrock using a data framework, like LangChain.
Knowledge Bases for Amazon Bedrock: Let Bedrock fully handle the RAG pipeline using Knowledge Bases. This could be referred to as: "RAG-as-a-service". Bedrock handles ingestion, embedding, querying, and vector stores and can also provide source attribution from your private documents and data sources.

We're going to implement both options in this tutorial.

Access to Amazon Bedrock

By default, you do not have access to the Amazon Bedrock Foundation Models (FMs). You'll need to:

Step 1: Add the required permissions to the user
Step 2: Request access to the foundation model

Step 1: Grant user Bedrock permissions in IAM

Sign into your AWS Console and navigate to the Identity and Access Management (IAM) page. You should then click on the User name that will be used to access the Amazon Bedrock foundation models.

After selecting the user, scroll down to the Permissions tab and choose Add permissions from the dropdown options as shown below:

The last step is to find the AmazonBedrockFullAccess permission from the list by searching for it. To proceed, select it and save the updated settings.

Step 2: Request access to Bedrock's FMs

Now we need to navigate to Amazon Bedrock to request access to the foundation model, in our case the Amazon Titan. Type Bedrock in the AWS console search bar then click on the Amazon Bedrock service, as shown below:

Now, let's go ahead and request access to the Amazon Titan Text G1 - Lite model. You can choose whichever foundation model you like from the available ones and request access, as I've mentioned before. On the top right, click on the Manage model access button.

This will enable selecting which models you want to request access to. For this tutorial, let's go ahead and choose Titan Text G1 - Lite then click on Request access at the bottom of the screen.

You should see the text Access granted in the Access status column next to the Titan Text G1 - Lite once you're done (As shown in the screenshot above).

LangChain + Bedrock

Let's set up our environment and write some code that will let our Python LangChain app interact with the foundation model.

To do this, we're going to:

Step up the work directory
Configure Boto3
Configure AWS Access KeysUsing AWS CLIManual configuration
Integrate LangChain and Bedrock

Step 1: Set up the work directory

Create a new directory and a new Python file, say: lang.py. The remainder of the articles trusts that you're working in a virtualenv = venv, and python module = lang.py next step is optional but recommended.

Now, we'll install Boto3 and LangChain using pip. Here's my simple requirements.txt and the way to install that in my venv.

boto3
langchain

[ ~/dir-langchain-on-bedrock](venv) $ pip3 install -r requirements.txt

To give the application access to our AWS resources, including Bedrock, we'll need to set up the AWS authentication credentials for the IAM user. In your AWS console, go to the IAM service, choose your User then navigate to the Security credentials tab, scroll down until the Access keys card is visible. On the top right, click on the Create access key button. Choose Local code from the list. The purpose of that webpage is just to inform you about the best practices when using access keys. Make sure to copy your keys and keep them handy for the next step below.

Step 3: Configuring AWS Access keys for Boto3

We have two options to configure the AWS Access keys - Using the AWS CLI or Manual configuration

é¢†è‹±æŽ¨è

My "Aha!" Moment with Amazon Q

Amazon Web Services (AWS) 8 ä¸ªæœˆå‰

The Future Of Cloud-Based Data, Analytics, and Machine Learning: Highlights from AWS re:Invent 2022

The Future Of Cloud-Based Data, Analytics, and Machineâ€¦

Bernard Marr 2 å¹´å‰

Deploying a Trained CTGAN Model on an EC2 Instance: A Step-by-Step Guide

Deploying a Trained CTGAN Model on an EC2 Instance: Aâ€¦

Jon Bonso 1 å¹´å‰

Option 1: Using the AWS CLI

This is the fastest way but requires you to download the AWS CLI. You should opt for this option if you already the CLI installed on your system.

In your terminal window, type aws configure. This instructs the AWS CLI to generate a credentials file. The CLI will ask you for the AWS Access Key ID, AWS Secret Access Key, Default region name, (for example us-west-1), and Default output format (JSON is the default if you left it blank). The AWS CLI will generate your credentials file in the ~/.aws/credentials directory. Boto3 and LangChain will use this file to communicate with Bedrock - or any AWS services.

Option 2: Manual configuration

Manually create the credentials file, as below:

[bedrock]
aws_access_key_id = YOUR_ACCESS_KEY
aws_secret_access_key = YOUR_SECRET_KEY

Also create the config file in the same directory. The purpose of this file is to store the resource region, as below:

[bedrock]
region=us-west-1

Step 4: Integrating Bedrock with LangChain

Add the following to your lang.py file:

from langchain.chains import LLMChain
from langchain_community.llms import Bedrock
from langchain_core.prompts import PromptTemplate

# This object will let LangChain know the model to communicate with and the AWS credentials to use

llm = Bedrock(
  model_id="amazon.titan-text-lite-v1",
  credentials_profile_name="bedrock" 
)

# Now, create PromptTemplate

prompt_template = "What is the capital city of {country}?"

prompt = PromptTemplate(
    input_variables=["country"], template=prompt_template
)

# Finally, create LLMChain to prompt the model

llm = LLMChain(llm=llm, prompt=prompt)

response = llm.invoke({"country": "Canada"})

print(response['text'])

Run this python module at the command prompt. Hopefully your code also ran, and here's the response:

Foundation model response: The capital city of Canada is Ottawa.

Was that eaassy?

How to have a production quality, managed, end-to-end RAG pipeline as a service (RagaaS)? We need to implement Knowledge Bases for Amazon Bedrock.

Knowledge Bases

Knowledge Bases for Amazon Bedrock takes care of the full RAG pipeline. This way Bedrock will handle embeddings, storage, data ingestion. and querying. It acts as a RAG-as-a-service fully managed by AWS.

Currently, Knowledge Bases only works with an S3 bucket as the data source.

It will set up the embedding model (of your choice) that converts the contents of the files in the S3 bucket to vector embeddings which it will store in a vector database (of your choice). The RagaaS data flow is shown below:

Limited support for foundation models

Anthropic models are only supported for querying. If you want to use those, you'd have to again Request model access for it!

Creating a new Knowledge base (KB)

A Knowledge base is simply a data / knowledge source. Click on the Knowledge Base item from the sidebar on the Bedrock console then click on the Create Knowledge Base button. Fill in -

Knowledge base details: Enter any name for your Knowledge base. Then, Choose Create and use a new service role and click on the Next button.
Set up data source: The only option is an S3 bucket. This bucket must contain the files that will be converted to embeddings and stored in a vector store for future querying. (Tip: You can save this write up itself as PDF and use it as a test file in your S3 Bucket). Are you also missing AWS Kinesis, SQS, Kafka, ElastiCache, RDS, Redshift, Athena, DynamoDb, DocumentDb, and Aurora ... Amazon Web Services (AWS) - no rush :)
Select embeddings model and configure vector store: I'm going for the default settings for both. Titan Embeddings G1 - Text v1.2 for embeddings and a new Amazon OpenSearch Serverless vector store.
Next, click on the Create knowledge base. It takes a few minutes.
Click on the Sync button to finalize.

Testing the Knowledge Base

Hit the Test Knowledge Base tab on the right. Enter your prompt in the text field and click Run.

It should take a few seconds but as you can see, the model's response is accurate and based on the provided PDF file in the S3 bucket. The knowledge base does the similarity search for us, prompts the model, and returns the response.

The complete RAG pipeline is all setup and handled by Amazon Bedrock.

Integrating with LangChain using the Knowledge Bases Retriever

Now it's time to query our Knowledge base using LangChain. For this example, I am going to use the RetrievalQA chain.

In the same lang.py file, let's import the following packages:

from langchain.chains import RetrievalQA
from langchain_community.retrievers import AmazonKnowledgeBasesRetriever

# Instantiate AmazonKnowledgeBasesRetriever

retriever = AmazonKnowledgeBasesRetriever(
    knowledge_base_id="KNOWLEDGE_BASE_ID",
    credentials_profile_name="bedrock",
    retrieval_config={"vectorSearchConfiguration": {"numberOfResults": 3}},
)

The following fields are required:

knowledge_base_id: Grab the ID from the Knowledge base page in the AWS console.
credentials_profile_name: This is the profile that has access to the Amazon Bedrock service, in our case [bedrock].
retrieval_config: I usually like to return the top 3 similar results from the vector store. Feel free to adjust as you like.

Let's set up our RetrievalQA chain. We'll need to provide it with the llm and retriever objects:

model_kwargs_claude = {"temperature": 0, "top_k": 10, "max_tokens_to_sample": 3000}

llm = Bedrock(
    model_id="anthropic.claude-v2:1", 
    credentials_profile_name="bedrock-kb", 
    model_kwargs=model_kwargs_claude
    )

qa = RetrievalQA.from_chain_type(
    llm=llm, 
    retriever=retriever
)

# Finally, query the FM
query = "Why is LangChain is good choice?"
response = qa.invoke(query)

print(response['result'])

Share the response you get in the comments! Hopefully, it is the same as the one you got in the AWS Console.

Claude's response is based on the files in the S3 bucket and nothing else. I only had a PDF version of this post in the S3 bucket for the model to reference through the vector database we chose. With the Cloud, horizontal scalability is a given, which means that the pipeline is serverless, stateless, and will be able to continuously deliver as a highly performant RagaaS product across the numerous documents you add.

The term RAG-as-a-service becomes evident now: the suer does not have to think about any part of the RAG pipeline. We set up the KB and let Amazon Bedrock do the rest.

Summary

If you're looking to manage your own RAG pipeline, you can get up and running in no time by just setting up the required permissions and using the LangChain Bedrock class to connect with one of the foundation models.

Otherwise, it is EASY for anyone to use the Knowledge Bases feature for Amazon Bedrock which will handle creating all the components in your RAG pipeline for you. As we've seen, this includes storage, embedding, querying, data ingestion, and everything in between.

By the way, SQL is dead, you know right? See below clip to confirm it for yourself. The rises of RagaaS will bring data and compute power closer to humans than ever before, and with AI, finally, there will be more for the computers to work and less for humans, to get to the explainability of any documentable topic.

Video: Make SQL queries with plain English using LangChain

Thanks!

Aritra Ray

Hardware Engineer 2 at Microsoft | Machine Learning, GenAI | Ph.D. Duke University

1 å¹´

Nice! Helpful implementation. A YouTube channel might be on the way too!

èµž

å›žå¤

1 æ¬¡å›žåº”

Sanjay Saxena

CEO @ CyberEdx | Best-Selling Author | Radio Host

1 å¹´

Nice

èµž

å›žå¤

1 æ¬¡å›žåº”

Vikramaditya N.

Co-Founder @ Ascendus AI and DocAvatar AI | Generative AI, Large Language Models

1 å¹´

The OpenSearch service is expensive and one can run up quite a bill. On the other hand, the Bedrock models are quite inexpensive. So I've found it to be more cost effective to set up and run your own embeddings index via FAISS - https://python.langchain.com/docs/integrations/vectorstores/faiss AWS will need to bring down pricing significantly on this service before RaGaaS can become a pleasant melody :)

èµž

å›žå¤

1 æ¬¡å›žåº”

æŸ¥çœ‹æ›´å¤šè¯„è®º

è¦æŸ¥çœ‹æˆ–æ·»åŠ è¯„è®ºï¼Œè¯·ç™»å½•

Nikunj J Parekhçš„æ›´å¤šæ–‡ç«

Nvidia GTC Is Here! What's New In AI?

2025å¹´3æœˆ18æ—¥

Nvidia GTC Is Here! What's New In AI?

?? NVIDIA GTC 2025 kicked off today! ?? The epicenter of AI and accelerated computing innovation is back, bringingâ€¦
Disruptions: AIâ€™s Future? NP-Hard? Monetization? Security? Inference Scaling?

2025å¹´3æœˆ8æ—¥

Disruptions: AIâ€™s Future? NP-Hard? Monetization? Security? Inference Scaling?

DIsclaimer: Only use of AI was to generate the Image, no AI used to generate the text. Why? Traditional effort is moreâ€¦

2 æ¡è¯„è®º
AI Agents Part 2: Pick The Right Framework

2025å¹´3æœˆ2æ—¥

AI Agents Part 2: Pick The Right Framework

My earlier editorials introduce Agentic frameworks at a high level, and have highlighted some difficulties engineersâ€¦

9 æ¡è¯„è®º
AI Agents Part 2: Pick The Right Framework

2025å¹´2æœˆ27æ—¥

AI Agents Part 2: Pick The Right Framework

My earlier editorials introduce Agentic frameworks at a high level, and have highlighted some difficulties engineersâ€¦

10 æ¡è¯„è®º
Enterprise Engineering Opportunities in Agentic AI: Case Studies

2025å¹´2æœˆ20æ—¥

Enterprise Engineering Opportunities in Agentic AI: Case Studies

This week, I had the opportunity to present a TechTalk on Agentic AI â€” a topic that is reshaping how AI systems operateâ€¦

2 æ¡è¯„è®º
Reflections on Career Readiness in the Age of Agents

2025å¹´2æœˆ17æ—¥

Reflections on Career Readiness in the Age of Agents

My son, Jay N Parekh, and I attended the "Career Ready with Agentic AI" event at UC Santa Cruz Extn yesterday, and itâ€¦

8 æ¡è¯„è®º
The AI Agents Part 1: A Starter's Technical Guide

2025å¹´2æœˆ13æ—¥

The AI Agents Part 1: A Starter's Technical Guide

You don't even have to look around. AI Agent guides are everywhere! So, let me take the initiative to make mine uniqueâ€¦

9 æ¡è¯„è®º
AI Agents Part 0.5: A Starter's Technical Guide

2025å¹´2æœˆ11æ—¥

AI Agents Part 0.5: A Starter's Technical Guide

You don't even have to look around. AI Agent guides are everywhere! So, let me take the initiative to make mine uniqueâ€¦

5 æ¡è¯„è®º
AI Agents Part 3: Agentic Applications: In HEALTHCARE

2025å¹´2æœˆ1æ—¥

AI Agents Part 3: Agentic Applications: In HEALTHCARE

?? The IQVIA-NVIDIA collaboration shaping up, is more than a technological advancement. Warming up to the NVIDIA GTCâ€¦

1 æ¡è¯„è®º
?? Advancing LLM Problem-Solving with Evolutionary Strategies!

2025å¹´1æœˆ25æ—¥

?? Advancing LLM Problem-Solving with Evolutionary Strategies!

A novel research paper just introduced Mind Evolution (ME), a new evolutionary search strategy for enhancing theâ€¦

4 æ¡è¯„è®º

See all articles

Creating RagaaS: With AWS Bedrock & LangChain + the Consequential Death of SQL

Nikunj J Parekh

Agentic AI Executive | CTO @ EV Platform | Principal DMTS | Board Advisor | IEEE | Speaker | President, IIT Tech Clubs | Author | Angel Investor

Why choose Amazon Bedrock?

Retrieval-Augmented Generation (RAG)

RAG Options with Amazon Bedrock

Access to Amazon Bedrock

LangChain + Bedrock

é¢†è‹±æŽ¨è

Knowledge Bases

Limited support for foundation models

Creating a new Knowledge base (KB)

Testing the Knowledge Base

Integrating with LangChain using the Knowledge Bases Retriever

Summary

Nikunj J Parekhçš„æ›´å¤šæ–‡ç«

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Beginnerâ€™s Guide to Amazon Q: Why, How, and Why Not

Week 44 (28 Oct - 3 Nov)

Why AWS is the Best Cloud Platform for Machine Learning

Forte Spotlight: Hello from AWS re:Invent 2024

AWS re:Invent 2024 | 7 takeaways after drinking from the firehose

Breaking Down AWS Bedrock Pricing Models

Week 34 (19 Aug - 25 Aug)

Estafet Insights - Edition 9

LangChain on AWS: Develop the Future of AI in the Cloud

Unlocking the Power of Generative AI with AWS Services

Why choose Amazon Bedrock?

Retrieval-Augmented Generation (RAG)

RAG Options with Amazon Bedrock

Access to Amazon Bedrock

LangChain + Bedrock

é¢†è‹±æŽ¨è

Knowledge Bases

Limited support for foundation models

Creating a new Knowledge base (KB)

Testing the Knowledge Base

Integrating with LangChain using the Knowledge Bases Retriever

Summary

Nikunj J Parekhçš„æ›´å¤šæ–‡ç«

Nvidia GTC Is Here! What's New In AI?

Disruptions: AIâ€™s Future? NP-Hard? Monetization? Security? Inference Scaling?

AI Agents Part 2: Pick The Right Framework

AI Agents Part 2: Pick The Right Framework

Enterprise Engineering Opportunities in Agentic AI: Case Studies

Reflections on Career Readiness in the Age of Agents

The AI Agents Part 1: A Starter's Technical Guide

AI Agents Part 0.5: A Starter's Technical Guide

AI Agents Part 3: Agentic Applications: In HEALTHCARE

?? Advancing LLM Problem-Solving with Evolutionary Strategies!

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Beginnerâ€™s Guide to Amazon Q: Why, How, and Why Not

Week 44 (28 Oct - 3 Nov)

Why AWS is the Best Cloud Platform for Machine Learning

Forte Spotlight: Hello from AWS re:Invent 2024

AWS re:Invent 2024 | 7 takeaways after drinking from the firehose

Breaking Down AWS Bedrock Pricing Models

Week 34 (19 Aug - 25 Aug)

Estafet Insights - Edition 9

LangChain on AWS: Develop the Future of AI in the Cloud

Unlocking the Power of Generative AI with AWS Services

é¢†è‹±æŽ¨è

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†