登录查看更多内容

Build a RAG App With Nvidia NIM and Milvus Running Locally

Janakiram MSV

Analyst | Architect | Advisor

发布日期: 2024年9月2日

In the previous post, we built an application that consumes Nvidia NIM APIs and a hosted Zilliz vector database. In this tutorial, we will switch to self-hosted local deployments of the same components while maintaining the same codebase.

Nvidia NIM is available as both APIs hosted within Nvidia’s infrastructure and as containers that can be deployed in an on-premises environment. Similarly, we can deploy Milvus as a stand-alone vector database running in containers. Since Milvus is one of the first open source vector databases to take advantage of GPU acceleration, we can leverage the available GPUs to run the entire stack on an accelerated computing infrastructure.

Let’s start by exploring the environment where we deploy this stack. For my generative AI testbed, I installed two Nvidia GeForce RTX 4090 GPUs. Having two GPUs helps us dedicate one for the LLM while scheduling the embeddings model and the vector database on the other.

Read the entire article at?The New Stack

Janakiram MSV?is an analyst, advisor, and architect. Follow him on?Twitter,??Facebook?and?LinkedIn.

要查看或添加评论，请登录

Janakiram MSV的更多文章

Microsoft’s Copilot Agents Transform Executive Decision-Making

2025年3月28日

Microsoft’s Copilot Agents Transform Executive Decision-Making

The integration of artificial intelligence into business operations is rapidly transforming how organizations function…
Google Launches Gemini 2.5 Pro, Pushing The Boundaries Of AI Reasoning

2025年3月28日

Google Launches Gemini 2.5 Pro, Pushing The Boundaries Of AI Reasoning

Gemini 2.5 Pro is Google DeepMind’s latest large-scale multimodal AI model, engineered with built-in “thinking”…
Nvidia Dynamo — Next-Gen AI Inference Server For Enterprises

2025年3月26日

Nvidia Dynamo — Next-Gen AI Inference Server For Enterprises

At the GTC 2025 conference, Nvidia introduced Dynamo, a new open-source AI inference server designed to serve the…

1 条评论
What Is AI Factory, And Why Is Nvidia Betting On It?

2025年3月24日

What Is AI Factory, And Why Is Nvidia Betting On It?

At the recent Nvidia GTC conference, executives and speakers frequently referenced the AI factory. It was one of the…

5 条评论
Portkey: An open-source AI gateway for easy LLM orchestration

2025年3月12日

Portkey: An open-source AI gateway for easy LLM orchestration

The explosion of open-source AI frameworks has given developers unprecedented flexibility in deploying AI models…

1 条评论
How to Run DeepSeek Models Locally on a Windows Copilot+ PC

2025年3月10日

How to Run DeepSeek Models Locally on a Windows Copilot+ PC

With the Windows 11 version 24H2, Microsoft has enabled access to the Neural Processing Unit (NPU) on Copilot+ PCs…

4 条评论
How Physical AI Transforms Industries Through Embedded Intelligence

2025年3月5日

How Physical AI Transforms Industries Through Embedded Intelligence

Physical artificial intelligence represents the evolution of AI from purely digital systems to intelligent machines…

1 条评论
Orchestrate Cloud Native Workloads With Kro and Kubernetes

2025年3月3日

Orchestrate Cloud Native Workloads With Kro and Kubernetes

In the first part of this series, I introduced the background of Kube Resource Orchestrator (Kro). In this installment,…
Sonar Bets On AI Code Automation With AutoCodeRover Acquisition

2025年2月27日

Sonar Bets On AI Code Automation With AutoCodeRover Acquisition

Sonar’s acquisition of AutoCodeRover, announced on February 19, 2025, marks a strategic move to integrate agentic AI…
Gemini Lands On Agentforce: A Bold Move By Google And Salesforce

2025年2月25日

Gemini Lands On Agentforce: A Bold Move By Google And Salesforce

Salesforce and Google have expanded their partnership to integrate Google’s Gemini AI into Salesforce’s Agentforce…

See all articles

Build a RAG App With Nvidia NIM and Milvus Running Locally

Janakiram MSV

Analyst | Architect | Advisor

Janakiram MSV的更多文章

社区洞察

其他会员也浏览了

An Introduction to NVIDIA L40S

To GeForce or not to GeForce, that is the question...

How DeepSeek?R1 Shook Up the GPU Market

?? Starforge's PC Pulse #7 ??

The NVIDIA GeForce GTX 1080 Is The Fastest GPU We’ve Tested, But Also The Best Enthusiast ‘Value’

Sorry Folks, AMD RX Pricing Drop Didn’t Sucker NVIDIA Into Anything

The New Nvidia RTX 50 Series: NVIDIA’s Bold Leap or an Upgrade with a Catch?

NVIDIA GeForce RTX 40 Series Studio PCs Deliver Ultimate Creative Performance

NVIDIA’s GeForce GTX 1080 Poised To Claim The Gaming And VR Performance Crown

AMD Highlights Path to the Future

Janakiram MSV的更多文章

Microsoft’s Copilot Agents Transform Executive Decision-Making

Google Launches Gemini 2.5 Pro, Pushing The Boundaries Of AI Reasoning

Nvidia Dynamo — Next-Gen AI Inference Server For Enterprises

What Is AI Factory, And Why Is Nvidia Betting On It?

Portkey: An open-source AI gateway for easy LLM orchestration

How to Run DeepSeek Models Locally on a Windows Copilot+ PC

How Physical AI Transforms Industries Through Embedded Intelligence

Orchestrate Cloud Native Workloads With Kro and Kubernetes

Sonar Bets On AI Code Automation With AutoCodeRover Acquisition

Gemini Lands On Agentforce: A Bold Move By Google And Salesforce

社区洞察

其他会员也浏览了

An Introduction to NVIDIA L40S

To GeForce or not to GeForce, that is the question...

How DeepSeek?R1 Shook Up the GPU Market

?? Starforge's PC Pulse #7 ??

The NVIDIA GeForce GTX 1080 Is The Fastest GPU We’ve Tested, But Also The Best Enthusiast ‘Value’

Sorry Folks, AMD RX Pricing Drop Didn’t Sucker NVIDIA Into Anything

The New Nvidia RTX 50 Series: NVIDIA’s Bold Leap or an Upgrade with a Catch?

NVIDIA GeForce RTX 40 Series Studio PCs Deliver Ultimate Creative Performance

NVIDIA’s GeForce GTX 1080 Poised To Claim The Gaming And VR Performance Crown

AMD Highlights Path to the Future