Build a RAG App With Nvidia NIM and Milvus Running Locally

Build a RAG App With Nvidia NIM and Milvus Running Locally

In the previous post, we built an application that consumes Nvidia NIM APIs and a hosted Zilliz vector database. In this tutorial, we will switch to self-hosted local deployments of the same components while maintaining the same codebase.

Nvidia NIM is available as both APIs hosted within Nvidia’s infrastructure and as containers that can be deployed in an on-premises environment. Similarly, we can deploy Milvus as a stand-alone vector database running in containers. Since Milvus is one of the first open source vector databases to take advantage of GPU acceleration, we can leverage the available GPUs to run the entire stack on an accelerated computing infrastructure.

Let’s start by exploring the environment where we deploy this stack. For my generative AI testbed, I installed two Nvidia GeForce RTX 4090 GPUs. Having two GPUs helps us dedicate one for the LLM while scheduling the embeddings model and the vector database on the other.

Read the entire article at?The New Stack

Janakiram MSV?is an analyst, advisor, and architect. Follow him on?Twitter,??Facebook?and?LinkedIn.

要查看或添加评论,请登录

Janakiram MSV的更多文章

社区洞察

其他会员也浏览了