Vector Insights: Milvus News, RAG Developments & AI/ML Terms

Vector Insights: Milvus News, RAG Developments & AI/ML Terms

In this issue:?

  • Building a RAG Application with Milvus and Databricks DBRX
  • AI/ML Terms You May Not Know
  • Connect with us: Milvus Office Hour & Discord
  • Upcoming Events?

?? Building a RAG Application with Milvus and Databricks DBRX

DBRX is Databricks’s open-source model with a fine-grained mixture-of-experts (MoE) architecture. In this tutorial, we will explore how to build a robust RAG application by combining the capabilities of Milvus, a scalable vector database optimized for similarity search, and DBRX.?

??Milvus: enables efficient handling and querying of large-scale embeddings

??DBRX: provides cutting-edge natural language processing (NLP) capabilities

??Milvus + DBRX: ideal for applications such as knowledge management, customer support, and personalized recommendations.

Get Started: Build RAG with Milvus and Databricks DBRX?

?? AI/ML Terms You May Not Know

??Mixture-of-Agents (MoA)

Some LLMs excel at solving mathematical problems, while others are better suited for coding tasks. This diversity makes it challenging to select the most suitable LLM for our needs, especially when dealing with multi-domain use cases. Mixture-of-Agents (MoA) is a framework where multiple specialized LLMs, or "agents," collaborate to solve tasks by leveraging their unique strengths.

Combine several LLMs with different strengths and capabilities into a single system. Each LLM in the system generates a response when a user submits a query. Then, a designated LLM at the end synthesizes all these responses into one coherent answer for the user, as shown in the visualization below:

Figure: Mixture-of-Agents concept.

Learn More: Mixture-of-Agents (MoA): How Collective Intelligence Elevates LLM Performance?

??Matryoshka Embeddings: Detail at Multiple Scales

Named after Russian nesting dolls (see visual below), Matryoshka embeddings are a clever approach to creating more efficient vector embeddings. Unlike traditional methods that compress vectors after creation, these embeddings are trained to contain multiple scales of representation within a single vector – like nested dolls of increasing detail. The key innovation is that each prefix of the vector (first half, first quarter, etc.) provides a complete, coherent representation of decreasing precision. This allows systems to choose the precision level needed dynamically.

Figure: Visualization of Matryoshka embeddings with multiple layers of detail

Learn More:?

LoRA: Low-Rank Adaptation for Fine-Tuning LLMs

LoRA is a breakthrough technique that makes fine-tuning large language models practical and efficient. Instead of updating all model weights during fine-tuning (which is computationally expensive), LoRA introduces small, trainable matrices into specific layers of the model. This clever approach dramatically reduces memory requirements and training costs while maintaining model quality. It's become a go-to method for developers and researchers who need to adapt massive language models (like GPT or BERT) to specific tasks without access to extensive computing resources.

Learn More: Efficient LLM Fine-tuning with LoRA (Low-Rank Adaptation)?

? Milvus Office Hours

We are here to help! Join our expert team every week for a 20-minute one-on-one session to get insights, guidance, and answers to your Milvus implementation questions. Get expert help with:

  • Performance optimization
  • Schema & index design
  • Scaling strategies
  • Troubleshooting
  • Framework integrations
  • Latest Features

Join Milvus Office Hours to Get Support from Vector DB Experts!?

?? Milvus Discord

Join our Discord channel to engage with our engineers and community members. Come learn about our latest webinars & announcements and ask us your Milvus questions! We have a chatbot powered by Inkeep and Zilliz in the #??ask-ai-for-help channel. If you think other people might have the same question, ask it here.?

Join us: milvus.io/discord

??? Upcoming Events

Jan 14: Data for AI Meetup Presents: The Open Source Afterparty (in-person)?

Join Stefan Webb and speakers from exciting companies in SF at the Data for AI Meetup on January 14! He will be giving a talk to understand tabular data, and demonstrate practical ways to unlock hidden value in your organization's data assets.?

Save Your Spot

Jan 16: How to Optimize Your Embedding Model Selection and Development through TDA Clustering (virtual)?

?? Join us on January 16 at 9:00 AM PT for a deep dive into optimizing embedding model selection using Topological Data Analysis (TDA).

Learn how TDA clustering can revolutionize your vector database applications through efficient model evaluation and performance measurement. Gunnar Carlsson , Co-Founder of BluelightAI , alongside Sr. Data Scientist Gabriel A. , will share practical insights on:

  • Evaluating embedding models using navigable TDA clusters
  • Real-world ML lifecycle case studies in e-commerce
  • Overcoming limitations of traditional evaluation methods

Register now to transform how you select and fine-tune embedding models for your specific use cases

Save Your Spot

Jan 23: A Table is Worth 1000 Words (virtual)?

In case you can’t make it in person in SF, Stefan will be doing an online webinar on the same topic! In this talk, Stefan Webb will explore how new multimodal foundation models are trained to understand tabular data and demonstrate practical ways to unlock hidden value in your organization's data assets.

Save Your Spot?

Jan 22-Jan 23: RAG talks at Open Data Science Conference (ODSC) AI Builders Summit (virtual)?

AI Builders Summit is a 4-week virtual training event designed to equip data scientists, ML and AI engineers, and innovators with the latest advancements in large language models (LLMs), AI agents, and Retrieval-Augmented Generation (RAG). In week 2, Stefan Webb will speak about “Evaluating Retrieval-Augmented Generation and LLM-as-a-Judge Methodologies”?

Quantify the performance of retrieval and generation with these open-source tools:?

Attendees will learn how to implement a complete RAG evaluation pipeline to explore design choices in a principled way and leave with practical code examples and best practices that can be applied in real-world scenarios.

Save Your Spot

Jan 25: Women in AI RAG Hackathon (in-person)

Join us for an exciting one-day hackathon at Stanford that celebrates and empowers women in technology! Hosted in partnership with Zilliz, Women Who Do Data (W2D2) , The GenAI Collective , TwelveLabs , StreamNative , OmniStack , and Arize AI .?

?? Palo Alto, California

?? Saturday, Jan 25

?8:30 AM - 8:00 PM?

Choose your AI adventure and develop a Retrieval-Augmented Generation (RAG) system using Milvus Lite vector database technology.

Application Note: ?? Space is limited! While we can't guarantee acceptance, we encourage continued interest and applications for future events.

Apply Now

Feb 26: San Francisco Unstructured Data Meetup (in-person)

Join us at the AWS Loft in San Francisco for our first Bay Area Unstructured Data Meetup of 2025! We look forward to exciting talks about the latest AI innovations and more. More details coming soon.?

Sneak peek: Stefan Webb will be speaking on Combining Lexical and Semantic Search with Milvus 2.5. ??

Save Your Spot


要查看或添加评论,请登录

Milvus的更多文章

社区洞察

其他会员也浏览了