Vector Insights: Milvus News, RAG Developments & AI/ML Terms
In this issue:?
?? Building a RAG Application with Milvus and Databricks DBRX
DBRX is Databricks’s open-source model with a fine-grained mixture-of-experts (MoE) architecture. In this tutorial, we will explore how to build a robust RAG application by combining the capabilities of Milvus, a scalable vector database optimized for similarity search, and DBRX.?
??Milvus: enables efficient handling and querying of large-scale embeddings
??DBRX: provides cutting-edge natural language processing (NLP) capabilities
??Milvus + DBRX: ideal for applications such as knowledge management, customer support, and personalized recommendations.
Get Started: Build RAG with Milvus and Databricks DBRX?
?? AI/ML Terms You May Not Know
??Mixture-of-Agents (MoA)
Some LLMs excel at solving mathematical problems, while others are better suited for coding tasks. This diversity makes it challenging to select the most suitable LLM for our needs, especially when dealing with multi-domain use cases. Mixture-of-Agents (MoA) is a framework where multiple specialized LLMs, or "agents," collaborate to solve tasks by leveraging their unique strengths.
Combine several LLMs with different strengths and capabilities into a single system. Each LLM in the system generates a response when a user submits a query. Then, a designated LLM at the end synthesizes all these responses into one coherent answer for the user, as shown in the visualization below:
Figure: Mixture-of-Agents concept.
??Matryoshka Embeddings: Detail at Multiple Scales
Named after Russian nesting dolls (see visual below), Matryoshka embeddings are a clever approach to creating more efficient vector embeddings. Unlike traditional methods that compress vectors after creation, these embeddings are trained to contain multiple scales of representation within a single vector – like nested dolls of increasing detail. The key innovation is that each prefix of the vector (first half, first quarter, etc.) provides a complete, coherent representation of decreasing precision. This allows systems to choose the precision level needed dynamically.
Figure: Visualization of Matryoshka embeddings with multiple layers of detail
Learn More:?
LoRA: Low-Rank Adaptation for Fine-Tuning LLMs
LoRA is a breakthrough technique that makes fine-tuning large language models practical and efficient. Instead of updating all model weights during fine-tuning (which is computationally expensive), LoRA introduces small, trainable matrices into specific layers of the model. This clever approach dramatically reduces memory requirements and training costs while maintaining model quality. It's become a go-to method for developers and researchers who need to adapt massive language models (like GPT or BERT) to specific tasks without access to extensive computing resources.
? Milvus Office Hours
We are here to help! Join our expert team every week for a 20-minute one-on-one session to get insights, guidance, and answers to your Milvus implementation questions. Get expert help with:
?? Milvus Discord
Join our Discord channel to engage with our engineers and community members. Come learn about our latest webinars & announcements and ask us your Milvus questions! We have a chatbot powered by Inkeep and Zilliz in the #??ask-ai-for-help channel. If you think other people might have the same question, ask it here.?
Join us: milvus.io/discord
??? Upcoming Events
Jan 14: Data for AI Meetup Presents: The Open Source Afterparty (in-person)?
Join Stefan Webb and speakers from exciting companies in SF at the Data for AI Meetup on January 14! He will be giving a talk to understand tabular data, and demonstrate practical ways to unlock hidden value in your organization's data assets.?
领英推荐
Jan 16: How to Optimize Your Embedding Model Selection and Development through TDA Clustering (virtual)?
?? Join us on January 16 at 9:00 AM PT for a deep dive into optimizing embedding model selection using Topological Data Analysis (TDA).
Learn how TDA clustering can revolutionize your vector database applications through efficient model evaluation and performance measurement. Gunnar Carlsson , Co-Founder of BluelightAI , alongside Sr. Data Scientist Gabriel A. , will share practical insights on:
Register now to transform how you select and fine-tune embedding models for your specific use cases
Jan 23: A Table is Worth 1000 Words (virtual)?
In case you can’t make it in person in SF, Stefan will be doing an online webinar on the same topic! In this talk, Stefan Webb will explore how new multimodal foundation models are trained to understand tabular data and demonstrate practical ways to unlock hidden value in your organization's data assets.
Jan 22-Jan 23: RAG talks at Open Data Science Conference (ODSC) AI Builders Summit (virtual)?
AI Builders Summit is a 4-week virtual training event designed to equip data scientists, ML and AI engineers, and innovators with the latest advancements in large language models (LLMs), AI agents, and Retrieval-Augmented Generation (RAG). In week 2, Stefan Webb will speak about “Evaluating Retrieval-Augmented Generation and LLM-as-a-Judge Methodologies”?
Quantify the performance of retrieval and generation with these open-source tools:?
Attendees will learn how to implement a complete RAG evaluation pipeline to explore design choices in a principled way and leave with practical code examples and best practices that can be applied in real-world scenarios.
Jan 25: Women in AI RAG Hackathon (in-person)
Join us for an exciting one-day hackathon at Stanford that celebrates and empowers women in technology! Hosted in partnership with Zilliz, Women Who Do Data (W2D2) , The GenAI Collective , TwelveLabs , StreamNative , OmniStack , and Arize AI .?
?? Palo Alto, California
?? Saturday, Jan 25
?8:30 AM - 8:00 PM?
Choose your AI adventure and develop a Retrieval-Augmented Generation (RAG) system using Milvus Lite vector database technology.
Application Note: ?? Space is limited! While we can't guarantee acceptance, we encourage continued interest and applications for future events.
Feb 26: San Francisco Unstructured Data Meetup (in-person)
Join us at the AWS Loft in San Francisco for our first Bay Area Unstructured Data Meetup of 2025! We look forward to exciting talks about the latest AI innovations and more. More details coming soon.?
Sneak peek: Stefan Webb will be speaking on Combining Lexical and Semantic Search with Milvus 2.5. ??