登录查看更多内容

Tutorials on Building Deep Research with Open source, Multimodal Semantic Search, and More!

Milvus

Open-source vector database for production AI, created by Zilliz, hosted by The Linux Foundation.

发布日期: 2025年2月13日

+ 关注

In this issue:?

Build Deep Research with Open Source
Multimodal Semantic Search with Images and Text
Why DeepSeek V3 is Taking the AI World by Storm
Unstructured Data Podcast: AI Agents!
Upcoming Events?

?? Build Deep Research with Open Source

OpenAI ’s Deep Research just dropped, promising AI-powered synthesis of complex topics. But what if you could build your own version running locally and fully open-source?

In this tutorial, we explore a DIY research agent that can:

?? Reason & Plan: Break down complex questions into subtopics

?? Search Wikipedia: Retrieve relevant info using Milvus for vector storage

?? Synthesize Reports: Use DeepSeek AI R1 + LangChain for structured summaries

?? Example: Asking “How has The Simpsons changed over time?”—our agent refines the query, retrieves insights from Wikipedia, and compiles a structured research report.

?? Why does this matter? Open-source AI gives flexibility & control for academia, content creation, or next-gen assistants. Future iterations could integrate real-time web search, reflection, and multi-step reasoning.

Get Started: I Built a Deep Research with Open Source—and So Can You!

Building on top of this idea, we are open-sourcing an implementation of self-reflection and search that works with any LLM and data sources! It can search both public web and your private knowledge-base on a vector database like Milvus / Zilliz Cloud.?

Run an agent like Deep Research in your local laptop: GitHub - zilliztech/deep-searcher: Deep Research Alternative to Reasoning About Private Data

The tool works with:

Embedding models: OpenAI & Voyage AI (part of MongoDB)

LLM Services: OpenAI, DeepSeek, SiliconFlow Together AI

PDF Parsing: unstructured.io

Web Crawl: Firecrawl , 极纳科技 Reader, and Crawl4AI

?? Multimodal Semantic Search with Images and Text

Humans interpret the world through multiple senses. Why shouldn’t AI do the same? To truly match human understanding, AI must process text, images, and context together.

In this tutorial, we explore multimodal semantic search, showing how AI can connect words and visuals to improve search accuracy. We’ll build a retrieve-and-rerank search app that goes beyond keywords, using:

? Milvus for vector storage

? Visualized BGE for text-image embeddings

? Phi-3 Vision for reranking results

?? Example: Searching for a leopard print phone case using both text (“a phone case with”) and an image of a leopard.?

?? What’s next? Multimodal AI is unlocking new possibilities—from e-commerce to scientific research. And with open-source tools, anyone can build and experiment.

?? Check out the video walkthrough to see it in action!

Get Started: Multimodal Semantic Search with Images and Text?

?? Why DeepSeek V3 is Taking the AI World by Storm

Big news in AI! DeepSeek V3 is taking the AI world by storm, delivering GPT-4-level performance at a fraction of the cost. Here’s why everyone’s talking about it:

?? Smarter, Faster, More Efficient DeepSeek V3 introduces Multi-Head Latent Attention (MLA) to speed up processing and reduce memory use. Faster responses, lower compute costs. ??

?? Massive Power, Minimal Waste Thanks to Mixture of Experts (MoE), the model only activates the parameters it actually needs. Think of it as AI with a built-in efficiency mode! ?

领英推荐

A Complete Guide to Creating and Storing Vector…

Pavan Belagatti 12 个月前

Building Retrieval Augmented Generation (RAG) from…

Saurav Prateek 7 个月前

Blueprint for Leveraging Vector Database in Business

Oak Business Consultant 9 个月前

?? Predicting the Future (Well, Almost) ?? DeepSeek V3 doesn’t just predict one word at a time, it uses Multi-Token Prediction (MTP) to generate multiple tokens in parallel. That means smoother, more natural responses.

?? It’s Cheaper While GPT-4 cost an estimated $100M+ to train, DeepSeek V3 pulled it off for just $5.6M.

?? Fully Open-Source & Ready to Use Unlike closed models, DeepSeek V3 is MIT-licensed, meaning developers can tinker, test, and build freely.?

Learn More: DeepSeek V3: Inside the Open-Source AI Model Rivaling GPT-4?

Milvus X DeepSeek: Build RAG with Milvus and DeepSeek?

???Unstructured Data Podcast

We have a podcast! Listen to our first episode all about the AI Agent Revolution with Zilliz Developer Advocate Stephen Batifol and host Stefan Webb . They dive deep into the evolving world of AI agents—what they are, how they work, and their growing impact on technology and society.

Listen Now?

??? Upcoming Events

Feb 20: Deploying a Multimodal RAG System Using Open Source Milvus, LlamaIndex, and vLLM (virtual)?

Discover how to build multimodal RAG systems using open-source tools! Learn to process images, audio, and text together using Milvus, LlamaIndex , and vLLM.

??Join our live demo showcasing:

Self-hosted LLM deployment with vLLM
Real-world multimodal processing
Complete architecture walkthrough
Privacy-first infrastructure setup

Save Your Spot

Feb 26: San Francisco Unstructured Data Meetup (in-person)

Join us at the AWS GenAI Loft in San Francisco for our first Bay Area Unstructured Data Meetup of 2025! We look forward to exciting talks about the latest AI innovations and more.?

??? Arnab Sinha & Jean Malha will speak about Bedrock’s Latest feature: Bedrock Data Automation (Preview) to simplify processing unstructured data

?? Stefan Webb will be speaking on Combining Lexical and Semantic Search with Milvus 2.5.?

?? Anushrut Gupta from Hasura will speak about pushing AI's accuracy on unstructured data to 100% with PromptQL

Save Your Spot

March 11: Product Demo: Discover the Power of Zilliz Cloud (virtual)?

Join our monthly demo for a technical overview of Zilliz Cloud, a highly scalable and performant vector database service for AI applications

Zilliz Cloud's scalable architecture
Key features of the developer-friendly UI
Security best practices and data privacy
Highlights from recent product releases

This webinar is an excellent opportunity for developers to learn about Zilliz Cloud's capabilities and how it can support their AI projects. Register now to join our community and stay up-to-date with the latest vector database technology.

Save Your Spot

?? Stay Connected

?? Join a community of developers on Discord?

?? Follow us on LinkedIn, X, and YouTube

?? Check out our latest articles on the Zilliz blog

?? Subscribe to this Milvus LinkedIn Newsletter for weekly updates

Tutorials on Building Deep Research with Open source, Multimodal Semantic Search, and More!

Milvus

Open-source vector database for production AI, created by Zilliz, hosted by The Linux Foundation.

?? Build Deep Research with Open Source

?? Multimodal Semantic Search with Images and Text

?? Why DeepSeek V3 is Taking the AI World by Storm

领英推荐

???Unstructured Data Podcast

??? Upcoming Events

Feb 20: Deploying a Multimodal RAG System Using Open Source Milvus, LlamaIndex, and vLLM (virtual)?

Feb 26: San Francisco Unstructured Data Meetup (in-person)

March 11: Product Demo: Discover the Power of Zilliz Cloud (virtual)?

?? Stay Connected

Milvus Newsletter

1,911 位关注者

Milvus的更多文章

社区洞察

其他会员也浏览了

Guidebook to the State-of-the-Art Embeddings and Information Retrieval

Why Vector Databases Are Important for Large Language Models (LLMs)

Positive Thinking Company Newsletter November 2023

???????????? ?????????????????? ?????? ?????? ????????????????????????

How Enterprise Data Observability will make the most of your Shiny New Vector Databases

Building and Evaluating RAG Applications

Vector Databases vs. Knowledge Graphs: Choosing the Right Foundation for Retrieval-Augmented Generation

Bigdata.com API in action: which companies could thrive or struggle in a Trump re-election scenario

Edition 25 - What Retrieval Approaches Actually Work?

Gretel's Tabular LLM, Synthetic Data Accelerator, and much more

?? Build Deep Research with Open Source

?? Multimodal Semantic Search with Images and Text

?? Why DeepSeek V3 is Taking the AI World by Storm

领英推荐

???Unstructured Data Podcast

??? Upcoming Events

Feb 20: Deploying a Multimodal RAG System Using Open Source Milvus, LlamaIndex, and vLLM (virtual)?

Feb 26: San Francisco Unstructured Data Meetup (in-person)

March 11: Product Demo: Discover the Power of Zilliz Cloud (virtual)?

?? Stay Connected

Milvus Newsletter

1,911 位关注者

Milvus的更多文章

Manual sharding pains, A look inside Milvus’s distributed architecture, and Milvus MCP Server

Audio AI Secrets: Developer Tools That Transform Sound to Data

Build AI Agents: DeepSearcher, RAG Chatbots & Social Media Assistant

Insights into RAG Design, Cost Analysis, and Milvus Community Learnings

Hackathon Recap, RAG with DeepSeek & Milvus, and More!

Why Your Vector Database Needs a Formula 1 Upgrade, Information Retrieval Algorithms, and More!

The Top 10’s of 2025: Open Source Frameworks and AI Agents

Vector Insights: Milvus News, RAG Developments & AI/ML Terms

Notebooks For Your 2025 AI Projects

Introducing Milvus 2.5: Built-in Full-Text Search and More!

社区洞察

其他会员也浏览了

Guidebook to the State-of-the-Art Embeddings and Information Retrieval

Why Vector Databases Are Important for Large Language Models (LLMs)

Positive Thinking Company Newsletter November 2023

???????????? ?????????????????? ?????? ?????? ????????????????????????

How Enterprise Data Observability will make the most of your Shiny New Vector Databases

Building and Evaluating RAG Applications

Vector Databases vs. Knowledge Graphs: Choosing the Right Foundation for Retrieval-Augmented Generation

Bigdata.com API in action: which companies could thrive or struggle in a Trump re-election scenario

Edition 25 - What Retrieval Approaches Actually Work?

Gretel's Tabular LLM, Synthetic Data Accelerator, and much more