登录查看更多内容

LLMs and RAG are Great, But Don’t Throw Away Your Inverted Index Yet

Daniel Tunkelang

Query Understanding

发布日期: 2024年3月29日

Vectors, embeddings, large language models (LLMs), and retrieval-augmented generation (RAG) represent the cutting edge of search architecture, and it is very tempting to believe we can dispense with the traditional inverted index architecture entirely. You should be excited about this brave new world, but you should also proceed with caution.

It is true that embedding-based retrieval addresses many pain points that challenge a traditional inverted index. Embeddings are less susceptible to polysemy (words having multiple meanings) and synonymy (multiple words having the same meaning). And embedding-based retrieval can be especially useful for handling long queries, especially compared to traditional methods like query expansion and query relaxation.

These sound like great arguments in favor of embedding-based retrieval. So what is the catch? Why are most companies still using a traditional — or at least a hybrid — architecture? Here are some of the main reasons.

Embedding-based retrieval is powerful, but it gains that power at the price of explainability. Vectors from embeddings tend to beless explainable than token-based representations. While a bag of words may not be a perfect representation of content, it is at least simple and understandable. In contrast, embeddings are a black box, making it hard to understand how they affect retrieval and ranking, and even harder to debug.

Embeddings also tend to be task-dependent. A single embedding model may not capture everything about a document or query. For example, in an e-commerce setting, an embedding might be more or less sensitive to variations in product type, brand, or size. Since embedding-based retrieval reduces relevance to a single similarity metric, there is a risk that a single vector representation will not address all search use cases. In contrast, token-based representations, despite being simplistic, are more flexible.

领英推荐

Crash Course on Developing AI Applications with…

Alex Merced 1 个月前

RAG to Riches

Lightning AI 1 年前

??Top ML Papers of the Week

DAIR.AI 7 个月前

There are also computational challenges. Embeddings tend to be vectors with hundreds of densely populated dimensions. That is not necessarily a showstopper, especially if the documents they represent are large. Also, there are techniques to make the representations more compact. Still, index size matters, especially when vectors need to be kept in memory to minimize the latency of accessing them. Aside from scale concerns, exact nearest-neighbor search is not practical for most latency-sensitive applications, and even approximate nearest-neighbor (ANN) search is slower than performing simple set operations on an inverted index.

And then there is ranking. It is not clear how to combine the query-dependent similarity score with other ranking factors, particularly query-independent desirability factors. Ranking is never easy, but embedding-based retrieval introduces additional complexity.

Finally, there is the challenge of any operations that depend on retrieval, including result counts, filters or facets, and explicit sorts. As we discussed in the previous section, these are hard to implement well when we lack a principled way to manage the precision-recall tradeoff.

These are serious challenges! So it is important to go into embedding-based retrieval cautiously, recognizing that, for many applications, the costs of moving to embedding-based retrieval do not justify giving up the benefits of a traditional inverted index architecture. Or at least not yet.

Adrian Lopez

I help companies match with IT Experts Nearshore from Mexico. Scale up your team with guaranteed cultural fit.

2 个月

valuable contribution ????

1 次回应

Nishant Vyas

Applying for jobs manually? Automate with ApplyEngine.AI

11 个月

But is it cool anymore? ??

1 次回应

Dr. Ben E. Kuzey

Chief AI Officer at Microsoft NL | Max Planck Institute

11 个月

Vidyadhar (Vidya) Ranade

1 次回应

Rupesh Gupta

AI at LinkedIn

11 个月

> Embeddings also tend to be task-dependent. A single embedding model may not capture everything about a document or query. This is very true. We have observed this for several queries. That's why we decided to keep both keyword based and embedding based retrievers.

4 次回应

Abhimanyu Lad

Director of Engineering, LinkedIn Search

11 个月

Aw shucks, we just threw away our inverted index! But seriously – I'm pleasantly surprised by how well even off-the-shelf embeddings (like e5-small) are able to match keyword-based relevance for some of our use cases. But yes, we're not doing away with our inv idx just yet, for many of the reasons you mentioned. cc anand, Rupesh, Birjodh.

4 次回应

查看更多评论

要查看或添加评论，请登录

Daniel Tunkelang的更多文章

ChatGPT, Are You Just Telling Me What I Want to Hear?

2025年3月3日

ChatGPT, Are You Just Telling Me What I Want to Hear?

These days, the Turing Test — which Turing originally called the “imitation game” — feels hopelessly outdated. With…
Not All Recall is Created Equal

2025年2月24日

Not All Recall is Created Equal

Search application developers constantly navigate tradeoffs, particularly between precision and recall. Precision…

1 条评论
To Bot or Not to Bot: It Depends on the Question

2025年1月31日

To Bot or Not to Bot: It Depends on the Question

I was one of Quora’s earliest users. I earned Top Writer status for several years and even made some money through…
Ground Truth: A Useful Fiction

2025年1月14日

Ground Truth: A Useful Fiction

A key concern about AI is that models “hallucinate” — technical jargon for saying that they make up things that look…

5 条评论
Conjunction, Disjunction, What’s Your Function?

2025年1月6日

Conjunction, Disjunction, What’s Your Function?

Like many folks of my generation, I grew up on Schoolhouse Rock, a series of animated educational shorts that aired…
Modeling Queries as Bags of Documents

2024年12月2日

Modeling Queries as Bags of Documents

Last week, I had the honor of presenting “Modeling Queries as Bags of Documents” at Search Solutions 2024 with Aritra…
Documents, Queries, and Categories

2024年11月25日

Documents, Queries, and Categories

I have published a number of posts and presentations about the bag-of-documents model, which essentially represents…
Where Do Categories Come From?

2024年11月20日

Where Do Categories Come From?

In my previous post, I argued that categories are fundamental for search applications. I characterized a robust set of…

1 条评论
Categories are Fundamental for Search

2024年11月18日

Categories are Fundamental for Search

As a search consultant, I have learned to be flexible about structured data. However, I do insist on content being…

5 条评论
Quo Vadis Nunc, Quora?

2024年9月25日

Quo Vadis Nunc, Quora?

I was one of Quora’s earliest users, earned Top Writer status for a few years, and topped the leaderboard as a 9-time…

2 条评论

See all articles

LLMs and RAG are Great, But Don’t Throw Away Your Inverted Index Yet

Daniel Tunkelang

Query Understanding

领英推荐

Daniel Tunkelang的更多文章

社区洞察

其他会员也浏览了

A Guide to Building RAG

Mastering the Ingestion Phase of Retriever Augmented Generation (RAG)

Issue #221 - THE ML ENGINEER ??

AutoGen and Semantic Kernel: Multi-Agent AI Development

RAG Architecture Deep Dive

Why GraphQL Will Rewrite the Semantic Web

Agent Protocol to Deploy AI Agents in Production

LangChain State of AI 2024: A Comprehensive Analysis

AI Stack with Retrieval Augmented Generation (RAG) using APEX 24.2

High Fidelity Retrieval Augmented Generation (RAG) with Meta Llama 3.1 at PubNub

领英推荐

Daniel Tunkelang的更多文章

ChatGPT, Are You Just Telling Me What I Want to Hear?

Not All Recall is Created Equal

To Bot or Not to Bot: It Depends on the Question

Ground Truth: A Useful Fiction

Conjunction, Disjunction, What’s Your Function?

Modeling Queries as Bags of Documents

Documents, Queries, and Categories

Where Do Categories Come From?

Categories are Fundamental for Search

Quo Vadis Nunc, Quora?

社区洞察

其他会员也浏览了

A Guide to Building RAG

Mastering the Ingestion Phase of Retriever Augmented Generation (RAG)

Issue #221 - THE ML ENGINEER ??

AutoGen and Semantic Kernel: Multi-Agent AI Development

RAG Architecture Deep Dive

Why GraphQL Will Rewrite the Semantic Web

Agent Protocol to Deploy AI Agents in Production

LangChain State of AI 2024: A Comprehensive Analysis

AI Stack with Retrieval Augmented Generation (RAG) using APEX 24.2

High Fidelity Retrieval Augmented Generation (RAG) with Meta Llama 3.1 at PubNub