The Future of Retrieval Systems & LLMs

The Future of Retrieval Systems & LLMs

Here’s a reality check: nobody with a ton of data, in multiple silos, is copying all of it into some new silo.

So the future of LLM’s is very simple: they require integrated, sophisticated search capabilities to survive... especially in the enterprise.?

Search is the key to Retrieval-Augmented Generation (RAG) - which, despite the many efforts to re-label and re-brand it, remains among the best mechanisms for saving time and money using LLMs.

Putting the “R” in RAG

SWIRL is a metasearch engine designed to RAG across silos, or augment most any RAG with data from one or more silos. There's no need to move all the data into a new vector database – assuming it’s already searchable through some system, vector “aware” or not.

SWIRL adapts the user’s query as necessary – into SQL for example – then sends it out to one or more endpoints - search engines, databases, enterprise applications and information services. Asynchronously, of course.

Then it re-ranks the results using a Reader LLM.??

The Selection Problem?

If you're trying to build a retriever system using something like LangChain, it’s very cool until you get to the part where you have to figure out what data, from the large amount you might have, is relevant to your RAG.

This is the first problem that SWIRL solves.?

SWIRL Re-Ranking Diagram

SWIRL vectorizes the user’s query, as well as each result from each source, using any configured embeddings model. (SWIRL ships with spaCy’s large english model, by default.)

It then re-ranks using an algorithm that combines Search, NLP and vector techniques – including term frequency, term surprise factor, source rank, recency, proximity, “aboutness”, entity analysis and soft cosine similarity.

Ultimately it seeks results that evidence the user’s query and intent. And it draws a line where the relevancy falls off. (In the galaxy UI, these are shown as star ratings.)?

The Xethub study (https://about.xethub.com/blog/you-dont-need-a-vector-database) shows very clearly that re-ranking can out-perform a “full vector” approach. And it definitely costs a fraction of the time, effort and ongoing $.?

Passage Detection and Token Counts?

But that’s the first part of RAG: retrieval. The second part, often overlooked, is the “augment” step.??

In a nutshell, augment means “putting relevant data into the prompt”. Although search result snippets are indicative of the information they link to, to really perform a proper augmentation the full-text of the document needs to be retrieved.??

That’s actually the easy part.?

Much more important than retrieving any single document is analyzing the set of most relevant? documents to remove duplicates, select the most recent, and, perhaps most importantly, pare them down to the relevant portions.??

Do you really want to pay to summarize 7,000 tokens of boilerplate power point when the answer you’re seeking is on slide 17??

That’s the second problem SWIRL solves for you.?

SWIRL’s Reader LLM can de-dupe quite effectively using vectors; it can also crack open 1,500 file formats, find the most relevant portions and chunk it in less than 1 second (per X MB) before sending it out to your choice of GAI for summarization, question answering, comparison and/or translation – among others.??

The Future of RAG is ... Search

Want to supercharge your existing RAG systems while also avoiding the overhead of copying and re-indexing? Put SWIRL into your stack!

Check out the below video showing how quickly you can install and configure the Community Edition of SWIRL!!

RAG with SWIRL AI Search

??

Sid Probstein

CEO at SWIRL | 10x CTO | AI & Search Pioneer | ex-Attivio

3 个月

Predictions are hard. Especially about the future. But after a week at KMWorld I am more confident than ever that search is *the* key to making AI work in the enterprise.

回复
Dave Voutila

Building better AI, Ex-[object Object]

6 个月

It's pretty wild how the dream we tried multiple times to build at Attivio, with semantic understanding of an NLP query, is becoming more possible now with LLMs being generally available.

Robert Yelle

Global Client & Partner Enablement Director

6 个月

Stealing the term “aboutness” … good stuff

要查看或添加评论,请登录

Sid Probstein的更多文章

社区洞察

其他会员也浏览了