登录查看更多内容

Production-Grade LLM Applications that React to Your Data

Rohan Paul

Founder Rohan's Bytes. → I write daily for my 112K+ engineering audience with 4.5Mn+ weekly views. AI Engineer and Entrepreneur (Ex Investment Banking).

发布日期: 2024年7月1日

???? One of the greatest challenges of Large Language based applications is how to enable them to adapt to their evolving environments through real-time pipelines that process real-world data continuously.

?? If you are building an agent today that depends on augmented generation, like RAG-augmented generation, where the data is external or non-parametric, the model needs to see updated information all the time.

This is because the underlying documents on which the application relies for its knowledge base are constantly changing. Similarly, if someone is building a chat application where new chat messages are coming in continuously, the agent or model needs to be aware of these updates.

?? In such cases, you need an index or a set of indexes that are continuously updated.

So the challenge is after my set of documents get embedded and put into a vectorstore - how do I reindex, how do I add a new capability into my pipeline, or how you keep the whole system online, keeping the indexes online while reindexing, and so on.

?? So we need a solution to build Production-Grade LLM Applications that react to your external multi-modal Data.

You also need a scalable compute engine so that whenever new data comes in, you are able to index them appropriately and update the indexes, and so on.

?? To deal with all of that, introducing you to a new tool that's just launched - Tensorlake's open-source, real-time data framework, Indexify. ??

The tool that changes the game by simplifying the transition from prototype to production for data-intensive LLM applications at any scale.

I have just about started hacking around it, and till now find it quite incredible. ?

Let's see some of its fantastic features

???? So, Indexify is a "long running" service like any other micro-service running in your environment, and exposes an API for uploading any form of unstructured data.

???? Makes Unstructured Data Queryable with SQL and Semantic Search.

???? Create Extraction Graph to create multi-step workflows for data transformation, embedding and structured extraction. Works with many Blob Stores, Vector Stores and Structured Databases.

They have even Open Sourced Automation to deploy to Kubernetes in production.

???? After the upload, a bunch of extractors run in parallel on the cluster and extract information out of this unstructured data, then update indexes on something like Quadrant vectorstore or Postgres datawarehouse for semi-structured data continuously.

???? Local Experience: Indexify runs locally without any dependencies. Runs on your laptop during prototyping. The pipelines developed and tested on laptops can run unchanged in production, and scales to 1000s of machines on the cloud.

What can You Do with indexify today?

You can create pipelines to parse, summarize, embed, classify, and detect entities in videos, images, and documents, ideal for environments where data constantly evolves and LLMs make crucial decisions on the latest information.

You should use Indexify if ??

- You are working with non-trivial amount of data, >1000s of documents, audio files, videos or images.

- The data volume grows over time, and LLMs need access to updated data as quickly as possible

- You care about reliability and availability of your ingestion pipelines.

- You are working with multi-modal data, or combine multiple models into a single pipeline for data extraction.

- User Experience of your application degrades if your LLM application is reading stale data when data sources are updated.

领英推荐

Top 10 Trends for Data in 2024

Tomasz Tunguz 1 年前

THE Strongest Link! 5 Reasons why Semantic Link IS the…

Nikola Ilic 4 个月前

Databricks Data+AI Summit 2024: The headlines – and…

Kubrick Group 9 个月前

Now talking about some of the most important specialities of Indexify

1. Real-time Processing: Optimized for tasks like summarization, extraction, embedding, and parsing, Indexify excels with frequently updated data. It can ingest any data modality at scale, with incremental updates that don't require re-processing entire documents.

2. Reliability, Multi-Cloud, and Hardware Acceleration:

3. Observability:

4. Tested & Trusted:

5. Extensible & Versatile:

6. Multi-Modality I

7. Integrates with All Your Favorite Databases (Qdrant, Pinecone, PgVector, and LanceDB)

Indexify vs LangChain ??

Indexify complements LangChain by providing a robust platform for indexing large volume of multi-modal content such as PDFs, raw text, audio and video. It provides a retriever API to retrieve context for LLMs.

Install the Indexify Langchain retriever package -

pip install indexify-langchain

Indexify vs LlamaIndex ??

LlamaIndex and Indexify are complementary, you can use LlamaIndex's query engine and other components such as data loaders to ingest content for transformation and extraction using Indexify.

Indexify is the distributed data framework and compute engine. Your extraction and data processing workflows will run asynchronously and reliably in Indexify. LlamaIndex is an LLM application framework for querying data from vector stores and for response synthesis with LLMs. It doesn't include a fault tolerant and reliable distributed orchestration engine in the open source library. LlamaIndex doesn't include a deletion framework and a robust incremental compute engine, when data source are updated or deleted.

Indexify vs Spark ??

Spark is works well with tabular data and with compute functions written in Java. Indexify is faster than Spark as it doesn't rely on an external scheduler like Kubernetes for Mesos for task scheduling. Spark being only a compute engine doesn't remember where the extracted features are written, so you will also have to build a control plane to track data if deletion or updating them is necessary for your usecase. Indexify tracks data lineage and update extracted content when the source changes.

All the important links below

?? Indexify official site - https://getindexify.ai/

?? Github - https://github.com/tensorlakeai/indexify

?? Read their announcement blog - https://medium.com/tensorlake-ai/announcing-indexify-a36f69967884

?? Official documentation - docs.getindexify.ai

?? Link to a notebook demonstrating how Indexify can quickly extract insights from 10K SEC filings

https://docs.getindexify.ai/examples/SEC_10_K_docs/

Luis Molina

Technical Lead AI - Engineer AI

8 个月

How can we modify the pdf parsing? For example, how can we remove images, change headers, acronyms, etc.

要查看或添加评论，请登录

Rohan Paul的更多文章

?? Real-time audio transcription just got lightning fast: Fireworks AI unveils an API for instant captions and responsive voice interfaces.

2025年1月28日

?? Real-time audio transcription just got lightning fast: Fireworks AI unveils an API for instant captions and responsive voice interfaces.

?? Lagging captions kill live shows, and so real-time super fast voice transcription is a must. Fireworks AI just…
One prompt. Structured data. From any website, with Firecrawl's Extract, the new feature they just launched

2025年1月24日

One prompt. Structured data. From any website, with Firecrawl's Extract, the new feature they just launched

Firecrawl just launched their new feature, Extract and I am finding it just incredibly helpful in my daily work. It…
?? OpenAI Introduces Its First Agent, Operator To Automate Tasks Such As Vacation Planning, Restaurant Reservations

2025年1月24日

?? OpenAI Introduces Its First Agent, Operator To Automate Tasks Such As Vacation Planning, Restaurant Reservations

Check more on my Daily Email Newsletter ( I write daily for my 106K+ AI-pro audience, with 3.5M+ weekly views.
Image generation API at super competitive prices from Nebius

2025年1月22日

Image generation API at super competitive prices from Nebius

Found this gem today ?? Nebius just launched their image generation API. Offering three models: Flux Schnell, Flux Dev,…

5 条评论
? Pingle AI: A New Agentic AI Based Real-Time Web Search Engine

2025年1月10日

? Pingle AI: A New Agentic AI Based Real-Time Web Search Engine

For my AI-powered web search, I have been exploring Pingle AI for a few days and it’s turning out to be quite…

1 条评论
Long Term Memory : The Foundation of AI Self-Evolution

2024年11月13日

Long Term Memory : The Foundation of AI Self-Evolution

?? Very interesting paper from the Tianqiao and Chrissy Chen Institute (TCCI ) that takes AI Long-Term Memory to the…
Low code LLM Agents with Pre-build RAG Pipeline - Introducing Lyzr

2024年5月13日

Low code LLM Agents with Pre-build RAG Pipeline - Introducing Lyzr

?? In contrast to Gen AI, "agentic" AI is where the business value is. We are at a stage where Large Language Models…
Fastest way to finetune and deploy Large Language Model without writing any code

2024年5月7日

Fastest way to finetune and deploy Large Language Model without writing any code

Just recently, a finetuned version of Gemma-2B model from Google outperformed LLaMA 13B on Mathematics reasoning. ?…
Binary Quantization

2024年4月7日

Binary Quantization

The buzz surrounding Binary Quantization has been impossible to ignore, especially if you've been keeping tabs on…
Launch your RAG powered ChatBot in Minutes Using MonsterAPI's no-code platform

2024年4月1日

Launch your RAG powered ChatBot in Minutes Using MonsterAPI's no-code platform

Retrieval-Augmented Generation (RAG) and businesses are a match made in heaven. ?? RAG is a technique for enhancing the…

See all articles

Production-Grade LLM Applications that React to Your Data

Rohan Paul

Founder Rohan's Bytes. → I write daily for my 112K+ engineering audience with 4.5Mn+ weekly views. AI Engineer and Entrepreneur (Ex Investment Banking).

领英推荐

Rohan Paul的更多文章

社区洞察

其他会员也浏览了

What is a Vector Database?

A Revolution in Analytical Technology

Data, meet Graph: Kubrick Partners with Neo4j

From Silicon Valley to Silicon Valley: how did I turn my 3rd idea into a company?

End-to-end RAG application with source retriveal on Databricks Platform

DATA Pill #056 - Fine Tuning vs. Prompt Engineering LLM, Kedro-Snowflake plugin, and more…

DATA Pill #094 - PyAirbyte and why Gemini 1.5 are bullish for RAG

Meet Chanakya: The Platform Behind Anko’s Data-Driven Solutions

Step-by-Step Guide to Data Science at ONLEI Technologies

AI Assisted Data Catalogs: An LLM Powered by Knowledge Graphs for Metadata Discovery

领英推荐

Rohan Paul的更多文章

?? Real-time audio transcription just got lightning fast: Fireworks AI unveils an API for instant captions and responsive voice interfaces.

One prompt. Structured data. From any website, with Firecrawl's Extract, the new feature they just launched

?? OpenAI Introduces Its First Agent, Operator To Automate Tasks Such As Vacation Planning, Restaurant Reservations

Image generation API at super competitive prices from Nebius

? Pingle AI: A New Agentic AI Based Real-Time Web Search Engine

Long Term Memory : The Foundation of AI Self-Evolution

Low code LLM Agents with Pre-build RAG Pipeline - Introducing Lyzr

Fastest way to finetune and deploy Large Language Model without writing any code

Binary Quantization

Launch your RAG powered ChatBot in Minutes Using MonsterAPI's no-code platform

社区洞察

其他会员也浏览了

What is a Vector Database?

A Revolution in Analytical Technology

Data, meet Graph: Kubrick Partners with Neo4j

From Silicon Valley to Silicon Valley: how did I turn my 3rd idea into a company?

End-to-end RAG application with source retriveal on Databricks Platform

DATA Pill #056 - Fine Tuning vs. Prompt Engineering LLM, Kedro-Snowflake plugin, and more…

DATA Pill #094 - PyAirbyte and why Gemini 1.5 are bullish for RAG

Meet Chanakya: The Platform Behind Anko’s Data-Driven Solutions

Step-by-Step Guide to Data Science at ONLEI Technologies

AI Assisted Data Catalogs: An LLM Powered by Knowledge Graphs for Metadata Discovery