LLMs Are Transforming AI Apps: Here's How.
Mahdad Kiyani
AWS Partner (APN-Software Solutions) | AWS SA Professional | Azure AZ-305 | ML & Data Engineering | IT Governance & SAFe Agilist | ITIL Leader | MBA (expected june 2025) | ISO 27001 Lead Auditor
Generative AI and large language models (LLMs) are revolutionizing the way applications work and becoming an essential part of the application stack. Their transformative impact is comparable to the internet's rise in 1994. As a result, corporations are now racing to integrate AI into their business operations.
Generative AI, particularly through LLMs like ChatGPT, goes beyond simple tasks such as writing blog posts or assisting with code development. These models are increasingly being used to build dynamic and interactive AI interfaces known as "agents." These agents, when built on a database with comprehensive data and LLM language capabilities, represent the present and future of mobile apps.
The integration of LLMs into applications enables powerful and engaging user experiences. These applications offer dynamic interactions, access to vast amounts of public and proprietary data, and adaptability to specific situations. Such capabilities were not readily available until recently.
Moreover, the technology has evolved to the point that almost anyone with the appropriate database and APIs can build these experiences. Let's delve into what it entails.
How Generative AI Transforms Applications
When people hear "agent" and "AI" together, they might envision a simple chatbot that appears as a popup window on an e-commerce website, offering basic conversational prompts and FAQ-based responses.
However, LLMs can do much more when they have access to the right data. Applications built on LLMs can provide advanced interaction methods that deliver expertly curated, specific, and prescient information.
Consider the following example: Imagine wanting to construct a deck in your backyard and using a mobile application from a home improvement store to create a shopping list.
By leveraging an LLM like GPT-4 and multiple data sources (e.g., the store's product catalog, inventory, customer information, order history, and other relevant data), the application can easily inform you of the materials needed for your DIY project.
But the application doesn't stop there.
If you describe the dimensions and features you desire for your deck, it can provide visualization tools and design aids. It can identify nearby stores with the required items in stock based on your postal code.
Furthermore, drawing from your purchase history, the application might suggest hiring a contractor and provide contact information for local professionals.
Considering factors like drying time for deck stain and seasonal climate trends, the application can estimate when you'll be able to host that planned birthday party on your deck.
The application can also assist you with related areas, such as project permit requirements and the impact of construction on your property value. If you have additional questions, the application acts as a helpful assistant, guiding you every step of the way.
Utilizing LLMs in Your Application Is Easier Than You Think
Incorporating generative AI into projects is not confined to big established enterprises. Many organizations, including some of the largest DataStax customers, are currently working on multiple projects involving generative AI.
Building LLM-based applications doesn't require extensive knowledge of machine learning, data science, or ML model training. In fact, all it takes is a developer who can make a database call and an API call.
Creating applications that provide unprecedented levels of personalized context is a reality accessible to anyone with the right database, a few lines of code, and an LLM like GPT-4.
LLMs are user-friendly tools that take context (often called a "prompt") and generate a response accordingly. To build an agent, developers must consider how to provide the appropriate context to elicit the desired response from the LLM.
Generally, this context originates from three sources: the user's question, pre-defined prompts created by the agent's developer, and data sourced from a database or other relevant sources (refer to the diagram below).
[Diagram: User Question] -> [LLM Context] -> [Database/Other Data Sources] -> [Generated Response]
Overall, generative AI and LLMs empower developers to create sophisticated applications that cater to individual users' needs, preferences, and contexts.
The context provided by the user typically consists of their input question within the application.
The second piece of context can be defined by a product manager collaborating with a developer to specify the role of the agent. For example, they might describe the agent as a helpful sales assistant assisting customers with project planning, including providing relevant product recommendations in responses.
The third aspect of context involves incorporating external data from databases and other sources into constructing the agent's response.
In some cases, agent applications may make multiple calls to the LLM to generate more detailed responses before delivering them to the user.
Technologies like ChatGPT Plug-ins and LangChain facilitate these processes, enhancing the agent's capabilities.
领英推荐
Endowing LLMs with Memory...
AI agents require a knowledge source that LLMs can comprehend. To understand how LLMs work, it's important to consider their context window or memory capacity. When you interact with ChatGPT over an extended conversation, it retains your previous queries and corresponding responses. However, it starts to "forget" the context.
This is why connecting an agent to a database is crucial for companies aiming to build agent-based applications on LLMs. The database needs to store information in a format understandable by LLMs, such as vectors.
In simple terms, vectors allow you to represent a sentence, concept, or image across multiple dimensions. For instance, a product description can be transformed into a vector representation.
Recording these dimensions enables vector search, which enables searching based on multidimensional concepts rather than just keywords. This facilitates more accurate and contextually appropriate responses from LLMs and acts as a form of long-term memory for the models. Vector search serves as a vital link between LLMs and the extensive knowledge bases on which they are trained.
Vectors serve as the "language" of LLMs, while vector search is an essential capability of databases that provides the necessary context.
Consequently, a vector database with high throughput, scalability, and reliability is a crucial component for serving LLMs with the appropriate data, especially considering the massive datasets required to fuel agent experiences.
...With the Right Database
Scalability and performance are key considerations when selecting a database for any AI/ML application. Agents require real-time access to extensive data and demand high-speed processing, particularly when deploying agents for every customer on a website or mobile application.
The ability to scale rapidly is vital for storing data that powers agent applications.
Apache Cassandra, relied upon by industry leaders like Netflix, Uber, and FedEx, plays a pivotal role in driving their engagement systems. AI has become integral to enhancing every interaction in businesses.
As agent-powered engagement becomes more prominent, Cassandra becomes essential due to its horizontal scalability, speed, and rock-solid stability. It naturally fits as a database for storing the data required to power agent-based applications.
In line with this, the Cassandra community has developed critical vector search capabilities to simplify building AI applications on vast datasets. DataStax has made these capabilities easily accessible through Astra DB, the first petascale NoSQL database that is AI-ready with vector capabilities. (Read more about this news here.)
How Is It Achieved?
Organizations have multiple avenues to create agent application experiences, as mentioned earlier.
Developers often discuss frameworks like LangChain, which enables the development of LLM-powered agents by chaining together multiple LLM invocations and automatically retrieving the necessary data from relevant sources.
However, the most significant approach to building these experiences is by leveraging the world's most popular agent at present: ChatGPT.
ChatGPT plugins enable third-party organizations to connect to ChatGPT and provide add-ons that offer information specific to their companies. Think of it like Facebook, which became a social network platform with an ecosystem of organizations building games, content, and news feeds that could integrate with it. ChatGPT has become a similar platform—a "super agent."
While your developers may be working on creating your proprietary agent-based application experience using frameworks like LangChain, solely focusing on that would come at a significant opportunity cost.
If they don't work on a ChatGPT plugin, your organization would miss out on a massive distribution opportunity to integrate context tailored to your business into the range of information and recommended actions that ChatGPT can provide to its users.
Several companies, such as Instacart, Expedia, OpenTable, and Slack, have already developed ChatGPT plugins, offering them a competitive edge through integration with ChatGPT.
An Accessible Agent for Transformation
Building ChatGPT plugins will be a crucial aspect of AI agent projects that businesses aim to undertake.
Having the right data architecture, particularly a vector database, simplifies the process of developing high-performance agent experiences that can swiftly retrieve the necessary information to power their responses.
In the future, all applications will become AI applications, and the rise of LLMs and capabilities like ChatGPT plugins makes this future more accessible.