登录查看更多内容

点击“继续加入或登录”，即表示您同意遵守领英的《用户协议》、《隐私政策》及《Cookie 政策》。

How to Optimize LLM Performance with AI Agents

Anthony Walsh

Product Marketing Manager at Atlassian, Meta & venture-backed startups | Accelerating AI development securely, efficiently & compliantly

发布日期: 2024年6月25日

Apple popularized the concept of AI agents in 1994. After thirty years, they will soon be integrated into desktops and mobile devices. During this time, AI developed from a research discipline to programs for natural language processing (NLP) and generative AI applications to autonomous agents. Agents can automate actions and improve the output of large language models (LLM). Open-source projects including Langchain, LlamaIndex, and Eidolon AI provide frameworks for agents capable of performing many tasks, including retrieval augmented generation (RAG). RAG offers helpful context for domain-specific questions, which can enhance employee experiences or deliver customer support and specialized knowledge.

Agents on the global stage (and in the palms of mobile users)

On June 10, Apple made headlines during the Worldwide Developer Conference by announcing AI-native capabilities for iOS 18, iPadOS 18, and MacOS Sequoia to be launched in September. The company broke its long-standing silence on AI development plans with the debut of Apple Intelligence. CEO Tim Cook and team demoed new features, including:

Custom image & emoji generation
Integrations with OpenAI’s GPT4o
Apple Silicon for on-device processing
Privacy Cloud Compute for AI processing
Enhanced NLP and text commands for Siri

Apple revealed the soon-to-be multimodal virtual assistant, which executes actions within and across many applications. Device holders can expect to search for photos, videos, or files with natural language, improving user productivity and native app experiences. Furthermore, developers will soon be able to integrate voice and text interactions into third-party applications.?

Siri’s new functionality marks a shift from conversational AI to AI agents, which are defined as software programs that perceive their environment and act autonomously to complete a task. Apple placed agents on the world stage (and in the palms of mobile users, except for those in Europe) but it’s hardly a new concept.

Thirty years ago, three Distinguished Research Scientists at Apple described agents as "a persistent software entity dedicated to a specific purpose.” Allen Cypher, David Canfield Smith, and Jim Spohrer emphasized that “‘persistent’ distinguishes agents from subroutines; agents have their own ideas about how to accomplish tasks, their own agendas.” They further elaborated that “‘special purpose' distinguishes them from entire multifunction applications.”

SVP of Software Engineering Craig Federighi presenting Apple Intelligence

Improving the output of large language models

One may wonder how agents relate to LLMs given the parallel release of Siri’s actions and ChatGPT integrations. Whereas agents complete tasks, generative AI applications built on LLMs predict the next token in a string.

It’s no secret that Siri struggles with natural language understanding, just as LLMs hallucinate when responding. A 2019 report by Statista ranks the answer accuracy of the voice assistant at 83.1%, while foundation models by OpenAI, Microsoft, Google, Meta, and Anthropic maintained a greater rate of factual consistency, according to the public LLM leaderboard published by Palo Alto startup Vectara. Coincidentally, Apple’s OpenELM-3B-Instruct ranked as the least precise model evaluated.?

There are two primary methods to improve the performance of LLMs: fine-tuning and RAG. These methods are not mutually exclusive, and their utilization depends on the given use case or constraints.

Fine-tuning involves training a foundation model with domain-specific datasets or modifying parameters to influence its behavior. Although configured to produce more specific output, fine-tuned models have been known to generate unexpected answers. Furthermore, they require additional resources for data labeling, model training, and adjustments.?

RAG employs agents to query knowledge bases so pre-trained models can access recent, reliable, and relevant information. Models grounded by a source of truth and supplemented with contextual understanding exhibit lower risks of hallucinations. Developers seemingly prefer RAG for scenarios that demand subject matter expertise. It also tends to be less complex, costly, and resource-intensive when compared to fine-tuning.?

Many open-source projects enable developers to build agents capable of performing RAG – including Langchain, LlamaIndex, and more. However, Eidolon AI by August Data stands out due to its ease of use, modularity, and multi-agent architecture. Not only that, but the organization demonstrates the value of its solution by publishing the codebase for agents that search and retrieve information about its very own GitHub repository.

Retrieval augmented generation with AI agents?

Below are seven steps to implement agents that conduct RAG for Eidolon AI GitHub repositories. Before starting, note that this open-source project only supports MacOS and Linux. Those using Windows must install Windows Subsystem for Linux (WSL).

Set up the developer environment by downloading Python 3.11 or 3.12 and Python Poetry, then acquire a premium OpenAI license with access to an API key
Copy and paste this code snippet into a CLI to clone the Eidolon quickstart repository and download all necessary dependencies locally:?

git clone https://github.com/eidolon-ai/eidolon-quickstart.git
cd eidolon-quickstart

3. Run the Eidolon HTTP server in developer mode by entering the following command:?

make serve-dev

4. At the program’s request, input the OpenAI API key, accessible from an OpenAI account.?

5. Fork the Eidolon chatbot repository, clone it locally, and start the server with this script:

git clone https://github.com/eidolon-ai/eidolon-git-search.git
cd eidolon-git-search
make serve-dev

6. Add the GitHub token, which can be found in a GitHub account, to avoid any rate limit errors.?

7. Navigate to the chatbot UI in a web browser. Select the agent, open a chat, and enter a prompt.?

Completing these seven steps will facilitate the deployment of two agents: the repo expert agent and the repo search agent. The repo expert agent can be considered a user-facing copilot that receives and responds to questions about the Eidolon GitHub repository. It retrieves answers from the repo search agent, which converts queries in natural language to vector searches and returns the top result.

Now that the agents are programmed to answer questions, feel free to ask about Eidolon AI or how to customize its agents for specific use cases. Troubleshoot potential errors by referencing the quickstart guide or the “recipe” for the GitHub repo expert and search agents. Join this Discord channel to contribute to the project or send inquiries to the developers.?

Incorporating search agents into user workflows

Not only do the instructions exemplify how to implement agents for RAG, but the agents grounded by the Eidolon GitHub repository exhibit the potential of the many applications that can be built with the open-source framework. Developers agree that RAG provides the most value when sharing specialized knowledge, enhancing employee experiences, and delivering customer support.?

Employee experience agents are critical for operational efficiency because they unlock productivity gains for staff. Glean and Moveworks, for example, have developed products to help end clients search for information in their corporate intranet and learn more about the inner workings of their organization. Customer support agents are crucial for commercial growth, as they are responsible for fielding and addressing questions about technical documentation for software products. Vendors such as Intercom and Aisera offer solutions to help software companies retain users by resolving their confusion.?

Companies like Notion are launching AI productivity features leveraging RAG to answer questions from tens of millions of users instantaneously by searching billions of documents, which reduced operating costs by 60%. Health and legal tech providers like DISCO and InpharmD, respectively, have also leveraged RAG to benefit from productivity and cost savings while optimizing response times and response accuracy. It will be interesting to witness how enterprises, law firms, or healthcare centers of the future adopt RAG for various use cases.

Eidolon AI

9 个月

Thanks Anthony Walsh. Useful for practitioners who want to move beyond Pilots to real world production deployments.

1 次回应

Ravi Ramachandran

Startup-tarian | CEO & Co-Founder | Data & AI Go-To-Market and Sales Leader

9 个月

Thanks for going deep with Eidolon Anthony Walsh. We're working hard to offer agents that are enterprise grade and production ready. Your feedback has been super useful and definitely helped us improve.

1 次回应

查看更多评论

要查看或添加评论，请登录

Anthony Walsh的更多文章

The Dangers of Data Egress and Ingress for LLM Usage

2025年3月6日

The Dangers of Data Egress and Ingress for LLM Usage

Generative AI is quickly becoming a pervasive utility in the workplace. According to research by Microsoft, 78% of…
How AI Apps Use and Misuse Your Data

2025年2月21日

How AI Apps Use and Misuse Your Data

Although large language model (LLM) and AI application developers have implemented privacy and security…
Ethical AI Policies and Their Unintended Consequences

2025年1月23日

Ethical AI Policies and Their Unintended Consequences

The imposition of AI policy in global jurisdictions has disrupted product launches for some of the largest…

1 条评论
5 Product Marketing Programs that Add Immediate Value

2023年1月14日

5 Product Marketing Programs that Add Immediate Value

Revenue Optimization & Customer Satisfaction The role of a Product Marketer entails the management of programs that add…

2 条评论
10 Dos and Donts of Drafting Positioning Statements

2022年8月18日

10 Dos and Donts of Drafting Positioning Statements

Defining Your Solution Whether your team is preparing for a new product release, corporate rebranding, or iterative…
40 Things To Do For Your Next B2B Software Launch

2022年5月19日

40 Things To Do For Your Next B2B Software Launch

B2B SaaS Product Launch Guide Releasing new software can be a struggle for mature organizations, and even more so for…

2 条评论
30-60-90 Day Plan for B2B Product Marketers

2022年5月6日

30-60-90 Day Plan for B2B Product Marketers

Are you a new Product Marketing recruit, or a Marketing Leader hiring Product Marketers? You're likely tasked with…

See all articles

Agents on the global stage (and in the palms of mobile users)

Improving the output of large language models

Retrieval augmented generation with AI agents?

Incorporating search agents into user workflows

Anthony Walsh的更多文章

The Dangers of Data Egress and Ingress for LLM Usage

How AI Apps Use and Misuse Your Data

Ethical AI Policies and Their Unintended Consequences

5 Product Marketing Programs that Add Immediate Value

10 Dos and Donts of Drafting Positioning Statements

40 Things To Do For Your Next B2B Software Launch

30-60-90 Day Plan for B2B Product Marketers

社区洞察