登录查看更多内容

Yet Another Rant About Why AI Doesn’t Meet My Expectations (Despite All the Hype)

Dmitry Moiseev

Senior Director of Engineering @ Cambium Networks | Wireless, Security & IoT

发布日期: 2024年11月6日

Is AI ready to understand your business? Oh, absolutely—if your business is answering trivia questions or writing dad jokes. But for anything beyond that? Not even close. The current generation of LLMs is undeniably impressive; from everyday users chatting away on ChatGPT to developers tinkering with APIs and self-hosted models like Facebook's Llama series, NVIDIA's NeMo, Anthropic's offerings, or Mistral models—these tools are reshaping how we approach various tasks.

They're excellent at handling generic questions and performing generic tasks, making our lives a tad easier. (Full disclosure: I used an LLM to polish the grammar of this essay; however, the original content is entirely my own.) But let's not get too carried away.

The biggest headache is context. These LLMs are trained on publicly available data and are blissfully unaware of your company's inner workings. Sure, they can answer generic questions like a champ, but when it comes to your specific needs—like your unique codebase or the hardware your software runs on—they’re left scratching their digital heads. Imagine a developer asking an LLM to debug a specific error related to a custom, in-house API. The LLM might generate plausible-sounding but incorrect solutions simply because it lacks knowledge of the company's proprietary codebase.

Making them understand your unique data isn’t exactly straightforward. Currently, there are three "solutions" on the table:

a. Fine-tuning the model—which sounds great until you realize it demands well-structured and meticulously prepared input-output pairs for training. For example, a software development company might want to fine-tune an LLM to act as a code assistant, capable of understanding and debugging their proprietary systems. The company relies on extensive internal knowledge, third-party APIs, and complex hardware interactions—all documented in disparate, often inconsistent PDFs. Unfortunately, you can't just toss in your PDFs and source code and hope for the best—unless, of course, your idea of 'the best' is random, unpredictable chaos. Converting this unstructured data—scattered across source code and documentation PDFs—into a structured dataset of input and output examples suitable for fine-tuning is time-intensive and requires extensive human oversight. Developers need to meticulously parse through these PDFs, extracting relevant API calls, error codes, and troubleshooting steps, then reformat them into a training-friendly format. This preprocessing stage is essential yet complex, as it demands a deep understanding of both the proprietary systems and the specific nuances of the LLM training.

领英推荐

Delivering 3 trusted platforms for the AI age

Satya Nadella 4 个月前

The power of expertise

OneAdvanced 10 个月前

How can you integrate machine learning (#ML/#AI) into…

Ajit Jaokar 10 个月前

b. Retrieval-Augmented Generation (RAG)—an even trickier route. RAG works by retrieving relevant information from a knowledge base to supplement the LLM's responses, effectively bridging the gap between the model and your proprietary data. This involves techniques like semantic search and using vector databases to find and fetch the most relevant pieces of information. For instance, a customer support team might use RAG to retrieve troubleshooting steps for specific products. However, if the data includes irrelevant details or outdated procedures, the LLM may produce misleading responses. The main challenges here are ensuring the retrieval of accurate, relevant data and effectively summarizing it within the token limits imposed by the model. Any noise or irrelevant details in the data can confuse the LLM, so yes, you get the joy of playing 'data janitor' for hours to avoid confusing your very expensive AI babysitter. Additionally, the limitations of context size and associated costs restrict the amount of information that can be included. Even after jumping through all these hoops, the LLM might still focus on minor details, ignoring what's actually relevant because it can't see "the whole picture."

c. Agents—or, as some call them, "autonomous" LLM extensions. Agents attempt to act on their own to carry out complex, multi-step tasks by calling APIs, retrieving data, or performing other actions. Frameworks like AutoGPT and LangChain agents are pushing the boundaries of what's possible. Sounds impressive, right? Almost like having a digital assistant that promises to make your life easier, then proceeds to make you question your life choices when it spirals into an infinite loop over a minor ticket. If it encounters a unique issue, it may loop endlessly between unrelated steps or suggest irrelevant solutions, due to its limited understanding of when it is veering off track. While they represent significant advancements, these agents often lack a real understanding of when they are veering off track. They may perform redundant actions, get caught in loops, or simply go down irrelevant paths. However, ongoing research is focused on improving their reliability through techniques like reinforcement learning and human-in-the-loop approaches, aiming to enhance agent performance. For now, agents rely on finely tuned configurations and human oversight to keep them on track—so "autonomous" may be a bit of a stretch here.

The real “next big thing” is figuring out how to provide AI with comprehensive context for your data, a challenge that many researchers and organizations are actively working to solve. Current approaches include developing more efficient memory architectures for LLMs, using vector databases to expand retrieval capabilities, and building hierarchical or multi-step processing pipelines to manage larger contexts. Other promising techniques, such as adaptive context windows, dynamic summarization, and reinforcement learning with feedback loops, are also emerging. These advancements aim to enable AI to handle and understand complex, business-specific information more effectively, bringing us closer to seamless integration with proprietary data.

Until then, there is still plenty of work for humans that LLMs simply aren’t capable of yet—like actually knowing what’s important versus getting lost in meaningless details. Seems like we’re safe from complete AI domination for at least a little while longer: fine-tuning models, designing robust RAG pipelines, optimizing memory architectures, coordinating agents, and weaving these solutions together to create seamless, context-aware AI systems. Each of these components demands specialized expertise and careful integration, underscoring the need for human insight and oversight as we push the boundaries of what AI can achieve.

Judson Weeks

Chief Technology Officer | CTO | Enlightened Engineering Leadership

4 个月

Dmitry Moiseev thank you for this thoughtful and well-written article.

1 次回应

Caleb Knauer

Looking for something interesting

4 个月

I think of LLM output like reading a seemingly knowledgeable and interesting Reddit thread. Sounds really smart and insightful at first, but if you actually know the topic being discussed you realize it's mostly a bunch of regurgitated nonsense :-D

3 次回应

Frantisek Borsik

"If you do not take risks for your ideas you are nothing. Nothing." N.N.T. | #LibreQoS & #bufferbloat :-) PS: Bandwidth is a lie!

4 个月

Jana ;-)

1 次回应

Firas Douss

Deputy Manager at PIVA Software

4 个月

Insightful... thanks for sharing

1 次回应

查看更多评论

要查看或添加评论，请登录

Dmitry Moiseev的更多文章

Phishing 101: A Coaching Session from Universal Coaching Consultants

2024年12月3日

Phishing 101: A Coaching Session from Universal Coaching Consultants

Today could have been just another ordinary day, but an email landed in my inbox that caught my attention because it…

2 条评论
Apple Private Cloud Compute Overview

2024年6月11日

Apple Private Cloud Compute Overview

Yesterday, when watching Apple WWDC'24 I was curious about how Apple is going to assure privacy and verifiable…

2 条评论
Best Intentions, Troubling Consequences: Unpacking the FCC's Broadband Proposal

2023年11月1日

Best Intentions, Troubling Consequences: Unpacking the FCC's Broadband Proposal

The FCC announced their "Seventeenth Section 706 Report Notice of Inquiry"(https://www.fcc.

14 条评论

Yet Another Rant About Why AI Doesn’t Meet My Expectations (Despite All the Hype)

Dmitry Moiseev

Senior Director of Engineering @ Cambium Networks | Wireless, Security & IoT

领英推荐

Dmitry Moiseev的更多文章

社区洞察

其他会员也浏览了

You’re All In On GenAI. Now What?

Power-Up Optimal AI Applications

LLM Wrapper Applications: Dead or Driving Real Business Value?

#artificialintelligence #100: How would a copilot first approach look like if we started with a generative / assisted strategy ?

Newsletter 2. January, 2025

FinTech Jargon Buster

A.I. Executive Briefing #4

The end of 'the numbers guy'

Notes on metadata of LLMs/GAI/AI to tie accountability, liability and more in legal terms plus study it's evolution and analyze - Part 1

Retrieval-Augmented Generation Basics for the Data Center Admin

领英推荐

Dmitry Moiseev的更多文章

Phishing 101: A Coaching Session from Universal Coaching Consultants

Apple Private Cloud Compute Overview

Best Intentions, Troubling Consequences: Unpacking the FCC's Broadband Proposal

社区洞察

其他会员也浏览了

You’re All In On GenAI. Now What?

Power-Up Optimal AI Applications

LLM Wrapper Applications: Dead or Driving Real Business Value?

#artificialintelligence #100: How would a copilot first approach look like if we started with a generative / assisted strategy ?

Newsletter 2. January, 2025

FinTech Jargon Buster

A.I. Executive Briefing #4

The end of 'the numbers guy'

Notes on metadata of LLMs/GAI/AI to tie accountability, liability and more in legal terms plus study it's evolution and analyze - Part 1

Retrieval-Augmented Generation Basics for the Data Center Admin