Yet Another Rant About Why AI Doesn’t Meet My Expectations (Despite All the Hype)
Dmitry Moiseev
Senior Director of Engineering @ Cambium Networks | Wireless, Security & IoT
Is AI ready to understand your business? Oh, absolutely—if your business is answering trivia questions or writing dad jokes. But for anything beyond that? Not even close. The current generation of LLMs is undeniably impressive; from everyday users chatting away on ChatGPT to developers tinkering with APIs and self-hosted models like Facebook's Llama series, NVIDIA's NeMo, Anthropic's offerings, or Mistral models—these tools are reshaping how we approach various tasks.
They're excellent at handling generic questions and performing generic tasks, making our lives a tad easier. (Full disclosure: I used an LLM to polish the grammar of this essay; however, the original content is entirely my own.) But let's not get too carried away.
The biggest headache is context. These LLMs are trained on publicly available data and are blissfully unaware of your company's inner workings. Sure, they can answer generic questions like a champ, but when it comes to your specific needs—like your unique codebase or the hardware your software runs on—they’re left scratching their digital heads. Imagine a developer asking an LLM to debug a specific error related to a custom, in-house API. The LLM might generate plausible-sounding but incorrect solutions simply because it lacks knowledge of the company's proprietary codebase.
Making them understand your unique data isn’t exactly straightforward. Currently, there are three "solutions" on the table:
a. Fine-tuning the model—which sounds great until you realize it demands well-structured and meticulously prepared input-output pairs for training. For example, a software development company might want to fine-tune an LLM to act as a code assistant, capable of understanding and debugging their proprietary systems. The company relies on extensive internal knowledge, third-party APIs, and complex hardware interactions—all documented in disparate, often inconsistent PDFs. Unfortunately, you can't just toss in your PDFs and source code and hope for the best—unless, of course, your idea of 'the best' is random, unpredictable chaos. Converting this unstructured data—scattered across source code and documentation PDFs—into a structured dataset of input and output examples suitable for fine-tuning is time-intensive and requires extensive human oversight. Developers need to meticulously parse through these PDFs, extracting relevant API calls, error codes, and troubleshooting steps, then reformat them into a training-friendly format. This preprocessing stage is essential yet complex, as it demands a deep understanding of both the proprietary systems and the specific nuances of the LLM training.
领英推荐
b. Retrieval-Augmented Generation (RAG)—an even trickier route. RAG works by retrieving relevant information from a knowledge base to supplement the LLM's responses, effectively bridging the gap between the model and your proprietary data. This involves techniques like semantic search and using vector databases to find and fetch the most relevant pieces of information. For instance, a customer support team might use RAG to retrieve troubleshooting steps for specific products. However, if the data includes irrelevant details or outdated procedures, the LLM may produce misleading responses. The main challenges here are ensuring the retrieval of accurate, relevant data and effectively summarizing it within the token limits imposed by the model. Any noise or irrelevant details in the data can confuse the LLM, so yes, you get the joy of playing 'data janitor' for hours to avoid confusing your very expensive AI babysitter. Additionally, the limitations of context size and associated costs restrict the amount of information that can be included. Even after jumping through all these hoops, the LLM might still focus on minor details, ignoring what's actually relevant because it can't see "the whole picture."
c. Agents—or, as some call them, "autonomous" LLM extensions. Agents attempt to act on their own to carry out complex, multi-step tasks by calling APIs, retrieving data, or performing other actions. Frameworks like AutoGPT and LangChain agents are pushing the boundaries of what's possible. Sounds impressive, right? Almost like having a digital assistant that promises to make your life easier, then proceeds to make you question your life choices when it spirals into an infinite loop over a minor ticket. If it encounters a unique issue, it may loop endlessly between unrelated steps or suggest irrelevant solutions, due to its limited understanding of when it is veering off track. While they represent significant advancements, these agents often lack a real understanding of when they are veering off track. They may perform redundant actions, get caught in loops, or simply go down irrelevant paths. However, ongoing research is focused on improving their reliability through techniques like reinforcement learning and human-in-the-loop approaches, aiming to enhance agent performance. For now, agents rely on finely tuned configurations and human oversight to keep them on track—so "autonomous" may be a bit of a stretch here.
The real “next big thing” is figuring out how to provide AI with comprehensive context for your data, a challenge that many researchers and organizations are actively working to solve. Current approaches include developing more efficient memory architectures for LLMs, using vector databases to expand retrieval capabilities, and building hierarchical or multi-step processing pipelines to manage larger contexts. Other promising techniques, such as adaptive context windows, dynamic summarization, and reinforcement learning with feedback loops, are also emerging. These advancements aim to enable AI to handle and understand complex, business-specific information more effectively, bringing us closer to seamless integration with proprietary data.
Until then, there is still plenty of work for humans that LLMs simply aren’t capable of yet—like actually knowing what’s important versus getting lost in meaningless details. Seems like we’re safe from complete AI domination for at least a little while longer: fine-tuning models, designing robust RAG pipelines, optimizing memory architectures, coordinating agents, and weaving these solutions together to create seamless, context-aware AI systems. Each of these components demands specialized expertise and careful integration, underscoring the need for human insight and oversight as we push the boundaries of what AI can achieve.
Chief Technology Officer | CTO | Enlightened Engineering Leadership
4 个月Dmitry Moiseev thank you for this thoughtful and well-written article.
Looking for something interesting
4 个月I think of LLM output like reading a seemingly knowledgeable and interesting Reddit thread. Sounds really smart and insightful at first, but if you actually know the topic being discussed you realize it's mostly a bunch of regurgitated nonsense :-D
"If you do not take risks for your ideas you are nothing. Nothing." N.N.T. | #LibreQoS & #bufferbloat :-) PS: Bandwidth is a lie!
4 个月Jana ;-)
Deputy Manager at PIVA Software
4 个月Insightful... thanks for sharing