A No-Nonsense Guide to Large Language Models
Adityaojas Sharma
??NASSCOM AI GameChanger '24 ?? e4m Rising Star of the Year '23 ?? Young EMVIE of the Year '22)
Have you ever found yourself in a meeting where your boss's boss asks for something like this:
"We need a hyper-advanced, fully autonomous AGI agentic model with extensive neural network capabilities and quantum computing optimizations to write social media posts?"
Sound familiar? You're certainly not alone.
The gap between expectations, and what capabilities of AI and LLMs are required for what applications, can be vast.
Recognizing this, I've put together a straightforward guide to help demystify what capabilities are needed for various AI use cases by structuring them from the simpler to more complex levels of AI Capabilities
Let's dive into the no-nonsense details with...
Level 1 - The Question-Answering Engine
Imagine you're using an LLM engine for the very first time. You ask, "Who is the CEO of Google?" Within seconds, you receive the reply, "Sundar Pichai." At this initial level, the LLM operates purely as a question-answering engine.
You provide a prompt, and it fetches the answer. It’s straightforward and factual—ideal for quick, specific queries. There's no need for context or memory; it's all about direct responses to direct questions.
Now let's add another dimension, by adding a layer of short-term memory, and take it to...
Level 2: Conversational Agents
This upgrade transforms our LLM from a simple Q&A bot to a conversational agent, capable of understanding context within a session. Technically, this is called - in-context learning.
For instance, after you’ve asked about the CEO of Google, a follow-up question like, "Where are the headquarters located?" yields "California." The system remembers that you were discussing Google and tailors its response accordingly. This in-context ability makes the interaction feel more natural and connected.
Now let's take another leap, and make the memory a bit longer, and take it to...
Level 3 - The RAGs
You see, factual and in-context knowledge isn't always enough, especially for complex or niche queries.
Suppose you ask, "Who was the senior sales manager of the Google Pixel 7 team?" The answer might be inaccurate or fabricated because it’s not commonly known information.
This is where Retrieval-Augmented Generation (RAG) becomes crucial. RAG systems enhance LLMs by pulling from extensive external data sources, enabling them to answer more specialized questions accurately.
This integration marks a significant evolution from mere chatbots to intelligent systems capable of deep, reliable interactions.
Now when we are talking about RAGs, there are a lot of other key components that one needs to know about - like Context Windows, Indexing, Embedding, Semantic Search, etc. But we'll keep all that for later.
领英推荐
So till now we've covered 3 dimensions. Humor yourself, and think again, before we move from RAGs to Riches, and hop on to the next level which is..
Level 4: The Agents
The evolution to agents addresses the need for dynamic interaction with external systems and data. Agents can execute tasks, interact with APIs, and handle complex sequences of actions.
For example, at one of the verticals in LEAPX.AI we're building AI Agents that automate and optimize the whole process of launching all kinds of Digital Ads and all sorts of platforms. Along with executing the tasks, our agents also get intelligent with time for different industries/ categories, of course by storing all the learnings in dedicated databases, and upgrading every now.
Now you must have heard of Function Calling, which sounds rather complicated, but is infact a simple concept of structuring LLM responses into a format acceptable by external APIs, functions and Tools.
For instance, if a logistics manager needs to convert currency values based on the latest exchange rates, the LLM could format a query in JSON that directly calls a currency conversion API, providing instant, accurate results.
This capability is vital for adapting to the fluidity of real-world data and for integrating with the myriad of digital tools that professionals use today.
Agents can autonomously execute these function calls, significantly reducing manual workloads and enhancing efficiency.
That was a bit too much on Agents, and that's exactly what most of the startups are working on. More on Agents soon!
But now, let's touch upon an aspirational level - things and capabilities of LLMs that we might get to see in the near future! Something like..
Level 5: LLM Operating Systems
The futristic concept of an LLM Operating System (OS) integrates all previous functionalities into a unified, intelligent system. It's gonna be more of a Jarvis or Ultron than anything before!
An LLM OS could manage diverse tasks across an enterprise, from handling customer service inquiries to managing supply chains, all while interacting with various data sources and applications to automate routine processes and deliver insights.
This advanced system would not only respond to inputs but also anticipate needs and adapt to new information, potentially transforming industries by enabling a higher degree of automation and decision-making support.
I am tired now, I'd probably jot more about LLM OSs when they seem more accessible and realistic! So..
As I wrap up this article, I hope it helped you gain better clarity on how to match the right AI Capabilities to your specific needs. (I think I will keep on writing about this in future posts and articles.)
The words and jargon in AI are often oversold in ambition, but with the right knowledge and expectations, you can harness its true potential effectively.
Whether you're looking to automate routine tasks or tackle more complex challenges, remember that the key lies in understanding both the capabilities and limitations of these technologies.
Thank you for reading it till the end!