LLMs: Chatbots to Orchestrators of future applications
Arpit Tandon
Director of AI Strategy and IP | P&L Leadership | Market Expansion | Scaling Teams
Inspired by the fantastic LLM talk by Andrej Karpathy , this post dives into the potential of these models beyond chatbots and assistants. As AI technology evolves, the scope for LLMs extends far beyond mere assistants, envisioning a future where they act as the central operating system orchestrating various facets of our lives seamlessly.
LLMs as the Intelligent OS of the Future
In the future, LLMs could act as the intelligent operating system that powers our everyday devices and applications. Imagine an LLM-powered smart home that anticipates your mood based on facial expressions, adjusts lighting and temperature, and even suggests recipes based on the ingredients in your fridge. Picture an ecosystem where these models interact with Internet of Things (IoT) devices, analysing sensor data, understanding user intentions, and automating tasks. In this vision, LLMs function as an operating system, not just comprehending human language but dynamically adapting to evolving needs through an enriched reasoning layer.
Long term memory of LLMs
A major limitation of LLMs is that they are stateless, if multiple prompts are given one after the other, LLM would process those prompts independently without remembering the earlier ones. However for practical purposes, prompts are chained together (older prompts appended to the new ones) to provide more contextual responses. But this approach increases the context length, thereby increasing the computational costs of inferencing LLMs. More advanced approaches such as semantic search and adding relevancy to older prompts are helping add longer term memory to the LLMs. In fact, adding longer term memory to LLMs is the path towards AGI or Artificial General Intelligence. This is akin to the RAM that used to be super expensive in the 90s and then as hardware became cheaper, computers started having more RAM. With decreasing compute costs and smarter algorithms, context lengths would increase, adding longer term memory to the LLMs.
Retrieval augmented generation (RAG)
Static training data limits LLMs. Enter Retrieval-Augmented Generation (RAG), which integrates LLMs with external knowledge bases. Think of it as LLMs accessing databases on demand, like CPUs with hard drives. This unlocks hyper-personalisation, with specialised LLMs paired with individual data in fields like healthcare and education. Imagine a medically specialised LLM paired with your medical records suggesting personalised treatment plans, or one tailoring learning materials to your child's unique needs.
Multi-modality of LLMs
Modern LLMs are evolving toward multi modality, integrating various forms of information beyond text, including images and audio. This expansion broadens their capacity to comprehend and generate content across diverse mediums. Integrating with external peripheral devices will enhance this multi modality, allowing LLMs to interface with sensors, cameras, or microphones, thereby accessing real-time data streams. This connection will enable these models to be more contextually aware, bridging the gap between digital and physical world. In fact, the latest Google’s Gemini suite of LLMs are natively multi-modal. They are trained on audio, images and videos, and also accept inputs directly in this form for further processing.
领英推荐
LLMs embedded into the browsers
Bing started this innovation by offering LLM powered internet search through OpenAI GPT 3.5 and 4 models. Even Google has launched their early release of Gemini powered search. This can massively disrupt the way we have been searching internet for past couple of decades. Hyper-personalised results along with extracting information from hidden sources can bring in surprising results. Expect a major shift in digital advertising, with Search Engine Optimisation (SEO) experts focusing on LLM-friendly content. However, concerns around misinformation, bias, and privacy will be amplified in this new landscape.
Connection to other LLMs
Imagine a world where LLMs not only talk to us but also have conversations with each other. The emergence of "Multi-LLM Communication" opens a pandora's box of possibilities, the kind of transformative applications that are tough to conceive yet. Subject-specific LLMs interacting can uncover complex inter-disciplinary patterns, like a sports LLM teaming up with a medical LLM to optimise an athlete’s performance. In a multi-LLM ecosystem, each of the LLMs can play a different (and sometime opposing role) to generate better quality outputs. However, this would again amplify the concerns of privacy, bias and explainability, and could be computationally expensive in the short term.
Will AI technology market be more fragmented or consolidated?
As LLMs become central to future applications, the question of market dynamics arises. While intuition might suggest a highly consolidated AI market, I believe the opposite is true.
Unlike the internet boom of late 90s and beyond, where network effects led to big tech dominance, AI is facing widespread skepticism towards closed-source models and a surge in open-source development. Companies and individuals are wary of data privacy and vendor lock-in, and customers are increasingly embracing open-source alternatives even at a trade-off on performance. So, expect a fragmented AI market with multiple successful players, though the infrastructure and chip layers of the AI ecosystem might remain a highly consolidated market.
Disclaimer- The views and opinions expressed in this post are my own and do not necessarily reflect the views or opinions of my employer.
Building an AI SDR to automate your sales tasks and sharing everything I learn along the way | 50+ Successful Deployments & $10M+ Revenue Impact
1 年Great article! The idea that LLMs could become the intelligent operating system that powers our daily lives is fascinating. The possibilities are endless, from smart homes that anticipate our needs to LLMs integrated with external knowledge bases for hyper-personalization. The longer-term memory, retrieval-augmented generation, and multi-modality further highlight the potential of LLMs. The article also raises essential considerations regarding privacy, bias, and the future market dynamics of AI technology. Overall, it's an engaging and thought-provoking read.