Embracing Model Diversity: Why Organisations Should Adopt Multiple Large Language Models
Image created with DALL.E 2

Embracing Model Diversity: Why Organisations Should Adopt Multiple Large Language Models


Introduction

In the realm of artificial intelligence, Large Language Models (LLMs) have begun to revolutionise how machines understand and generate human-like content. These models have become foundational in developing applications ranging from automated customer service to advanced content generation and application development. However, as our reliance on these technologies grows, the strategy of using a single model for multiple tasks is becoming increasingly questioned. In this blog, I will explore why adopting a variety of LLMs is not just a viable alternative but a necessary strategy for organisations to consider as they embark on their AI journey.


The Limits of a "One-Size-Fits-All" Approach

Consider this, we’ve moved from mainframes, to servers, to virtual machines, to micro services and on to serverless systems. In many instances, building highly distributed, bespoke build systems that have the optionality of being run across public, private and hybrid cloud instances. With the ability to make use of utility compute resources on demand at competitive costs. Evidently, notwithstanding the data gravity conundrum.?

Yet, in the midst of the AI hype curve, we now find ourselves broadly gravitating to a few preferred models or model providers to power the decisions and interactions of our global economy, social and health care systems and potentially critical national infrastructure services like electricity, gas and water supplies. Whilst adoption across industry sectors varies, most of the organisations I speak with have one way or another began to conduct some form of AI/generative AI pilot or Proof of Concept. Yet the same small smattering of LLMs are being applied and in most instances, organisations are using a single LLM.?


One model to rule them all!?

Whilst using a single LLM across various applications might seem efficient, this approach does bring limitations. Firstly, a universal model may not adequately capture the specific jargon and nuances of different industries or fields. For instance, the language and knowledge required for legal advice differs vastly from what is needed in medical diagnosis or creative writing. Additionally, relying solely on one model amplifies risks: if the model experiences downtime or errors, every application dependent on it could fail simultaneously.

Indeed, I’d also add that throwing the most powerful LLM at a specific problem is probably not the most cost effective approach for organisations to take. As there will likely be many finely tuned and in some instances, Small Language Models (SLMs) that are increasingly cost effective and more proficient than their LLM counterparts at performing a specific task (e.g language translation). A quick search on platforms such as Hugging Face will provide a litany of results, tips and tricks for model deployment. As well as ratings from consumers/users of the model for how they found it to operate for their specific task.?

I mean, imagine you're a skilled carpenter tasked with crafting a fine piece of furniture, like a delicate wooden chair. Your workshop is equipped with a variety of tools, from heavy-duty power saws to fine chisels and everything in between.?

Using the largest power saw for every detail—whether cutting the basic frame or intricately carving the legs—would not only be inefficient but could ruin the finer aspects of your work. Instead, the power saw is ideal for cutting large, rough pieces quickly, while the finer chisel is perfect for the delicate carvings that give the chair its elegance and charm.

In the same way, deploying the most powerful large language model (LLM) for every AI task can be like using a power saw to carve chair legs. While powerful models can handle broad and complex tasks effectively, smaller, more specialised models are often better suited for specific, nuanced problems. Just as a good carpenter chooses the right tool for each part of the job, a wise use of technology involves selecting the most appropriate model for each particular challenge.

While single-model language systems have made remarkable strides, they come with inherent limitations including biases, a lack of flexibility across various tasks, and a susceptibility to overfitting. Adopting a multi-model approach addresses these issues by leveraging the strengths of diverse models. By employing different models tailored to specific tasks, businesses can significantly improve their data analytics capabilities and mitigate the challenges associated with depending solely on one predominant model.

This is where I think the likes of AWS have played a cute game by positioning the notion of model optionality for builders of AI systems. At a curious glance, AWS provides access to models provided by the likes of AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazons own bespoke models like Titan. Indeed, Microsoft has also started to hedge its bets by also bringing Mistral to the AI party in addition to its deep partnership with OpenAI.?

This is where a multi-model approach makes sense and to a certain degree mirrors the movement we saw with multi cloud strategies ~4-5 years ago.?


The Benefits of Multiple Models

Top of mind, the deployment of multiple LLMs within an organisation offers several compelling advantages:

Customisation: Tailoring different models to specific tasks or sectors can dramatically improve performance and relevance. Specialised models can be trained on domain-specific data, leading to more accurate and contextually appropriate responses for organisations. Indeed, this means you need to have your data in order and ensure it is of a sufficient calibre.?

Innovation & Supplier Bartering: A diverse ecosystem of models encourages competition and innovation within the field. As different models specialise and improve, organisations can benefit from the latest advancements in specific areas of AI. Indeed, it can also be used as a bargaining chip with Big Tech providers to negotiate on more economical and lower cost compute resources for reserved instances, GPU’s and model training tooling.?

Resilience & Regulation: Diversity in technology tools can enhance system robustness. By deploying multiple models, an organisation can ensure that the failure of one does not cripple its entire AI infrastructure. Indeed, I suspect we will see similar oversight from regulators in the coming years around LLM use in the business across sectors like financial services, manufacturing, energy & utilities and FMCG to what we have witnessed in the cloud computing sector. Specifically around considerations such as operational resilience, third party risk management and concentration risk.?

Cost and Efficiency: Different models may also optimise operational costs and efficiency. For example, lighter, faster models could handle routine queries while more complex, resource-intensive models tackle advanced problem-solving tasks.

Indeed, just as with any system architecture, in an ideal world you want a solution that is composable, modular and provides you with the ability to swap and change components based on your business needs. This is why I believe the direction of travel we are heading towards is the position of many LLMs and SLMs working in unison across a business landscape. Indeed, LLMs are still in their infancy in enterprise organisations? and the tradeoffs for a multi model approach aren’t that well travelled and deeply understood by many. However, there are synergies to the challenges we faced with the multi-cloud agenda several years ago….which still percolates today!??


Practical Considerations for Implementing Multiple Models

Much like running multiple clouds, data centres or many distributed systems, the deployment of running and maintaining multiple LLMs also comes with an overhead and “engineering tax”. While there are benefits to a multi model approach, there are also practical challenges to manage when adopting multiple LLMs:

Infrastructure: Organisations need the right technical infrastructure to support multiple models, which might include significant investment in computational resources and data storage across multiple cloud providers.?

Talent and Expertise: Managing various models requires a diverse set of skills within the workforce. In this instance, your organisations would need to invest in training or hiring specialists familiar with different AI models, platform services and underlying compute resources. This is a massive challenge for organisations when the talent pool for highly skilled ML/AI experts is becoming increasingly competitive week by week.?

Integration Challenges: Ensuring that different models work harmoniously within the same ecosystem can be complex. Organisations must design systems where models can share insights and data where appropriate, without causing conflicts or data integrity issues.


Big and Powerful, Doesn’t Mean Beautiful

In the last few weeks alone, the leaderboard for the biggest context window, accuracy, toxicity and the most performant LLM has changed hands several times over. This isn’t going to change and much like cloud computing, models themself will likely become a commodity. This is typified by the likes of OpenAI launching their Apple style GPT store as an example. However, the thing that will drive the aforementioned leadership position above is… high quality data. But we will leave that for another blog!

There is no denying in my opinion, that the future of LLMs in business and technology looks promising, with continuous advancements likely to spur even more specialised models. As AI becomes more ingrained in business processes, the ability to deploy and manage a portfolio of models will be a significant competitive advantage.

That said, the adoption of multiple large language models offers numerous benefits for organisations, from enhanced customisation and innovation to improved resilience and cost efficiency. As the AI landscape evolves, embracing a variety of specialised models will be crucial for staying competitive and agile in an increasingly automated world.

Andrew Turner

Board Advisor | Operational Leader | Investor | Founder | Co-Founder | Community Leader | Experimenter | Podcaster & upcoming Author

10 个月

Multi-Model & SLMs by Industry and Use case is the future #shakeoutahead

要查看或添加评论,请登录

Ben Saunders的更多文章

社区洞察

其他会员也浏览了