The Rise of Open-source LLMs
Introduction
It’s been a year since I wrote my first article on Generative AI so I thought it’d be worth penning my latest thoughts on the state of the industry. In particular, what I think will be the biggest trend in GenAI in 2024 and that’s the rise of Open-source Large Language Models (LLMs).?
?
Until early 2023 and the advent of enterprise offerings for Generative AI, the market for AI/ML was dominated by open source models and frameworks. These would be customised and tailored to specific client problems and data with developments and improvements continually worked on by the open source community. There was little space for Proprietary ML models and even out of the box solutions from vendors such as Salesforce were just custom models built from Open-source code.?
?
Cut to 2023 and the latest AI technology is suddenly locked behind a paywall of sorts with access to the best Generative AI models available only via API, at a not insignificant cost, and for the most part the algorithm is completely hidden from the end user and not easily customisable. Basically, you don’t get access to the source code. Note this is somewhat changing for the Proprietary models with fine tuning customisations of the frozen foundation models coming online but this is still for advanced users and with uncertain benefits and additional complications over RAG implementations – this is a whole topic in itself!
?
So where are the Open-source LLMs then? Well, they have always been there. Up until the last 6 months though they’ve largely been overshadowed by models from the hyperscalers and OpenAI. The release of Mistral’s LLM from seemingly nowhere (a relatively obscure French company founded by ex-Meta and Google folk) changed the perception of open source. Add on the continued improvements of Falcon and Llama amongst a host of other models and there is a real question now about using Open-source for enterprise solutions. Perhaps the biggest arguments of why to use Open-source instead of Proprietary focuses less on pure performance but the notable downsides of the Proprietary models.
?
Based on conversations with clients over the last few months six common concerns with Proprietary models come up again and again. So let’s dig into these and break them down.
?
1.?????? Legal Implications
The legal landscape surrounding LLMs is complex to say the least, encompassing issues of intellectual property (IP), data security, and privacy concerns.
?
One of the primary legal challenges involves navigating data sourcing considerations to ensure compliance with data protection laws. This includes adhering to regulations that govern the processing of personal data and avoiding unauthorized access to or use of third-party data, as outlined by the Computer Misuse Act in the UK.
?
Additionally, the monetization and creative use of content generated by LLMs have raised significant IP and copyright challenges. Legal systems around the world are being called upon to clarify ownership rights and protect the creators' rights, leading to debates over authorship, and the development of new licensing and royalty models for AI-generated content.
?
Privacy and security concerns are paramount, given LLMs' capabilities to process vast amounts of data, including potentially sensitive information. The risk of data breaches, unintended disclosure of sensitive information, and the introduction of biases in AI outputs necessitates stringent data protection measures. Organizations must adopt comprehensive approaches to safeguard data, encompassing policy enforcement, access controls, and continuous monitoring.
?
Furthermore, the use of LLMs in business operations introduces specific privacy/security challenges, such as the misuse of "dark data" and the need for clear data stewardship. Companies must ensure that they have appropriate consent to process data and are capable of managing the complex legal obligations associated with data stewardship.
?
Addressing these challenges requires a careful balance between leveraging the capabilities of LLMs and adhering to evolving legal and ethical standards. Organizations must stay informed of the legal landscape and implement robust governance frameworks to mitigate risks associated with the deployment of LLM technologies.
?
领英推荐
These looming legal troubles are concerning from a long term perspective and they aren’t going away. The good news for the Open Source community is that a lot of these challenges disappear as they data sets used to train the model are known and local deployments of models remove privacy and security risks. The hyperscalers and OpenAI know this and are trying to assuage customer fears. Sam Altman famously said that OpenAI would cover the legal fees of someone getting sued for IP infringement. Google has recently gone further with specific indemnity. They are now committing that if customers are challenged on copyright grounds, they will provide indemnity on both the training data used to build the model and the output of those models, including Duet which will be used by main knowledge workers on a day to day basis. Whilst these legal issues have not yet been decided the major providers of Proprietary models are providing strong coverage. If this should change with legal precedent then it could upend the Generative AI industry.
?
2.?????? Model Depreciation
Model depreciation in Large Language Models (LLMs) refers to the phenomenon where a model's performance or relevance decreases over time due to various factors such as changes in language, the emergence of new information, or shifts in societal norms. This presents significant risks and challenges to organizations working with vendors of Proprietary LLMs. Firstly, there's the risk of obsolescence, where a once cutting-edge model becomes outdated, necessitating costly updates or replacements. Secondly, organizations face integration challenges, as constantly evolving models may require frequent adjustments to their existing systems and workflows. Additionally, the Proprietary nature of these models often means limited customization or insight into the model's inner workings, making it harder for organizations to adapt the model to their specific needs or to mitigate biases. This can lead to a reliance on vendors for updates and support, potentially locking organizations into expensive and inflexible contracts. The worst-case scenario is where a provider discontinues a model completely forcing users to switch to a suitable replacement. This is a risk when you don’t own your own models. This risk became real when in July 2023 OpenAI announced they were depreciating 28 of their models. As of January 4, 2024 these models would no longer be available.
?
3.?????? Cost
Cost concerns associated with using Proprietary Large Language Models (LLMs) like GPT-4 and Gemini Pro, as opposed to open-source models, are significant for organizations. Proprietary models often come with licensing fees, usage costs based on the volume of data processed, or the number of API calls made, which can scale rapidly with increased usage, leading to unpredictably high operational costs. In contrast, open-source models can be more cost-effective since they can be deployed on an organization's own infrastructure, eliminating per-query costs and offering more control over operational expenses. However, deploying and maintaining open-source models also incurs costs related to infrastructure, development, and potentially, model customization to suit specific needs. Despite these costs, the flexibility and potential for cost optimization make open-source models an attractive option for organizations with the capability to manage them, especially when compared to the ongoing costs associated with Proprietary LLMs. This is an especially acute issue for use cases which demand high volume of API calls such as business process optimisation or customer services queries for large retailers.
?
4.?????? Latency
The latency when using advanced language models like GPT-4 and Gemini Pro, particularly when sending multiple API calls simultaneously, can significantly impact user experience and operational efficiency. These models, due to their complexity and the computational resources required to generate responses, can experience increased response times under heavy load or when processing large volumes of requests concurrently. This challenge is exacerbated in real-time applications or services that rely on swift data processing, where any delay can affect user satisfaction or decision-making processes. Moreover, the infrastructure and network bandwidth also play a crucial role in the latency experienced by users; as data travels back and forth between the user's systems and the model's servers, any bottlenecks in this path can further increase response times. Organizations must therefore carefully manage their request volumes, possibly implementing queueing mechanisms or optimizing request patterns, to mitigate these latency issues while ensuring that their use of such models remains cost-effective and efficient.
?
Deploying open-source LLMs locally or in a private cloud can address some of the latency challenges associated with using Proprietary models through external APIs. Local or private cloud deployment allows for greater control over the computing resources allocated to the model, enabling organizations to scale up their infrastructure to meet demand and reduce response times. This setup minimizes the network latency that comes with sending data to and from an external server, as computations are performed within the organization's own network. Moreover, having direct access to the hardware can facilitate optimizations tailored to the specific model and use case, further enhancing performance. However, this approach requires significant investment in hardware and expertise to manage the infrastructure and model updates efficiently. It also shifts the responsibility for data privacy and security entirely onto the organization, which can be a complex challenge but offers the advantage of keeping sensitive data within a controlled environment.
?
5.?????? SLAs
Service Level Agreements (SLAs) provided with Proprietary Large Language Models (LLMs) like GPT-4 and Gemini Pro are critical for enterprises relying on these services for operational tasks. SLAs typically outline the expected performance standards, availability, and response times, offering a measure of reliability and assurance to organizations. For enterprise solutions, these SLAs can significantly impact operational reliability; they define the vendor's commitment to uptime and performance metrics, which are vital for planning and maintaining service continuity. A strong SLA ensures that the model is available and performs at the required level, minimizing disruptions to business operations. However, reliance on these SLAs also means that enterprises must prepare for scenarios where SLA conditions are not met, which might include implementing fallback mechanisms or diversifying their dependencies on multiple services to mitigate risks. The terms of these agreements can also affect how enterprises plan their resource allocation and manage their service expectations, making the specifics of SLAs a crucial factor in the operational planning and risk management strategies of organizations using Proprietary LLMs.
?
6.?????? Vendor Risk
Relying on a single vendor for Large Language Model (LLM) services presents several risks, including vendor lock-in, limited flexibility, and potential service disruption. Vendor lock-in occurs when an organization becomes so dependent on a vendor's products and services that switching to another provider is prohibitively costly or complex. This can limit an organization's ability to adapt to new technologies or negotiate favourable terms. With only one vendor, there's also a risk of limited flexibility; the organization may find itself constrained by the vendor's roadmap, capabilities, or data handling practices, which may not always align with its evolving needs. Furthermore, reliance on a single source for LLM services increases the risk of service disruption. Should the vendor experience downtime, security breaches, or terminate the service, the organization could face significant operational challenges without a ready alternative. Diversifying vendors or considering open-source alternatives where feasible can mitigate these risks, offering greater control and flexibility.
?
The LLM market still has a way to mature and with so many new models being released and major competition between the hyper scalers and established software vendors to embed GenAI into their offerings, organisations should be wary of committing to a single vendor for LLM services. It is simply a poor business strategy at this stage of the maturity curve.
?
Conclusion
Given all of the above my bold prediction for 2024 is that Open-source models will seriously challenge the hegemony of the Proprietary models in the LLM space. With increasingly performant models with each new iteration the Open-source models challenge the capabilities of the GPTs and Geminis of the world. They do this without any of the downside risks and challenges of using Proprietary models. Yes, there are increased complications of deploying and managing an LLM locally but as we continue to improve our understanding of how to do this efficiently and effectively then the remaining benefits of Proprietary models may be quickly eclipsed by the Open-source marketplace. I could be totally wrong and that’s fine too! Just the other day the introduction of Sora blew the industry away and gave the big players another leg up in the arms race that is Generative AI. Where this ends up is anyone’s guess but if I was a betting man I’d back Open-source every day of the week.
?