Is cost of Large Language Models (LLMs) or Small Language Models (SLMs) impacting the sustained adoption of AI within your enterprise?
Jacob Mathew
C-level Sales, Consulting & Growth Professional | Enterprise Solution Sales | SaaS | Open Source AI | LLMs | Quantum AI | Cloud | Commercial Open Source | IT/OT Data Transformation | Ex IBM, CISCO, AVEVA, WIND RIVER.
Artificial Intelligence (AI) is advancing at a breakneck pace globally. Today’s announcement? by Macron's at Paris AI summit about France to receive 109 billion euros ($113 billion) from international investors mirrors the scale of the US AI investment with the $500-billion "Stargate" project backed by OpenAI in the United States.?The AI revolution is here and I’m excited about the potential opportunities AI present as enterprises and public bodies undergo their transformation journey in an AI-powered world to deliver better outcomes for humans and societies in large. The future of AI is now!
Recently European Commission under the Digital Europe Programme launched OpenEuroLLM, an initiative to develop open-source, multilingual large language models (LLMs) all while reflecting European values of transparency, openness and accessibility in compliance with the European AI Act which came into force on 1st August, 2024. Clearly, the massive spending on AI is set to continue further by not only big tech firms but also by various government bodies globally and the race is on.
Large Language Models (LLMs) like ChatGPT, LLaMA etc have been trained on a massive corpus of data, where data sets traditionally are internet scale and made up of publicly available data, and with parameters that number in the billions. Even when organisations release their pre-trained models, LLMs still require additional fine tuning to meet real-world enterprise requirements. Retraining and tuning can be expensive, especially when you consider that it could take multiple training cycles to get the outcomes you need from your AI models.
As cited by Andreessen Horowitz the financial costs are significant ranging from US $500,000 to more than US $4 million to train a AI model like GPT-4. In addition to the financial costs, companies who have established climate and ESG goals and are tracking their carbon emissions might find deploying LLMs expensive and at odds to deliver on their ESG and carbon emissions commitments. ?
Furthermore, a global study conducted by S&P Global Market Intelligence in 2023 indicated more than 68% of organisations interviewed are concerned about the impact of AI/ML on their organization’s energy use and carbon footprint. It is fair to say that the supply of compute is so constrained. Right now, access to compute resources at a lower cost has become a determining factor of the success of deployment of LLM’s to achieving enterprise scale and value creation.??
Moreover, whilst large language models (LLMs) have dominated headlines,?today enterprises are increasingly recognizing the strategic value of Small Language Models (SLMs)?as a more targeted, efficient and cost-effective approach to implementing AI with their functional departments. For instance, SLMs could be used? in real-time,to monitor or perform preventative maintenance on devices linked to the Internet of Things (IoT) that is running mission critical assets in sectors like Energy, Telecom, Retail etc. Within automotive systems, SLMs can go a long way in offering real-time traffic updates for smarter road navigation, or improving voice commands or handsfree calling.
In December 2024,?Microsoft released their latest SLM in its Phi family, Phi-4, where Microsoft says “Phi-4 outperforms comparable and larger models on math-related reasoning” as well as being able to carry out conventional language processing.?
Therefore, in my opinion being able to have a world class LLM or SLM compression solution that is both effective, but less resource intensive at a lower costs is vital for every organisation as well as public institutions.
Multiverse Computing, a Spanish headquartered company has developed CompactifAI, an LLM compressor which uses "quantum-inspired" tensor networks to make high-performing foundation models from leading AI companies like Meta, Mistral AI, DeepSeek etc. more efficient and portable, reducing size by over 90%, with only a 2% – 3% drop in accuracy, and better price-performance with over 50% tangible costs savings in retraining and inference costs.
Our tensor networks compression methods can be applied to any leading AI models in the world? that is powered by LLMs or SLMs such as Llama3.1 8B, Microsoft/phi-4, Mistral-8B, Llama2 7B etc. offering state-of-the-art compressed models, detailed performance benchmarks, comprehensive evaluations, and dedicated enterprise support—everything you need to optimise your costs, accelerate time-to-production, and easily scale your organisation's AI solutions.
High-performing AI foundation models (FMs) compressed and optimised with CompactifAI can be deployed across a wide range of environments, from enterprise data centers, on-prem to lightweight and embedded devices running LLMs or SLMs on the edge, cloud, on-prem, hybrid, and multi-cloud environments.
领英推荐
In addition, we have numerous active projects in Europe focused on compressing Text-to-Speech (TTS) and Speech-to-Text (STT) models using CompactifAI. These models feature distinct architectures compared to Large Language Models (LLMs) and computer vision models and support numerous use cases in automotive, defense sectors.
By reducing the foundational AI model footprint, CompactifAI minimises hardware (GPU/CPU) requirements, unlocking new possibilities for cost-effective and sustainable AI deployments offering the following benefits:
?? 50% - 75% lower LLM compute/energy costs
? 25% - 50% faster inference
?? 2X - 8X faster training
?? LLM portability to cloud, on-prem, in-device, IoT, edge, etc.
?? Negligible accuracy drop??
??Check out:?https://lnkd.in/dPj8844F
Ready to optimise your enterprise AI LLMs or SLMs deployments with speed, efficiency, and faster time-to-market?
Get in touch with via LinkedIN or Email me: [email protected] to discover the world class quantum-inspired AI compression tool, models, and support you need to succeed with your enterprise AI programs and projects.
?