Think Small, Dream Big
Generative AI is having a bit of a blockchain moment. Remember when everyone was hyping up blockchain's potential, only for the conversation to shift to that technology’s massive energy consumption, arguably, inhibiting its adoption?
AI is facing a similar narrative now. Despite its incredible value and potential (from $1 trillion in US GDP growth to a 3.5% productivity increase), AI is being inundated by concerns about its environmental impact.
And to be fair, there's reason for concern: generative AI's hunger is undeniably causing massive spikes in energy usage. Google's missing its sustainability targets, with emissions up by 48% over the baseline year of 2019. Microsoft's power demands are soaring (a projected $100 billion data center construction project will see to that), and the US might even delay shutting down coal plants to feed AI’s insatiable hunger. What’s more, because of the arms race that’s going on to have the biggest, best, and most intelligent model as we journey towards artificial general intelligence, it’s hard to say if anyone in big tech really cares.
Some of the proposed solutions to AI’s sustainability problem are insufficient on their own. While AI itself can discover sustainability solutions and improve efficiency, a comprehensive, global approach is needed for broader impact. Focusing on efficient hardware is not enough, as advancements are often met with increased model complexity, perpetuating the issue. And future energy-focused tech such as fusion and reversible computing show promise, but frankly remain distant possibilities.
Sometimes, less is more
Achieving truly sustainable AI development will require short- and long-term innovative solutions and unprecedented global collaboration. There's a difference businesses can make right now, though, and it’s all to do with making sure you’re using the right tool for the job at hand. When it comes to generative AI, that tool isn’t always a large language model. While LLMs receive much of the attention, small language models offer a compelling alternative.
SLMs do effectively the same thing as LLMs, but they differ from LLMs in a key (not to mention obvious) way: size. SLMs have a smaller number of parameters, or values that the model learns during training. This makes them faster, more lightweight, often more customizable, and crucially—for this discussion at least—way more energy efficient. Don't let their size fool you, though. SLMs are trained on massive datasets and are capable of handling tasks like text summarization, translation, question-answering, and content generation.
Many enterprise platforms for gen AI now offer solutions that incorporate SLMs. This allows businesses to harness the power of AI without the extensive computing resources required by LLMs. In addition, many SLMs are open source and can be run on local hardware, even mobile devices.
Effort is also happening under the hood of even these small models, as engineers work to make them leaner and greener. Techniques like quantization and pruning are important in this quest for efficiency. (Quantization involves reducing the precision of the numbers used in a model's calculations; it’s akin to rounding off a long decimal to a simpler fraction. This makes the model smaller and faster, with minimal impact on performance. Pruning, for its part, involves removing unnecessary connections within the model, much like trimming dead branches from a tree. This can drastically reduce a model's size and computational demands without sacrificing much accuracy.)
领英推荐
Rightsizing
The answer though, is not always to simply switch from large to small language models—that would be foolish. While small language models have many benefits for sustainability, cost, and specialization, they lack the complexity, nuance, and creativity required for some use cases.
The key is to use the right model for the job at the time, bringing in some of the elasticity we see with cloud providers. Those providers have become adept at serving up just enough compute power to cover demand, scaling up and down as necessary. It’s a model the gen AI world should emulate. Businesses need not smash GPT-4o or the biggest and best model with every query they throw at generative AI tools; rather, they should base the size of the model used on the complexity and nature of the task at hand.
Apple has demonstrated this quite eloquently with Apple Intelligence, wherein a prompt is first assessed to see if it can be completed using an on-device AI model. If not, off it goes to the cloud to be run by Apple’s own models. If the response isn’t good enough, the user then has the option to throw the problem at ChatGPT.
This isn't to say the be-all and end-all solution to generative AI’s sustainability challenge is just about business decisions, model choice, and rightsizing. But it is something we can do now, without waiting for more efficient hardware or new and miraculous power sources. As ever, it will take a combination of efforts over the long term to ensure the tech doesn’t push the planet over the edge and that the outputs are truly valuable to employees, consumers and society.
The relentless pursuit of AGI will undoubtedly demand massive amounts of power. However, a multitude of current AI applications don’t require it, and could be re-evaluated for their environmental impact and optimized for greater efficiency. This would not only reduce their ecological footprint but also make them more cost-effective and faster. By taking a critical look at existing use cases, we can identify areas where more sustainable AI models or less resource-intensive algorithms could be employed without sacrificing performance.
When the conversation around blockchain was dominated by its environmental impact, it wasn't the technology itself that was the main issue, but rather the way Bitcoin, specifically, validated transactions, which consumed a lot of energy. However, many other blockchains have since become much more efficient by adopting a different, less energy-intensive validation method.
We can expect a similar situation with generative AI: the environmental impact isn't solely about the technology, but about how it's put into practice. Therefore, we should focus on responsible AI practices from the beginning and always select the right tool for the job.
Technical Director | Google Cloud Portfolio Solution Lead | GDE | Advisor
3 个月SML and Agentic SML flow will see to that environmental impact reduced using all that same non-bespoke consumer hardware to attain similar impactful GenAI boosted benefits. Example likes of Google Chrome can already run small transformer models. ????Fun fact, this weekend I just finished off building a beta release of PromptKeeper chrome extension (built with GenAi assistants) leveraging chrome in-built Prompt AI APIs (alpha preview) - based on Gemini Nano. The value generation, opportunities and (most) benefits of GenAI capabilities without costing the world To be fair, it’s just the cost of innovation.?? Consider the MPG efficiency of early vehicle engines…
Helping organisations transform and innovate with data and AI.
3 个月Focus on the value and the impact. Kick out the low hanging fruit and focus on a strategy of proving what your AI future looks like..then transform.
Strategic Partnerships | Business Development | Artificial Intelligence
3 个月Love the blockchain analogy – history definitely seems to be rhyming here. But you're right, simply swapping LLMs for SLMs isn't the whole answer. It's about finding the right tool for the job and leveraging AI to optimise itself. AI will help us design more efficient algorithms, hardware, and even discover new energy solutions. We just need to train it to turn the lights off when it leaves the room...
Transforming & Growing Businesses Sustainably using Digital, Artificial Intelligence and Automation @ Cognizant Ocean
3 个月Rightsizing. That’s such an apt term that I like a lot, and I use along another word that you alluded to in your post: Balance. If we did more Rightsizing and balance we may have had more success, growth and less blockchain moments. Mind you, Rightsizing goes both ways — when someone’s asking for AGI for $10 we need to meet somewhere in the middle… Another thing I believe the tech and AI industry does wrong is not spelling out vertical applications. Domain knowledge and AI must go hand in hand or we’ll just have another chatbot moment…