Think Again Before Splurging on Top-end GPUs

Think Again Before Splurging on Top-end GPUs

We are in the midst of AI mania, and it comes with a close relative: GPU mania. As AI makes its mark on varying industries, enterprise investment in the infrastructure needed to support AI workloads means that we are experiencing a GPU land grab. Many cloud service providers and enterprises are buying up GPUs in anticipation of future AI applications. It’s an “if we build it, they will come” strategy.

But GPUs are not the be-all end-all, and there’s no one-size-fits-all solution. In fact, as the AI industry shifts from building and pre-training foundation models to using those models for inference and generation,? we can expect that today’s GPU-heavy compute will shift to a continuum – cutting-edge GPUs for the largest “ask me anything” LLMs, to mid-level GPUs for mid-size task-specific LLMs, and to software-optimized CPUs for small low-latency edge LLMs.

It’s clear that many organizations are taking their time to sort through the hype – and rightfully so. There is uneven AI adoption right now across industries, with large companies and industries like manufacturing leading the charge. And overall, according to a February 2024 survey from the U.S. Census Bureau, only 5% of U.S.-based companies were currently using AI, and 7% planned to use it within the next 6 months.

At Akamai, we’ve been considering the question of how to implement GPUs – and especially the question of what GPUs are needed for AI inference at the edge – closely. The answers aren’t necessarily obvious. Here’s why:

1. GenAI is just the tip of the iceberg

When it comes to AI applications and models, think of an iceberg. Above the surface, we have those GenAI models that we’ve all heard of, the ones that garner all of the attention and headlines. But below the surface, there’s so much more. These are the myriad of AI models that you’ve never heard of. These are the deep learning models that do things like image classification, anomaly detection, and clustering, and that have been delivering real value to enterprises for a decade or more.?

When you realize how AI has been in use by enterprises well before the current GenAI and GPU mania, it becomes clear that there is significant hype that needs to be sorted through.

2. You don’t need the best GPUs, you need the right GPUs

GPUs for AI are evolving quickly. For example, if you had bought NVIDIA’s top-end GPUs for AI just a couple of years ago, you’d already be a factor of 32 below the stated performance of the newest generation.?

Further, GPUs are also not “one size fits all”. They span a wide range of performance capabilities and price points. Before purchasing GPUs, it’s important to know what your use case is. Are you trying to build your own LLM? Deploy machine learning models at the edge? Leverage deep learning to help address cybersecurity risk? Customize an existing LLM to support your customer experience??

A GPU that is well suited for a given AI model may be a horrible fit for another AI model. By simply recognizing that the GPUs bought today might not be a good match for the use cases that emerge in the future, you are setting yourself up for success.

3. AI may not always need GPUs

Finally, it’s important to remember that AI algorithms are also progressing at a breakneck pace. And they aren’t always getting bigger and more complex. A particularly interesting example is the 1-bit transformer from Microsoft. This algorithm involves representing certain values in the model with just one bit – one bit to represent the values 1 and -1 – so matrix multiplication (the core task GPUs do so well) becomes simple addition and subtraction. This Microsoft algorithm is an extreme example of a technique called quantization, but it suggests that AI models that need GPUs today, may not need GPUs tomorrow.

4. The energy demands of GPUs can’t be ignored

The energy required to power GPUs is of significant concern. As enterprises scale up their AI workloads, the energy demand of GPU-heavy infrastructure grows exponentially. On average, GPUs use more than twice the amount of power per unit of processing than CPUs.?

An interesting point here is that when it comes to AI inference, there are software-led innovations that can help reduce energy demands. Generally, these work by reducing the size of the AI model through various techniques. Today, in the early days of GenAI, we see cases throwing megawatts at problems that can be solved with milliwatts.?

Okay, I might be exaggerating a bit here, but the point still stands. LLMs are being touted as the solution for problems that are more economically solved with small language models (SLMs), deep learning, machine learning, or even plain-old data analytics. If enterprises are willing to take their time and optimize their chosen AI model for their specific use case, they can limit energy demands.


The GPU conversation is complex. But developing a thoughtful strategy that avoids succumbing to the hype is essential for making sound hardware investments. By understanding the broader AI landscape, selecting the right GPUs for your specific needs, and considering the future evolution of AI algorithms, you can position your organization for long-term success.

Thank you for bringing attention to the nuances of GPU selection in the context of AI. Your point about tailoring technology to specific business needs is crucial, as the demands of AI applications can vary significantly across different industries. It would be interesting to hear more about your experiences in evaluating the right mix of GPUs and CPUs. What criteria do you prioritize when making these decisions? Looking forward to further insights from you and the community.

回复
Mark Dronzek

Transformational CIO | Global Leadership & Strategy | Data Advocate | Talent Amplifier

7 个月

Spot on as usual, Robert Blumofe Cut through the hype and pragmatically identify real world solutions. Thanks for posting I remember back in the late 2000s I heard you speak at a Thornton May event. You shared that rather than buying the latest/greatest hw for your edge servers (as well as the most expensive) you simply bought dependable, fit-for-purpose servers in bulk w/ the expectation they'll fail & could quickly be replaced from local inventory. Rational & common-sense when shared, but counter-intuitive for most. I subsequently applied that approach in Sub-Saharan Africa and Asia to great advantage. Thanks again

Matthias Wagner

CTO at Cellino

7 个月

Well said, very few purpose-specific AI models have a trillion parameters, but may have very different throughput and cost requirements— take for example autonomous driving, or therapeutic cell and tissue manufacturing for that matter!

Tim Napoleon

Head Of Partnerships @ Skillz Inc. | Strategic Partnerships | ex-Amazon | ex-Akamai

7 个月

As we scale workflows like encoding or AI inference models, operationalizing costs becomes crucial. Here's how companies can challenge Nvidia's market position: Cost & Power Efficiency: ARM processors and specialized GPUs offer significant cost and power savings. Open-Source Advantage: Robust Linux software stacks provide flexibility and drive continuous innovation. Example: FFMPEG with ARM chips has dramatically reduced broadcasting costs, showcasing the potential for other industries.

要查看或添加评论,请登录

Robert Blumofe的更多文章

  • Celebrating 25 Years Since Akamai’s IPO

    Celebrating 25 Years Since Akamai’s IPO

    Just over 25 years ago, early in my career, I jumped at the chance to join a one-year-old company that was just…

    19 条评论
  • Can We Afford to Keep Up with AI’s Growing Energy Demands?

    Can We Afford to Keep Up with AI’s Growing Energy Demands?

    This week is Climate Week NYC, and many businesses and leaders are participating to discuss how we can advance clean…

  • Celebrating 25 Years in Great Company

    Celebrating 25 Years in Great Company

    August 14, 1999. I had packed everything I owned into my car and was driving from Texas back to Cambridge to begin a…

    53 条评论
  • Going Phishing with ChatGPT

    Going Phishing with ChatGPT

    Imagine you’re a system administrator at a tech company. You’re working late on a critical project.

  • A Funny Little Thank-You Letter from JFK to my Grandfather

    A Funny Little Thank-You Letter from JFK to my Grandfather

    I want to share with you all a wonderful and funny little letter that I have. It’s a hand-written thank-you note from…

    10 条评论
  • Zero Trust and the Fallacy of Secure Networks

    Zero Trust and the Fallacy of Secure Networks

    Talking about secure networks is like talking about safe pools. A pool is just a body of water, and if it has enough…

  • ZERO TRUST SHOULD NOT GIVE IT A BAD NAME

    ZERO TRUST SHOULD NOT GIVE IT A BAD NAME

    Maybe you've just found out that your company's IT organization is implementing Zero Trust. Does that mean they don't…

    3 条评论
  • ALL ACCESS IS (OR SHOULD BE) REMOTE ACCESS

    ALL ACCESS IS (OR SHOULD BE) REMOTE ACCESS

    With the transition to remote work, we often hear the term remote access used in unison. Typically, remote work…

  • WHY ZERO TRUST NEEDS THE EDGE

    WHY ZERO TRUST NEEDS THE EDGE

    Backhauling traffic destroys performance, and backhauling attack traffic can destroy even more. Nevertheless, in a…

  • ZERO TRUST NETWORK ACCESS IS AN OXYMORON

    ZERO TRUST NETWORK ACCESS IS AN OXYMORON

    Though Zero Trust is really quite simple and should be viewed as a very strong form of the age-old principle of least…

    2 条评论

社区洞察

其他会员也浏览了