登录查看更多内容

Think Again Before Splurging on Top-end GPUs

Robert Blumofe

发布日期: 2024年7月24日

We are in the midst of AI mania, and it comes with a close relative: GPU mania. As AI makes its mark on varying industries, enterprise investment in the infrastructure needed to support AI workloads means that we are experiencing a GPU land grab. Many cloud service providers and enterprises are buying up GPUs in anticipation of future AI applications. It’s an “if we build it, they will come” strategy.

But GPUs are not the be-all end-all, and there’s no one-size-fits-all solution. In fact, as the AI industry shifts from building and pre-training foundation models to using those models for inference and generation,? we can expect that today’s GPU-heavy compute will shift to a continuum – cutting-edge GPUs for the largest “ask me anything” LLMs, to mid-level GPUs for mid-size task-specific LLMs, and to software-optimized CPUs for small low-latency edge LLMs.

It’s clear that many organizations are taking their time to sort through the hype – and rightfully so. There is uneven AI adoption right now across industries, with large companies and industries like manufacturing leading the charge. And overall, according to a February 2024 survey from the U.S. Census Bureau, only 5% of U.S.-based companies were currently using AI, and 7% planned to use it within the next 6 months.

At Akamai, we’ve been considering the question of how to implement GPUs – and especially the question of what GPUs are needed for AI inference at the edge – closely. The answers aren’t necessarily obvious. Here’s why:

1. GenAI is just the tip of the iceberg

When it comes to AI applications and models, think of an iceberg. Above the surface, we have those GenAI models that we’ve all heard of, the ones that garner all of the attention and headlines. But below the surface, there’s so much more. These are the myriad of AI models that you’ve never heard of. These are the deep learning models that do things like image classification, anomaly detection, and clustering, and that have been delivering real value to enterprises for a decade or more.?

When you realize how AI has been in use by enterprises well before the current GenAI and GPU mania, it becomes clear that there is significant hype that needs to be sorted through.

2. You don’t need the best GPUs, you need the right GPUs

GPUs for AI are evolving quickly. For example, if you had bought NVIDIA’s top-end GPUs for AI just a couple of years ago, you’d already be a factor of 32 below the stated performance of the newest generation.?

Further, GPUs are also not “one size fits all”. They span a wide range of performance capabilities and price points. Before purchasing GPUs, it’s important to know what your use case is. Are you trying to build your own LLM? Deploy machine learning models at the edge? Leverage deep learning to help address cybersecurity risk? Customize an existing LLM to support your customer experience??

领英推荐

LLM Inference War Begins

AIM 5 个月前

AMR Future Brief| Exploring the Potential of…

Allied Market Research 8 个月前

Nvidia unveils NVIDIA Blackwell, NIM microservices…

Attri 11 个月前

A GPU that is well suited for a given AI model may be a horrible fit for another AI model. By simply recognizing that the GPUs bought today might not be a good match for the use cases that emerge in the future, you are setting yourself up for success.

3. AI may not always need GPUs

Finally, it’s important to remember that AI algorithms are also progressing at a breakneck pace. And they aren’t always getting bigger and more complex. A particularly interesting example is the 1-bit transformer from Microsoft. This algorithm involves representing certain values in the model with just one bit – one bit to represent the values 1 and -1 – so matrix multiplication (the core task GPUs do so well) becomes simple addition and subtraction. This Microsoft algorithm is an extreme example of a technique called quantization, but it suggests that AI models that need GPUs today, may not need GPUs tomorrow.

4. The energy demands of GPUs can’t be ignored

The energy required to power GPUs is of significant concern. As enterprises scale up their AI workloads, the energy demand of GPU-heavy infrastructure grows exponentially. On average, GPUs use more than twice the amount of power per unit of processing than CPUs.?

An interesting point here is that when it comes to AI inference, there are software-led innovations that can help reduce energy demands. Generally, these work by reducing the size of the AI model through various techniques. Today, in the early days of GenAI, we see cases throwing megawatts at problems that can be solved with milliwatts.?

Okay, I might be exaggerating a bit here, but the point still stands. LLMs are being touted as the solution for problems that are more economically solved with small language models (SLMs), deep learning, machine learning, or even plain-old data analytics. If enterprises are willing to take their time and optimize their chosen AI model for their specific use case, they can limit energy demands.

The GPU conversation is complex. But developing a thoughtful strategy that avoids succumbing to the hype is essential for making sound hardware investments. By understanding the broader AI landscape, selecting the right GPUs for your specific needs, and considering the future evolution of AI algorithms, you can position your organization for long-term success.

Connect Tech+Talent

2 个月

Thank you for bringing attention to the nuances of GPU selection in the context of AI. Your point about tailoring technology to specific business needs is crucial, as the demands of AI applications can vary significantly across different industries. It would be interesting to hear more about your experiences in evaluating the right mix of GPUs and CPUs. What criteria do you prioritize when making these decisions? Looking forward to further insights from you and the community.

Mark Dronzek

Transformational CIO | Global Leadership & Strategy | Data Advocate | Talent Amplifier

7 个月

Spot on as usual, Robert Blumofe Cut through the hype and pragmatically identify real world solutions. Thanks for posting I remember back in the late 2000s I heard you speak at a Thornton May event. You shared that rather than buying the latest/greatest hw for your edge servers (as well as the most expensive) you simply bought dependable, fit-for-purpose servers in bulk w/ the expectation they'll fail & could quickly be replaced from local inventory. Rational & common-sense when shared, but counter-intuitive for most. I subsequently applied that approach in Sub-Saharan Africa and Asia to great advantage. Thanks again

1 次回应

Matthias Wagner

CTO at Cellino

7 个月

Well said, very few purpose-specific AI models have a trillion parameters, but may have very different throughput and cost requirements— take for example autonomous driving, or therapeutic cell and tissue manufacturing for that matter!

3 次回应

Tim Napoleon

Head Of Partnerships @ Skillz Inc. | Strategic Partnerships | ex-Amazon | ex-Akamai

7 个月

As we scale workflows like encoding or AI inference models, operationalizing costs becomes crucial. Here's how companies can challenge Nvidia's market position: Cost & Power Efficiency: ARM processors and specialized GPUs offer significant cost and power savings. Open-Source Advantage: Robust Linux software stacks provide flexibility and drive continuous innovation. Example: FFMPEG with ARM chips has dramatically reduced broadcasting costs, showcasing the potential for other industries.

1 次回应

查看更多评论

要查看或添加评论，请登录

Robert Blumofe的更多文章

Celebrating 25 Years Since Akamai’s IPO

2024年10月29日

Celebrating 25 Years Since Akamai’s IPO

Just over 25 years ago, early in my career, I jumped at the chance to join a one-year-old company that was just…

19 条评论
Can We Afford to Keep Up with AI’s Growing Energy Demands?

2024年9月24日

Can We Afford to Keep Up with AI’s Growing Energy Demands?

This week is Climate Week NYC, and many businesses and leaders are participating to discuss how we can advance clean…
Celebrating 25 Years in Great Company

2023年8月16日

Celebrating 25 Years in Great Company

August 14, 1999. I had packed everything I owned into my car and was driving from Texas back to Cambridge to begin a…

53 条评论
Going Phishing with ChatGPT

2023年3月21日

Going Phishing with ChatGPT

Imagine you’re a system administrator at a tech company. You’re working late on a critical project.
A Funny Little Thank-You Letter from JFK to my Grandfather

2022年5月19日

A Funny Little Thank-You Letter from JFK to my Grandfather

I want to share with you all a wonderful and funny little letter that I have. It’s a hand-written thank-you note from…

10 条评论
Zero Trust and the Fallacy of Secure Networks

2022年5月3日

Zero Trust and the Fallacy of Secure Networks

Talking about secure networks is like talking about safe pools. A pool is just a body of water, and if it has enough…
ZERO TRUST SHOULD NOT GIVE IT A BAD NAME

2021年8月3日

ZERO TRUST SHOULD NOT GIVE IT A BAD NAME

Maybe you've just found out that your company's IT organization is implementing Zero Trust. Does that mean they don't…

3 条评论
ALL ACCESS IS (OR SHOULD BE) REMOTE ACCESS

2021年7月28日

ALL ACCESS IS (OR SHOULD BE) REMOTE ACCESS

With the transition to remote work, we often hear the term remote access used in unison. Typically, remote work…
WHY ZERO TRUST NEEDS THE EDGE

2021年7月26日

WHY ZERO TRUST NEEDS THE EDGE

Backhauling traffic destroys performance, and backhauling attack traffic can destroy even more. Nevertheless, in a…
ZERO TRUST NETWORK ACCESS IS AN OXYMORON

2021年7月20日

ZERO TRUST NETWORK ACCESS IS AN OXYMORON

Though Zero Trust is really quite simple and should be viewed as a very strong form of the age-old principle of least…

2 条评论

See all articles

Think Again Before Splurging on Top-end GPUs

Robert Blumofe

1. GenAI is just the tip of the iceberg

2. You don’t need the best GPUs, you need the right GPUs

领英推荐

3. AI may not always need GPUs

4. The energy demands of GPUs can’t be ignored

Robert Blumofe的更多文章

社区洞察

其他会员也浏览了

10 Best Cloud GPU Platforms for Deep Learning Workloads

GPU Servers for AI: Everything You Need to Know

Running ML inference with AMD GPU and ROCm (Part II)

How Bad is the AI Compute Shortage, Really?

#32: Implementing Fractional GPUs on Kubernetes ??

AI Hardware Round 2: TPU vs. DPU vs. VPU vs. APU vs. QPU

The Future of Data Centers: Trends in GPU and CPU Hardware

Run MATLAB using GPUs in the Dataoorts Cloud

LLM Inference War Begins

In Network Acceleration for AI/ML Workloads

1. GenAI is just the tip of the iceberg

2. You don’t need the best GPUs, you need the right GPUs

领英推荐

3. AI may not always need GPUs

4. The energy demands of GPUs can’t be ignored

Robert Blumofe的更多文章

Celebrating 25 Years Since Akamai’s IPO

Can We Afford to Keep Up with AI’s Growing Energy Demands?

Celebrating 25 Years in Great Company

Going Phishing with ChatGPT

A Funny Little Thank-You Letter from JFK to my Grandfather

Zero Trust and the Fallacy of Secure Networks

ZERO TRUST SHOULD NOT GIVE IT A BAD NAME

ALL ACCESS IS (OR SHOULD BE) REMOTE ACCESS

WHY ZERO TRUST NEEDS THE EDGE

ZERO TRUST NETWORK ACCESS IS AN OXYMORON

社区洞察

其他会员也浏览了

10 Best Cloud GPU Platforms for Deep Learning Workloads

GPU Servers for AI: Everything You Need to Know

Running ML inference with AMD GPU and ROCm (Part II)

How Bad is the AI Compute Shortage, Really?

#32: Implementing Fractional GPUs on Kubernetes ??

AI Hardware Round 2: TPU vs. DPU vs. VPU vs. APU vs. QPU

The Future of Data Centers: Trends in GPU and CPU Hardware

Run MATLAB using GPUs in the Dataoorts Cloud

LLM Inference War Begins

In Network Acceleration for AI/ML Workloads