GPUs is the new oil - How to survive the GPU War...

GPUs is the new oil - How to survive the GPU War...


Assumption 1: AI models & their ability determine how someone can derive value out of AI

Assumption 2: GenAI use cases, how they are implemented define the AI value

Assumption 3: Its the AI feature pricing & monetisation strategy that ultimately helps realise AI implementation value

Assumption 4: It's about how we govern and create controlled/responsible AI that defines longer term sustain value.

So far all these 4 assumptions are right. A combination of these 4 would determine value a company can derive out of GenAI and can become key differentiators to win in the market place.

But there's one more key factor thats emerging faster. Its what causing below news headlines...

  • Nvidia became a $1 trillion company thanks to the AI boom: https://www.theverge.com/2023/5/30/23742123/nvidia-stock-ai-gpu-1-trillion-market-cap-price-value
  • How GPU Shortage is Limiting Potential of ChatGPT: https://www.gizchina.com/2023/06/08/how-gpu-shortage-is-limiting-the-potential-of-chatgpt/
  • To combat GPU shortage for generative AI, startup works to optimize hardware: https://venturebeat.com/ai/to-combat-gpu-shortage-for-generative-ai-startup-works-to-optimize-hardware/
  • Google's Tensor Processing Units & Amazon's Trainium AI Chips: https://www.theregister.com/2022/10/11/google_amazon_ai_chips_nvidia/

They said, infrastructure is commodity, and they they were right then. They also said, what matters is the value of the software thats written on top of infra, which is right again. For a considerable amount of time, it felt that innovation in silicon was good enough, all the innovation is due in AI/ML and other top layers. But now, it looks like the silicon has once again became the key bottleneck with GPUs becoming key ingredients for Generative AI, and they are not available with quantity that the companies are looking for. Making it new scare resource and need for better planning if we need to continue with AI revolution that we are currently in...

First of all, whats a GPU?

A GPU, or Graphics Processing Unit, is a specialized electronic circuit or chip that utilizes parallel processing to accelerate and optimize the rendering and processing of computer graphics and images. Composed of multiple processing cores, GPUs excel at handling complex mathematical calculations and data parallelism, allowing for faster and more efficient computation compared to traditional CPUs. While initially developed for graphics rendering in gaming and visual applications, GPUs have evolved to play a crucial role in various fields such as scientific research, machine learning, cryptocurrency mining, and artificial intelligence, where their parallel architecture and high memory bandwidth make them ideal for handling computationally intensive tasks.

GPUs vs CPUs for your ready reference:

  1. Parallel Processing Power: GPUs are highly efficient in parallel processing, making them particularly well-suited for training and running deep learning models that require heavy computation on large datasets. CPUs, on the other hand, are better suited for sequential processing and handling tasks that involve complex control flow.
  2. Accelerated AI Workloads: GPUs often include specialized AI hardware, such as Tensor Cores, which are designed to accelerate specific AI operations like matrix calculations. This hardware acceleration can significantly speed up AI computations compared to CPUs.
  3. Memory Bandwidth and Data Handling: GPUs generally have higher memory bandwidth, enabling faster data transfer and manipulation, which is crucial for handling the massive datasets involved in AI tasks. CPUs, while still capable, may have relatively lower memory bandwidth.
  4. Programming Flexibility: CPUs offer greater flexibility in terms of programming languages and frameworks, making them suitable for a wide range of tasks and software applications beyond AI. GPUs, on the other hand, typically require specific programming languages like CUDA or specialized frameworks like TensorFlow or PyTorch for efficient AI computation.
  5. Scalability and Cost-Effectiveness: GPUs can be easily scaled by using multiple GPUs in parallel, allowing for increased computational power and performance gains in large-scale AI workloads. However, GPUs can be more expensive to procure and maintain compared to CPUs, which are more cost-effective for general-purpose .computing tasks.

Widespread uses & increasing Datacenter based usage (GenAI & others)

No alt text provided for this image
Credit: App Economy Insights Twitter handle: @EconomyApp


What can I do as CTO/CIO?

  • Start estimating your GenAI compute needs for next 12-18 months right away. Your Product & Engg teams should be able to tell you thorough compute needed per scenario (use case) in terms of # of tokens, total scenarios per customer, # of customers and create a percentile based estimation numbers
  • Share the estimates with your current GenAI provider (OpenAI, Azure, GCP, AWS or others) and block through reservations if you see compelling ROI/Business case
  • Build GenAI circuit breaker in the product design - Design your product in such a way, if GenAI runs out of capacity or has an outage, you will still be able to run the product with core features without GenAI. This is your insurance policy to continue to stay in business if GenAI provider goes down or can't serve you for some reason.
  • Optimize the product to use minimal GPU/Tokens. This is possible by using right model for respective use cases. Some use cases are good for code generation, while others for text summarisation. Using one for another can cause burning of GenAI compute and budgets pretty fast.

In the next article, I will share few methods on how can we use cloud based vs on-prem GPU resources for non-prod Vs prod workloads to minimise cost & maximise innovation in your companies. Remember, if you are the person who is responsible for forecasting, planning on how to keep your company's GenAI dreams running, then you should read-on and the time is now to start planning...
Colette Symanowitz

Enterprise Customer Success Manager, MBA

1 年

Very interesting post, thanks Sreedhar Gade for sharing

回复

要查看或添加评论,请登录

Sreedhar Gade的更多文章

社区洞察

其他会员也浏览了