Adapting to the changes in fundamental forces in GenAI

Adapting to the changes in fundamental forces in GenAI

During the last two years in the GenAI space, I’ve been fascinated by the evolution of conversations with clients. Initially, discussions revolved around assessments, building strategies, and prompt engineering. These conversations matured into developing POCs, pilot use cases, and even some production deployments. Nowadays, the focus has shifted towards enhancing customer experience and streamlining processes through automation using GenAI. Designing these use case solutions often involves Retrieval-Augmented Generation (RAG), prompt engineering, and sometimes an agentic approach. Moreover, more mature customers are now discussing the scalability of their GenAI solutions and addressing performance-related issues.

While Parameter-Efficient Fine-Tuning (PEFT) techniques exist, they remain largely academic from the customers’ perspective, partly due to the lack of underlying data foundations for LLM builds and the time and effort involved. Recently, I read a few articles that made me think that the fundamental forces behind GenAI will change very soon this and bring discussions around model tuning, custom models on the table. Here’s why:

1. Alternative AI Chips by Cloud Providers: Almost all major cloud providers (Azure, AWS, GCP) have developed their own AI chips designed specifically for training models and inference. Azure has Maia and Cobalt, AWS offers Inferentia and Trainium, and GCP has its TPU. These chips are marketed as faster and cheaper alternatives to Nvidia GPUs.

2. Competition from AMD, Intel, and Others: AMD and Intel have significant orders in backlog for their chipsets, promising cheaper and faster options compared to Nvidia GPUs. This competition is bolstered by substantial VC investment, with $4 billion already funneled into 93 separate efforts according to Pitchbook. Nvidia’s dominance, with a market cap of $2.7 trillion and $80 billion in annual revenue at 78% gross margins, is making the industry very lucrative. If this is not enough, AI chip sales are expected to hit $400 billion annually in five years, that has sent the market on a wild goose chase.

3. Edge Device AI Chips: Apple’s latest laptops and tablets, optimized for AI with neural engines, and Qualcomm’s PC chips enabling laptops to run Microsoft AI services, are shifting AI work from server farms to consumer devices. This localizes AI processing, making it more efficient and accessible.

4. OneAPI as an Alternative to CUDA: CUDA has been crucial for building large language models that require hundreds of thousands of GPU cores. However, a coalition of tech companies, including Qualcomm, Google, and Intel, is developing oneAPI technology to create open-source software compatible with multiple AI chips. This move aims to challenge Nvidia’s dominance through a cross device compatible solution.

For practitioners like me, who work with customers on industry-specific problem statements, these changes imply several key points:

1. Cost Mechanics Will Become Crucial: As discussions advance into use cases involving custom model building and fine-tuning, decisions around processors, GPUs, and their associated costs will become more apparent.

2. Choice of Training Framework and Heterogeneous Computing: Deciding between TensorFlow and Pytorch, CUDA and oneAPI, will be essential to avoid vendor lock-in. The ability to build solutions that run across different GPUs, including AMDs, Intels, TPUs, and Nvidia’s, will be critical.

3. Flexibility, Modularity, Scalability, Interoperability: These qualities will become even more pertinent. The ability to switch LLMs to other providers, integrate fine-tuned models based on cost dynamics and performance needs, and decide between cloud vs. edge device models, as well as small language models vs. LLMs, will be crucial topics I see coming up in client discussions.

Despite these changes, it’s still Day 1 in the field of GenAI, and the real value realization from these solutions is yet to come. The landscape is evolving rapidly, and staying ahead means adapting to these transformative forces.

References -

Nvidia dominates the AI chip market, but there's rising competition (cnbc.com)

Exclusive: Behind the plot to break Nvidia's grip on AI by targeting software | Reuters




VENKATESWARLU PATTEDA

Senior Reporting Analyst at EXL

7 个月

Very informative

回复
DARSANA KURIAN

Associate at EXL Service| US Health Care Analytics

7 个月

Very informative

回复

Very informative?

回复
aswathi suresh

Reporting Analyst at EXL

7 个月

Good advice

回复
Geo Wilson

Associate Analyst- Speridian technologies / Ex - EXL Service

7 个月

Insightful!

回复

要查看或添加评论,请登录

Abhishek Gupta的更多文章

社区洞察

其他会员也浏览了