登录查看更多内容

The Brutal Economics of AI in the Post-Training Era

James Kobielus

Research Director and Principal Analyst

发布日期: 2025年3月5日

Artificial intelligence (AI) is becoming table stakes in practically every market niche. And as it evolves in this direction, it’s sure to become a commodity.

Microsoft CEO Satya Nadella recently said as much when he tweeted that “as AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can't get enough of.”

In economics, one defines a commodity as an offering that has lost differentiation across its supply base. As this happens, the only differentiator between available offerings is price. As products and services undergo commoditization, their prices plunge and profit margins drop ever closer to zero.

In a hotly competitive niche such as generative AI, commoditization is already underway. We can see from the glut of large language models on the market, which are increasingly competing as much based on price-performance benchmarks as on a feature/function basis. Free-tier generative AI offerings abound, which helps to explain why OpenAI, DeepSeek, Google, and other providers are attempting to differentiate in such adjacent niches as deep reasoning, agentic, and copilot AI.

In a market where every vendor is competing against free, it’s not prudent to count one’s profitability chickens before they’re hatched. This is why I chuckled when I saw a recent headline in which DeepSeek said its AI models would have a 545% profit margin if—huge “if” here--everyone who used them would pay.

Fat chance! ?AI solution providers will have a tough time expanding their pool of paying customers for plain-vanilla LLMs. They’ll have just as much trouble trying to get existing customers to pay more than break-even. In the commoditized AI market that’s emerging, few providers will be able to charge any margin at all for AI’s magical intelligence. The money to be made in AI will be in using it to boost the differentiating value of applications, tools, processes, and goods of every sort.

So don’t be fooled by the current AI gold rush. The economics of the AI industry are growing more brutal every day. Yes, DeepSeek currently boasts that its open-source foundation model is 20-40x cheaper to use?than comparable models from OpenAI. And it has slashed the prices for its services, accelerating the price war already overtaking the consumer-facing AI market. Most immediately, this move prompted OpenAI to offer new free versions and Google has introduced?lower-cost access tiers?for its Gemini AI models.

The price war is likely to continue for the indefinite future in the AI market. There is plenty of margin to slash, according to recent research studies such as this from Aaron Scher of the MIRI Technical Governance Team. It shows that many LLM providers have fat margins, sometimes as much as 10x their lowball competitors, when we consider rival pricing for the exact same open-weight LLMs.

My sense is that this price differential reflects contrasting go-to-market strategies of the rival LLM providers. The providers charging the largest markups are probably trying to get early adopters to fund the exorbitant upfront costs of building and pre-training their models. The providers charging much less are probably trying to acquire as many customers as possible, hence scale economies, before the LLM market shakes out.

Scher also observes that providers of proprietary LLMs are charging substantial mark-ups, taking advantage of the facts that the models entail high development costs and that there is limited or no competition in serving them. They are almost certainly attempting to gain first-mover advantage, accrue early-adopter high margins, and build customer bases to upsell to premium offerings such as agentic AI and deep-reasoning solutions.

One thing that’s for sure in today’s AI market is that DeepSeek and all other providers—as they watch the prices of their offerings plummet—will continue to burn ungodly amounts of cash to grow their businesses.

Even DeepSeek won’t be able to sustain its price-performance advantage in the niches—such as deep reasoning—where it has a foothold. Its splashy rollout signalled the start of a new paradigm—the post-training era—for building smarter inferencing through a multistage blend of reinforcement learning, supervised finetuning, and distillation. DeepSeek itself helped to set this trend in motion with its approach called “inference-time compute scaling” (which some refer to as “test-time compute scaling”).

Before we get into a discussion what this new AI approach entails, let’s examine the state of the art heretofore in scaling LLMs. In the AI era that’s now beginning to recede into the past, scaling LLMs and other foundation models has required the building of ever larger models and use of unsupervised learning to pre-train those models on ever larger datasets. Generally, this has required more powerful AI-optimized chips and server clusters. In this regard, generative AI early movers have had a multi-year head starts, along with ample data, plus the financial and technical resources to train and serve ever larger, faster, and more accurate models for a wide range of use cases

However, in the past year or more, this approach has started to yield diminishing returns in LLMs' and other frontier models’ performance. Also, plain-vanilla LLMs are unsuited to the new challenges in the domain of “deep reasoning,” which focuses more on the scaling of inferencing workloads that are executed in quasi-real-time in response to user prompts, rather than training jobs that may take days or weeks to complete.

Inference-time compute scaling—which DeepSeek has pioneered and has been widely adopted by other providers—is well-suited to complex reasoning, mathematical problem solving, code generation, and other use cases where high accuracy is essential even if it requires slightly longer response times than simply serving prompts to query LLMs.?

Under this approach, AI professionals rely on the following technique:

Increase computational resources dynamically during AI model inferencing;
Execute longer, more complex chains of thought by performing additional calculations at runtime;
Leverage reinforcement-learning reward models to assess the quality of generated responses and thereby guide search towards better solutions;
Explore more potential solutions to complex problems;
Refine output iteratively through additional computation steps and identifying and correcting errors in previous iterations;
Generate multiple candidate answers and use a dedicated "verifier" model to select the best option based on its quality; and
Adjust the amount of compute used dynamically based on the complexity of the input prompt, allocating more resources for challenging problems.?

One downside to this approach is that it may involve trading off inferencing speed for greater intricacy and accuracy in reasoning abilities.

Nevertheless, it’s becoming clear that the future of AI in deep reasoning and other frontier use cases will require further innovations in inference-time compute scaling. This became evident when OpenAI’s new GPT-4.5 release landed with a thud. To the dismay of many, GPT-4.5 provides marginally better performance than GPT-4o at 30 times the cost for input and 15 times the cost for output. This points to the fact there seem to be diminishing returns in using unsupervised learning to train LLMs.

For its part, OpenAI seems to acknowledge this truth by referring to the release as a "research preview” and nudging industry attentions toward its upcoming GPT-5, which will be the last of OpenAI's traditional AI models, with GPT-5 planned to be a dynamic combination of "non-reasoning" LLMs (such as GPT-4.5) and deep reasoning models (built on scaling of inference-time compute) such as its Open AI o3 (and DeepSeek’s R1).

Before long, every contender in the generative AI arena will not only adopt inference-time compute scaling for many of the most deep reasoning and agentic use cases. Unsupervised learning—the legacy approach used for LLMs and other non-reasoning AI—will endure but become an increasing dead-end when what’s needed is greater model accuracy, precision, and trustworthiness in decision support.

Going down that road will require that AI providers plow more cash into inference computing infrastructure, which won’t come cheap. All of these investments must be done while the AI market as a whole—generative, agentic, reasoning, etc.—must keep its head above water even as the products and services they provide are commoditized to the point of razor-thin margins.

These AI markets are increasingly open, in models, APIs, platforms, tools, and the like. So it’s not likely that first movers will be able to lock in their customers through technical means. In the battle for customers, share, and profitability, the ultimate winners in this new AI-centric economic order will be providers with the broadest product and service portfolios, the most extensive partner ecosystems, the most scalable and cost-efficient operations, and the most reliable supplies of GPUs and other AI-optimized chipsets.

Looking ahead, the AI market might become stratified into mass-market providers of low-cost commoditized generative services, leveraging pre-training, vs. premium-price domain-focused deep-reasoning services, in which the value is produced in post-training using inference-time compute scaling and other approaches for boosting AI’s ability to outdo human intelligence.

But it’s clearly too early to say who will come out on top in the AI market. Merger and acquisition activity will keep shaking up this arena for some time to come. Nevetheless, the providers with the deepest pockets will remain the ones to watch. Yes, that means the billionaires who now occupy an overlarge presence in the geopolitical sphere. Elon Musk, for example, has invested heavily in inference-time compute scaling, and it’s central to his xAI Grok 3 technology.

As recent world events have shown, Musk has a way of making things happen. But it’s just as likely that the next world-shaking AI innovations might come again from China, or India, or the United States, or pretty much anywhere else on this planet.

Smart people are producing brilliant AI innovations everywhere. And smart money has a way of finding them.

要查看或添加评论，请登录

James Kobielus的更多文章

Driving Generative AI Deeply Into the User Experience

2025年3月13日

Driving Generative AI Deeply Into the User Experience

Generative user interfaces (UIs) are the next big trend in personalization. These interfaces build interactive context…
Ensuring that AI Remains Superaligned with Humanity’s Best Interests

2025年2月21日

Ensuring that AI Remains Superaligned with Humanity’s Best Interests

AI superalignment is a topic that’s been kicking around the research community for some time. In development all over…
The Deep Reasoning Era of Generative, Agentic, and Superintelligent AI Has Begun

2025年2月11日

The Deep Reasoning Era of Generative, Agentic, and Superintelligent AI Has Begun

We keep seeing wave after wave of revolutionary new approaches transform artificial intelligence (AI). In the past few…
What It Will Take for Prompt Engineering to Mature in the Enterprise

2025年1月28日

What It Will Take for Prompt Engineering to Mature in the Enterprise

Prompt engineering is at the heart of the modern practice of artificial intelligence (AI). It refers to the practice of…
Agentic AI: The New Application Development, Orchestration, and Governance Paradigm

2025年1月9日

Agentic AI: The New Application Development, Orchestration, and Governance Paradigm

Artificial intelligence (AI) has been the main growth driver for hyperscalers in the post-pandemic era. As enterprises…
The Formidable Challenges of Implementing Effective AI Guardrails

2024年12月18日

The Formidable Challenges of Implementing Effective AI Guardrails

Bringing strong governance to artificial intelligence (AI) is a daunting task. Where to start? When you peel the onion…

2 条评论
Information Technology’s Pivotal Role In The Post-Pandemic New Normal

2020年4月30日

Information Technology’s Pivotal Role In The Post-Pandemic New Normal

No one can say for sure whether the current public-health crisis will ever vanish completely. We’ll need to brace…

1 条评论
Must Data Privacy Take a Back Seat During the Coronavirus Panic?

2020年4月2日

Must Data Privacy Take a Back Seat During the Coronavirus Panic?

Privacy is on the run in the race to save the world from the ravages of coronavirus. COVID-19 has given surveillance…
Cybersecurity Issues Are Growing More Acute Under the COVID-19 Emergency

2020年4月1日

Cybersecurity Issues Are Growing More Acute Under the COVID-19 Emergency

Cybersecurity inevitably suffers when scares infect the populace. The current coronavirus—aka, COVID-19--outbreak…
How Many AI Application Development Frameworks Are Too Many?

2020年3月31日

How Many AI Application Development Frameworks Are Too Many?

Most AI developers have decided what their primary modeling tools will be. It’s come down to a horse race between…

See all articles

James Kobielus的更多文章

Driving Generative AI Deeply Into the User Experience

Ensuring that AI Remains Superaligned with Humanity’s Best Interests

The Deep Reasoning Era of Generative, Agentic, and Superintelligent AI Has Begun

What It Will Take for Prompt Engineering to Mature in the Enterprise

Agentic AI: The New Application Development, Orchestration, and Governance Paradigm

The Formidable Challenges of Implementing Effective AI Guardrails

Information Technology’s Pivotal Role In The Post-Pandemic New Normal

Must Data Privacy Take a Back Seat During the Coronavirus Panic?

Cybersecurity Issues Are Growing More Acute Under the COVID-19 Emergency

How Many AI Application Development Frameworks Are Too Many?