Foundation Model Battle: Why OpenAI Should Use That $6.6B To Build An Enterprise Application Moat
If you want to understand just how red-hot #GenAI investments are, consider that Paris-based poolside announced a $500M early-stage funding last week and the news barely made a ripple because...
... OpenAI confirmed later that day that it had closed its latest funding round of $6.6B at a $157B valuation . Because much of the funding is convertible notes, OpenAI will eventually change its structure from a non-profit to a for-profit business. Despite the sudden departure of several key executives, it appears investors remain confident in the ability of CEO Sam Altman to steer the ship.
But where is he taking OpenAI? This was, in part, the topic of my newsletter last week just before the funding news dropped.
OpenAI has been hugely successful by most user metrics. The $3.6 billion in revenue it is reportedly on track to earn this year puts it well ahead of Altman's early revenue projections. But that growth has been built to a large degree on consumer products and usage. Today, Consumer revenue is $2.7B, while Enterprise revenue is $1B revenue.
The result: the company is on track to lose $5 billion this year, according to Reuters .
No matter how many checks investors write, that is not sustainable, especially with such fast-moving competition like 英伟达 ’s new NVLM 1.0 family of large multimodal language models that was just released.
I believe that Altman understands this. The latest funding and governance change signals that he knows the key to OpenAI's future is to build a moat around its business by focusing on enterprise applications.
As I wrote last week:
The real potential for foundation models lies in business applications. While consumers may gravitate toward one dominant platform — OpenAI’s ChatGPT, for instance — the enterprise sector is a much more complex and significant battleground. Enterprises with larger budgets and complex needs are the driving force behind the evolution and diversification of this market, and understanding their role is crucial for investors. The race to capture this enterprise market is crucial. Business applications are where foundation models can generate sustainable profits, especially since consumer-facing products tend to operate in a winner-takes-all landscape. Though less shiny, this is where profitability will be ultimately found.
Just a few weeks earlier, OpenAI caused a stir with the release of its o1-preview model, code-named “Strawberry.” Pitchbook noted that this represents a new category of LLMs with more powerful reasoning capabilities that simulate human thinking and a new avenue for applications and investment.
Even as consumers raced to understand the implications, 微软 signaled that the real action would be in the enterprise by highlighting all the ways the new o1 model was already integrated into its Azure platform and inviting select enterprise users to get early access.
This transformation is happening at a blistering pace, and the competition is becoming more intense. My full breakdown is below, including my advice for how investors need to rethink their understanding of valuations in an era that has gone beyond disruption into discontinuity.
GenAI Foundation Models: The LLM Race Has Only Just Begun
The generative AI boom continues and shows no signs of slowing down. According to IDC, enterprise spending on generative AI will surge from $16 billion in 2023 to $143 billion by 2027.
So far, the most significant investments have been in the companies building the foundation models that enable the new technology. The sums have gotten so large and the usage so big that it seems on the surface that the LLM market is mature and dominated by just a handful of names: OpenAI, Anthropic, Meta, Google, Mistral, and Cohere. But not later than today Nvidia released a family of powerful open-source model s able to compete with the leaders.
One year after writing about the market for GenAI foundation models , I wanted to revisit the topic and understand what has changed. Note that I am excluding from my analysis Nvidia’s new NVLM 1.0 family of large multimodal language models just released.
My key takeaways: Winners are emerging in the foundation model race, though innovation continues at full speed. Understanding those potential investment prospects and valuing them correctly requires rethinking the analysis of companies building foundation models.
To get an idea of just how dynamic the market for LLMs remains, just glimpse at Stanford’s Holistic Evaluation of Language Models (HELM) which is considered the gold standard for evaluating and ranking foundation models. The current leaderboard highlights the number of models vying for technical supremacy — and just how much more powerful subsequent versions have become. Two small models are in the Top 6, including Mistral’s 7B model, while Meta dominates the ranking with its 70B and 65B models.
Chatbot Arena , an open-source platform for evaluating AI created by researchers at UC Berkeley SkyLab and LMSYS , confirms this same evolution but with different results. Chatbot Arena evaluates human preference and shows that OpenAI dominates the race, but some lesser-known names are still in the top rankings. Though it has gotten less public attention, xAI’s Grok-2 (normal + mini) is in the top 10.
Among the LLM leaders, the market is evolving and diversifying at an accelerated pace. OpenAI caused a stir in September with the release of its o1-preview model, code-named “Strawberry.” Pitchbook noted that this represents a new category of LLMs with more powerful reasoning capabilities that simulate human thinking and a new avenue for applications and investment. As General Catalyst principal Chris Kauffman told Pitchbook: “There’s this whole new room for competition.”
This LLM technical arms race doesn’t yet include models that have yet to appear but whose companies have already received huge amounts of funding. That includes Safe Superintelligence, the company founded by former OpenAI Chief Scientist Ilya Sutskever that just raised $1 billion to build AI systems that are both safe and more powerful.
Such rapid developments remind us that GenAI is still in its infancy as a technology. Some LLM companies have already closed their doors, and some have been acquired (Character.ai was absorbed by Amazon, and Inflection was absorbed by Microsoft), but new ones are emerging.
Amid this technical upheaval, investors have a few choices. They could sit on the sidelines and wait for safer, more predictable opportunities to emerge (if they ever do!). Or, they could treat LLMs like an index and just bet on all (or most) of them, hoping that a winner-take-all market eventually offsets the other losing bets. Fortunately, there is a third option. Much more financial data is now available, allowing investors to dig in to make informed decisions and cherry-pick the companies with the highest potential to be winners. This third option involves conducting thorough due diligence, understanding the unique fundamentals of each company, and making strategic investments based on a comprehensive analysis of the market and individual companies.
To understand how to frame that analysis, I want to focus on the names noted above that are at the forefront of this foundation model race (for now!): OpenAI, Anthropic, Meta, Google, Mistral, xAI, and Cohere. These are the heavyweights vying for dominance in the broader business application market, where deep-pocketed enterprises represent the most significant revenue opportunity. I’m excluding companies that are developing their models specifically for internal use.
The real potential for foundation models lies in business applications. While consumers may gravitate toward one dominant platform — OpenAI’s ChatGPT, for instance — the enterprise sector is a much more complex and significant battleground. Enterprises with larger budgets and complex needs are the driving force behind the evolution and diversification of this market, and understanding their role is crucial for investors.
The race to capture this enterprise market is crucial. Business applications are where foundation models can generate sustainable profits, especially since consumer-facing products tend to operate in a winner-takes-all landscape. Though less shiny, this is where profitability will be ultimately found.
I’ll use our Advanced Growth Intelligence (AGI) methodology to explain how investors can better understand the LLM market. The AGI methodology is a comprehensive approach to analyzing companies’ growth potential and resilience in the genAI sector.
Quality of Revenue
Foundation models operate on intricate revenue models, primarily driven by API access rather than subscriptions. Companies like OpenAI and Anthropic offer consumer subscriptions for chatbots (e.g., ChatGPT Plus or Claude Pro). Still, these represent only a fraction of the revenue potential, and there’s little room for differentiation.
The bulk of income comes from API access, both for access to the model and for its fine-tuning, where businesses pay for tokens based on model complexity. For example, OpenAI charges $0.006 per 1,000 tokens for GPT-4 (8k context), compared to $0.002 for GPT -3.5. This token-based pricing structure has significant implications for valuations because it introduces uncertainty in revenue projections. It also adds complexity to assessing annual recurring revenue (ARR), critical for valuation models: the quantum at stake is unknown. To address the difficulty of uncovering a trustworthy revenue source, the key is to build the right approach to determine API Access ARR (similar to transaction-based ARR). This allows for rationalizing foundation model valuations. (See our previous analysis of Mistral AI ).
Even so, a word of caution: Applications are being built to optimize the level of token costs, which constitutes both a tailwind as foundation model usage increases with cost reductions and a headwind as revenue compresses due to lesser token requirements in applications.
Foundation models can enhance the quality of their revenue by building high-margin professional services fee — in addition to token revenue — on enterprise use cases. Though this is not “technological” per se, and thus drives a lower valuation than license revenue, it could become a critical revenue stream in the case token prices start to get commoditized.
领英推荐
Quality of Growth
The immediate goal for foundation models is to transform business applications from objects of curiosity into business-critical applications. The ability to catalyze business apps that become vital to customers will determine the winners of the foundation model race.
So far, enterprises have been experimenting with many models. In only a few cases has the technological choice of a foundation model been made definitive.
Several factors seem to drive decisions for businesses and developers:
Gross Retention Rate (“GRR”) is a critical indicator that confirms this choice as it measures a company’s ability to retain token revenue. However, assessing GRR is complex because revenue (as noted above) is token-based rather than subscription-based. While GRR analysis should be performed on overall revenue, investors can take a deep dive into both use cases and products to assess the Quality of Growth at the cohort level.
Use cases first: What are the ones supported by a company’s different models, and what is the stickiness of the model? For example, in the case of customer chatbots, are customers sticking to GPT 3 — text-davinci-002 or going to other LLMs? This will provide better insight into where the spending is going and help establish the sustainability of the company’s growth.
As model costs increase and bottlenecks appear, comfort with growth prospects at iso-product is also necessary. Growth trends should be evaluated based on current models sold to enterprise customers, with improving Tech and performance on product after product creating new avenues of growth and not substituting existing ones.
That leads to the other critical factor for assessing the Quality of Growth: cost of customer acquisition (CAC). Beyond the initial virality that tempts many to try foundation models, these companies eventually compete for enterprise IT budgets. As such, the ability to build an effective go-to-market strategy (at scale) is critical. However, assessing CAC becomes complex when foundation model companies need to target developers and customers, as in the case of open-source models like the Mistral AI example.
Quality of Margins
Foundation models are trying to optimize compute costs. OpenAI’s o1-mini is a good example. While some argue that LLMs are becoming commoditized, OpenAI’s o1 costs as much as 20x more per token than its mini model.
That will help offset margin pressures to a degree. However, foundation model companies still face a big challenge to stay ahead of potential commoditization by rolling out new and more powerful versions, which has led to rising costs of developing newer products due to computing costs.
Those computing costs have the potential to sink LLM companies. Google’s Gemini model cost $191m to train, and OpenAI’s GPT-4 cost $78m to train. Researchers are working full speed at ensuring that models are trained more optimally for computing power, but it will remain a struggle for now.
The most significant components of computing costs include GPU costs — either through direct hardware purchases or as part of computing services. GPU costs are central to the equation. We have moved away from the potential for GenAI to be a black swan event , but the viability of costs remains central. We had highlighted this as a key risk — in addition to commoditization — in our analysis of Mistral’s economics , but it applies to all LLMs.
When evaluating Margin quality for foundation models, investors need to recognize a fundamental accounting distinction that has major impacts on P&Ls: Opex vs. Capex.
Here’s why. GPU costs for developing and training can potentially be treated as both. The criteria may well depend on the actual use of the GPU. These companies are evolving far from public scrutiny, so there will be a great temptation for accounting manipulation. The accounting rules are complex to understand and investors must understand how potential investment targets are applying these.
This gives an advantage to smaller models that work on smaller data sets. But the win is for applications in our view, undifferentiated at the model layer as they pass on the costs to customers.
We should note other costs, such as data labeling and security and electricity and cooling. But the one that looms large is R&D.
Within R&D, investors should pay specific attention to alignment costs, a fundamental factor ensuring AI systems act in human interest.
Single points of failure
Foundation model development has a clear bottleneck: It needs staggering power and digital infrastructure to pursue its developments. The computing power required by transformer-based models requires far more energy than previous technological innovations.
Of course, this is a huge opportunity for private market investors. BlackRock CEO Larry Fink recently characterized this as a “multi-trillion long-term investment opportunity” when announcing a new $30 billion AI energy fund with Microsoft.
Despite the opportunity, foundation model growth is limited in the short term. For example, Llama 4 is 100x more compute-intensive than Llama 3. This is part of the gap between AI revenues and the growing cost of infrastructure needed to support this growth, which Sequoia’s David Cahn dubbed “AI’s $600 billion question.”
Last, today’s LLMs have token limits. As some models become larger, many research firms are trying to go beyond the transformer paradigm and develop new variants.
A Marathon, Not a Sprint
It’s too early to declare an LLM winner as things stand today.
OpenAI has a clear lead, and Anthropic and Mistral appear to be serious challengers, thanks to their traction at the enterprise level. However, only the ability to sustain long-term growth and resilience will determine the winner.
Adoption, monetization, and margin optimization, along with the criteria I’ve outlined, will largely determine this. OpenAI’s leadership may be challenged if Mistral or Anthropic are able to navigate the complex landscape of business applications, manage costs, and drive sustainable growth. It could also be challenged by bold moves from existing or new actors in the LLM space.
In the meantime, the ability to attract funding is a strong determinant of success, given the slower adoption and monetization of business applications versus the cost of building LLMs. Investors have incredible sway in determining who will win — perhaps more so than we’ve ever seen in Tech.
The LLM competition has just started. Investors need the right tools to get into the game.
Read the full article here: https://raphaelledornano.medium.com/genai-foundation-models-the-llm-race-has-only-just-begun-but-it-has-its-favorites-827b05e9d601
Sign up for my newsletter here: https://dornanoco.substack.com/
Managing Partner at August Debouzy - Technologies, Transformations, Digital & Regulation
1 个月Very interesting Raphaelle d'Ornano ??
American journalist in France Journalist | Newsletter Editor | Content Marketing Specialist | Conference Moderator
1 个月I can't even keep up at this point. ?? But great analysis!