Platform vs. Pipeline — The Difficult Path to Monetization for Foundation Models
Generated by Microsoft Designer

Platform vs. Pipeline — The Difficult Path to Monetization for Foundation Models

Original Post on Medium: Platform vs. Pipeline — The Difficult Path to Monetization for Foundation Models | by Sam Bobo | Jul, 2024 | Medium

History continues to repeat itself. It’s an age-old adage that underpins society and is constantly in the forefront of historians and analysts alike. The history I am talking about today is Artificial Intelligence and more specifically, monetization of software. Artificial Intelligence is inherently a tool, a tool that has frenzied society — from lawmakers to white collar workers to investors and the like — but in its purest form, software. Attention mechanisms, deep neural networks, natural language processing, are all algorithms that are written using computer code, compiling down to 1s and 0s that run on AI accelerator chips, GPUs, and other forms of chips. All else equal, it’s the same as an Operating System, SaaS service, or any other software making the same infrastructure dependency assumptions as we did in the past.

Traditionally, closed or open-source software combines third party libraries (code that performs a particular functionality) and layers on top proprietary (or open/communal) intellectual property that outputs a particular set of functionality. Taking the user interface out for purposes of this argument, the end result is yet another set of libraries stitched together to form a holistic solution, to either be re-consumed and built upon (say, via an API) or monetized directly. This was the premise behind APIs and Building Blocks, a legacy article I wrote during the early experimental days of blogging. To make the analogy easier, the forked paths (no software code pun intended) are either: Solution/Platform or Library.

It’s no secret, the training cost for building a Large Language Model (LLM) with increasingly higher context windows gets astronomically higher keeping technological advances constant. As I’ve mentioned in a previous post, LLM providers such as OpenAI, Meta, Anthropic, Mistral, and others are in a never-ending race to build larger and more capable models, with the beneficiary, the general populus:

These intertwined systematic cycles at work create a flywheel engine that hinders LLM providers, removes risk for large technology companies, and drives down prices for consumers to build Generative AI solutions.

This never-ending race operates in the harsh reality that LLMs are inherently a commodity, similar to Conversational AI engines today. As a result, these companies must compete on price, namely inference but training for those performing additional tuning, thus, these companies are burning more and more money on what systematically a losing race. The only way out is to monetize in some capacity — a la “solution/platform” fork mentioned earlier, but the road to monetization comes at a steep investment.

OpenAI monetized GPT models in two capacities:

  • Solution (ChatGPT)— the household Generative AI solution that is immediately invoked to mind with its massive first-mover brand presence similar to that of Kleenex.
  • API Pipeline — a hosted endpoint for accessing GPT models via an API call for integration, fine tuning (within a closed tenant)

Lets start with Solution (ChatGPT).

ChatGPT is a consumer-focused solution developed by OpenAI that presents the user with a chatbot interface for a conversational interaction modality. Users can interact with the chatbot by raising questions or making statements and holding a turn-by-turn conversation to ultimately arrive at a conclusion. This can be anything from asking a question and obtaining a response, generating content such as a study guide / code / image, or even imbuing ChatGPT with a role to modify the interaction in some manner. (Honestly, I’m not telling you anything you don’t already know but are typing here for illustrative purposes.) ChatGPT’s inception involved taking a GPT model such as GPT3.5 or GPT 4.o and employing reinforcement learning with human feedback (RLHF) techniques to red tape, guardrail, and tune the output diction of the model to be conversation in nature. Furthermore, pre-canned responses for failover scenarios and small-talk banter were included to make the solution more consumer-friendly, similar to smart assistant devices of the past.

ChatGPT mimics a common pattern with consumer facing solutions, employing the all-too-common subscription model

Subscription revenue takes multiple forms, ranging from customers who purchase a subscription and simply forget they are paying for access until eventually they see a credit card bill to the super users to reach the upper echelon of usage limits. Businesses plan to target the mean usage to balance out the edge cases, in a standard-normal curve. With a subscription typically comes the future promise of upgrades — such as model updates of GPT — and additional functionality.

Companies who seek to outsource functionality employ a platform play by creating a two-sided market of consumers and developers. This is seen all too common with the Apple iPhone and app store as an easy example. OpenAI sought the platform paradigm through announcing the GPT Store. The GPT store, in effect, allowed people and organizations to create custom “GPTs” or plugins that worked with ChatGPT. These GPTs could range from a simple API passthrough such as Wolfram Alpha for answering mathematical questions or the NewYorkTimes for querying news stories to custom GPTs build by hobbyists and developers looking to make money and building novel GPTs using clever prompting, ranging from resume builders to virtual girl(boy)friends and the like. With most stores, 80% is garbage but the 20% is highly useful and create a significant amount of extensible value. This then, attracts new customers and the originating company taking a rake as a platform fee. The two-sided market approach requires incentivization of one party, typically the developers, to then create a flywheel where more value = more users = more incentive to develop more, etc.

The API Pipeline Side

The other approach is to expose functionality as an API and monetize on a per-usage basis for access. Many Conversational AI engines operate in this manner, particularly those hosted on hyperscaler Platforms as a Service (PaaS). The pay per use business model is predicated on a future “API Building Block” bet that the integrating solution grows to success and scales usage. Many PPU business models include tiered volume discounting either based on precommitments of volume and/or lower transactional cost after a certain threshold mostly as a per tenant cost for the model organization decreases at scale. For text-to-speech systems, per x many characters is the unit of measure, for natural language understanding classifications, the unit may be per x many classifications; for large language models, the unit of measure is a token (or 1M tokens which has been the standard across model providers). A token, as many knowledgeable readers may know, is a near-equivalent of a word but might be at the syllable level in some cases based on how the model tokenizes the input and output text. Moreover, input tokens are far cheaper than output tokens due to the cost of inference.

This certainly is the case with OpenAI. OpenAI provides pricing schedules based on the model being used

The audience for API-based consumption models typically target data scientists and developers, thereby the marketing of such a solution is grounded on benchmarking and achievement metrics, the vicious cycle model providers face.

Across the AI landscape, this parallelism of go-to-market strategies by model providers (reaching further back into the Conversational Era of AI) has become best practice that incumbents are trying to mimic in the Generative AI era. My argument with crafting this post is to articulate a point that pursuing the common standard might be disadvantageous to incumbent players and a pitfall.

Take Hume.AI as the leading example, Hume deeply rooted itself in scientific prowess within Emotional sciences, cofounded by PhDs, numerous scientific publications, and a highly differentiated model extremely hard to replicate, thus warranting a competitive advantage and unique differentiation among Large Language Model players broadly. HumeAI first exposed an API in beta for emotional inference across video, image, and text modalities ranging from recognizing emotions in one’s facial features, inflection in one’s vocal prosody, and within one’s diction. Hume thereafter, poured R&D resources into building out the Empathetic Voice Interface (EVI), an emotionally aware chatbot. Announcement after announcement arrived on discord channels and LinkedIn (my primary information dissemination channels) about integration into Twillio for voice calls, iOS application, function calling, and more. HumeAI quickly erected a solution’s sided business centered around EVI, clearly following the OpenAI ChatGPT model, following a Specific Phase of standardized go to markets:

Unfortunately, there may exist a secondary system forcing the parallel productization efforts, such that a customer-facing solution a la ChatGPT or EVI is required to prove the benchmarking metrics of the underlying APIs as those benchmarks alone may not be enough to usurp OpenAI’s brand name first-mover dominance. I personally believe that the convergence on a standardized go to market is fueled by the highly accelerated pace of innovation perceived to be required in the Generative AI market since the turn of 2023. There may be alternatives, however, to breaking the vicious cycle:

  • Foundation Models — Traditional foundation models such as OpenAI’s GPT, Anthropic, Mistral, and others should continue the cycle of innovation, breaking new context window barriers and providing a foundation of ground truth of which to build solutions on. These, again, are commodities with OpenAI gaining first mover advantage. The only differentiating factor may be licensing agreements and/or government enforcement, however, we should not fall trap to the streaming industry. These foundation models synergistically pair well with hyperscaler platforms seeking to amortize compute (after all, many have compute based deals in lieu of venture capital). Competing with OpenAI is a difficult endeavor, instead, pursue an API pipeline approach to capitalize on solutions built on top in the value chain a la “API building blocks.” Furthermore, I would abandon the endeavor of “Skill” based ecosystems attaching to customer-facing solutions as the extensibility rarely comes to fruition.
  • Specialty Models — Specialty models in adjacent markets, the example being HumeAI in Emotion AI, should follow suite as Foundation Models to capitalize on solutions incorporating the model(s) into more formal solutions and capitalizing on the pay per use model standard to the market. These specialty models should seek launch partners highlighting the uniqueness of their model to drive further adoption and the attention of hyperscalers to host the model and expose the API in a marketplace type setting, similar to how there are Model Marketplaces. Ecosystem partners, in Hume’s case, could include retail marketing (though fighting against the “creepy” aspect) or even making plays against traditional sentiment to replace those primitive services. In the case where specialty model providers are born from inherent data-driven advantages such as Bloomberg or Pandora, they themselves should be customer 0 to expand own competitive advantages already achieved. Note: This also applies to Small Language Models (SLMs).
  • Solution Providers — Those whom build on top of LLMs and Specialty Models / SLMs should employ a subscription based model for access to the solution offering. These solutions build upon the AI capabilities inherent within the model(s) and add unique value to fill a market need. I’d make a plea, however, that solution providers should steer away from the pitfall of chatbot-only interfaces and invent new modalities of interaction, especially when competing in mature markets, focusing on the Disruption of Design.

I continue to advocate for specialty models and employing a Mixture of Experts and/or Leader-Agent Model as a value-optimizing framework for new entrants imbuing AI capabilities into solutions.

In conclusion, selecting a route to market when presented with a fork is straightforward — models are poised to capitalize on API pipeline routes to market while solutions should build user bases in a subscription model. Simply put, building platforms and associated two-sided markets might be intriguing, it’s not always the most optimal.

Alexandru Armasu

Founder & CEO, Group 8 Security Solutions Inc. DBA Machine Learning Intelligence

2 个月

Loved this post.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了