Platform vs. Pipeline — The Difficult Path to Monetization for Foundation Models
Original Post on Medium: Platform vs. Pipeline — The Difficult Path to Monetization for Foundation Models | by Sam Bobo | Jul, 2024 | Medium
History continues to repeat itself. It’s an age-old adage that underpins society and is constantly in the forefront of historians and analysts alike. The history I am talking about today is Artificial Intelligence and more specifically, monetization of software. Artificial Intelligence is inherently a tool, a tool that has frenzied society — from lawmakers to white collar workers to investors and the like — but in its purest form, software. Attention mechanisms, deep neural networks, natural language processing, are all algorithms that are written using computer code, compiling down to 1s and 0s that run on AI accelerator chips, GPUs, and other forms of chips. All else equal, it’s the same as an Operating System, SaaS service, or any other software making the same infrastructure dependency assumptions as we did in the past.
Traditionally, closed or open-source software combines third party libraries (code that performs a particular functionality) and layers on top proprietary (or open/communal) intellectual property that outputs a particular set of functionality. Taking the user interface out for purposes of this argument, the end result is yet another set of libraries stitched together to form a holistic solution, to either be re-consumed and built upon (say, via an API) or monetized directly. This was the premise behind APIs and Building Blocks, a legacy article I wrote during the early experimental days of blogging. To make the analogy easier, the forked paths (no software code pun intended) are either: Solution/Platform or Library.
It’s no secret, the training cost for building a Large Language Model (LLM) with increasingly higher context windows gets astronomically higher keeping technological advances constant. As I’ve mentioned in a previous post, LLM providers such as OpenAI, Meta, Anthropic, Mistral, and others are in a never-ending race to build larger and more capable models, with the beneficiary, the general populus:
These intertwined systematic cycles at work create a flywheel engine that hinders LLM providers, removes risk for large technology companies, and drives down prices for consumers to build Generative AI solutions.
This never-ending race operates in the harsh reality that LLMs are inherently a commodity, similar to Conversational AI engines today. As a result, these companies must compete on price, namely inference but training for those performing additional tuning, thus, these companies are burning more and more money on what systematically a losing race. The only way out is to monetize in some capacity — a la “solution/platform” fork mentioned earlier, but the road to monetization comes at a steep investment.
OpenAI monetized GPT models in two capacities:
Lets start with Solution (ChatGPT).
ChatGPT is a consumer-focused solution developed by OpenAI that presents the user with a chatbot interface for a conversational interaction modality. Users can interact with the chatbot by raising questions or making statements and holding a turn-by-turn conversation to ultimately arrive at a conclusion. This can be anything from asking a question and obtaining a response, generating content such as a study guide / code / image, or even imbuing ChatGPT with a role to modify the interaction in some manner. (Honestly, I’m not telling you anything you don’t already know but are typing here for illustrative purposes.) ChatGPT’s inception involved taking a GPT model such as GPT3.5 or GPT 4.o and employing reinforcement learning with human feedback (RLHF) techniques to red tape, guardrail, and tune the output diction of the model to be conversation in nature. Furthermore, pre-canned responses for failover scenarios and small-talk banter were included to make the solution more consumer-friendly, similar to smart assistant devices of the past.
ChatGPT mimics a common pattern with consumer facing solutions, employing the all-too-common subscription model
Subscription revenue takes multiple forms, ranging from customers who purchase a subscription and simply forget they are paying for access until eventually they see a credit card bill to the super users to reach the upper echelon of usage limits. Businesses plan to target the mean usage to balance out the edge cases, in a standard-normal curve. With a subscription typically comes the future promise of upgrades — such as model updates of GPT — and additional functionality.
领英推荐
Companies who seek to outsource functionality employ a platform play by creating a two-sided market of consumers and developers. This is seen all too common with the Apple iPhone and app store as an easy example. OpenAI sought the platform paradigm through announcing the GPT Store. The GPT store, in effect, allowed people and organizations to create custom “GPTs” or plugins that worked with ChatGPT. These GPTs could range from a simple API passthrough such as Wolfram Alpha for answering mathematical questions or the NewYorkTimes for querying news stories to custom GPTs build by hobbyists and developers looking to make money and building novel GPTs using clever prompting, ranging from resume builders to virtual girl(boy)friends and the like. With most stores, 80% is garbage but the 20% is highly useful and create a significant amount of extensible value. This then, attracts new customers and the originating company taking a rake as a platform fee. The two-sided market approach requires incentivization of one party, typically the developers, to then create a flywheel where more value = more users = more incentive to develop more, etc.
The API Pipeline Side
The other approach is to expose functionality as an API and monetize on a per-usage basis for access. Many Conversational AI engines operate in this manner, particularly those hosted on hyperscaler Platforms as a Service (PaaS). The pay per use business model is predicated on a future “API Building Block” bet that the integrating solution grows to success and scales usage. Many PPU business models include tiered volume discounting either based on precommitments of volume and/or lower transactional cost after a certain threshold mostly as a per tenant cost for the model organization decreases at scale. For text-to-speech systems, per x many characters is the unit of measure, for natural language understanding classifications, the unit may be per x many classifications; for large language models, the unit of measure is a token (or 1M tokens which has been the standard across model providers). A token, as many knowledgeable readers may know, is a near-equivalent of a word but might be at the syllable level in some cases based on how the model tokenizes the input and output text. Moreover, input tokens are far cheaper than output tokens due to the cost of inference.
This certainly is the case with OpenAI. OpenAI provides pricing schedules based on the model being used
The audience for API-based consumption models typically target data scientists and developers, thereby the marketing of such a solution is grounded on benchmarking and achievement metrics, the vicious cycle model providers face.
Across the AI landscape, this parallelism of go-to-market strategies by model providers (reaching further back into the Conversational Era of AI) has become best practice that incumbents are trying to mimic in the Generative AI era. My argument with crafting this post is to articulate a point that pursuing the common standard might be disadvantageous to incumbent players and a pitfall.
Take Hume.AI as the leading example, Hume deeply rooted itself in scientific prowess within Emotional sciences, cofounded by PhDs, numerous scientific publications, and a highly differentiated model extremely hard to replicate, thus warranting a competitive advantage and unique differentiation among Large Language Model players broadly. HumeAI first exposed an API in beta for emotional inference across video, image, and text modalities ranging from recognizing emotions in one’s facial features, inflection in one’s vocal prosody, and within one’s diction. Hume thereafter, poured R&D resources into building out the Empathetic Voice Interface (EVI), an emotionally aware chatbot. Announcement after announcement arrived on discord channels and LinkedIn (my primary information dissemination channels) about integration into Twillio for voice calls, iOS application, function calling, and more. HumeAI quickly erected a solution’s sided business centered around EVI, clearly following the OpenAI ChatGPT model, following a Specific Phase of standardized go to markets:
Unfortunately, there may exist a secondary system forcing the parallel productization efforts, such that a customer-facing solution a la ChatGPT or EVI is required to prove the benchmarking metrics of the underlying APIs as those benchmarks alone may not be enough to usurp OpenAI’s brand name first-mover dominance. I personally believe that the convergence on a standardized go to market is fueled by the highly accelerated pace of innovation perceived to be required in the Generative AI market since the turn of 2023. There may be alternatives, however, to breaking the vicious cycle:
I continue to advocate for specialty models and employing a Mixture of Experts and/or Leader-Agent Model as a value-optimizing framework for new entrants imbuing AI capabilities into solutions.
In conclusion, selecting a route to market when presented with a fork is straightforward — models are poised to capitalize on API pipeline routes to market while solutions should build user bases in a subscription model. Simply put, building platforms and associated two-sided markets might be intriguing, it’s not always the most optimal.
Founder & CEO, Group 8 Security Solutions Inc. DBA Machine Learning Intelligence
2 个月Loved this post.