Selling with Data #44 - The cost of creating AI

Selling with Data #44 - The cost of creating AI

Warning: This article is longer and geekier than my past article, but I am trying something different. Please let me know your feedback.

We are entering the pivotal point of AI. Many companies are coming off the sidelines and green lighting AI projects, especially generative AI. I just had a buddy tell me that his leadership team asked him to create a website page outlining all the AI that the company is doing. He was struggling since most of his AI projects were still in labs or small pockets hidden in the business. Showcasing AI has gone from leading to keeping up. This focus on accelerating AI is creating an amazing time to be a seller of AI.

AI investments are happening in two areas - companies that are building the AI and companies that are using AI. I mostly see companies using AI to improve digital labor, intelligent automation, advanced security, and customer care, for example. These are great uses and I love seeing them.

In this article, I will focus on the other category, the cost of building AI models.

The cost to build AI models

According to IDC, worldwide spending in AI is set to grow 19.6% year-over-year in 2022 to $432.8 billion – well on its way to breaching the $500 billion mark by 2024?(source). So far, 88% of that spend has been on AI software. In the next several years, the share spent on AI hardware is expected to grow faster than other categories.

The need for more horse power to run large language model (LLMs) require faster and more complex infrastructure at a higher cost. To provide an example, consider the average cost or running a basic ML model versus running a LLM.

Example 1 - A basic ML solution costs about $25,000 / year - $15,750 in labor and $9,000 in infrastructure, (source). Here's how that breaks down: Infrastructure: A single machine in the cloud w/o load management @ $9,000 / year Data Support: Labor to pull data - @ $6,750 Engineering / Deployment: Labor to move the model from the data scientist workstation to the cloud - @ $9,000

Example 2 - ChatGPT or other large models like Meta's LLama can cost about $2M-4M+ to build and run. CNBC's, ChatGPT and generative AI are booming, but the costs can be extraordinary, says " Analysts and technologists estimate that the critical process of training a large language model such as GPT-3 could cost over $4 million." Another source says Open AI GPT model for a single training run can cost $12 million - OpenAI requires ~3,617 HGX A100 servers (28,936 GPUs) to serve Chat GPT (source: Hitchhiker’s Guide to ML Training Infrastructure). Meta's largest model, LLama is estimated to cost $2.4m, 2,048 Nvidia A100 GPUs, 1.4 trillion tokens, 1 million GPU hours to train (source: ChatGPT and generative AI are booming, but the costs can be extraordinary)

Granted, comparing a simple ML model to Meta's largest model isn't an apples to apples comparison, and some ML models can be really big. It is safe to assume the cost to run a LLM versus a simple ML model will be more expensive. As LLMs are doing more and more, they are growing bigger and more complex, and the cost to train and run them is going up.

No alt text provided for this image
Source

AI and business models for business to consumer companies

Many of the largest foundation models already in production are concentrated with a handful of big companies who can afford to pay. Business to consumer (B2C) companies, like Microsoft and OpenAI, are offering these models to users and often not yet charging for the AI services. They are free to users because the companies are following Google's advertisement driven business model. Google offers search services to users at no charge, spends billions on infrastructure, data management, and labor costs behind the scenes, and makes billions on advertisement revenue from optimized ad placement based on user search history. In many cases, large AI B2C companies are willing to take the loss on the cost of infrastructure and development of the AI models so that millions of users engage and train their model. When?so many users are improving and training the models at scale, this can result in long term monetization potential of the user data.

As the saying goes "If you are not paying for something, you're not the customer; you're the product.” Every prompt on ChatGPT is a user training the model, sharing a little bit about themselves, and in turn adding value to the product.

This business model comes at a cost. The Information article, OpenAI’s Losses Doubled to $540 Million as It Developed ChatGPT, says OpenAI's $540 Million loss "reflects the steep costs of training its machine-learning models during the period before it started selling access to the chatbot." OpenAI isn't alone. Financial analysts estimate Microsoft’s Bing AI chatbot, which is powered by an OpenAI ChatGPT model, needs at least?$4 billion of infrastructure?to serve responses to all Bing users.

Model costs - training and inference costs

As more enterprise companies are banning open models like ChatGPT due to fears of the quality of the training data and privacy of their employee searches (link), many companies are looking for substitutes to general use models, with model options that provide governance, traceability, and trust. For enterprise companies providing their own version of these models isn't cheap. The cost falls into two main categories - training and inference costs.

Training cost are the cost of resources, computational power, and data, required to train a model. Training is required to adjust the model based on the information specific to the training dataset to improve the model's accuracy and minimize errors.

There are differing approaches to model training.

  1. Training from scratch - this is the most manual and typically the most expensive.
  2. Fine-tuning - this starts with a pre-trained model and continues training the model using the customers own data.
  3. Prompt-tuning – this starts with a pre-trained model and steers the model to generate a desired outcome without changing the underlying model, e.g. providing instruction for the model on how to answer a question, or provide a doc/docs and get the model to answer based on that content.

Training is typically charged per compute hour, which makes picking the right training approach important.

  • Training from scratch is typically expensive because of the heavy computation needed to train a model from nothing. The meter is running as everything is being built from zero.
  • Fine tuning is more manageable because the model already has some of training completed, and the training for the existing portion was already paid for by someone else. Fine tuning of the underlying model makes a copy of the model and changes it, so the "meter" is only applied to the changes.
  • Prompt-tuning is the most-cost effective option because it focused on getting the outcome without changing the underlying model.

Prompt-tuning uses embedding models to automatically perform prompt engineering to replace a human manually and randomly experimenting with different prompts. Picture a person using ChatGPT through trial and error by changing the prompt input with more detailed and different terms until the output starts to match their expectation. By using a embedding models, prompt-tuning is able identify the right prompts with less iterations resulting in lower inference costs.?More information from IBM Research on prompt-tuning can be found here.

The best approach to model training is finding the best balance the quality of the output for the lowest cost.

I am already seeing signs that smaller models trained on cleansed data can outperform larger models on specific tasks, for example coding models, financial models, and other specific use case models. Companies are looking at the ROI of models cost against the return of the use case plus there is a high sense of urgency that using a smaller model can speed up the the time to production. Using the right tuning approach can make models more accessible to more companies.

A recent report by OpenAI found that the cost of training large AI models is expected to rise from $100 million to $500 million by 2030 (source), suggesting that only the wealthiest companies will be able to afford to develop and use AI technology. Before reacting, it is worth considering the source of the information and that some companies, like OpenAI, benefit from framing model creation as prohibitively expensive to create a moat and persuade companies to use their LLMs versus creating smaller models or using approaches like prompt-tuning, which are available at lower costs. Working through the various options of training are important.

Inference costs are the resources required to deploy and run the trained AI model on new data, after the training process is complete.

Once a model is trained, the usage of the model is referred to as inference. Before the model can be used, the model needs to be hosted or deployed somewhere to allow API queries to call to the model. Every time a query is made, server compute is used to run the query through the model and to generate the output. The bigger the model, the higher the compute. This is why inference pricing, is typically provided per model using tokens, i.e., 1K token ~750 words. For example, OpenAI has 7 different rates for inference depending what model is used and the compute needed per model-?https://openai.com/pricing.

An alternative is for customers to host the model themselves. If self-hosting, customers pay for the compute infrastructure versus paying a small price per query. For companies in regulated businesses, or for sensitive data that isn't allowed on the public cloud, self-hosting may be the best option. This approach requires an upfront infrastructure investment, and a draw back is that it isn't able to easily leverage elastic capacity offered on the cloud.

According to OpenAI's?report?from 2018, most compute used for deep learning is spent not on training but on inference. Inference costs far exceed training costs when deploying a model at any reasonable scale.

Despite the cost, the value of the models often produces a positive ROI with significant advancements in insights driving automation that increase revenue, reduce cost, and increase reach into new markets.

If AI models are expensive, what should enterprise companies do

  1. Invest in building their own specialized models: While building models may require a significant upfront investment, it can result in more cost-effective solutions in the long run. This approach may be suitable for companies with significant data science expertise, resources, and long-term plans that have the expertise to develop specialized models for specific purposes. Communities like HuggingFace have made open-source ML models available and accessible.
  2. Use prompt tuning versus fine tuning for model tuning. Prompt tuning is a more cost-effective approach to build custom models. Using embedding models to automatically do prompt engineering replaces a human randomly experimenting with different prompts, and reduces compute and cost.
  3. Use cloud-based solutions, if available: Cloud-based machine learning services, like those available from IBM, can provide access to powerful machine learning tools without requiring significant upfront investment in hardware and software infrastructure. Most of the cloud-based solutions offer pay as you go pricing models that can be built cost effectively and allow bursting when more compute is needed.
  4. Work with trusted partners: Enterprise companies can collaborate with enterprise companies that specialize in governed AI for business to develop cost-effective solutions. While there is a lot of AI washing lately, trusted companies have been working on governed AI for business for years. Leveraging intellectual property that is commercialized can be particularly beneficial for companies with limited resources or expertise in machine learning. Instead of becoming an expert at AI to add AI to their applications, off the shelf governed and enterprise grade AI tools are available to be simply embedded into the applications, often through simple API calls.

What should enterprise sellers who want to stay ahead of the curve do

  1. Stay current with the latest advancements in AI. A great enterprise seller brings value to their customers through expertise and examples. In the area of AI, and specifically generative AI, very few people have years of experience and everyone is learning together. Sellers who invest the time to become experts provide value to others. Great sellers can invest in their skills by attending industry conferences, reading relevant publications, and keeping track of the latest research in the field. There are a basic set of terms that every seller should understand: LLM, foundation model, prompt, tuning, fine-tuning, prompt engineering, deploy, inference, to name a few. Here is a great link to get started in learning more.
  2. Be hands on. There is so much free information on the internet and access to SaaS sites that make the models accessible that there really isn't a good excuse for anyone who is motivated to not dive into models to learn it themselves. Last week I built several models using HuggingFace and Cloud Pak for Data aaS in under 30 minutes. Take online introduction classes and demystify AI down to skills you can teach yourself.
  3. Provide AI solutions and use cases. AI is currently with the Chief Data Officer, Chief Analytics Officer, and other data science and analytics functions. Generative AI is introducing new players, the AI builders. AI builders are Application developers (software engineers not typical ML engineers)?who are focused on applications that call on the AI models. Sellers are experts at working across organizations and connecting AI experts, AI builders, and business owners who are yearning for help to deploy AI at scale within their organizations.
  4. Provide an outside view to customers that they value but otherwise wouldn't have access to through industry case studies. Similar to how bees are a critical part of cross-pollination, sellers serve an important role in cross-pollinating real AI use cases across companies that otherwise aren't talking. As the cartoon below suggests, sellers are going to bring innovation to many companies. Sellers, stay persistent and believe in the value you bring!

No alt text provided for this image

What do you think is the biggest obstacle or opportunity for enterprise companies to deploy AI, and what is the best way enterprise seller can help?

A special thanks to Maryam Ashoori, PhD and Carlo Appugliese for the help on this article.

Good selling.

No alt text provided for this image
Jane Hiscock

Founder and President, Farland Group

1 年

Another great article Ayal Steinberg - I appreciate the depth you provided here. The space is moving fast and you've provided excellent insight into areas that enterprise CI/TOs are trying to get their arms around. We've started to hear about an increasing focus on recruiting for prompt engineers. If I'm reading and understanding prompt tuning correctly (which I may not be!) it seems like this explosion of prompt engineers may not be a long term trend. Will tuning replace this role? Keep up the great insights. I learn with every read!

Sundeep Kumar

Senior Sales & Business Leader | Driving Multi-Million Dollar Revenue Growth in AI, Cloud, & SaaS | Expertise in APAC & Global Markets

1 年

Great one Ayal! and yes, it was a long read :-) Would this be a good summary? - The cost of building a foundational model is super high and only a few companies today have the resources to invest. - The cost of providing these models is high, falling into two main categories: training and inference costs. - Current foundational models are focused on B2C on a freemium model more to outsource training the model than training it for specific industry needs. - Enterprises are fearful due to the quality of the training data and privacy of their employee searches and are looking for substitutes to general use models that provide governance, traceability, and trust - The real market (who will pay for this) will be enterprises but they need someone to build models specific to their industry, that is regulated, and that the industry needs to trust. Or they would spend on building their own models (high upfront cost - might work out in the long run) Anything I missed?

回复
Tom McPherson

General Manager, IBM Power Systems

1 年

Great read, thanks Ayal.

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了