Day 4 - Starting an AI Company in 30 Days – Comparing Costs of AI Models
Midjourney Prompt: an image of a person navigating a labyrinth where the maze has openings to different AI Models: Claude, GPT, Llama, Gemini, Mistra

Day 4 - Starting an AI Company in 30 Days – Comparing Costs of AI Models

By Day 4, I had a good grasp of the AI ecosystem and where my company delights.ai would be in the ecosystem but I needed to still figure out our costs.

(Our goal is to launch delights.ai in Q3’24 and I hope I can reveal more as we get closer to that date.)

Searching for a Table of Comparison of AI Model Costs

My partners and I needed to figure out whether the product idea for delights.ai would be viable when put through the financial rigors of modeling. We needed a solid understanding of the ongoing costs of all the different AI Models and which would be the most economical to build our solution on top of without compromising quality. This is where my head started hurting since we had to understand what model would work best for us without blowing our budget. I searched high and low for a comprehensive comparison table of the costs of AI models. Unfortunately, there was nothing that I could find that put together the costs into one comprehensive table.

AI Model Cost Comparison Table

Since there was nothing out there that we could easily use to understand the costs of the AI Models we decided to create one ourselves. The comparison table my partners and I created below covers the basics that may help you in your financial planning for your AI company. The table includes data that we needed, i.e:

1.???? We needed to know how many “Tokens” were allowed for Input and Output

2.???? We also needed to know when the AI Model training date cutoff was since that indicated how much knowledge of current events (and for us, the rapidly evolving internet jargon,) it would know and understand.

3.???? The costs associated with the Tokens

Definitions of AI Jargon

I wrote some quick definitions for AI jargon because I needed a clear understanding of these phrases myself. The below definitions capture the most commonly agreed explanations.

* TOKEN - Tokens can be thought of as pieces of words. Before the API processes the request, the input is broken down into tokens. These tokens are not cut up exactly where the words start or end - tokens can include trailing spaces and even sub-words. Here are some helpful rules of thumb for understanding tokens in terms of lengths:

· 1 token ~= 4 chars in English

· 1 token ~= ? words

· 100 tokens ~= 75 words

Or

· 1-2 sentence ~= 30 tokens

· 1 paragraph ~= 100 tokens

· 1,500 words ~= 2048 tokens?

* TOKEN LIMITS - Depending on the model used, requests can use up to 128,000 tokens (in GPT-4) shared between prompt and completion. Some models, like GPT-4 Turbo, have different limits on input and output tokens.

OUTPUT TOKEN LIMIT - Limit refers to the maximum number of tokens that an AI model can generate in a single session. In simpler terms, it's the total length of response you can get from an AI model, which contributes to the overall Token Limit.

CONTEXT WINDOW - denotes the total number of tokens permissible in a session. This includes tokens from a user's interaction with a chatbot in a conversational exchange or the processing of code, files, images, videos, etc. It accounts for both the user's input and the model's output.

Note: These * marked definitions were taken from OpenAI Help Community directly without edits.

AI Model Cost Comparison Table as of April 10, 2024

The below costs of the Input and Output are per 1 Million Token. Also, this is primarily for chat costs, not an Embedded API which often costs less than shown here. I’ve linked each AI Model’s pricing below the table. Please let me know if any of the information is incorrect or needs to be updated.

AI Model Cost Comparison Table by the

As you can see, there are vast differences between the versions of each Leading AI Model. For example, GPT-4 Turbo Output costs $30 per one Million Tokens vs GPT-4 which costs $60 per one Million Tokens. What is the difference between the two and why would GPT-4 Turbo be cheaper when it has a larger Context Window? The difference in cost is what each version is good at. GPT-4 Turbo is designed for fast processing of less complex tasks with more recent data vs GPT-4 which is described by OpenAI as being more accurate and for solving more difficult problems.

https://openai.com/pricing
https://openai.com/pricing

In general, this is true of the different versions of the AI Models, the cheaper they are the more likely they are designed for Simplicity & Speed versus Accuracy & Complexity. When you compare the more complex versions of each model, then the latest version of Gemini 1.5 Pro is in the lead for its large Context Window and Input/Output Costs. Of course, this may change next week, since the speed of evolution of these models and pricing is so fast and that is the “Preview Pricing” for Gemini 1.5 Pro.

Comparing the larger versions of each model

Here is how Google Gemini 1.5 Pro compares itself to other models

https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/#architecture

Here are the links to the pricing models. I included full-stack third-party provider octo.ai whose pricing might be cheaper than engaging directly with the AI Model providers. We are looking at octo.ai as our full-stack platform.

https://octo.ai/docs/getting-started/pricing-and-billing

https://openai.com/pricing

https://docs.mistral.ai/platform/pricing/

https://www.anthropic.com/api#pricing

https://ai.google.dev/pricing

https://ai.google.dev/models/gemini

AI Model Strengths & Weaknesses with Use Cases

We next needed to understand what the strengths and weaknesses of each AI Model were to determine which one would suit our purposes the best. My partner, who was the AI Lead, was looking at various things such as the ease of deployment, integration, and flexibility. However, for this article, I wanted to simplify it with basic Use Cases and whether or not the models were capable of being multi-modal (i.e. being able to handle images, audio, videos, etc.) But before I share my comparison table, I want to show you what the AI Models told me when I asked them about themselves and other models. When I prompted them to "put together a comparison table on strengths and weaknesses and the ability to understand and process image, audio, and videos and the ideal use cases for [all the models.]"

Here is what OpenAI's GPT-4 told me…

"Here's a table summarizing the capabilities, strengths, weaknesses, and use cases for the AI models Gemini 1.5 Pro, GPT-4 Turbo, Mistral Large, Mixtral 8x7B, Llama 2, and Claude Opus, including their capabilities to handle different media types like images, videos, and audio:

OpenAI GPT-4 Comparing its strengths & weaknesses against other leading LLMs

Claude stands out as the only model in this list with capabilities extending beyond text, able to process and generate content related to audio, video, and images, thus offering a broader application scope." <--- interesting that GPT-4 said this about a competitor!

Here is what Claude 3 Sonnet told me:

"Here's a table with the strengths, weaknesses, multimedia capabilities, and potential use cases for the AI models you mentioned, based on my current knowledge:

Anthropic Claude 3 Sonnet Comparing its strengths & weaknesses against other leading LLMs

"Some key points:

  • Gemini 1.5 Pro, GPT-4 Turbo, LLaMA 2, and Claude Opus are large language models focused on understanding and generating text, with varying levels of multimedia support.
  • GPT-4 Turbo is hypothesized to have strong multimedia and multi-modal capabilities given its position as a cutting-edge AI model.
  • Mistral Large is designed specifically for workflow orchestration rather than general language tasks.
  • I do not have reliable information about "Mixtral 8x7b" to include details.
  • Use cases range from virtual assistants to coding aids to decision support systems.

Please note that my knowledge comes from my training data, which has a cutoff date. The actual multimedia capabilities and strengths/weaknesses of these models may have evolved beyond what I've represented here. I recommend checking official sources for the latest updates on these AI models and their real-world applications. Let me know if you need any clarification or have additional questions!"

Here is what Gemini 1.5 Pro Preview told me:

Google Gemini 1.5 Pro

General?Observations:

  • Advanced?Capabilities:?These?models?represent?advancements?in?text-based?AI?tasks,?potentially?offering?improved?accuracy,?reasoning,?and?creative?capabilities.
  • Accessibility:?Access?remains?a?concern,?with?several?models?being?limited?to?research?or?requiring?specific?permissions.
  • Multimedia?Handling:?Direct?image,?video,?or?audio?processing?is?still?not?a?feature,?but?these?models?could?be?integrated?with?other?AI?systems?for?multimedia?tasks.
  • Use?Cases:?These?models?are?suitable?for?advanced?research,?content?creation,?complex?problem-solving,?and?developing?sophisticated?language-based?applications.

Additional?Considerations:

  • Fine-tuning:?Many?of?these?models?benefit?from?fine-tuning?on?specific?datasets?to?optimize?performance?for?particular?tasks.
  • Ethical?Implications:?Responsible?development?and?use?are?crucial?to?mitigate?potential?biases?and?ensure?fairness?in?these?powerful?models.
  • Evolving?Landscape:?The?field?of?AI?is?rapidly?evolving,?and?new?models?with?expanded?capabilities?may?emerge."


Please take the above AI-generated tables with a grain of salt since their training data cutoff dates are not recent and the progress and updates of these models are happening nearly monthly if not more frequently. The most interesting thing to me is that these models all consider themselves and other competitive models weak in multi-modal capability and almost exclusively text-based models. Yet, I have seen some amazing image, video, and audio GenAI companies being built on top of these AI Models. Here are a couple being built off Claude Speakai.co which does audio input and Pictory.ai which does video generation. Others being built off GPT and do image generation are Notion.so and Hypotenuse.ai These give you a good idea of what can be built on top of the leading AI Models.

The Human-Generated AI Strengths & Weaknesses Comparison Table

Here is the comparison table that I, the human, generated with my training data cutoff date of April 10th, 2024 as a comparison to what the AI Models generated in the tables above. We humans are prone to error as well so let me know if I missed anything or it needs updating.

AI Models Strengths & Weaknesses by S. Pike


I want to caveat that all the models are able to handle audio, video, and images with a lot of modifications per the company examples I have earlier.

The delights.ai Model Calculator

For delights.ai we are mainly focusing on video and text processing for our Minimum Viable Product (MVP.) With that in mind, we moved on to the next phase of understanding our costs and potential revenue by producing a delights.ai modeling calculator that took the price of each AI Model and added our business assumptions. See below, but I have hidden some elements to protect our IP.

delights.ai

The above is an example of how we are evaluating the costs of these models because the leading LLMs are constantly updating their models, developing new ones, and updating their pricing. This is the irony of this journey of starting an AI company -- the AI Models and corresponding tech stacks are becoming easier and easier to use but making sure that it makes actual business sense for our company and that there aren’t any financial gotchas has been the most challenging.

Also, we’re being very careful to not ask any of the AI models anything about our business idea so it doesn’t get put into their training models. I bet that one of the AI models could generate a financial plan fairly quickly but I don’t plan on asking. I recommend not doing this for yourselves either.

I will go into more detail about how we came up with the above calculator and about the business idea and strategy in my next post.

But before I go, here is a screenshot of a funny conversation that I had with Gemini 1.5 Pro where it denies its own identity. You can preview the latest Google AI aka Gemini Pro here at Google AI Studio.

Gemini having an identity crisis


Thanks all if you read this whole post and my previous articles. Please let me know if you have any questions, feedback, or suggestions.


My next post will be:

Day 5-10: Coming up with the idea – Where to Start


#GenerativeAI #StartupJourney #TechStartups #Entrepreneurship #AIInnovation #StartupChallenge #MachineLearning #BusinessGrowth #Innovation #Leadership #TechTrends #StartupLife #DigitalTransformation #EntrepreneurMindset #FutureOfWork #AIStartups #AIModels #PromptEngineers #PromptEngineering #PromptMarketPlace #PromptEntertainment #AIEntertainment #AI #ArtificialIntelligence #LLM #DataScience #Gemini #DallE #Llama #Mistral #Mixtral #Claude #GPT #OpenAI #Anthropic #Meta #Google #Notion #AIModelCostsComparison #AIModelCosts #AIModelStrengths

?

?


Dover J.

Climate Tech Entrepreneur | Currently on a mission to evolve the way shared communities create, use, and think about energy.

5 个月

Fun quick morning read! Thanks for sharing your notes!

要查看或添加评论,请登录

社区洞察

其他会员也浏览了