The Token Economy: The Future of Computing is Tokens

The Token Economy: The Future of Computing is Tokens

The Value is in Tokens

Current market conditions present a paradox to investors, undercurrents of a recession are met with bullish trends. Selloffs like a few weeks ago are brutal reminders of where and how value is stored in markets, and even with NVIDIA down over 10% in a week, there is value to be found. Contrasting the sea of red is a shining star in tech and AI, Groq , who closed a $640M round at a $2.8B valuation, led by none other than BlackRock.

While most tech stocks are down, what makes Groq worth such a sizable investment? Tokens. Throughout each era of computing, single metrics have defined the value creators. First, it was bytes of storage, while document digitization reigned king, creating massive value in IBM, Oracle, and Sun. From storage, we moved to processing power creating titans like Intel and AMD. Next, as we scaled from the power of one computer to the power of many working together, bandwidth or bytes per second created new value in telecom and infrastructure companies installing hardware throughout the world. As we moved to the cloud, companies like Amazon, Azure, and Google captured the new metric of the era, generically known as "compute units." Each age of computing is defined by a metric. The metric of this era is tokens.

What are tokens?

Tokens, simply, are fractions of words, phrases, or characters that allow AI applications to reason and store large complex thought. The same way people connect ideas together, AI applications use tokens to connect language and create human-like responses to questions and conversations.

People don't think or write in tokens, but tokens are an artifact of every piece of digitized content we create, every phone conversation, every email, every training video, or corporate memo is a collection of tokens, generated by a person rather than an AI.

Where factories and factory workers generate machines, cars, or something physical, the office workers of the world generate tokens, and just like automation changed the factory, AI changes the office job.

Your margin is my opportunity

The most practical, commercial application of AI we have seen is the augmentation of customer support centers and call centers. These call centers for decades have been token factories. Seas of employees following prewritten scripts, measured by the calls they made or tickets they handled. For decades, to lower costs, enterprises have offshored whole functions to India, Mexico, and the Philippines, or if an accent mattered, departments were moved to low-cost regions like Oklahoma. Each of these customer support centers were, and are, token factories with humans generating tokens at scale for the lowest cost.

NVIDIA as the incumbent player needs to remain sharp, or as capitalism goes, their margin will become everyone else's opportunity. Fumbles like delaying Blackwell mean companies craving compute will need to look elsewhere for their chips, and when speed of inference is critical, there is no better competitor than Groq. Where tokens from companies like Groq or OpenAI cost ~$0.05/1M tokens (or even cheaper for smaller more purpose-built models), a normal person can speak less than 100,000 tokens per day. With the average wage of a call center employee in India making less than $3/hour, the cost per 1M tokens for a call center is $240 or 4800x more expensive than a human.

Speed is the Alpha

Most people now are familiar with AI's iconic typewriter-like feel. Ask AI a question and it pecks out, character by character, an answer. This is not a fancy user experience trick. What you are seeing is the AI generate a response in real time. Guessing each token, one at a time. This look and feel is nice when users are interacting with AI directly but is cumbersome if the AI is providing information for a customer support representative. Instead of you reading as the AI types the answer, the call center employee will need to wait for a complete response, read it, reason about the information, then finally reiterate the relevant information to the customer. This interaction leads to seconds if not more than a minute of delays, creating a very awkward conversation. This is why speed is critical to the deployment of commercial AI.

We are not seeing AI replace customer service & call center jobs, yet. Instead, we are seeing AI augment already effective customer support and call center teams. As a customer support agent works on the phone in real time with a customer, when AI inference is near real time, a whole new interaction will begin to exist. Instead of customer support staff listening to a customer then clicking around a cumbersome interface to find questions, incredibly fast inference like Groq's whisper API can listen in near real time to the conversation and then by using agents with nearly instantaneous inference, the AI can present relevant documents, screens, or actions to a customer support representative. This allows the representative to focus on customer satisfaction, without the need to go on hold.

Stability in AI

With NVIDIA's run over the past year and Groq's big round, it's clear that investors see the value of AI and AI's adoption in the market. Where it has been unclear is where the value creation will exist. Investors were torn between value being in the foundational models like ChatGPT, Claude, and Llama 3 or the value being in the hardware with chips like AMD, NVIDIA and Groq. With all models, open and closed source, converging in capability because eventually they will all be trained on, largely, the same datasets, the value is in how fast and how cheap a token can get created.


要查看或添加评论,请登录

Austin V.的更多文章

社区洞察

其他会员也浏览了