FOD#65: Jevons' Paradox in AI

FOD#65: Jevons' Paradox in AI

we discuss what the rapidly decreasing token cost of using LLMs means for AI companies, introduce some changes to Turing Post, and, as always, offer you the best-curated list of news and papers

?? Changes on Turing Post ??

Good times, dear readers. It’s September, and we’re approaching some exciting months filled with ML and AI developments. We hope you had a restful summer because we’re ready to offer more insights into machine learning – not just the AGI conversation, but the technology behind it. At Turing Post, we aim to support your learning by blending history, key terms, and storytelling, hoping to inspire new, practical ideas.

To understand AGI, it’s essential to grasp the foundational technology behind it. Our AI 101 series on Wednesdays is designed with this in mind, providing clarity amidst the often inconsistent use of terminology. We’ll focus on three main areas:

While there may be overlaps between these categories, models and methods will be explained in detail, and fundamental concepts will be presented in concise, easy-to-understand formats.

Fridays will be divided between two series:

Agentic Workflows is an enormous topic with a lot happening right now. We will guide you through it, learning together along the way.

Hope you’ll find it useful (your feedback and sharing are the most valuable support. Please don’t hesitate to do it ??)

This week on Turing Post

  • Tuesday, Guest Post: Optimizing Multi-agent Systems with Mistral Large, Mistral Nemo, and Llama-agents (practical!)
  • Wednesday, AI 101<>Method/Technique: What is Chain-of-Knowledge (for those interested in enhancing the reasoning capabilities of LLMs)
  • Friday, AI Unicorn: a fascinating story of 01.AI and her leader, a legendary Kai-Fu Lee.


Editorial

Have you heard of Jevons' Paradox? It's a paradox discovered by British economist William Stanley Jevons (1835-1882) during the Industrial Revolution in 1865. After James Watt introduced an efficient steam engine that required much less coal than previous methods, people assumed that Watt's engine would eventually reduce the total amount of coal consumed. But the exact opposite happened! Coal consumption in the UK skyrocketed. This is the phenomenon of how increasing the efficiency of a resource as technology continues to advance does not lead to less use of that resource, but rather more.

In the generative AI space, the token cost of using LLMs is rapidly decreasing, especially as LLM technology development accelerates and open-source LLMs proliferate. Professor Andrew Ng wrote a piece a few days ago about the rapid decline in token costs, why it's happening, and what AI companies should be thinking about going forward. Here's a quick summary of his thoughts:

  • The LLM token price has been declining at a significant rate of almost 80% per year. From $36 per million tokens at the launch of GPT-4 in March 2023, the price of GPT-4o tokens has recently been reduced by OpenAI to $4 per million, and the new Batch API is available for an even lower price of $2 per million.
  • The sharp drop in token price is attributed to the launch of the open-weight model and innovations in hardware. There are many reasons for this, but with the release of great open-weight models like Meta's Llama 3.1, we're seeing a steady stream of mature, usable LLMs of all sizes, allowing startups like Anyscale, Fireworks, Together.ai, and large cloud service providers to compete directly on factors like price and speed without the burden of having to recoup ‘model development costs’. And the ongoing hardware innovation from startups like Groq, Samba Nova (which delivers Llama 3.1 405B tokens at 114 per second), Cerebras, and the likes of Nvidia, AMD, and others will further accelerate price reductions going forward.
  • Recommendations for AI Companies Developing LLM Applications: Given the projected decline in token prices, focus on creating valuable applications rather than solely optimizing costs. Even if current costs seem high, pursue aggressive development and deployment with an eye on future price drops. Regularly review and switch to different models or providers as new options become available.

Paradigms of generative AI development I also believe that the sharp decline in token prices will definitely contribute to more experimentation, development, and deployment of LLM and generative AI applications. The real winners will be operators with multi-LLM architectures who can rapidly deploy new applications that leverage AI's generative capabilities.

While cost is a factor, the key lies in balancing 'Utility vs. Cost,' a challenging task in generative AI. Best practices for killer applications are still emerging, and risks like 'illusion,' 'bias,' and 'privacy leakage' must be managed. These risks can impact AI companies and society if not handled welll.

I believe that the companies that will be leaders in the ‘generative AI’ market will be those that take advantage of the ‘falling cost’ of LLM technology and create and operate applications that maximize the features and benefits of this technology quickly and with good risk management. I call this the “risk-based generative AI paradigm”.

What perspectives do you think are needed in the generative AI market, including LLMs, to allow for more experimentation, development, and deployment, like Jevons' paradox?


It’s Labor Day in the US, and I, Ksenia, am navigating a family invasion. Today's editorial is brought to you by Ben Sum, our dedicated Korean partner at Turing Post. Thanks to him, Turing Post Korea thrives (subscribe here), and he'll be contributing more insightful opinion pieces to the main Turing Post as well.


Twitter Library


Weekly recommendation from AI practitioner????:

  • Check OpenRouter and Not Diamond. They allow manage access to different AI models. OpenRouter simplifies using various large language models through a single API, while Not Diamond helps connect and route between multiple AI models, supporting a more interconnected AI environment.


News from The Usual Suspects ?

  • Anthropic's System Prompts and Artefacts

Anthropic's system prompts reveal why Claude avoids “I’m sorry” intros, prefers markdown for code, and might even offer you a piecemeal approach for long tasks. Designed to be curious, yet careful, Claude steers clear of identifying human faces, sticking to facts over faces, and keeps mum about itself unless asked.

They also rolls out Artifacts to all users, turning chats into interactive creations like code diagrams and dashboards. User experience revolution keeps unfolding.

  • Gemini’s New Gems

Google’s Gemini introduces Gems, customizable AI experts to assist with everything from coding to career advice. With the Imagen 3 model’s upgraded image generation, Google’s AI is shaping up to be a gem for personal and professional use. They also rolling out three experimental models: a new smaller variant, Gemini 1.5 Flash-8B, Gemini 1.5 Pro model (better on coding & complex prompts), improved Gemini 1.5 Flash model.

Google is also working on enhancing its efforts for the upcoming U.S. elections, focusing on providing reliable information through Search, YouTube, and Google Play (e.g., monitoring abuse trends, using AI to detect misinformation, and increasing security for high-risk users.

  • Meta's Llama Stampede

Meta's Llama models are galloping ahead with 350 million downloads, showing a staggering 10x growth. From enhanced customer care to 60,000 derivative models on Hugging Face, Llama’s reach is no tall tale. Mark Zuckerberg is leading the herd, riding that llama to success.

  • Microsoft’s Brainwave

Microsoft’s new AI innovations, like CircuitNet and Spiking Neural Networks, are straight out of a sci-fi flick. Mimicking the brain’s efficiency, these tech wonders promise better AI performance with fewer resources. Microsoft is clearly aiming for both brains and beauty in AI.

  • Cerebras Speeds Ahead

Cerebras (read it’s fascinating profile here) has taken AI speed to new heights, launching the fastest AI inference solution yet, handling up to 1,800 tokens per second. Their Wafer Scale Engine outperforms traditional GPU setups by a wide margin, making real-time AI a reality.

  • OpenAI’s Sweet Secret

OpenAI's mysterious "Strawberry" AI has piqued U.S. national security's interest, hinting at applications beyond chit-chat. Meanwhile, OpenAI eyes a hefty funding round, pushing its valuation past the $100 billion mark. Sweet success indeed.

  • Cohere Commands Business

Cohere's latest Command R series is optimized for business, offering speedy retrieval-augmented generation and multilingual capabilities. The enhancements aim to boost efficiency across industries, making it a go-to choice for enterprise AI needs.

  • New Unicorns in Town: Codeium and Magic AI

Codeium, now valued at $1.25 billion, raised $150 million for its AI-powered development tools. Meanwhile, Magic AI introduces ultra-long context models with 100 million token capabilities, pushing AI's boundaries even further.

  • Midjourney is Into Hardware

Midjourney is hiring for its hardware effort. What are they baking?


We are watching/reading


The freshest research papers were published. We categorized them for your convenience ????


要查看或添加评论,请登录

社区洞察

其他会员也浏览了