The Subtle Changes
Ray Villalobos
Generative AI, Prompt Engineering and Full Stack Development. LinkedIn Top Voice. Senior Staff Instructor at LinkedIn, Instructor at Stanford University.
Many important changes go unnoticed, but they can be indicative of AI's future direction. Some of these don't seem great at first, but stick with me, I'll show you where we're going.
For example, OpenAI recently changed the context window size in their tiers, reducing it from 128k to 8k in the free tier and to 32k in their Plus tier. They didn’t make a big deal out of it, but this dramatically changed how the tool worked.
I manually added the context window and the cutoff dates. You can find the data on the context window also on this page, but you'll have to scroll down to the bottom.
Failure to Communicate
The other thing you might be noticing more and more is LLMs like Chat GPT failing to do certain things. A lot of this is due to companies being sued for, or being afraid about getting sued for copyright infringement.
This is a bit disappointing because I didn't ask for or want it to include copyrighted toy brands, just recognizable toys and it should have just given me something with similar toys, but between regulation and lawsuits, you're going to see more of this in your future.
I'm Tired Boss
The other thing I'm seeing is a lot of prompts returning a lazy version of what I asked for.
I was asking Claude to look through a PDF and create a web page for me with the data in the table of contents. It did a pretty good job, but it only generated the first two items and then just gave up... By the way, if you want to know how to solve this, check out my latest course on Getting Started with Claude.
This section on using structured data shows you how I fixed this. However, the important thing is that it's just easier for a tool to quit early or tell you know. What we really need is what's coming next.
What's Next
What I really need is a tool that will do what I had to coax Claude into doing...complete a complex task at scale. This concept isn't new and some of it was pioneered by a tool called AutoGPT that saw a lot of press when it was released, but didn't get a ton of traction because it wasn't extremely useful.
One tool that does this well is Chat GPT Plus' Code Interpreter (Advanced Data Analysis tool). This has been around in Chat GPT for quite a while and what I really like is when you ask it to take care of something and it fails at that task...it just tries again.
It's clear that Chat GPT can do this, but they seem to have slowed down releases recently, so that the biggest news have been about companies surpassing it's capabilities.
It's true, Claude 3 is now at least as good, if not better than even Chat GPT 4 Turbo. What's surprising is that their smaller model Haiku is almost as good and way faster.
However, Open AI's most underrated product, their GPT store, is a step towards monetizing agentive products. In the future, this is going to take care of more complex tasks that will multiply what humans are capable of producing with a lot of accuracy. The world is really going to change then.
Although I expect layoffs from companies eager to save money from salaries, eventually new jobs will arise and companies will be just as bloated as before when consumers learn to expect more and new levels of functionality from AIs and humans working together.
Google Surprised Me
One thing I didn't expect on the way to AGI (Artificial General Intelligence) is the move towards massive token context windows, basically how much an AI can hold in it's current memory at a time. Gemini has definitely been leading here, offering a million token input/outputs releasing in May, but available for free in preview now. It's something I'm testing and releasing a course on real soon.
Although this makes extremely large documents searchable and easy to process through AI, right now it's a bit pricey at $7/million input and $21/million output. Thankfully as computing power explodes I'm sure these prices will come down. Claude can already handle that larger context windows and Google's Gemini has been tested up to 10 million tokens.
There is a market for huge context windows. But I do think the next step is agentive, so be on the lookout for whatever these companies announce soon.