#AGI? ARE WE WHERE YET?
Michael Minkevich
Technology Executive: from startups to corporate ventures and M&A | Stanford | MIT
Spoiler alert: No. Definitely not yet. Slowly getting there, but not exactly the way most people think. Meanwhile, as we are patiently waiting for the advent of our robot overlord, there are some practical things related to our everyday business we should talk about. Especially, if you are an #AI startup.
In fact, I was trying to write a regular post, but pretty soon exceeded the word limit and just couldn't stop, so here is a long read for you.
OpenAI and 谷歌 have just completed their much anticipated showdowns. As the initial wave of public excitement ("Oh, look at those nice pics") subsided a little bit, I dare add my five cents. What do we have left when it's all said and done? Not much. Good incremental progress with no major breakthroughs. Forget about #AGI and concentrate on more mundane implications.
New GPT-4o and Gemini 1.5 (even the version names implies minor updates) look very similar, despite using different architecture. Both support multimodality. Voice generation, image processing, live video streaming seem to be a necessary default feature from now on. Both are faster than their predecessors as expected. Both are expectedly cheaper but it is tricky to do head-to-head comparison for different inference scenarios and pricing plans. I am glad that the access to voice capabilities is limited for now - we already have far too many phone scammers. Abilities to handle live video streams vary, but it is, again, marginal.
Perhaps, one notable difference is the context window size (128K tokens GPT vs 1M Google). Which means that if you practice family law, now you can upload all documents of some eight-year long divorce litigation over a large estate in one batch and finally bring that painful process to conclusion (or proceed in the old fashioned way and keep charging your fees for next ten years). Get ready for long long prompts. Claude 3 sported a 1M token context window as of two months ago, so there is no doubt other #LLMs will catch up soon.
But what does it all mean from the practical prospective?
1) Foundation #LLMs are effectively becoming a commodity. We have to admit it. Similar announcements from Meta , Anthropic and Chinese mammoths, such as 腾讯 , not to mention smaller more enterprise focused players, such as Databricks , are just a matter of time. And, perhaps, Microsoft AI ? These guys win under any possible circumstances. And don't forget about open-source guys, like Mistral AI and FALCON
2) Therefore, we should expect more attempts by the large players to climb up the food chain. Competing at a commodity market without clear differentiators is hard. Hence, more vertical integration cases. Native chatbots, document management, translation, AI assistants etc. I would not completely exclude the possibility that some killer use cases may pop up all of the sudden. OpenAI is reportedly deliberating over the idea to allow users generate porn. I know that is exactly what the majority of internet citizens have been longing for years, but that will mean the end of the humankind. OpenAI, please please don't do that.
3) Climbing up the food chain also means building an ecosystem. OpenAI is trying to strike a deal with Apple, Google has its own vast platform covering everything from office productivity to mobile. Other #LLM vendors will certainly follow the suit and the major platform will be behind many products in different markets.
4) The competition will trickle down to the device level. #EdgeAI will finally become a really big thing. More attempts to make large models smaller, faster and make it work on resource-constrained devices beyond high-end smartphones.
领英推荐
5) Need for a novel model architecture. Transformers are cool and helped the humanity to get rid of website content creators and bloggers (not completely), but there is a few other alternatives. For instance, #RNNs are now experiencing real renaissance. I already posted about #KAN. Will cover alternative model architectures in depth in a separate article.
6) Specialized HW. Software engineering tricks for model optimization have their limits. Soon, we will see more specialized hardware for both model training and inference, especially for fast real-time edge scenarios. We should see more novel ASIC architectures and system designs. Personally, I am a big fan of the neuromorphic approach. New hardware, in turn, drives new model design and optimization techniques.
Meanwhile, getting to my original point, what does it all means for you, if you are an AI startup that makes use of major LLM platforms? Bad news. With each new release of major LLMs, I see early stage companies, who did nothing more than sticking nice UI to one of the foundation models, dying in droves (or trying to pivot in panic). For the reference, you can check the picture above. That is how their business looks like. Strategically. Don't be that startup.
How can you become different and not to be eaten alive by the majors? In my opinion, there are four options:
Indeed, if you have (more than sufficient) funding and a think-tank full of data scientists, you can always try to build your very own LLM. Go for it - we love competition here in America! :)
Data Scientist / AI Dialogue Facilitator
6 个月The intellectual level of ALL 'demonstrations' offered by the overcapitalized monsters leaves much to be desired.
?????????? Ecommerce Training and Consulting | Digital Commerce Expert | Author | ex Hybris (SAP),ex Spryker, ex Elasticpath
6 个月Michael Thanks for well structured overview of the recent developments and trends. Makes perfect sense. For smaller startups AI is not a product. Applying industry expertise in using it is the way to go. We should catch up sometime.