Thoughts on Gemini
After much anticipation, Gemini has been announced this week (on 6-Dec). There have been rumors that it is being pushed to 2024 so this has been a welcome move.
Firstly, I am happy that there is now a credible alternative to Open AI. Yes, Anthropic has been around too but it does only some things well, not all. And their rate limits suggest they are still figuring out scaling. Google does have the engineering chops to scale (they are the ones helping Anthropic scale anyway), it was a good model they did not have. Do they have it now, with Gemini?
The benchmarks suggest they are at par. Yes, slightly better but given the narrow margins, it makes sense to read them as similar (statisticians anyone?). It is indeed strange that all the scores are just a wee bit better than GPT4. This means one or two things -
Google intentionally is playing down the full capabilities so they can give themselves some room to show ongoing improvements. If they show a big leap right now, they risk provoking Open AI and given they are just about to catch up in this race, they don’t want to give unnecessary motivation to Open AI
Of course, the other and a much simpler explanation is that the model is actually not as good as it is advertised. And they could barely meet the performance by torturing certain levers (32 CoT vs 5-shot, using older versions of GPT4, etc.).
Reality is perhaps somewhere in between or some combination of the two.
Their demo videos have been very well made. Quite impressive. But critics were quick to point out that these demos are “fake”. The interaction is not what was actually fed to the model and the actual prompts when reviewed, makes it less impressive, …
领英推荐
Some of this appears to be nitpicky to me. Yes, Google could have been more forthcoming and transparent in showing the benchmarks and the demos (In fact, they were transparent - included all the details in technical report and blog posts around how the demos were made. So, they deserve credit, not criticism). I guess people expected more from Google. After all, Sergey Brin is one of the co-authors of the paper! Anyway, I would say that the model is impressive nevertheless even if it matched GPT4 and doesn’t beat it…yet.
Alpha Code 2 is another silent drop that deserves attention. On the surface, it appears to be a significant step up over previous version and does significantly better on a lot of code forces tasks. More on this later, in a separate post.
While impressive, is this all still incremental? Probably. Hallucinations, prompt injection attacks and other threats still remain. Gemini models are not fundamentally different so they can’t reason as well. Multimodal capabilities are still not great. So perhaps this is an incremental step that doesn’t really take the field to next level. But if you are like me working in this field for the past 2 years, I would welcome something that is more robust even if incremental rather than yet another disruption.
Oh, conveniently Ultra will be available only in 2024. So Google is also still figuring out the scaling. Again, this led some critics to say Google just shipped a blog post, because Pro and Nano models are also available only next week. They are being a bit too harsh. Open AI for example released GPT-4 Turbo a month back and it is yet to announce GA without rate limits. Give me a break.
Research -> Business. Now that the model is out, next step is to monetize it. Making money with Generative AI has been a difficult problem all along. Google had better models in the past too - vision, dialog flow, … but their cloud platform adoption had been their Achilles heel. It would be interesting to see if/how Google would overcome this problem. They seem to first focus on improving their services/products - Pro powering Bard, G suite and Nano powering Pixel, and the bigger question of evolving search itself. Not sure how they are going to enable the broader developer community with APIs, and how they will attract them to GCP.
All in all, I love that we have another alternative to GPT4 that can actually match its capabilities. We can debate performance numbers, but they will only improve going forward. This gives confidence to other companies and more importantly to OSS community and I am sincerely hoping that OSS models can soon match these models’ capabilities. Here is to hoping the field evolves steadily, making the capabilities accessible to everyone!
Finance | Analytics | Management | Let's find out together how I can help you!
1 年Check out this killer breakdown of the Google DeepMind Gemini model unveiling. Seriously, it will be worth your time! https://davefrank.xyz/insights/google-gemini-ai-model-challenges-the-established-order/