Google's Gemini
Google launched Gemini today. Gemini 1.0 is a multimodal model that has been trained on image, audio, video and text data and is intended to be generalizable across different data modalities (e.g., you give an image, it can return text-based description of the image). Gemini comes in three sizes - Ultra, Pro, and Nano. Ultra is intended for the most complex and difficult tasks. Pro is designed for performance and deployability (sort of akin to the enterprise tier). Nano empowers on-device applications. Nano models are smaller and distillations of the bigger Gemini models. They are quantized for on-device deployment.
The evaluation against other multimodal models shows how fast the entire industry continues to move. It feels like foundational models are yesteryear's news, and multimodal is the present. Who knows what comes next. This is against the backdrop that building these large foundational models are extremely challenging, and multimodal foundational models are even harder because of multimodality.
Below is the evaluation from the technical paper. Gemini is good. GPT-4 is good. I expect to see further improvement on all of these models as these models continue to learn and as machine learning researchers continue to incorporate new learnings and tweaks.
Further, we can see the power of the large models as well as the performance degradation on the smaller "nano" model. This is expected. We have to note that it doesn't mean that Nano model or other smaller models are bad. It means that the market now has more foundational models to cater to different use cases and deployment structure. I like the fact that Google gives this comparison across models and tasks.
领英推荐
Congratulations to Google Deepmind team and all the friends and students who have contributed to this effort in some way.
Today's announcement also gives startups in the space much to think about. I wrote about AI defensibility before, and it is imperative for startups to dig in and consider where your edge is against large platform players.
President Meryl Moss Media Group--Publicity, Marketing and Social Media / Publisher BookTrib.com and CEO Meridian Editions
4 个月Joyce, thanks for sharing! How are you doing?
Big congrats to Jack Krawczyk and team
Founder @ Trusound | Google certified engineer | deep tech | ?? | cloud ?? | pranic healing advocate ???| ?? feat. in Arch. Digest, BBC, Elle Decor, NPR | social entrepreneurship | impact investing | AI
1 年Great share Joyce! For startups and small to medium sized businesses it is truly a game changer.
Generative AI Tech Lead @ C6 Bank | Google Professional ML Engineer | Google Professional Data Engineer
1 年Very important day for AI, for Google, and for many startups. It's amazing how much technology has advanced in such a short time, undoubtedly a very interesting era to live in!
Senior ML Scientist @ Visa ● Speaker at Google AI For Social Good
1 年Enlightening????