Google's Gemini

Google's Gemini

Google launched Gemini today. Gemini 1.0 is a multimodal model that has been trained on image, audio, video and text data and is intended to be generalizable across different data modalities (e.g., you give an image, it can return text-based description of the image). Gemini comes in three sizes - Ultra, Pro, and Nano. Ultra is intended for the most complex and difficult tasks. Pro is designed for performance and deployability (sort of akin to the enterprise tier). Nano empowers on-device applications. Nano models are smaller and distillations of the bigger Gemini models. They are quantized for on-device deployment.

The evaluation against other multimodal models shows how fast the entire industry continues to move. It feels like foundational models are yesteryear's news, and multimodal is the present. Who knows what comes next. This is against the backdrop that building these large foundational models are extremely challenging, and multimodal foundational models are even harder because of multimodality.

Below is the evaluation from the technical paper. Gemini is good. GPT-4 is good. I expect to see further improvement on all of these models as these models continue to learn and as machine learning researchers continue to incorporate new learnings and tweaks.

Source: Gemini: A Family of Highly Capable Multimodal Models

Further, we can see the power of the large models as well as the performance degradation on the smaller "nano" model. This is expected. We have to note that it doesn't mean that Nano model or other smaller models are bad. It means that the market now has more foundational models to cater to different use cases and deployment structure. I like the fact that Google gives this comparison across models and tasks.


Source: Gemini: A Family of Highly Capable Multimodal Models

Congratulations to Google Deepmind team and all the friends and students who have contributed to this effort in some way.

Today's announcement also gives startups in the space much to think about. I wrote about AI defensibility before, and it is imperative for startups to dig in and consider where your edge is against large platform players.


Meryl Moss

President Meryl Moss Media Group--Publicity, Marketing and Social Media / Publisher BookTrib.com and CEO Meridian Editions

4 个月

Joyce, thanks for sharing! How are you doing?

回复
Daniel Flügger

Founder @ Trusound | Google certified engineer | deep tech | ?? | cloud ?? | pranic healing advocate ???| ?? feat. in Arch. Digest, BBC, Elle Decor, NPR | social entrepreneurship | impact investing | AI

1 年

Great share Joyce! For startups and small to medium sized businesses it is truly a game changer.

Jo?o Pedro de Bragan?a

Generative AI Tech Lead @ C6 Bank | Google Professional ML Engineer | Google Professional Data Engineer

1 年

Very important day for AI, for Google, and for many startups. It's amazing how much technology has advanced in such a short time, undoubtedly a very interesting era to live in!

Aditya Shah

Senior ML Scientist @ Visa ● Speaker at Google AI For Social Good

1 年

Enlightening????

要查看或添加评论,请登录

Joyce J. Shen的更多文章

社区洞察

其他会员也浏览了