Issue 18: A week to remember: Sora, Gemini, Nvidia
Sharmilli Ghosh
Product Management | GTM | ISV & SI Partnerships | Startup Founder | Board Member | Investor |
It’s becoming impossible to keep up with the pace of innovation, with just one newsletter per week. This week was huge. There were three big distinct “wow” moments for me, I am sure there were different ones for many of you – depending on whether you are more interested in the tech or business or hardware movements o other areas…
My three big announcements were:
1.?????? OpenAI announced Sora
2.?????? Google announced Gemini 1.5
3.?????? Nvidia surpassed Amazon and Alphabet in market cap to claim 3rd place
?
OpenAI announced Sora:
?On February 15, 2024, Open AI announced Sora, a new generative AI model that can generate high-definition videos up to 60 second long, from text prompts.
Sora, which means “sky” in Japanese, was made available to a select community of researchers (which includes some film makers and creative studios), to capture their feedback and assess scenarios where it can be misused.
The blog post announcing this on the OpenAI site, says Sora has a deep understanding of language, such that it can interpret complex prompts, and generate complex scenes with realistic characters, emotions, and accurate details of the subject and background. The model has very strong understanding of how things exist in the physical world and can generate very real, very believable footage.
In the hours that followed the announcement Sam Altman soliticated ideas from live users on X, and generated scenes live. They were pretty good, even the bizarre requests. Here are some posts from Sam on X.
The model is an evolution from DALL·E, the image creator model. In video generation, the model uses a diffusion model architecture, which represents videos and images as collections of smaller units of data called patches. Sora serves as a foundation for models that can understand and simulate the real world. This, Open AI believes, is an important milestone in the evolution of AGI - Artificial General Intelligence, which may be GPT 5, or very close to it.
This model comes with mixed feelings/reactions. The high quality of the videos, and the fact that they were generated in literally minutes, from a few lines of text description, is quite alarming. This has the potential of disrupting massive industries – the creative industry, the entire movie industry. What is the role of teams of editors, expensive studios and live actors and sets in this new world of Sora? What is the impact on industries like advertising, or fields such as graphic design, game development?or even edtech. In just one day since the announcement, there have been speculations. Forbes published its point of view here:
Forbes on X: "OpenAI Reveals ‘Sora’: AI Video Model Capable Of Realistic Text-To-Video Prompts https://t.co/cPvOyNCnrg https://t.co/hwkrSEKoq5" / X (twitter.com)
Sora has competitors racing to catchup too. It's rivals range from startups such as Runway Gen-2, Pika Labs and Stability AI to Google Lumiere.
Then there are risks, that are still being understood. Experts have expressed concerns over AI-generated realistic videos being used to influence elections, spread incorrect propaganda or false videos, playing with people’s sentiments. There can be dangerous or even criminal impact, if used in terrorism or pornography for instance. As soon as ‘text to speech’ technology catches up, AI generated video clips will be indistinguishable from any human created visual entertainment – will the latter gradually lose their value and eventually becomes obsolete? There are real risks to jobs and markets. Videos in any platform – television channels, streaming platforms, social channels… everything
This is a time I am wishing for innovation to slow down. Sora is not yet available to the public and I hope we have time to fully understand the risks, consequences and necessary constraints before we do make it generally available.?
This video captures the reactions quite well.
Google announced Gemini 1.5
??Also on February 15, 2024, Google announced Gemini 1.5, in quick succession after announcing Gemini 1.0 ultra last week. ?
The first Gemini 1.5 model being released for early testing is Gemini 1.5 Pro. Though it seemed like an incremental announcement, this was packed with two ground-breaking features:
领英推荐
Here is a visual representation of the comparison with previous release of Gemini as well as other Models. It is astounding.
?
Nvidia domination
On February 13th and 14th, Nvidia overtook Meta and Alphabet, respectively, and claimed the position of the third most valuable company, with $1.8 trillion in market cap. Nvidia is benefiting from the AI race, controlling about 80% of the high-end AI chip market. It is trading at around 34 times expected earnings, with ?adjusted net profit surging over 400% to $11.38 billion.
?
NVIDIA is also building its own AI ecosystem. It invested in 14 AI companies in 2023, and is tracking more than 8,500 AI startups through its Inception AI program. These startups are from 90 countries and have raised over $60 billion making NVIDIA’s investment strategy is quite broad and impactful across the entire tech sector.
References
Unveiling the Surprising Capabilities of Google GEMINI 1.5: A Game Changer in AI Technology (tammy.ai)
?
Coming Up Next:
Please watch this list... I have an active queue of newsletters in draft and every week I get new topics - so this list is pretty dynamic. I will publish in the order as they are finished. I couldn't be more excited about the volume of topics and the immense learning here.
?
Founder & CEO at Applied Insights AI
9 个月The exponential age is upon us; three big new announcements this week - world changing AI developments.