Issue 18: A week to remember: Sora, Gemini, Nvidia
Source: OpenAI

Issue 18: A week to remember: Sora, Gemini, Nvidia

It’s becoming impossible to keep up with the pace of innovation, with just one newsletter per week. This week was huge. There were three big distinct “wow” moments for me, I am sure there were different ones for many of you – depending on whether you are more interested in the tech or business or hardware movements o other areas…

My three big announcements were:

1.?????? OpenAI announced Sora

2.?????? Google announced Gemini 1.5

3.?????? Nvidia surpassed Amazon and Alphabet in market cap to claim 3rd place

?

OpenAI announced Sora:

?On February 15, 2024, Open AI announced Sora, a new generative AI model that can generate high-definition videos up to 60 second long, from text prompts.

Sora, which means “sky” in Japanese, was made available to a select community of researchers (which includes some film makers and creative studios), to capture their feedback and assess scenarios where it can be misused.

The blog post announcing this on the OpenAI site, says Sora has a deep understanding of language, such that it can interpret complex prompts, and generate complex scenes with realistic characters, emotions, and accurate details of the subject and background. The model has very strong understanding of how things exist in the physical world and can generate very real, very believable footage.

In the hours that followed the announcement Sam Altman soliticated ideas from live users on X, and generated scenes live. They were pretty good, even the bizarre requests. Here are some posts from Sam on X.

The model is an evolution from DALL·E, the image creator model. In video generation, the model uses a diffusion model architecture, which represents videos and images as collections of smaller units of data called patches. Sora serves as a foundation for models that can understand and simulate the real world. This, Open AI believes, is an important milestone in the evolution of AGI - Artificial General Intelligence, which may be GPT 5, or very close to it.

This model comes with mixed feelings/reactions. The high quality of the videos, and the fact that they were generated in literally minutes, from a few lines of text description, is quite alarming. This has the potential of disrupting massive industries – the creative industry, the entire movie industry. What is the role of teams of editors, expensive studios and live actors and sets in this new world of Sora? What is the impact on industries like advertising, or fields such as graphic design, game development?or even edtech. In just one day since the announcement, there have been speculations. Forbes published its point of view here:

OpenAI Reveals ‘Sora’: AI Video Model Capable Of Realistic Text-To-Video Prompts (forbes.com)

Forbes on X: "OpenAI Reveals ‘Sora’: AI Video Model Capable Of Realistic Text-To-Video Prompts https://t.co/cPvOyNCnrg https://t.co/hwkrSEKoq5" / X (twitter.com)


Sora has competitors racing to catchup too. It's rivals range from startups such as Runway Gen-2, Pika Labs and Stability AI to Google Lumiere.

Then there are risks, that are still being understood. Experts have expressed concerns over AI-generated realistic videos being used to influence elections, spread incorrect propaganda or false videos, playing with people’s sentiments. There can be dangerous or even criminal impact, if used in terrorism or pornography for instance. As soon as ‘text to speech’ technology catches up, AI generated video clips will be indistinguishable from any human created visual entertainment – will the latter gradually lose their value and eventually becomes obsolete? There are real risks to jobs and markets. Videos in any platform – television channels, streaming platforms, social channels… everything

This is a time I am wishing for innovation to slow down. Sora is not yet available to the public and I hope we have time to fully understand the risks, consequences and necessary constraints before we do make it generally available.?

This video captures the reactions quite well.

OpenAI's NEW AI "SORA" Just SHOCKED EVERYONE! (Text To Video)


Google announced Gemini 1.5

??Also on February 15, 2024, Google announced Gemini 1.5, in quick succession after announcing Gemini 1.0 ultra last week. ?

The first Gemini 1.5 model being released for early testing is Gemini 1.5 Pro. Though it seemed like an incremental announcement, this was packed with two ground-breaking features:

  1. First, the long-context understanding. In natural language processing (NLP), a context window refers to?“a defined span of words within a text sequence that is utilized to extract contextual information”. The sequence can be a combination of words, images, videos, audio or code. The bigger a model’s context window, the more information it can take in and process in a given prompt — making its output more consistent, relevant and useful. Gemini 1.0 context window is 32k tokens, while the context window can go up to 1 million tokens. This is massive. This translates to 1.5 Pro being able to process 1 hour of video, 11 hours of audio, codebases with over 30,000 lines of code or over 700,000 words, in seconds. This truly is a huge leap forward.
  2. Second, another monumental leap in AI is the advanced “in-context learning”. The model can learn new skills or knowledge directly from its prompts without the need for additional fine-tuning.

Here is a visual representation of the comparison with previous release of Gemini as well as other Models. It is astounding.

Source: Google

Introducing Gemini 1.5, Google's next-generation AI model (blog.google)

https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/#sundar-note

?

Nvidia domination

On February 13th and 14th, Nvidia overtook Meta and Alphabet, respectively, and claimed the position of the third most valuable company, with $1.8 trillion in market cap. Nvidia is benefiting from the AI race, controlling about 80% of the high-end AI chip market. It is trading at around 34 times expected earnings, with ?adjusted net profit surging over 400% to $11.38 billion.

?

NVIDIA is also building its own AI ecosystem. It invested in 14 AI companies in 2023, and is tracking more than 8,500 AI startups through its Inception AI program. These startups are from 90 countries and have raised over $60 billion making NVIDIA’s investment strategy is quite broad and impactful across the entire tech sector.




References

OpenAI is only granting access to red teamers who will assess potential risks associated with the model’s release.

OpenAI just shifted the generative AI battle to Hollywood with its new AI text-to-video product | Fortune

Unveiling the Surprising Capabilities of Google GEMINI 1.5: A Game Changer in AI Technology (tammy.ai)

Google DeepMind Gemini – Dr Alan D. Thompson – Life Architect

Nvidia replaces Alphabet as Wall St's third most valuable company | Reuters

?

Coming Up Next:

Please watch this list... I have an active queue of newsletters in draft and every week I get new topics - so this list is pretty dynamic. I will publish in the order as they are finished. I couldn't be more excited about the volume of topics and the immense learning here.

  1. AI-griculture - the role of AI in Agriculture
  2. To Build or Buy: That is the question
  3. Deep dive in BSFI?
  4. The rise of Small Language Models
  5. Where is Apple? Slow and steady wins the race?
  6. The environmental cost of AI
  7. The convergence of AI, Immigration and Privacy
  8. Decoding ML, DL, LLM, AGI, and the world of AI
  9. Rhythms | A new AI powered operating systems to transform the future of work?
  10. AI in Cybersecurity | Auth0/Okta

?

Cian Duggan

Founder & CEO at Applied Insights AI

9 个月

The exponential age is upon us; three big new announcements this week - world changing AI developments.

要查看或添加评论,请登录

Sharmilli Ghosh的更多文章

  • Issue 36: Anthropic, the rising star

    Issue 36: Anthropic, the rising star

    Last October (’23), we bought shares of Anthropic, at a whopping $18B in valuation. At first, I thought it was very…

    1 条评论
  • NVIDIA: In the eye of the storm

    NVIDIA: In the eye of the storm

    Last week NVIDIA briefly surged past Microsoft and Apple, achieving the position of the world’s most valuable company…

  • Issue 34: The importance of ESG and AI

    Issue 34: The importance of ESG and AI

    Investors and stakeholders increasingly consider non-financial factors when making investment decisions, resulting in…

  • Issue 33: Carbon Footprint of AI

    Issue 33: Carbon Footprint of AI

    Stanford researcher Peter Henderson warned that if AI models continue to scale without considering environmental…

    1 条评论
  • Issue 32: Anthropic and enterprise AI

    Issue 32: Anthropic and enterprise AI

    Last December, we invested in Anthropic at a whopping $15B in valuation. I thought it was very expensive, but compared…

    2 条评论
  • Issue 31: IOT, AI and Greentech

    Issue 31: IOT, AI and Greentech

    The cross section of IOT, AI and Greentech is where my passion and excitement is - making technology work to help our…

    2 条评论
  • Issue 30: Transforming Creative Industries

    Issue 30: Transforming Creative Industries

    In Feb 2024, I covered the announcement of OpenAI's Sora, in Issue 18, the revolutionary text-to-video AI model. While…

  • Issue 29: The intelligence behind your food delivery

    Issue 29: The intelligence behind your food delivery

    Introduction Today, I would like to pick up an industry I am a loyal consumer of – the online restaurant delivery…

    1 条评论
  • Issue 28: Autonomous AI Agents

    Issue 28: Autonomous AI Agents

    We are all familiar with virtual assistants like Siri, Alexa, or Cortana, which help with tasks such as setting…

    12 条评论
  • Issue 27: Understanding Prompt Engineering

    Issue 27: Understanding Prompt Engineering

    Author: Bindu Thota Introduction Large Language Models such as ChatGPT are pretrained transformer models that are…

    2 条评论

社区洞察

其他会员也浏览了