AI Newsletter - March Edition
AI Newsletter - Image generated by ChatGPT

AI Newsletter - March Edition

Welcome to the March issue of our AI Newsletter, where we delve into the fascinating and rapidly evolving world of Artificial Intelligence. This edition is packed with ground-breaking developments, innovative applications, and glimpses into the future of the technology.?

It's a quick summary of the top AI stories last month, here's what we are going to cover;

  • Google has rebranded its AI service Bard to Gemini
  • Gemini can generate images, just not for UK users
  • New smartphone concept ditches apps for AI
  • You can now lip sync animated characters via PIKA
  • Text-to-video gets one step closer
  • Combining audio and portrait videos

From Google's rebranding of Bard to Gemini to the pioneering work of Alibaba's Emo, we're covering the spectrum of AI advancements that are reshaping our digital experience. Get ready to explore the dynamic intersection of technology, creativity, and practicality in the world of AI.

Google has rebranded its AI service Bard to Gemini?

Google has rebranded its AI service ‘Bard’ to ‘Gemini’, which is built on its most advanced model - Gemini Advanced. This features the Ultra 1.0 model, offering enhanced capabilities for complex tasks like coding and the ability to generate different creative text formats, such as poems, scripts, emails, etc.

Additionally, a new mobile app for Gemini and Gemini Advanced is being rolled out, providing easy access on Android and iOS devices. Gemini Advanced is part of the Google One AI Premium Plan, priced at £19/$19.99 per month with a two-month trial.?

This premium plan also includes benefits like 2TB of storage and integration with other Google services, such as Gmail, Google Docs, Google Meet, and Google Slides. Although this integration is not yet available, Google says when it does become available, it will be included as part of a Google One AI Premium plan.

Gemini can generate images, just not for UK users

Gemini possesses the capability to generate images, but this feature is currently not available for users in the European Economic Area (EEA), Switzerland, and the UK. The image generation function of Gemini is also designed to work only with English prompts.

Last month, Gemini encountered some issues when it inaccurately generated images of historical figures, leading to a temporary removal of the feature that generates people. As of now, this limitation is still in place, but there is hope that Gemini's image generation capabilities will be extended to UK users in the near future.

New smartphone concept ditches apps for AI

Deutsche Telekom is a German telecommunications giant and the biggest telecommunications provider in Europe. Recently, they unveiled a new concept phone at MWC (Mobile World Congress) that ditches traditional apps in favour of an AI assistant.?

This phone would utilise large language models (LLMs) to anticipate users' needs and simplify their digital experience. During this announcement, they also mentioned? a decline in smartphone sales and suggested that AI integration might be a way to revitalise the industry.

You can now lip sync animated characters via PIKA

PIKA is an innovative technology that focuses on synchronising lip movements with animated characters. This tool represents a significant advancement in the field of animation and virtual character design.

PIKA's core functionality lies in its ability to precisely match the lip movements of animated characters with spoken words or audio tracks. This synchronisation creates a more realistic and engaging experience for viewers, as the characters appear to be speaking naturally.

The technology behind PIKA involves complex algorithms that analyse audio inputs, breaking down speech into phonetic components. It then accurately maps these components onto the character's mouth movements. This process ensures that the animation is not only synchronised with the audio but also reflects the nuances of speech, such as emotional intonations and accent variations.

Users can simply implement this feature using the text to audio generation feature or by simply uploading their own sample of audio first. Even those without extensive animation experience can create talking animations with PIKA's lip sync feature, opening up animation creation to a wider audience.

Text-to-video gets one step closer

OpenAI have recently announced Sora - an AI model that can generate realistic and imaginative video clips based on written descriptions. Users would simply input a scenario, e.g.) someone dancing in a field in the rain, and potentially, Sora can create a video of this description, up to 60 seconds long.?

There is no official release date for Sora yet. However, OpenAI has released a video showcasing Sora's capabilities, which you can find on their website. They state that they are prioritising safety measures before releasing Sora to the public, which is a reasonable and welcomed step for the company to take.?

OpenAI's Sora represents a significant leap in AI-powered video generation, which is a milestone they have managed to hit before their competitors over at Gemini. When you take into consideration the challenges Gemini is experiencing in image generation, it’s clear to see which AI model is winning in this area.?

Combining audio and portrait videos?

"Emo," a ground-breaking technology developed by Alibaba Group, is redefining the boundaries of digital media and artificial intelligence. This innovative technology is designed to generate expressive portrait videos, complete with audio synchronisation. The essence of Emo lies in its ability to bring still images to life, transforming them into dynamic, expressive videos that are perfectly synced with an accompanying song or audio track.

Emo's technology involves a sophisticated process that begins with a simple input: a static image and a chosen song or audio piece. Using its advanced algorithms, Emo analyses the rhythm, tone, and nuances of the audio, while simultaneously interpreting the characteristics of the input image.?

The technology then animates the image, creating realistic and synchronised lip movements that align with the audio. But Emo goes beyond mere lip-syncing; it also infuses the portrait with a range of facial expressions and subtle head movements, effectively capturing the emotions and mood conveyed by the audio.

What makes Emo particularly impressive is its ability to function under weak conditions, meaning it does not require extensive datasets or highly detailed images to produce high-quality results.?

This accessibility opens up a realm of possibilities for various applications. For instance, in the entertainment industry, Emo can be used to create music videos or digital performances using historical or fictional characters, offering a new form of artistic expression, without the need for technical editing skills.?

Final thoughts for this month;

The AI landscape continues to surprise and inspire with its relentless innovation and expansion. From Gemini's rebranding and its image generation limitations to Deutsche Telekom's AI-integrated phone concept, the trajectory of AI is clearly geared towards more intuitive, interactive, and imaginative uses.

Technologies like PIKA and Emo are not only pushing the boundaries of creative expression but are also making advanced capabilities more accessible to a wider audience. As we witness these exciting developments unfold, one thing is certain: the future of AI is bright, diverse, and full of potential for transforming how we interact with technology in our everyday lives.

Stay tuned for more updates in the next issue of our AI Newsletter, where we continue to explore the cutting-edge of artificial intelligence.

Tony Haren ACMA

Accountant & former trustee | Finance & governance specialist | Charity governance advisor | Online Governance Reviews | Board consultant | Business mentor | Active listener | Problem solver | UK; Ireland

8 个月

The pace of AI is just astonishing.....

要查看或添加评论,请登录

社区洞察

其他会员也浏览了