OpenAI's Voice Mode is finally here!

OpenAI's Voice Mode is finally here!

Welcome, AI entrepreneurs & enthusiasts.

After a multi-month wait, OpenAI finally announced that Advanced Voice Mode is rolling out to all ChatGPT Plus and Team subscribers this week (outside of the EU).

But with no current o1 integration, image upload option, or live video, will it live up to the hype? Let’s get into it…

In today’s AI news:

  • OpenAI rolls out Advanced Voice Mode
  • Google releases production-ready models
  • Customize images fast with PuLID-FLUX
  • James Cameron joins Stability AI’s board
  • More AI & tech news


OpenAI rolls out Advanced Voice Mode

Image source: OpenAI

The News: OpenAI is finally rolling out an enhanced Advanced Voice Mode (AVM) to all ChatGPT Plus and Teams subscribers this week, featuring new voices and improved functionality to make AI interactions feel more natural and personalized.

The details:

  • The initial rollout for OpenAI’s new Advanced Voice Mode started in July, but it only ever reached a select few ChatGPT users.
  • During the delay, OpenAI updated its AVM to integrate Custom Instructions and Memory, allowing for more personalized interactions and conversation recall.
  • OpenAI also improved AVM’s ability to understand accents and claims smoother, faster conversations, while adding five new nature-inspired voices (and removing the “Sky” voice that sounded like Scarlett Johansson).
  • AVM will not yet be available in several regions, including the EU, the UK, Switzerland, Iceland, Norway, and Liechtenstein.

Why it matters: With OpenAI CEO Sam Altman writing about AI agents and superintelligence, ChatGPT Advanced Voice Mode feels more relevant than ever. If we’re going to interact with AI every day—it has to sound and feel human—which is exactly what AVM is attempting to accomplish.

Editors note: If you still don’t have access to Advanced Voice Mode on your ChatGPT app, try uninstalling and reinstalling the app.


Google releases production-ready models

Image source: Google

The News: Google just announced significant updates to its Gemini AI models, including performance improvements, cost reductions, and increased accessibility for developers.

The details:

  • Two new production-ready models came out today: Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002, offering improved quality across various tasks, including a 20% boost in math-related benchmarks.
  • Pricing for Gemini 1.5 Pro has been reduced by over 50% for both input and output on prompts under 128K tokens, while rate limits have been increased significantly.
  • The models boast 2x faster output and 3x lower latency compared to previous versions, with improvements in long context understanding and vision capabilities.
  • Google also updated its default filter settings, giving developers more control over model configuration for their specific use cases.

Why it matters: Google is iterating quickly and pushing the boundaries of affordability for developers building with AI. While this isn’t Gemini 2 — it is a significant upgrade over the experimental models and will help builders create faster, smarter, cheaper applications.


James Cameron joins Stability AI’s board

Image source: Midjourney

The News: James Cameron, the acclaimed director of Titanic, Avatar, and The Terminator, recently joined the board of directors at Stability AI, the company behind the popular Stable Diffusion text-to-image AI model.

The details:

  • Cameron, known for pushing technological boundaries in filmmaking, sees the convergence of generative AI and CGI as “the next wave” in visual media creation.
  • Stability AI’s CEO, Prem Akkaraju, formerly led visual effects company WETA Digital, highlighting the firm’s focus on creative applications of AI.
  • The move comes as Hollywood grapples with AI’s potential, with some studios embracing the technology while others express concerns over content rights.

Why it matters: Just days after Lionsgate teamed up with AI startup Runway to create a custom video generation model, this move by one of Hollywood’s biggest directors could signal a significant shift in how influential filmmakers are thinking about navigating AI.


Customize images fast with PuLID-FLUX

The News: Hugging Face’s new PuLID-Flux space offers a tuning-free solution for quick image customization with your own likeness using just one reference photo.

Step-by-step:

  1. Visit PuLID-Flux Hugging Face space (also available in Replicate)
  2. Upload your reference image in the "ID Image" section
  3. Play with the different parameters it offers and write a descriptive prompt for your desired output
  4. Click "Generate" and refine as needed

Pro tip: Adjust the "timestep to start inserting ID" parameter to balance fidelity and creativity.


Trending AI Tools

  • Adobe GenStudio - Helps marketing teams measure on-brand content
  • FactBot by Snopes - Fact-checking for urban legends and misinformation
  • JustPaid - Automate invoice follow-ups and payment tracking
  • Pathway - Helps product teams test UX solutions and gather insights
  • Tubit AI - AI that summarizes YouTube videos for a deeper understanding


QUICK HITS

Warner Bros. Discovery adopted Google Cloud’s AI for caption generation, aiming to cut production time and costs for unscripted programming.

Intel launched Xeon 6 processors and Gaudi 3 AI accelerators, doubling performance for AI workloads and offering improved price and performance compared to Nvidia’s H100.

OpenAI increased API access for o1 models, adding tier 4 to the list of authorized users at 100 requests per minute and upping tier 5 users to 1000 requests per minute.

Suno AI announced a new cropping feature available to AI-generated songs, allowing Pro and Premier users to adjust the start and end of their creations.

Duolingo introduced AI-powered Adventures mini-games and a Video Call feature to enhance language learning through immersive, practical experiences for its users.

Apple unveiled its plan to roll out Siri’s major AI-powered updates gradually, with the most significant enhancements expected in iOS 18.3, likely launching in January 2025.


Thank you for reading our newsletter! If you want to stay two steps ahead of the competition, subscribe to this newsletter. If you want to leave your competition in the past, hop on a quick, complimentary, no-obligation call with our team to explore our consulting and custom development services.

We've proudly worked with over 400+ companies to revolutionize their business with AI, and our team of 4,000+ developers, engineers, consultants, and experts are more than ready to help you take advantage of all the latest and greatest AI technology for your business.

Ready to get started? Book a Consultation today!

要查看或添加评论,请登录

社区洞察

其他会员也浏览了