OpenAI Unveils Hyper-Realistic Voice Feature for ChatGPT Plus Users
ChandraKumar R Pillai
Top AI Voice | AI & Blockchain Expert | Tech Advisor | Leadership Insights
OpenAI ’s Advanced Voice Mode: A Leap in Hyper-Realistic AI Interaction
OpenAI is once again at the forefront of AI innovation with the introduction of ChatGPT’s Advanced Voice Mode. This new feature promises to revolutionize the way we interact with AI by offering hyper-realistic audio responses that closely mimic human speech. This groundbreaking development is set to gradually roll out to ChatGPT Plus users, with the alpha version available to a select group starting this fall.
The Journey to Advanced Voice Mode
OpenAI first showcased the capabilities of GPT-4o’s voice in May 2024, stunning audiences with its quick and lifelike responses. The voice, dubbed Sky, bore an uncanny resemblance to actress Scarlett Johansson’s voice in the movie “Her.” This led to some controversy, as Johansson had reportedly declined multiple requests from OpenAI to use her voice. Despite OpenAI’s denial, the voice was removed from the demo, and the release was delayed to improve safety measures.
What Sets Advanced Voice Mode Apart?
Unlike the previous Voice Mode, which relied on three separate models to convert voice to text, process the prompt, and convert text back to voice, Advanced Voice Mode utilizes GPT-4o’s multimodal capabilities. This integration allows for significantly lower latency conversations and the ability to detect emotional intonations such as sadness, excitement, or even singing.
Gradual Rollout and Safety Measures
OpenAI is taking a cautious approach with the release of Advanced Voice Mode. Initially, a small group of ChatGPT Plus users will have access, and the feature will be gradually made available to all Plus users by the end of the year. Users in the alpha group will receive notifications through the ChatGPT app and detailed instructions via email.
To ensure the safety and reliability of this feature, OpenAI has conducted extensive testing with over 100 external red teamers across 45 different languages. A report detailing these safety efforts is expected in early August. Additionally, the Advanced Voice Mode will be limited to four preset voices – Juniper, Breeze, Cove, and Ember – created in collaboration with paid voice actors.
Avoiding Deepfake Controversies
In light of past controversies surrounding deepfake technologies, OpenAI has implemented measures to prevent ChatGPT from impersonating individuals or public figures. The AI will also block requests to generate copyrighted audio, aiming to avoid legal issues similar to those faced by AI startups like ElevenLabs, Suno, and Udio.
领英推荐
The Future of AI Interaction
OpenAI’s Advanced Voice Mode is a significant step forward in making AI interactions more natural and human-like. By combining advanced voice synthesis with robust safety measures, OpenAI is setting a new standard for AI technology.
Critical Questions for Discussion
1. How will hyper-realistic AI voices impact user interaction and trust in AI technologies?
2. What are the potential ethical implications of AI-generated voices that closely mimic human speech?
3. How can companies balance innovation with the need to prevent misuse of AI technologies?
4. What additional safety measures should be implemented to protect against deepfake abuses?
5. How will the introduction of Advanced Voice Mode influence the competitive landscape of AI voice technologies?
The release of ChatGPT’s Advanced Voice Mode is poised to reshape the landscape of AI interaction. Share your thoughts and insights on this groundbreaking development. Engage with us and let’s discuss the future of AI together.
Join me and my incredible LinkedIn friends as we embark on a journey of innovation, AI, and EA, always keeping climate action at the forefront of our minds. ?? Follow me for more exciting updates https://lnkd.in/epE3SCni
#AI #VoiceTechnology #OpenAI #ChatGPT #AIInnovation #TechNews #ArtificialIntelligence #DigitalTransformation #AIRegulation #LinkedInDiscussion
Sources: TechCrunch ; OpenAI
MD; MBA; Civil Eng.; LL.B.; Organic Farmer; Nonprofit Consultant Cybersecurity; 1st Officer Merchant Marine; Ethical Hacker; AI (Photo, Video, Post, Investigation).
2 个月Very informative
Senior Managing Director
2 个月ChandraKumar R Pillai Very interesting. Thank you for sharing
Chief Data Science Officer | AI & ML Leader | Data Engineering Expert | CXO Incubator | Top 100 AI Influential Leader by AIM | AI Thought Leader: Responsible AI, Executive AI Leadership, and Generative AI Innovation
2 个月Great information, I'm wondering whether this will reduce the usage of NLP APIs for voice to text and language translation. I'm also curious to see the efficacy, does it translate or transliterate before passing to the RAG to elicit a coherent response. Eager to use the Plus model.
Visionary Thought Leader??Top Voice 2024 Overall??Awarded Top Global Leader 2024??CEO|Board Member|Executive Coach|Expert Keynote Speaker|21 X Top Leadership Voice LinkedIn| |Relationship Builder|Integrity|Accountability
2 个月OpenAI's Advanced Voice Mode is truly remarkable, paving the way for more immersive AI interactions. Your insights on the subject are always enlightening, ChandraKumar R Pillai.
CEO/Principal: CERAC Inc. FL USA..... ?? ????????Consortium for Equitable Research, Analysis & Communication
2 个月Thank you for sharing that information......Have a blessed day!!! ??