登录查看更多内容

OpenAI Expands ChatGPT's Capabilities with Voice and Image Integration

Zamir Khotov

Backand: PHP, Laravel, Go. Frontend: JavaScript React

发布日期: 2023年9月26日

OpenAI's ChatGPT, the groundbreaking generative AI assistant, is taking a giant leap forward. Today, OpenAI announced the integration of voice and image-based functionalities, transforming ChatGPT from a text-based search engine into a versatile conversational companion.

Since its launch approximately nine months ago, ChatGPT has captured the imagination of users worldwide by enabling them to generate essays, poems, and summaries based on simple text prompts. Now, it's expanding its capabilities to support voice interactions, allowing users to engage in voice conversations with the AI assistant.

This announcement coincides with Amazon's commitment to invest up to $4 billion in Anthropic, a rival to OpenAI. This underscores the fierce competition among tech giants, with Google's Bard chatbot, Meta's open-source approach, and Microsoft's partnership with OpenAI all vying for supremacy in the generative AI landscape.

A New Era in Generative AI

Today's development represents a significant milestone in the evolution of generative AI. OpenAI is bridging the gap between voice-based assistants and its powerful large language models (LLMs).

With this advancement, users can now verbally instruct ChatGPT to compose a bedtime story on the spot, guiding the narrative with vocal prompts. Alternatively, users can simply pose questions, and ChatGPT will respond verbally.

领英推荐

ChatGPT+ Will Soon Be Able To See, Hear, Speak, And…

Creativize.ai 1 年前

The ChatGPT Observer Edition 30

Zeyad Sweidan 5 个月前

OpenAI Unveils Hyper-Realistic Voice Feature for…

ChandraKumar R Pillai 7 个月前

In addition to voice interaction, ChatGPT users will gain the ability to search for answers using images. For instance, they can upload a picture and ask ChatGPT to explain its content or provide instructions for a specific task.

The voice feature relies on a new text-to-speech model capable of generating lifelike voices from text input and a short audio sample. OpenAI collaborated with established voice actors to create five distinct voices. They employed the open-source Whisper speech recognition system to transcribe spoken words into text.

Spotify joins this initiative as a launch partner, introducing an innovative feature for podcasters. It allows them to translate their shows from English into Spanish, French, or German while preserving their original voice. However, OpenAI has carefully selected partners for this launch, working with podcasters such as Dax Shepard, Monica Padman, Lex Fridman, Bill Simmons, and Steven Bartlett.

OpenAI acknowledges the transformative potential of this voice technology in creative and accessibility-focused applications but also highlights the associated risks, such as the potential for malicious actors to impersonate public figures or engage in fraudulent activities.

These new features will become available to paying Plus and Enterprise subscribers over the next two weeks. To activate voice capabilities, users should navigate to the "settings" menu in the app, access "new features," and opt-in to voice conversations. They can then select their preferred voice by tapping the headphone button in the top-right corner. Initially, voice functionality will be in an opt-in beta phase for ChatGPT's Android and iOS apps, while image search will be accessible by default across all platforms.

The Daily Brief

385 位关注者

要查看或添加评论，请登录

Zamir Khotov的更多文章

?? How I Built a JavaScript Job Board in 30 Days (Laravel + Vite + Tailwind)

2025年3月19日

?? How I Built a JavaScript Job Board in 30 Days (Laravel + Vite + Tailwind)

One month ago, I gave myself a challenge: build a simple and clean job board for JavaScript developers. ?? No…
Launched: JS Guru Jobs

2025年3月15日

Launched: JS Guru Jobs

?? https://jsgurujobs.com Over the last few weeks, I’ve been working on a side project that solves a problem I kept…
What Do Nobel Laureates Think About the Economics of Cryptocurrencies?

2025年2月4日

What Do Nobel Laureates Think About the Economics of Cryptocurrencies?

In the world of finance and economics, cryptocurrencies have sparked intense debates. While some hail them as…
Why Laravel is the Perfect Choice for Your Next Project

2025年1月24日

Why Laravel is the Perfect Choice for Your Next Project

Why Laravel is the Perfect Choice for Your Next Project When it comes to PHP frameworks, Laravel has long established…
Why Go is the Ideal Choice for Your Business

2025年1月20日

Why Go is the Ideal Choice for Your Business

Go, also known as Golang, is becoming increasingly popular among developers and companies looking for efficient…
Top 10 Web Development Blogs of 2024: Essential Resources for Every Developer

2024年1月15日

Top 10 Web Development Blogs of 2024: Essential Resources for Every Developer

In a world where technology evolves at a breakneck pace, web developers and designers are constantly in search of new…

1 条评论
??Exploring the Horizon: The Top 10 Technology Trends of 2024??

2024年1月9日

??Exploring the Horizon: The Top 10 Technology Trends of 2024??

As we step into another promising year in the realm of technology, it's crucial to pause and ponder the transformative…
Top 10 GitHub Repositories for Frontend Developers in 2024

2024年1月8日

Top 10 GitHub Repositories for Frontend Developers in 2024

In the fast-paced world of web development, staying ahead means constantly learning. For frontend developers seeking to…
Chronicles of the Digital Era

2023年12月31日

Chronicles of the Digital Era

This article delves into the milestones of the digital age, from the birth of the internet to the cutting-edge…
Understanding Stacking Context

2023年11月14日

Understanding Stacking Context

Title: Unraveling the Mysteries of CSS: Understanding Stacking Context In the world of web development, understanding…

See all articles

OpenAI Expands ChatGPT's Capabilities with Voice and Image Integration

Zamir Khotov

Backand: PHP, Laravel, Go. Frontend: JavaScript React

领英推荐

The Daily Brief

385 位关注者

Zamir Khotov的更多文章

社区洞察

其他会员也浏览了

Researchers Find That OpenAI ChatGPT Quality Has Worsened

OpenAI's ChatGPT 4.5: A Comprehensive Review of Capabilities, Limitations, and Strategic Implications

What Marketers Need to Know About ChatGPT, Google's Bard, and Microsoft's Bing Chatbot

Unleashing Potential: How AI (Artificial Intelligence) Can Supercharge Your Daily Life and Professional Career?

Insider's Edit: ChatGPT Performance Drift - a New Risk for Business

Embracing AI: Breakthroughs, Controversies, and Societal Impact

The World This Week in AI (16th December 2024)

Chat GPT Turns One-A Quick Recap

OpenAI seems to have finalized its deal with Apple | Custom GPTs now available for free ChatGPT users | Mistral launches Codestral.

ChatGPT vs Google Bard: The Race for the Best AI Chatbot

领英推荐

The Daily Brief

385 位关注者

Zamir Khotov的更多文章

?? How I Built a JavaScript Job Board in 30 Days (Laravel + Vite + Tailwind)

Launched: JS Guru Jobs

What Do Nobel Laureates Think About the Economics of Cryptocurrencies?

Why Laravel is the Perfect Choice for Your Next Project

Why Go is the Ideal Choice for Your Business

Top 10 Web Development Blogs of 2024: Essential Resources for Every Developer

??Exploring the Horizon: The Top 10 Technology Trends of 2024??

Top 10 GitHub Repositories for Frontend Developers in 2024

Chronicles of the Digital Era

Understanding Stacking Context

社区洞察

其他会员也浏览了

Researchers Find That OpenAI ChatGPT Quality Has Worsened

OpenAI's ChatGPT 4.5: A Comprehensive Review of Capabilities, Limitations, and Strategic Implications

What Marketers Need to Know About ChatGPT, Google's Bard, and Microsoft's Bing Chatbot

Unleashing Potential: How AI (Artificial Intelligence) Can Supercharge Your Daily Life and Professional Career?

Insider's Edit: ChatGPT Performance Drift - a New Risk for Business

Embracing AI: Breakthroughs, Controversies, and Societal Impact

The World This Week in AI (16th December 2024)

Chat GPT Turns One-A Quick Recap

OpenAI seems to have finalized its deal with Apple | Custom GPTs now available for free ChatGPT users | Mistral launches Codestral.

ChatGPT vs Google Bard: The Race for the Best AI Chatbot