OpenAI is enhancing its voice capabilities with the launch of the Advanced Voice Mode for ChatGPT Plus and Teams users.

This highly anticipated feature promises to transform user interactions with the chatbot into more natural, conversational experiences.

Powered by GPT-4o, OpenAI’s latest model, the voice mode integrates text, vision, and audio, resulting in faster and more fluid exchanges.

OpenAI announced via an official tweet:

"Advanced Voice is rolling out to all Plus and Team users in the ChatGPT app over the course of the week."

They also highlighted an amusing aspect of the feature, stating it can say “Sorry I’m late” in over 50 languages, a nod to the project's long development timeline.

Advanced Voice is rolling out to all Plus and Team users in the ChatGPT app over the course of the week.

While you’ve been patiently waiting, we’ve added Custom Instructions, Memory, five new voices, and improved accents.

It can also say “Sorry I’m late” in over 50 languages. pic.twitter.com/APOqqhXtDg

— OpenAI (@OpenAI) September 24, 2024

A Step Toward Seamless Conversations

OpenAI confirmed that the advanced voice feature is now available for users of its premium service.

This innovation allows users to engage in more dynamic conversations, enhancing the overall interactive experience.

However, the rollout is not yet accessible to users in the EU, Iceland, Liechtenstein, Norway, Switzerland, or the U.K., creating a geographical divide in availability.

OpenAI is rolling out its Advanced Voice feature to Plus and Team users with 5 voices, support for 50+ languages, and customizable instructions. It's unavailable in the EU and UK.https://t.co/KmJVD1rwPE

— StartupNews.fyi (@StartupNewsFyi) September 25, 2024

Originally announced in May, the new voice capability attracted considerable attention due to a voice option named Sky, which bore a striking resemblance to the voice of Scarlett Johansson in the 2013 film “Her.”

Following this revelation, legal representatives for Johansson sent letters to OpenAI, alleging that the company lacked the rights to use a voice so similar to hers.

Statement from Scarlett Johansson on the OpenAI situation. Wow: pic.twitter.com/8ibMeLfqP8

— Bobby Allyn (@BobbyAllyn) May 20, 2024

Here's the official statement released by Scarlett Johansson, detailing OpenAI's alleged illegal usage of her voice...

...read by the Sky AI voice, because irony. pic.twitter.com/cJDlnA0hTP

— Benjamin De Kraker đŸŽâ€â˜ ïž (@BenjaminDEKR) May 20, 2024

Consequently, OpenAI halted the use of the voice in its products, as reported by CNBC.

Is it just me, or does @OpenAI's updated voice in this @ChatGPTapp demo still sound strikingly similar to Scarlett Johansson? https://t.co/ovV78IpMqd

— Marty Swant (@martyswant) September 24, 2024

A Richer Voice Experience

In the months following the initial announcement, users could interact with ChatGPT using various voices in a free tier.

The advanced version, however, significantly improves responsiveness, allowing it to pause and listen if interrupted mid-conversation.

Currently, users can choose from nine different voices and can customise their experience through the app’s settings.

OpenAI is rolling out Advanced Voice Mode (AVM), an audio feature that makes ChatGPT more natural to speak with and includes five new voices pic.twitter.com/y97BCoob5b

— TechCrunch (@TechCrunch) September 24, 2024

“Hope you think it was worth the wait,” remarked Sam Altman, OpenAI’s co-founder and CEO, in a post on X, reflecting the anticipation surrounding this feature.

advanced voice mode rollout starts today! (will be completed over the course of the week)

hope you think it was worth the wait đŸ„șđŸ«¶ https://t.co/rEWZzNFERQ

— Sam Altman (@sama) September 24, 2024

As competition intensifies, OpenAI finds itself in a rapidly evolving landscape of generative AI.

Google has recently launched its Gemini Live voice feature on Android devices, while Meta is expected to unveil celebrity voices accessible through its platforms, including Facebook and Instagram.

Navigating the New Feature

OpenAI’s Advanced Voice Mode is exclusively available to subscribers of its Plus, Team, or Enterprise plans, with the Plus tier starting at $20 per month.

One hour at this part of rollout for advanced voice from @OpenAI
16 one-hour accounts * $20/mo ChatGPT Plus sub * 12 mo/yr

Live the life as shown in Her for just under $4000/yr pic.twitter.com/t7xCUIrwzX

— Joe Fetsch 🔍⏾ (@Jtfetsch) September 25, 2024

To access this new feature, users need to ensure they have the latest version of the ChatGPT app installed on their devices.

Once access is granted, a notification will appear within the app, prompting users to proceed.

To initiate a voice chat, users can swipe right or tap the two-line icon in the app’s upper left corner to create a new chat.

A sound wave icon will appear next to the message text field and microphone icon, indicating that voice functionality is ready.

After tapping the icon, a short “bump” sound signals readiness, transforming the on-screen circle into a dynamic blue and white animation.

Users can begin speaking, and they should expect a prompt response.

OpenAI has made strides in improving accents across various foreign languages and enhancing conversation speed.

If users wish for a change in delivery, they can request modifications, such as asking ChatGPT to speed up its speech or adopt a Southern accent.

Limitations and Use Cases

The advanced voice mode allows ChatGPT to assist users in various tasks, from narrating bedtime stories to preparing for job interviews or practising foreign language skills.

However, users should be aware that even paying subscribers are subject to usage limits.

After approximately 30 minutes of interaction, a notification indicating “15 minutes left” appears at the bottom of the screen, raising questions about the extent of access to this feature.

As OpenAI continues to innovate and expand its capabilities, the introduction of Advanced Voice Mode signifies a crucial step in making AI interactions more engaging and lifelike.