Anthony Bourdain's AI Voice: Unethical or Advancement?
Satabdi Mukherjee
Content Marketing Lead | Content editor | B2B SaaS (HR tech, Martech), healthcare, e-learning
Celebrity chef,?Anthony Bourdain, famous for his TV shows No Reservations and The Layover, died by suicide in 2018.
I remember being shocked to hear that the smiling, witty man who had taken me on several culinary journeys through his television appearances had ended his life so tragically.
Morgan Neville, an Oscar-winning documentary filmmaker, released Roadrunner: A Film About Anthony Bourdain in July 2021. This documentary features three quotes spoken by an?AI voice model of Anthony Bourdain, a fact that was?not revealed?to viewers.
Source: mspmag.com
As a result, social media has erupted in a furor over the?ethics of cloning a dead man’s voice without his consent?and without disclosing the artifice to viewers.
Ethics of voice cloning
Neville commissioned?Descript, a voice cloning company, to create an AI model of Anthony Bourdain’s voice.
Neville handed over hours of recordings of Bourdain’s voice pulled from TV shows, radio shows, podcasts, and audiobooks as “training data” for the voice cloning software. The result was an?indistinguishable piece of artificial audio?that went unnoticed when Roadrunner was released.
Only when Neville talked about it in interviews did people understand that a fake voice had been used. Obviously, there is concern over the ethics of it all.
When Neville was questioned about the ethics of using Bourdain’s voice without his consent, he made an off-hand comment about constituting an ethics committee later. This shows that Neville isn’t too concerned about the ethical implications of voice cloning.
It looks to me like Neville views it as a technological advancement that has allowed him to?reconstruct a dead man’s life through his own words.
Source: Freepik
To be fair, the cloned bits together span only 45 seconds and they’re words that Bourdain has written in an email. So Neville isn’t putting words in Bourdain’s mouth or attributing false statements to him.
However, it seems strange to me that Neville did not disclose the use of AI in the documentary. Did he not realize it would make him seem dishonest?
The relentless march of AI technology has roused fears of?job loss. It costs less to employ an AI to perform a task than a human. It also takes less time. The benefits of using AI, especially for repetitive tasks, is clear to companies.
Deepfakes,?face rendering, and?voice cloning?show us a glimpse of the future that belongs to AI.
Here’s a?post by Descript?that shares its thoughts about the Anthony Bourdain controversy.
How voice cloning works
Synthetic speech is not a new concept.
If you’ve used voice assistants like Siri or Alexa, or encountered an IVR (interactive voice response) system when calling customer care, you have experienced synthetic speech. However, these artificial voices sound distinctly robotic and it’s easy to understand that a machine is speaking and not a human.
Voice cloning uses a technology called?text-to-speech?(TTS) that converts text into synthetic audio. This enables humans and computers to interact through voice.
Source: Freepik
Two approaches to TTS exist:
a)?Concatenative approach?– wherein a collection of audio recordings is used to create a pool of words and sounds, from which sentences can be generated.
b)?Parametric approach?– wherein statistical models of speech are used to simplify the process of generating synthetic speech
The parametric TTS approach costs less and requires less effort than the concatenative approach. But neither approach results in a natural human voice.
AI?and?deep learning?have advanced voice cloning technology such that a close imitation of a human voice can be generated.
Neural Networks
AI-based voice cloning software uses?neural networks?to generate more human-like speech.
领英推荐
Neural network-based TTS models like?Tacotron?by Google,?Lyrebird, and?WaveNet?have the ability to recognize and learn patterns in data. They can clone any voice and make it “read” text.
Advanced models require just a few seconds of speech samples to create a natural-sounding human voice. It is also possible to change the gender and accent of the speech!
These AI-based tools are better at capturing the emotion, inflection, pronunciation, and intonation of human speech.
Applications of voice cloning
Voice cloning technology was created to help, not deceive.
a) Assistive technology
Voice cloning can help differently-abled people to communicate, especially those who have lost their voice.
b) Dubbing
Voiceover artists and actors can use AI-enabled voice cloning software to dub dialogues in different languages faster and cost-effectively.
c) Audiobooks
AI voice cloning software can recreate the voices of famous people and authors and use it to narrate their books or letters.
Source: Freepik
Historical figures could narrate their life stories, much like Bourdain’s AI voice does in Roadrunner. Keeping aside the ethical issues, it makes the experience so much more engrossing for the viewer/listener.
Check out?John F. Kennedy’s speech?(in his AI voice) that he would have delivered in Dallas in 1963 had he not been assassinated.
e) Educational aids
The teaching possibilities of cloning the voices of historical figures and using them to narrate important world events and speeches are endless.
Wouldn’t you like to hear Anne Frank narrate her experiences in the Secret Annexe in her own voice?
The Dark Side of Voice Cloning
Unfortunately, humans have found ways to use voice cloning technology to deceive people and spread misinformation.
a) Voice Phishing/ Vishing
An evolution of email phishing attacks, voice cloning enables the use of fake voices to con people into thinking they’re speaking to someone they trust. Phone calls and voicemail are the new weapons.
Source: Freepik
b) Voice spoofing
AI-created synthetic speech could be used to “make” people say things they have never said in real life. Such voice scams can have disastrous effects on the person, eroding his/her credibility and reputation.
Voice spoofing can be used to impersonate government officials, bank executives, and even trusted family members.
It poses serious security issues for biometric systems that have so far considered voice to be a reliable measure of identity. Biometric systems can be fooled into thinking that an authentic user is speaking, thus granting access to sensitive information.
c) Misinformation
Do you remember?Queen Elizabeth doing a TikTok dance?and delivering an alternative Christmas message in a video created by BBC to illustrate the dangers of deepfake videos?
That’s the danger of misusing AI voices. Criminals can make people say whatever they want to foment unrest and violence and sway public opinion. Imagine the consequences of fake politically motivated speech or hate speech.
In Conclusion
Where there’s poison, there’s got to be an antidote.
Thankfully, anti-spoofing technology exists — called?“voice liveness detection.”?It’s also an AI-enabled software that can distinguish between a human voice and a fake voice.
However, the core question of whether the use of AI voice cloning is ethical remains to be satisfactorily answered.
What do you think?
--------------------------------------------------------------------------------------------------------------
Content Marketing Lead | Content editor | B2B SaaS (HR tech, Martech), healthcare, e-learning
3 年Check out the trailer of the documentary here: https://www.youtube.com/watch?v=SHknGHEPtGI