The Future is Fluid: Will AI Assistants Ever Truly Transcend Gendered Voices?
For the record, I love speaking with AI assistants! So much so that the only locations in our home without an AI assistant are the bathrooms. Despite the various discourse I’m met with whenever I inform people that we own multiple Echo devices, can speak to our trashcan, ask Google to turn-off the TV, and ask Shazam by way of Siri to name the song that’s currently playing on Power Book III: Raising Kanan, I can’t imagine a world without these devices!
…and maybe it’s my personal feminist agenda that both my Alexa and Siri devices have male voices? In fact, our Alexa devices had the voice of Samuel L. Jackson until Amazon discontinued their celebrity voices last May. It was quite a shame because we loved asking Sam to say his favorite word!
But why male voices? Honestly, it just doesn’t feel right having a female voice assistant. That’s not to say that I treat my AI assistants any worse. I’m one of those people who uses words/phrases such as Please and Thank you when speaking with AI despite being a waste of a token. Rather, I feel that leaving the voices as their default female voice further perpetuates female stereotypes of fulfilling subservient roles.
As I reflect on my own journey in both owning and developing AI assistants, I couldn’t help but wonder: Will AI assistants ever truly transcend gendered voices?
The Gender Bias Debate
Personification plays a pivotal role in the assignment of gender to AI assistants, shaping user perceptions and interactions. Human beings naturally tend to anthropomorphize objects or entities they interact with, attributing human-like characteristics and traits to them (ffor example, I named my German car Dieter after Ludwig Dieter in Army of Thieves – such a great character!). When developers assign a gender to an AI assistant, they imbue it with specific qualities and behaviors associated with that gender, thereby influencing users' expectations and interactions.
While certainly not a lesser role, societal norms often dictate that women fulfill assistant roles, whether as administrative assistants, secretaries, or customer service representatives. Users may subconsciously transfer gendered expectations onto AI assistants, expecting them to be polite, helpful, and accommodating—traits traditionally associated with femininity. Consequently, female-voiced AI assistants are often perceived as more approachable and nurturing, mirroring the societal stereotype of women in supportive roles.
Despite its non-sentient nature, some users still find the desire to sexualize female-voiced AI assistants, which in turn reinforces the notion that it's permissible to engage in such behavior with women. In 2011, Brandon Griggs of CNN Digital expressed that when Siri made her debut, some users took to blogs and online forums with sexually suggestive inquiries such as “What are you wearing?”. Similarly, Microsoft told CNN that when Cortana launched in 2014, “a good chunk of early queries were about her sex life”. When users sexualize female AI assistants, it not only demeans the assistant itself but also reflects and reinforces harmful societal attitudes towards women. This normalization of sexualization in interactions with female AI assistants can further entrench harmful gender dynamics, suggesting that it's acceptable to treat women as mere objects of desire rather than as equals deserving of respect.
Conversely, male-voiced AI assistants may be perceived as authoritative or commanding, reflecting societal expectations of male leadership. For example, IBM's decision to use a male voice for Watson, despite the absence of gender in AI, underscores this phenomenon. IBM Watson's role alongside physicians in cancer treatment and its triumph on Jeopardy solidify its stature as a figure of expertise and capability, traits traditionally associated with masculinity. The deliberate use of a male voice with self-assured tones and short, definitive phrases mirrors typical male speech patterns , reinforcing perceptions of authority and competence. Moreover, research indicates that people tend to prefer masculine-sounding voices from leaders , further justifying IBM's choice to give Watson a male voice.
Gender Shift: A Dataset Rooted in Bias
Historically, text-to-speech systems have favored female voices as they offer ample data for training. As a society, we’ve developed such a robust dataset stemming as far back as the late 1800s due in part to the emergence of women dominating the telephone operator industry.
Emma Nutt, hailed as the world’s first female telephone operator, possessed a demeanor that quickly set the standard for telephone operator voices. Her voice, described as "cultured and soothing ," established a benchmark that all telephone companies aimed to replicate. As the 1800s drew to a close, the occupation had transitioned into a strictly female domain. This gender-shift became a piece of the puzzle in the causation of the use of female voices for providing assistance. As a result, we’ve amassed hundreds of years of female audio recordings. And therein lies the origin story of a dataset riddled with historical bias.
Machine Learning Paves the Way for Male AI Speech
Over the years, advancements in technology have significantly streamlined the process of creating male AI voice assistants. Initially, text-to-speech systems were predominantly trained on female voices due to the availability of rich data. However, with the progression of machine learning algorithms and the expansion of datasets, the barriers to synthesizing male voices have diminished. Improved speech synthesis techniques, coupled with the proliferation of diverse voice samples, have facilitated the creation of high-quality male AI voices. Additionally, advancements in natural language processing have enabled AI systems to better understand and generate male speech patterns, nuances, and intonations.
As a result, developers now have access to a broader range of tools and resources, making it increasingly easier to develop and deploy male AI voice assistants across various applications and platforms. As of 2021, Siri no longer defaults to a female or male voice in the US. Per Apple, this change was “…a continuation of Apple’s long-standing commitment to diversity and inclusion, and products and services that are designed to better reflect?the diversity of?the?world we live in.” In the same year, Amazon rolled out a male voice for Alexa .
领英推荐
Moving Towards Inclusivity
Gender-neutral AI voices hold immense promise for a more inclusive future of human-computer interaction. Imagine a world where users of all backgrounds and genders can interact with technology without encountering pre-conceived notions about helpfulness or authority based on vocal characteristics. This inclusivity could foster a sense of belonging and comfort for users who may not identify with the traditional gender binary.
However, this path isn't without its hurdles. Developers face the challenge of creating voices that are neither inherently masculine nor feminine, a concept that may push the boundaries of current text-to-speech technology. Conceptually defining a gender-neutral voice presents its own ethical conundrum – because who’s to say what sounds “gender-neutral” anyway?
That question brought me back to my research study in undergrad on Lavender Linguistics – the study of queer speech and language. While my research focused on speech and phonetic patterns of homosexual males, I would assume that defining, and most important replicating a gender-neutral voice would be quite the challenge. One interesting approach was theorized in 2021 by Cintya Chaves who stated that because “…the fundamental tone of a male voice ranges between 85 Hz and 180 Hz in frequency, and that a female voice ranges between 165 Hz and 255 Hz…one could argue that if a voice ranges between 125 Hz to 215 Hz, the result would be a genderless voice.” However, in their study on Voice as Identity: Creating a genderless voice assistant , Chaves concluded that “…gender is intrinsically associated with sound. As a result, it would not be possible to offer an ambiguous voice for voice assistant systems by only considering voice frequencies.” Our research shared the overlapping conclusion that “Every queer person's experience is different, and there is no one perfect way to represent a whole community”.
Therefore, if you’re interested in furthering research in this area, I encourage you to set your biases aside and commit to engaging with a robust pool of diverse speakers.
Ethical Considerations of Voice Design
The ethical considerations surrounding voice design extend far beyond simply choosing a gender. When crafting the voice of an AI assistant, developers must prioritize user consent and transparency. This means clearly informing users about how their voice data might be used to develop or modify the AI's speech patterns.? Additionally, users should have the option to opt-out of contributing their voice data or to control how it's used. Furthermore, the design process itself should be mindful of potential biases. Developers should strive for diverse teams working on voice creation to mitigate the risk of unconscious bias creeping into the AI's vocal characteristics.? Ultimately, ensuring ethical voice design requires a commitment to user privacy, transparency, and inclusivity.
Conclusion
The current dominance of female-voiced AI assistants highlights the unconscious biases and stereotypes ingrained in our technological landscape. While these voices may initially seem innocuous, they can perpetuate harmful notions about gender roles and limit user experience. As we move towards a future of increasingly sophisticated artificial intelligence, developers should carefully consider the impact of voice design. They must acknowledge the potential for bias, prioritize user consent and transparency, and strive for inclusivity in the creation of AI voices. Only then can we ensure that our digital companions are truly helpful, unbiased, and representative of the diverse world we inhabit.
Although I’m unsure whether I’ll ever change the voice of my AI assistants to female, I can appreciate the strides that are being made towards leveling the playing field across AI assistant voices. Maybe if someone creates a non-problematic gender-neutral voice, I’ll consider a switch. Heck, I still find generative-AI attempting to emulate Black women speech patterns cringe.
But let me know what you think! Do you see a future where we could ever resolve the inequities of gender-roles and human-computer interactions/perceptions between male, female, or gender-neutral voiced AI Assistants?
_
Note: I wrote this article to support #MarchResponsibly, a campaign hosted by our AI Cloud Advocacy team at Microsoft! Our goal is to raise awareness for Responsible AI as a critical practice. Microsoft is committed to the advancement of AI driven by ethical principles. If you’re interested in leveling up your understanding of Responsible AI principles, check out MS Learn Embrace Responsible AI Principles and Practices module! In addition, our MS Learn Responsible AI Dashboard module is a great starting place for data scientists and developers to explore essential tools necessary to craft machine learning models that prioritize societal well-being and inspire trust.
And lastly, if you enjoyed reading this article and would like to read more from me about developing AI responsibility, let me know in the comments!
Founder at Novia Works Ltd. | Microsoft MVP
8 个月It's interesting that you offer the perspective of a female voice assistant as, potentially, representing a subservient role. I retained the female voice because, for me, I associate Alexa (etc.) with being more capable & knowledgeable, with greater agency and impact. I'm definitely not disagreeing with your perspective, but it is interesting that reasonable people can come at the important issue of diversity and equality from different directions. I like the idea of gender neutral or gender irrelevant voices. I certainly don't define myself by my gender; I would prefer to use a gender non-specific pronoun (but actively dislike using 'They', it's clumsy and confusing) and try to be neutral in the language I use (though it's difficult). Interesting post, thanks.
Senior Executive
8 个月I also think there isn’t an actual female voice that resonates with many women out there! So the fluidity is great and needed, as is the need for more women to be in the voice creation space IMHO ….
Founder & CTO, Linden Lab / Second Life. Co-Founder, High Fidelity
8 个月I love this. Agree that thoughtful experimentation toward a gender-neutral voice could be both a valuable learning experience, as well as open up a whole new way of more inclusive communication. Thanks!