AI Data Bias, The 2024 Challenge: Solutions for Multicultural Integrity
Liz Castells-Heard
Stanford MBA, Multicultural marketing leader of ROI-powered ideas and business integration
By Liz Castells-Heard,?CEO & Chief Strategy Officer, INFUSION by Castells
Generative AI (Gen AI) applied to marketing has tremendous value and we leverage it across the board, however its inherent biases and limitations in Multicultural/Ethnic accuracy and representation requires guardrails, human contextual and deductive skills, and human involvement throughout the process.?
MCM Stakes Higher Than Ever
As a Multicultural agency, we’ve seen unintentional data bias for decades in first-party Client research data, third-party media metrics or research studies and even the Census from limited or erroneous samples, misassumptions, lack of relevant content or context, and language or cultural biases. But now that we are all using AI tools and models, the stakes are much higher. Clients are using or will use Gen AI optimization and MMX models with aggregated data sets to make major business decisions from resource allocation, marcomm and media?optimization strategies to identifying target priorities/profiles, messaging and developing or transcreating content.?Imagine the domino effect of even one flawed or biased data set.
AI’s Unchecked DEI Data Bias
Besides the limited samples, context or prejudice raised above, data bias can come from the lack of diversity in the humans who built them, data users, its interpretation, incomplete algorithms, and/or historical data not reflecting current populations. For example, Amazon’s AI hiring algorithm amplified the severe women, gender and ethnic bias using decades of historical recruiting records dominated by white males and despite efforts to rectify, they lost confidence and abandoned the model. Imaging tools depict “attractive” or “productive” as light-skinned individuals while those “with social services” are darker-skinned Blacks or Hispanics, despite the majority of recipients being White. Even Stability AI’s image generator (Stable Diffusion XL) which is one of the best defaults to outdated Western stereotypes. Content AI output varies in tone, style and accuracy. And most AI tools (including ChatGPT) are trained on general online content without data cleansing. Closed systems still rely on years of first- or third-party metrics, thus also inherit biases. And AI trained on human-developed materials not only inherits biases but amplifies them.
Addressing this requires a nuanced approach, considering diverse demographics and real-world vetting.
Spanish-Language Model Limitations?
Since all AI/ML Natural Language Processing (NLPs) are from English-speaking countries only trained with English data, Spanish language and cultural nuances get lost in translation, even with the upcoming Large Language Models (LLM).Consider just a few variables involved to transcreate English content into Spanish so that it is understood by U.S. Hispanics with 27+ country dialects, male/female prepositions, conditional use of Tú vs. Usted, category?nomenclature which varies with context and includes English or bilingual words, or consistency in brand tonality.
The nuanced variables require multiple human factors, decades of human training and consistent human input.
领英推荐
Gen AI Use In Perspective
If we leverage AI’s strengths, understand its limitations, and try to mitigate the biases, then we can astutely apply. We apply Gen AI Project Mgmt. to the execution of laborious tasks and now publishing tools for repetitive tasks like resizing or reformatting the same content to improve workflow productivity and speed to market. We use content tools mostly for inspiration, ideation and simple tasks like email subject headlines but assure copywriters vet for errors and edit to sound more human. We have used NLP CAT tools for years as a starting point with heavy proofing and editing. We use imaging tools to help with storyboards, mock-ups and social but not for major digital, print or video campaigns, and likewise video production tools lack the high level imaging, GFX and nuanced complexity of ads required. We use AI tools for data mining deeper insights and implications but are acutely aware of the drawbacks. And when it comes to data analysis, we constantly audit to catch errors and bias, and use various metrics/data bases on the same subject for comparison and vetting.
The goal is to deliver a more effective “Augmented Workforce” with AI + Human partnerships the best of both worlds.
Best Practices Addressing AI Data Model Bias
A 360° view, “HOTL” models (Humans on the loop) and safeguards throughout the process can help address the complexities that require human contextual and deductive skills to mitigate biases between?and?within datasets:
Incorporating different human views, types and sources of data at each step help ensure representation and flexibility.
The Bottom Line: A Collective Responsibility
AI’s inability to fully grasp and respond to intangible human factors in decision-making, such as ethnicity, race or culture, or the proper ethics, morality, and empathy, limits its readiness for viable decisions. While completely eliminating AI bias is unattainable, marketing leaders must ensure AI systems are equitable, enhance business outcomes and human decision-making rather than perpetuating human prejudices and low MCM representation.?It is our collective responsibility as an industry to foster research and establish standards that reduce AI bias.
If we can find solutions to assure Multicultural integrity in data models and do it right, then Generative AI is the seismic change that will truly enhance our cognitive capabilities, change how we work, how people buy and interact with brands, and lead us into the new golden age of creativity and exponential business value.
Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer
3 个月Unintentional data bias in Gen AI can perpetuate harmful stereotypes and misrepresent multicultural communities, leading to ineffective marketing campaigns and reinforcing existing societal inequalities. A 2021 study by the AI Now Institute found that facial recognition algorithms exhibit significantly higher error rates for people of color, highlighting the urgent need for diverse and representative training datasets. Given the increasing use of Gen AI in personalized advertising, how can we ensure that algorithms accurately target multicultural consumers without reinforcing existing biases?