We have a Problem with Racist AI
Racial, gender and other biases in AI are a pervasive problem. Google is just one timely example of this at work.
I can’t imagine it’s intentional, but it’s been happening for years and continues today. In 2015 the Google Vision AI came under fire for mis-labeling two dark-skinned individuals as “gorillas” https://www.usatoday.com/story/tech/2015/07/01/google-apologizes-after-photos-identify-black-people-as-gorillas/29567465/
Most recently, Google announced the availability of an AI trained to identify skin conditions related to moles. Refer “Google’s new AI dermatologist can help you figure out what that mole is” https://www.fastcompany.com/90637506/google-ai-dermatologist.
Unfortunately, researchers used a training dataset of 64,837 images of 12,399 patients, and only 3.5% represent persons with brown, dark brown or black skin. In other words, Google repeated their past mistake.
Google has learned from their past by publishing an article in JAMA outlining how effective the AI is with different ethnicities, but of course ethnicity is not the same as skin colour. While it’s a great step they put in this work, the study is flawed because of that conflation. https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2779250
As an industry, we in AI can and must do better. But how? How do we do better?
Not surprisingly, the solution lies in the completeness of source data. Simply put – we are not putting effort into collecting data that covers women, people of colour, or even a broad spectrum of ages. And to put a Canadian spin on it, I’ve never seen a medical data set about our own Indigenous people.
The best book I’ve read on gender bias is “Invisible Women: Data Bias in a World Designed for Men” by Caroline Criado Pérez. https://www.goodreads.com/book/show/41104077-invisible-women Like so many others, I was enraged by what this book outlines. Special thanks to Lishni Salgado for recommending the book to me.
I often joke that AI isn’t new, and reference Lt. Kenneth Levin’s Masters Thesis from June 1972 where he writes about using AI to make air-to-air combat better: https://calhoun.nps.edu/handle/10945/16115. And this means the problem of bias in data, and the impact on AI outcomes isn’t new either.
Our goal should be equity in outcome and value, and to do that we need equity in source data. As an industry it behooves us to recognize and understand, and wherever possible mitigate data bias. Before we productize any AI model, we need to understand our own inherent biases, and question if any of those have made their way into what we're delivering.
It won’t be overnight, but if we focus on correcting racial and gender bias in source data, and how it's critical to real-world AI success, we can make it happen.
Senior Advisor | Consultant | Board Member
3 年Appreciate you highlighting a very real issue, Paul.