To stop bias, we must fix the root-cause: badly biased data and algorithms.
Alex Salkever
Techquity.ai / Vionix Biosciences / Product + GTM Advisor (focus on Open Source, AI, and where they meet) / Author of books about Technology, AI and Society / Strong Opinions, Gently Argued
Today Starbucks shut down all of its 8,000 stores for a few hours in order to give its employees bias training. The move was sparked after a Starbucks manager in Philadelphia called the cops to arrest two African-American men who had asked to use the bathroom at a Starbucks prior to making a purchase. The move is laudable. CEO Howard Schultz recognizes that latent bias remains a nasty problem tearing at the fabric of our society.
It’s a good start. But the reality is we are actually making society more biased and not less biased as we more and more artificial intelligence systems are allowed to make real decisions impacting real lives. These algorithmic decisions impact far more important parts of our lives then whether we can go to the bathroom in a coffee shop. Our ability to get a loan and what rate we pay depends on these algorithms. Whether we are eligible for parole early (or at all) and whether the state can remove our children from our custody if they perceive a risk depends on these algorithms. What we pay for our insurance increasingly depends on these algorithms.
Far too often those decisions are heavily biased against women, minorities, the elderly, or other groups. This is only logical.
The data we collect and use to train those systems reflects decades (really centuries) of bias. Whites and blacks use drugs at about the same rate. Yet blacks are far more likely to be arrested for drug possession and dealing, and far more likely to go to prison. This creates a data problem. Algorithms just looking at the data would think that blacks are a bigger risk for drug crimes, when in fact they just get arrested and prosecuted more often. Naturally this stokes real world fears that are largely unjustified and result in discriminatory actions like what transpired at Starbucks.
The algorithms we build to process that data are, by and large, opaque and remain unaccountable. (This is, in part, why Europe passed the General Data Protection Rules which includes a mandate for algorithmic transparency). The people who build these algorithms are mostly white and Asian or South Asian, and male. They fall within a narrow age band.
That is why algorithms often contain built in proxies for discrimination which the authors never even thought about. For example, if a company like Facebook puts a premium on employees that live close to the office in hiring algorithms, then the hiring pool is more likely to reflect the demographics of the surrounding community. In downtown San Francisco, that community has a vanishingly small percentage of African-American residents.
Many experts in the field know this is a problem. They are trying to solve it by creating tools that allow engineers and programmers to more easily track what factors go into machine learning decisions and how deep learning neural networks build their models of the world. Within a few years, I hope, the tools will be better suited to deal with this problem. But fixing the tools is easier than fixing the biased data. This polluted data is omnipresent. The drug arrest data is the tip of the iceberg. Salary history data, job title data, loan performance data and everything else you can imagine conspires to paint a darker algorithmic picture of those who are already at a disadvantage.
So what’s the solution? It’s going to take a lot of time and thought - far more than a few hours. It’s going to require sustained effort to root out proxies for bias and make algorithms as transparent and as accountable as humanly possible. It’s also going to take more than a fair dose of common sense. There is a gray space between algorithmic bias and totally defensible algorithmic targeting in order to address specific audiences with specific messages. Alas, we remain far from the edge cases when you see social networks allowing racially targeted advertising for jobs and apartments in violation of Federal law.
But let's give Howard Schultz credit. He sacrificed profits and sales in order to do the right thing and start down the path to a better place. A good start would be to see the masters of the algorithms - Google, Facebook, and myriad other companies - take an afternoon companywide to figure out how to fight this scourge. Starbucks made it abundantly clear this is a top priority. It's high time we see this sort of commitment from others.
Alex Salkever started thinking about algorithmic bias it while reporting on AI for his last book, "The Driver in the Driverless Car: How Or Technology Choices Can Change The Future".
Great article. @black_in_ai was founded in part to deal with bias in data sets by Rediet Abebe, Cornell, Moustapha Cisse, Facebook AI Research Timnit Gebru, Microsoft Research among others. They’ve done some revealing work on how biased data affects outcomes.