The story of data
Theodora Lau
American Banker Top 20 Most Influential Women in Fintech | Book Author - Beyond Good (2021), Metaverse Economy (2023) | Founder - Unconventional Ventures | Podcast - One Vision | Advisor | Public Speaker | Top Voice |
By Theodora Lau and Bradley Leimer of Unconventional Ventures
A picture says a thousand words, the saying goes. But what about the endless streams of data underlying those images? Or that which deciphers the broader picture of our digital activities and footprints? If data could talk — what kind of picture would it paint? What stories would it tell?
Behind every 1 and 0 is a human story. A chain of events. A version of truth.
But what is the real truth?
Take the U.S. Census for example. “Also known as the Population and Housing Census, the Decennial U.S. Census is designed to count every resident in the United States. It is mandated by Article I, Section 2 of the Constitution and takes place every 10 years.”
Please note that important difference: Every resident is counted...not every citizen. That's an important distinction.
The census data collected helps paint a picture of where we are as a country. Unfortunately, according to the Census Bureau’s own assessment, the last census conducted in 2010 missed about 2.1 percent of black Americans and 1.5 percent of Hispanics, altogether missing more than 1.5 million people of color. Overall, the Bureau estimated 16 million omissions in the completed census. Since census data is used to determine the number of seats each state has in the U.S. House of Representatives, as well as how federal funds are shared amongst local communities — the impact of undercounting, especially our society's most vulnerable communities — cannot be understated.
While history does not repeat itself, it often rhymes. It is feared that the current pandemic and resulting economic turmoil may have caused yet another set of biased data to be represented in the current 2020 Census — and an even greater undercount of residents whose needs would be further underrepresented.
Back to the source
Much attention has been paid to algorithms and decisioning. But what about the crucial input: data? After all, data is what fuels the millions of algorithms that determine much of our digital life. Poor data quality will lead to poor output and to poor outcomes. A model is only as good as the data that is used to train it. Bias becomes magnified by those that gauge that bias as much as the data that curates that bias.
Of course, the challenge with data quality goes beyond the Census. Plenty of examples abound, especially around racial bias. As we have learned with image AI, if the data sets consist mostly of white people, it would not be surprising to end up with an algorithm that fails to recognize those with darker skin tones. Similarly for health technology solutions, if the algorithm is trained on only a subset of the population — hence, a singular viewpoint, the system is good for just that. Without proper context, incorrect conclusions might be drawn for different demographics. It is one thing to misinterpret music preference; the outcome is much more severe if the system provides a wrong medical diagnostic.
Other challenges exist for automated financial services as well, such as mortgage underwriting. Even though lenders cannot consider factors such as race, sex, and religion, other variables can still be at play (such as schools attended) — that could act as proxy for a more protected classification. Data points on existing financial services products can also introduce bias — such as high interest rates on an auto loan triggering higher interest rates for mortgages in the model outside of normalized credit models.
Moving forward
While we cannot completely eliminate bias, there are steps that can be taken to reduce it. Having a diverse team that is involved in creating and validating the bias of a given product is the most obvious — and most important first step. We must ensure that different demographic aspects are considered along the entire process, from data collection to modeling and deployment. We must avoid introducing mistakes and historical biases from the past into solutions that have impact on future outcomes.
With thoughtfulness, empathy, and human perseverance, we have a chance to create a better and more equal future — together.
Human and technology — acting as one.
________
In this episode of One Vision, Theo and Bradley chat with Uday Akkaraju, CEO of Bond.AI, a human-centered artificial intelligence platform for banks. The trio explore the challenges around data collection and human bias, and the fintech innovation ecosystem in Little Rock, Arkansas, including the great community created by the Venture Center. You can listen to this week’s episode on iTunes and Spotify. Please hit that like button and consider subscribing.
________
Unconventional Ventures helps drive innovation to improve systematic financial wellness. We connect founders to funders, provide mentorship to entrepreneurs, strategic advisory services to a broad set of corporates, and broaden opportunities for diversity within the ecosystem. Our belief is that anyone with great ideas should have a chance to succeed and every voice should be heard. Visit unconventionalventures.com to learn how you can partner with us today.
"AI Solutions Architect": Empowering SMEs to master AI tools for instant, impactful results. Custom AI Assistants, AI Automation Agents Personal Knowledge Management (PKM), Second Brain
4 年Thanks, Theodora, indeed, "Poor data quality will lead to poor output and to poor outcomes."? Sometimes that result is almost intended. ;-) When I worked with a large media company in Germany, we built an expert system to predict sales figures. The goal was to be able to let the company know how many books to print for book launch day. Our model was sound but the results were poor, at first and we wondered why that was the case. Until we figured out that some of the book experts had not given enough care entering their data into the system. Simply because they were scared of getting replaced by the system we built. Transferring their knowledge into intelligent data systems is scary for many. We were able to point out how jobs were changing, not getting replaced, which eased the job loss fear. Once it was clear that our predictive system wasn't built to replace the experts, data quality and results improved vastly. "Human and technology — acting as one. "
The Data Diva | Data Privacy & Emerging Technologies Advisor | Technologist | Keynote Speaker | Helping Companies Make Data Privacy and Business Advantage | Advisor | Futurist | #1 Data Privacy Podcast Host | Polymath
4 年Theodora Lau thank you for sharing your point of view. These are important issues that can negatively impact the lives of people that need more attention. Thank you for sharing the post and for bringing up these points. Best #datadiva