Every Database is Biased
Photo by Jan Antonin Kolar on Unsplash

Every Database is Biased

“People generally see what they look for, and hear what they listen for.”

To Kill a Mockingbird, Harper Lee

 AI and data science adoption rates are soaring as more organizations pursue a data-driven agenda. But have you stopped to consider the ethics of AI? It’s a complex undertaking, with many businesses struggling to apply ethical considerations in their day-to-day work.

‘Bias’ is a term that often gets thrown around, stalling data-driven initiatives, complicating project implementations and confusing stakeholders. But it’s a key consideration to take into account.

So, how can your organization achieve the right balance between the ethics of AI and achieving your business objectives? In this post, I’ll focus on a few important elements of bias, explaining how your business can embrace AI and Data Science in an ethical manner for digital success.

Wikipedia defines bias as:

“Bias is a disproportionate weight in favor of or against an idea or thing, usually in a way that is closed-mindedprejudicial, or unfair. Biases can be innate or learned. People may develop biases for or against an individual, a group, or a belief.[1] In science and engineering, a bias is a systematic errorStatistical bias results from an unfair sampling of a population, or from an estimation process that does not give accurate results on average.”

Let’s unpack the basics of that statement and explain why they are important from a Data Science context.

1. Data conditions 

Good quality data is not important for good AI, right? Wrong. Ask any experienced data scientist and they’ll tell you the same thing: to make accurate (and therefore ethical) decisions based on your data, the quality of your data is essential.  

Another misconception is that data is objective. Bias within data, however, can lead to incorrect conclusions or reinforce existing prejudices within your data. As such, the state of your data and your data management efforts are incredibly important. Data privacy and data security are, therefore, vital boundary conditions for ethical data usage.  

From another perspective: you may think all databases are biased since, by their very nature, they are a selection of datasets (and cannot include everything   ). However, it is more important to understand the basics of your data sample, including how your selection of data (i.e. your database), and/or its sub-selections relate to one another.

2. Model conditions  

You must take data bias and quality into account at the modeling stage. Bias can show up in the data and it can also be introduced when you select attributes for an AI model.

The transparency of your model matters. You must have justifiable reasons to opt for a more powerful but less transparent model. The good news is that transparency is not impossible to achieve. You can increase the transparency of, for example, a complicated neural network model by analyzing its operation or function, or by introducing human supervision.

Either way, an AI model must be auditable to ensure the output of the model or to ensure the steps leading to the model are replicable. To achieve this, an external company or your internal teams can conduct an audit. 

3. Data scientist conditions 

Whatever project you’re working on, it is unethical to act against your existing policies, rules or regulations.

This tenet also applies to data science. But you must have a clear accountability agreement in place to provide a consistent approach to ethics across your team. Your data scientists must also work in a proportional and transparent manner, adopting the least intrusive data strategies and clearly documenting your policies, rules and regulations.  

4. Impact on stakeholders 

Your AI and data science project has a people impact on both your employees and the data owners.

You should allow employees to provide feedback across the project lifecycle, including after deployment. You should also allow data owners to report any suspected issues. You may also need to make special considerations around the impact of your data project on vulnerable groups.

Accessibility is another consideration where people should have access to your AI products and services. This will safeguard certain groups within society, ensuring they are not discriminated against when your AI-based technologies are used in the wider world. 

5. Impact on community 

From a social, environmental and democratic perspective, data projects must have a positive impact on our community. Cambridge Analytica’s use of Facebook data during the 2016 US election is a clear example here of what not to do.

You should also apply one final consideration: the headline check. If you cannot easily justify your data project in one simple sentence, you may want to leave it on the drawing board. 

Here are some of the key ethical considerations for every data-driven initiative:

Integrated Approach of Data Science and Ethics

No alt text provided for this image

Intellerts ?

Martin Haagoort

MD Intellerts



 

Excellent read ! Would be interesting to see to what extent democratic governments around the world are willing to collaborate and play the role of watchdog (or not) to ensure a fair framework for all.

回复

要查看或添加评论,请登录

Martin Haagoort的更多文章

  • AI loosing it's magic?

    AI loosing it's magic?

    Interesting trend from earnings calls, AI is mentioned much less, counter the attention it gets on other platforms..

    3 条评论
  • Embrace Artificial Intelligence in Your Business

    Embrace Artificial Intelligence in Your Business

    Imagine a world where all your business questions and concerns are answered accurately. A world where you don’t have to…

    2 条评论
  • Launch your AI mission with Data Science

    Launch your AI mission with Data Science

    It’s a pleasure to share the first part of our whitepaper “Launch your AI mission with Data Science”. This exclusive…

    1 条评论
  • Expand in to Data Science

    Expand in to Data Science

    Artificial Intelligence is turning into a real buzzword… but how about Business Intelligence and Customer Intelligence,…

    1 条评论
  • Every Consultant is a Data Scientist - Right?

    Every Consultant is a Data Scientist - Right?

    It’s time for consultants to start practising what they preach. Analyzing the LinkedIn profiles of more than 350…

    1 条评论
  • It's the data (model), stupid!

    It's the data (model), stupid!

    In our current age of digitization and AI, data is growing at an astonishing rate. Everyone agrees.

  • AI Fake News #1: Sorry, Your brain is NOT like a computer

    AI Fake News #1: Sorry, Your brain is NOT like a computer

    AI is everywhere. You interact or rely on such systems in one form or another, every day.

    12 条评论
  • AI Fake News: Here’s What the Tech Prophets Don’t Want You to Know

    AI Fake News: Here’s What the Tech Prophets Don’t Want You to Know

    We live in a world where half truths, metaphors and fake news pollute the airways. AI is no exception.

    3 条评论
  • Everyone is a Data Scientist

    Everyone is a Data Scientist

    The shortage of analytical (i.e.

    17 条评论
  • Your brain is not a computer....

    Your brain is not a computer....

    The widely used metaphor brain-as-computer is fundamentally flawed. This is recognized by leading scientist in the…

社区洞察

其他会员也浏览了