登录查看更多内容

Every Database is Biased

Martin Haagoort

Trying to become who I was designed to be

发布日期: 2021年3月26日

+ 关注

“People generally see what they look for, and hear what they listen for.”

To Kill a Mockingbird, Harper Lee

AI and data science adoption rates are soaring as more organizations pursue a data-driven agenda. But have you stopped to consider the ethics of AI? It’s a complex undertaking, with many businesses struggling to apply ethical considerations in their day-to-day work.

‘Bias’ is a term that often gets thrown around, stalling data-driven initiatives, complicating project implementations and confusing stakeholders. But it’s a key consideration to take into account.

So, how can your organization achieve the right balance between the ethics of AI and achieving your business objectives? In this post, I’ll focus on a few important elements of bias, explaining how your business can embrace AI and Data Science in an ethical manner for digital success.

Wikipedia defines bias as:

“Bias is a disproportionate weight in favor of or against an idea or thing, usually in a way that is closed-minded, prejudicial, or unfair. Biases can be innate or learned. People may develop biases for or against an individual, a group, or a belief.[1] In science and engineering, a bias is a systematic error. Statistical bias results from an unfair sampling of a population, or from an estimation process that does not give accurate results on average.”

Let’s unpack the basics of that statement and explain why they are important from a Data Science context.

1. Data conditions

Good quality data is not important for good AI, right? Wrong. Ask any experienced data scientist and they’ll tell you the same thing: to make accurate (and therefore ethical) decisions based on your data, the quality of your data is essential.

Another misconception is that data is objective. Bias within data, however, can lead to incorrect conclusions or reinforce existing prejudices within your data. As such, the state of your data and your data management efforts are incredibly important. Data privacy and data security are, therefore, vital boundary conditions for ethical data usage.

From another perspective: you may think all databases are biased since, by their very nature, they are a selection of datasets (and cannot include everything ). However, it is more important to understand the basics of your data sample, including how your selection of data (i.e. your database), and/or its sub-selections relate to one another.

2. Model conditions

You must take data bias and quality into account at the modeling stage. Bias can show up in the data and it can also be introduced when you select attributes for an AI model.

The transparency of your model matters. You must have justifiable reasons to opt for a more powerful but less transparent model. The good news is that transparency is not impossible to achieve. You can increase the transparency of, for example, a complicated neural network model by analyzing its operation or function, or by introducing human supervision.

Either way, an AI model must be auditable to ensure the output of the model or to ensure the steps leading to the model are replicable. To achieve this, an external company or your internal teams can conduct an audit.

3. Data scientist conditions

Whatever project you’re working on, it is unethical to act against your existing policies, rules or regulations.

This tenet also applies to data science. But you must have a clear accountability agreement in place to provide a consistent approach to ethics across your team. Your data scientists must also work in a proportional and transparent manner, adopting the least intrusive data strategies and clearly documenting your policies, rules and regulations.

4. Impact on stakeholders

Your AI and data science project has a people impact on both your employees and the data owners.

You should allow employees to provide feedback across the project lifecycle, including after deployment. You should also allow data owners to report any suspected issues. You may also need to make special considerations around the impact of your data project on vulnerable groups.

Accessibility is another consideration where people should have access to your AI products and services. This will safeguard certain groups within society, ensuring they are not discriminated against when your AI-based technologies are used in the wider world.

5. Impact on community

From a social, environmental and democratic perspective, data projects must have a positive impact on our community. Cambridge Analytica’s use of Facebook data during the 2016 US election is a clear example here of what not to do.

You should also apply one final consideration: the headline check. If you cannot easily justify your data project in one simple sentence, you may want to leave it on the drawing board.

Here are some of the key ethical considerations for every data-driven initiative:

Integrated Approach of Data Science and Ethics

Intellerts ?

Martin Haagoort

MD Intellerts

Rajesh H.

3 年

Excellent read ! Would be interesting to see to what extent democratic governments around the world are willing to collaborate and play the role of watchdog (or not) to ensure a fair framework for all.

查看更多评论

要查看或添加评论，请登录

Martin Haagoort的更多文章

AI loosing it's magic?

2024年2月23日

AI loosing it's magic?

Interesting trend from earnings calls, AI is mentioned much less, counter the attention it gets on other platforms..

3 条评论
Embrace Artificial Intelligence in Your Business

2021年9月20日

Embrace Artificial Intelligence in Your Business

Imagine a world where all your business questions and concerns are answered accurately. A world where you don’t have to…

2 条评论
Launch your AI mission with Data Science

2021年8月5日

Launch your AI mission with Data Science

It’s a pleasure to share the first part of our whitepaper “Launch your AI mission with Data Science”. This exclusive…

1 条评论
Expand in to Data Science

2021年6月17日

Expand in to Data Science

Artificial Intelligence is turning into a real buzzword… but how about Business Intelligence and Customer Intelligence,…

1 条评论
Every Consultant is a Data Scientist - Right?

2021年3月23日

Every Consultant is a Data Scientist - Right?

It’s time for consultants to start practising what they preach. Analyzing the LinkedIn profiles of more than 350…

1 条评论
It's the data (model), stupid!

2021年3月17日

It's the data (model), stupid!

In our current age of digitization and AI, data is growing at an astonishing rate. Everyone agrees.
AI Fake News #1: Sorry, Your brain is NOT like a computer

2020年10月15日

AI Fake News #1: Sorry, Your brain is NOT like a computer

AI is everywhere. You interact or rely on such systems in one form or another, every day.

12 条评论
AI Fake News: Here’s What the Tech Prophets Don’t Want You to Know

2020年9月17日

AI Fake News: Here’s What the Tech Prophets Don’t Want You to Know

We live in a world where half truths, metaphors and fake news pollute the airways. AI is no exception.

3 条评论
Everyone is a Data Scientist

2020年8月5日

Everyone is a Data Scientist

The shortage of analytical (i.e.

17 条评论
Your brain is not a computer....

2020年7月17日

Your brain is not a computer....

The widely used metaphor brain-as-computer is fundamentally flawed. This is recognized by leading scientist in the…

See all articles

Every Database is Biased

Martin Haagoort

Trying to become who I was designed to be

AI and data science adoption rates are soaring as more organizations pursue a data-driven agenda. But have you stopped to consider the ethics of AI? It’s a complex undertaking, with many businesses struggling to apply ethical considerations in their day-to-day work.

1. Data conditions

2. Model conditions

3. Data scientist conditions

4. Impact on stakeholders

5. Impact on community

Integrated Approach of Data Science and Ethics

Martin Haagoort的更多文章

社区洞察

其他会员也浏览了

Empowering Companies to Govern Data and Implement Responsible AI

?? Adopting AI? These 5 Resiliency Rules Will Come in Handy - Part 5 of 5 ??

Data Governance in the Age of AI: Building Trust in a Data-Driven World

The ever-changing face of the data analysis skills market

Top trends data scientists should learn in 2022

Harnessing the Power of Synthetic Data: A Guide for Data Engineers and Product Managers

The Foundational Pillars of Technology Management in Artificial Intelligence

Data Science Wheel and Driving Ethical Behaviour

Synthetic Data: The Game-Changer in Business Innovation and Privacy

Ethical Considerations in Machine Learning: Navigating the Challenges

AI and data science adoption rates are soaring as more organizations pursue a data-driven agenda. But have you stopped to consider the ethics of AI? It’s a complex undertaking, with many businesses struggling to apply ethical considerations in their day-to-day work.

1. Data conditions

2. Model conditions

3. Data scientist conditions

4. Impact on stakeholders

5. Impact on community

Integrated Approach of Data Science and Ethics

Martin Haagoort的更多文章

AI loosing it's magic?

Embrace Artificial Intelligence in Your Business

Launch your AI mission with Data Science

Expand in to Data Science

Every Consultant is a Data Scientist - Right?

It's the data (model), stupid!

AI Fake News #1: Sorry, Your brain is NOT like a computer

AI Fake News: Here’s What the Tech Prophets Don’t Want You to Know

Everyone is a Data Scientist

Your brain is not a computer....

社区洞察

其他会员也浏览了

Empowering Companies to Govern Data and Implement Responsible AI

?? Adopting AI? These 5 Resiliency Rules Will Come in Handy - Part 5 of 5 ??

Data Governance in the Age of AI: Building Trust in a Data-Driven World

The ever-changing face of the data analysis skills market

Top trends data scientists should learn in 2022

Harnessing the Power of Synthetic Data: A Guide for Data Engineers and Product Managers

The Foundational Pillars of Technology Management in Artificial Intelligence

Data Science Wheel and Driving Ethical Behaviour

Synthetic Data: The Game-Changer in Business Innovation and Privacy

Ethical Considerations in Machine Learning: Navigating the Challenges