A Code of Ethics for Data Science

A Code of Ethics for Data Science

2.5 quintillion bytes of data are created every day. That's a huge number. So big it's hard to even get our head around. It’s created by you when you’re commute to work or school,  when you’re shopping, when you get a medical treatment, and even when you’re sleeping. It’s created by you, your neighbors, and everyone around you.  So, how do we ensure it’s used ethically?

Back in 2014, before I entered public service, I wrote a post called Making the World Better One Scientist at a Time that discussed concerns I had at the time about data. What’s interesting, is how much of it is still relevant today. The biggest difference? The scale of data and coverage of data has massively increased since then and with it the opportunity to do both good and bad.

In the bucket of good. We’re finding incredible insights using data to develop tailored medical treatments (Precision Medicine). Recently a data scientist at the Data Science for Social Good Program at the University of Chicago used machine learning/artificial intelligence to automatically detect bridges from satellite images that have been flooded for first responders. Crisis Text Line has been literally saving lives every day through an all volunteer network of counselors with powerful data and technology superpowers to help those in crisis. And through the Data-Driven Justice Initiative we’ve seen local counties be able to get their populations that need mental help and drug treatment out of our overcrowded jails and into the facilities though the safe sharing of data. These solutions not only save money they are a proven success.

I could go on and on about all of the amazing work that is happening around the world using data to make lives better everyday, but we also have to address where data is causing more harm than good. As Propublica has shown, algorithms are being used in the courtroom to make decisions that have an adverse impact on race. We know that data used in predictive policing can reinforce traditional stereotypes. And my friend Cathy O’Neil documents many more cases in her great book Weapons of Math Destruction. Let’s not forget about people stealing our data. From healthcare breaches to data brokers, we have systems holding on to our most sensitive data with minimal oversight and protections. And finally, our democratic systems have been under attack using our very own data to incite hate and sow discord.

With the old adage that with great power comes great responsibility, it’s time for the data science community to take a leadership role in defining right from wrong. Much like the Hippocratic Oath defines Do No Harm for the medical profession, the data science community must have a set of principles to guide and hold each other accountable as data science professionals. To collectively understand the difference between helpful and harmful. To guide and push each other in putting responsible behaviors into practice. And to help empower the masses rather than to disenfranchise them. Data is such an incredible lever arm for change, we need to make sure that the change that is coming, is the one we all want to see.  

So how do we do it?  First, there is no single voice that determines these choices. This MUST be community effort. Data Science is a team sport and we’ve got to decide what kind of team we want to be.

To start we need to engage in conversation and spend much more time talking about the changes that are about to take place (to those who have been doing this, thank you!).

That’s why I’m excited about the opportunity for the ENTIRE data science community to take part in helping define what a Code of Ethics for data sharing would look like for data scientists. How do you get involved?

  1. Join the global conversation remotely over Slack (channel #p-code-of-ethics) and follow @TechAtBloomberg on Twitter to tune into a livestream of portions of the Data for Good Exchange SF starting at 12 PM EST/9 AM PST on Tuesday, February 6th.
  2. Get your team of data scientists at work or at a meetup together and start talking about what a Code of Ethics for us would look like. And post it on the Slack channel.
  3. Most of all, share what you're learning! I want to hear from you on the slack channel, here on LinkedIn, or find me on twitter @dpatil


Steven Adler

Data Industry Pioneer | Awarded Tech Leader at IBM | TEDx Speaker | Startup Mentor | Adjunct Professor

5 年

I'm going to back off my comment 2 years ago.? Data Ethics really matter.? How many Data Scientists helped big pharma companies to improve operations that shipped Opioid drugs that killed Americans?? How many professionals put money first and outcomes later??

回复
Richard J. Bendell

Finance at City of Welland

6 年

Great article, insightful, collaborative, forward thinking and inclusive. A tremendous basis for the right steps forward towards data science responsibility to help maximize benefits with minimum risk. The first and most important thing for data scientists to realize is this is NOT your data, it is the user's data, plain and simple. And, they (the data scientists) bear the greatest responsibility to serve and protect those users. Call it the data police if you might, but that would not be correct for truly they become the data PROTECTORS. With that in mind, how can you fail? All the best.

Terri Lewis

Senior Executive at Planet Connected , Board Member at CereBulb

6 年

DJ. Nice article. There are basic engineering tools, like FMEA (Failure Mode & Effect Analysis), that should be applied to data. Paralleling engineers working on developing products, data “ engineers” should instead of developing validation plans to address “what can cause a product reliability issue”, data scientist engineers need to ask “how could this use of data cause harm”.

Hi, an interesting point, but surely, as specialists in data science, rather than ethics, should not part of the call be for Philosophers, Theologians, etc. to collaborate with data science - to share their years, decades, centuries, millennia of expertise in the field of "right and wrong" and how that could be applied? I put some initial thoughts out a little while ago: https://daveflanaganblog.wordpress.com/2017/09/29/algorithm-youre-fired/ Best regards Dave

Charles Coleman

Quantitative Analyst in Statistics, Economics and Demography

6 年

The American Statistical Association has its Ethical Guidelines for Statistical Practice available at https://www.amstat.org/ASA/Your-Career/Ethical-Guidelines-for-Statistical-Practice.aspx.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了