Open Data - a vaccine for re-identification and privacy issues?

Open Data - a vaccine for re-identification and privacy issues?

Thank you @LATA_org for inviting to speak and host one of the breakout sessions in the Data Driven Nation conference!

During event we discussed privacy risks from open data initiatives - how important it is to discuss possible risks from opening data and take appropriate legal and technical steps to counter those risks. Re-identification is a disease. It is actually a pandemic of data world.

No alt text provided for this image

But during discussions a novel idea appeared that has not seen much spotlight so far. It haunted me for a few days after event and after many second thoughts, I decided it is actually worth to share with broader community.

Politicians are actively discussing splitting companies that have amassed too large data context by mining their users – let’s split Facebook and others in parts. Claimed goal being better user privacy protection, improved democracy, etc... But doesn't it sound like nuking a meteorite on collision course to Earth and creating a bunch of objects that would certainly bring extinction level event? 

No alt text provided for this image

What if instead, or indeed in addition to, splitting companies that have too large data context, these companies would be required to open their data?

What if Open Data actually would become a vaccine for re-identification and other privacy issues stemming from existing imbalance of power arguably brought by these companies?

What if tomorrow when I use a search product - narrowly defined primary product/service - my data can be processed to deliver the search results. But if that data is used to serve advertising or transferred to third parties in any form, that data would have to be made publicly available in machine readable format, for free - opened?

At first sight, that sounded like heresy - wouldn't it create privacy havoc and why would someone agree to it? But after many discussions and multiple second thoughts, actually it might make sense. First, privacy actually would be better protected, because - (a) no private data can be publicly shared and such requirement would directly limit use of personally identifiable data for secondary purposes, and (b) - there would be a register of anonymised data outflows from such companies that later can be tracked and checked in cases where re-identification has occurred. Effectively, such requirement would change economic incentives for these companies, limiting data reuse (using data harvested from users in product A to serve product B, C, .., XYZ), cancel financial motivation to resell data even in anonymised form (as it would be public and free) and enable community to fact check data sets that get published to prevent re-identification in the future.

As for the regulatory challenge, oh yes, that would be a huge one. Historically many industries have seen regulatory frameworks applied to them after initial crazy free-fall development period, e.g. financials. None of those industries were happy by that fact. But in all cases regulation was applied because these industries created imbalance of power in detriment to the consumers. As in all cases, significant enforcement effort would be required, but maybe open data vaccine idea indeed is something that everyone should give the second thought?


要查看或添加评论,请登录

Jaundālders Aigars的更多文章

社区洞察

其他会员也浏览了