About Data Privacy and the GDPR
Mohammed Brueckner
Strategic IT-Business Interface Specialist | Microsoft Cloud Technologies Advocate | Cloud Computing, Enterprise Architecture
Data Privacy for modern day companies is like Santa Claus for kids.
Great to have around but terrifying up-close.
Let's face it, data privacy is not a cool topic. Not even by a long stretch one that would get you into entertaining discussions on a tech conference. Even just looking at it from a business perspective there is not much to dig about it (at a first glance). After all we all want to connect the dots, serve our dear customers best – you know the drill. Let's collect whatever we can about our customers, we give them a better treatment and everybody is happy. Right?
Not quite, if you are serious about putting your customers into the centre of everything. People are not indifferent about the use of their data at all. 89% of consumers actually want to know how companies keep their personal information secure, 86% insist they should know then their data is passed to 3rd parties. eConsultancy states in fact that data privacy is a mega-trend marketers need to be on top of.
What it all translates to is trust. If you check out the certifications of top 3 cloud computing vendors out there you'll quickly conclude all of them have to pour incredible heaps of money and effort into getting all the IT Security and data privacy certifications under the sun to prove they are getting it right. Fail to show you care about the data of your clients and you only prove to show you do not care about your clients.
The other side of the coin that is not user experience related but very relevant for sure as well are the legal repercussions that come with new data protection regulations, everywhere around the globe. In Europe first thing that comes to mind is of course the famous GDPR kicking in by May 2018.
More on that later, let's establish something fundamental first.
There is no privacy where there is no security
There is a reason why I am mentioning IT Security and protecting data in one go. Data protection is all about a promise you make about securing consumer data that needs to be secured to an extent depending on its very nature. Of course this promise needs to be backed up by evidence – your understanding of the matter and policies you create that revolve around that understanding. Policies and guide lines only take you so far, however, if your enterprise architecture encompassing data, applications, technology (including integrations) domains is not carefully protected by technical means. So any serious data protection discussion needs to account for IT security. Practically that means legal experts and IT experts need to join forces in order to make it really happen. Luckily, there is great support from all directions for IT security, more on that later.
Less is more
When it comes to be on the safe side of things, you want to make sure to stick to the very least amount of sensitive data possible. Relationships still need to be built, yet you can always abstract everything away by means of "hashing" or, less secure, "encrypting". Masking means replacing actual data with alphanumerical gibberish that cannot be turned into the original data, while encryption allows to do that. Encryption is not encryption however so things can be more or less complex, suffice to say that encryption is advisable as default while not having any sensitive data sticking around should be you very aim. That completely clashes and collides with the whole notion of being consumer centric and acting targeted of course, we'd all tend to believe. Yet think again – what you really want to do is deliver the right message to the right audience at the right time. For most interactions you can easily get away with certain traits and anonymous personas of your target audience. Context in these cases matters arguably more than knowing the actual full vita of your customer. Acknowledged, having only an abstract understanding of your consumers will not be good enough at all times, the message here is only that by the time you introduce Personal Identifiable Information (PII) you will enter a realm of calm and caution and you need to deal with it. Adding greatly to pretty much every consumer related process. Hence it has to be worth it, literally. And numerically, on a budget and cost sheet and for your next P&L discussions. No matter what decision you take, do yourself a favour and be always ready to answer the three golden questions as far as protecting sensitive data goes.
Technology is a great ally
For many "technical aspects" of IT there is a misconception that the responsibility ends with the folks dealing e.g. directly with your databases. Now that was always wrong and when it comes to data integrity that is not only wrong but seriously harmful for your business if there is a lack of understanding that the responsibility goes all the way to the top and decision making.
As uneasy as that epiphany might be, thereafter seeing all the possibilities of your business crumble away because of attacks from one end or hefty penalties that go with breaking laws on the other end are at least daunting. On the bright side of things however, IT Security is, as mentioned earlier, one of the pillars of every data protection strategy. Modern day cloud providers are well prepared and certainly have that sort of agenda to aid you with that strategy.
Let's take Amazon Webservices (AWS) - AWS has a "Well Architected Framework" to offer which features a "Security Pillar" whitepaper. Summarizing best practices implemented by services and features the rich AWS ecosystem has to offer. Microsoft Azure is no short of focus either, their Trust Center is a rich source of information on plenty of security related topics. The risk assessment section is a highly advisable one.
The bottom line with regards to security is, trust automation and built-in functionalities that foster security. Things come to mind like automated encryption at rest, in flight. Automated responses to different types of incidents. A strict authentication and authorization model, tied to e.g. identity and access management (IAM) services on the platforms of your choice for best possible impact (without restraining from doing actual work). Intelligent event based analysis and counter-measures of potential security threats. Again something that is simplified by the availability of overarching event-based triggers and services that allow you to e.g. have these events flow into serverless functions. Think Lambda on AWS, think Azure Functions on Microsoft, Cloud Function on Google Cloud Platform. If there is anything that strikes you as suspicious, now you have all the computing power and tools to diagnose whatever you want on the fly. Since there is nothing blocking in that chain – unless you make some really poor architectural decisions – it would not even impact your ability to scale.
Expanding on that point of being able to hook into anything and everything going on in your infrastructure landscape: be it AWS CloudTrail on AWS CloudWatch or Azure Monitor - you are empowered to screen pretty much everything you want, log anything you want and can, again, be hooked up in an event based fashion to accomplish checks and abstractions to any level you deem necessary. Abstraction meaning, immediate is the interaction and therefore what attack surface do you offer – with the trade-off of greater complexity. (Typical abstraction techniques are e.g. using "firewall" proxies.)
Necessary in this context is not to be taken light-heartedly, though: applying maximum security means to just everything is way too much of a broad brush to paint with. These security means and techniques come with effort and resource consumption, so only put your most valuable data assets into the proverbial digital Fort Knox – and leave the rest to something more reasonable. What you want to avoid as well is to have everything run via "firewall" proxies and enforce a high number of hops limiting the capacity of your total system. Different classifications of sensitive data can and should guide you on your way to architect suitable and reasonable solutions. Therefore striking the right balance. Services like AWS X-Ray will help you figure the performance of your system end to end and when it comes adding security gateways that's something you want to have an eye on.
After all, you still have to stay productive which certainly means able to scale. There are great examples proving that running secure and running in scale is achievable, like Coinbase. Scalability is not the only concern for productivity though – you could easily kill off all productivity by restricting users/developers too much. I recommend listening to this podcast on how Turbot tackled that challenge, without time consuming workarounds. It's better to respond to security related events - automated of course, wherever possible.
Brace yourselves, GDPR is coming
There are other reads about the GDPR I would recommend if GDPR (General Data Protection Regulation) is your main interest and not so much technology that supports it, however I'll break down the practical implications for you, quote:
"The GDPR provides EU residents with control over their personal data through a set of “data subject rights.” This includes the right to:
- Access readily-available information in plain language about how personal data is used
- Access personal data
- Have incorrect personal data deleted or corrected
- Have personal data rectified and erased in certain circumstances (sometimes referred to as the “right to be forgotten”)
- Restrict or object to processing of personal data
- Receive a copy of personal data
- Object to processing of data for specific uses, such as marketing or profiling" (Source)
To expand on that:
"Data Breach Notification – the new regulation will require companies to notify the data protection supervisory authority of data breaches within 72 hours
Right to erasure – to emphasize my earlier statement, only collect what you need to collect. This means companies have to delete personal data and any related links if they no longer find it accurate or relevant to the business
Right to information and transparency – customers should have the right to opt out and have a very clear understanding of how you store their personal data and what you do with it.
Data retention is a general concern as the GDPR wants you to use sensitive data only to the extent utterly required to perform a certain task." (Source)
NB: Privacy Shield and GDPR will go hand in hand, with an uneasy marriage yet to be witnessed.
For Digital Marketing, managing consent and getting consent right is extremely important: not only have you be able to prove consent at any time but you have to be able to erase data when that call to do so comes.
While we're at things you need to get right, they all more or less boil down to transparency. In most cases they will challenge your own architecture and to what degree you have your systems (and their integration) in order.
One takeaway of the GDPR is that applying encryption is taken into account to your benefit as implementer - like the way you have to let customers know about data breaches for example highly depends on whether you encrypt data or not for example.
An important concept of the GDPR is the "pseudo anonymization" which is dealt with as an alternative to encryption. So it comes down to tokenizing and hashing. Following that route and letting go of explicit PII data comes with some serious advantages, as mentioned. For example how you have to behave in case of a breach.
GDPR is way too big to cover it on a side note and I'll not make the attempt to do that, so here are some resources I would like to point you to start your investigation in case you are not familiar with GDPR yet:
One is an article from Craig Clark and the other one is Microsoft's Trust Center section on GDPR. Hopefully you are already prepping up - because the changes probably waiting to be implemented are many.
Great excuse for tidying up
This article began with stating that data privacy is not a cool topic. Clearly pointing out that it is an important one however and it all boils down to trust. So uncool but important. Is there more to it? Yes – a chance, a perfect excuse to get all the uneasy renovations and transformations started. Think of it like moving from one place to another. Isn't that always the perfect opportunity to get rid of all the junk and replace the broken stuff with modern, better? How many companies state they would like to move but are just stuck in their ways. Thing is, not covering data privacy regulations is getting more and more of a risk. Managing that risk will more or less force transformation to happen. Why not turn constraints into differentiators by changing it all to the better? Maybe I am overly optimistic about this but experience shows that individuals only move if pain or gain exceed a certain threshold. It's up to us technologists to make the best of it. This is a good time to bring on new technical aid in forms of Google's Data Loss Prevention API and the likes, automatically classifying sensitive data no matter where it came from on the spot, thanks to natural language processing and machine learning.
In fact I would not be surprised to see many solutions arise out of the need to address data privacy as these types of technology get more and more commoditized. Add tactics like sticking to a certain region in AWS and void multi-region redundancy to that equation or go for the Azure EU cloud and you are well equipped. These tactics can be mixed and matched with other regions and their availability zones for less sensitive data of course or by bringing in your own data centre for a hybrid model. Based on data classifications of course, to avoid overspend. If you go easy on security and namely authentication and authorization then all these efforts are worthless, so don't go easy. Ever.
The bottom line
Data Security and Privacy need to be an integral part of every architectural decision, that's not news – the rising importance in the eyes of the public and how they feel about data privacy for one and then again the legal obligations make it more and more worthwhile to be disciplined about it. Technology can aid you. In fact, technology is probably your only way out to deal with the vast amounts of data, decision points and events related to securing the integrity of data and validity of authorization granted. Automation is front and center in all of this and for a good reason an ongoing theme int his article.
Want to read up more on privacy in technology? Here's a read for your daily commute.
I hope this article gave you another or at least a a worthwhile additional perspective and plenty of starting points for your own investigation journey. If you feel like exchanging thoughts and ideas on the matter, drop me a message. I am as much in this journey as you probably are and value your critique, ideas.