Privacy vs. Security in a Big Data World
Tamara Dull
Principal Product Marketing Lead, Migration & Modernization at Amazon Web Services (AWS) | Dog Rescuer
I still haven’t decided if Edward Snowden is a hero, a traitor, or a schmuck. I watched Citizenfour, the documentary that captures his story in real time, thinking it would help me figure it out, but I walked away liking him more and his actions less, and still not ready to hang a scarlet letter around his neck.
What I do know, however – and I thank him for this – is that Snowden helped bring the discussion of big data privacy and security to the public square – and not just the American public square, but the global one as well. This is a good thing because in this era of big data, not to mention the Internet of Things, we can no longer relegate this discussion to the privacy freaks and security geeks in the back room. It’s a discussion in which we all should participate.
To understand it better, let’s take a brief look at some of the privacy and security issues in the context of the (big) data lifecycle.
Privacy, Security, And The Data Lifecycle
In data security circles, the six stages of the data lifecycle are well known: create, store, use, share, archive, and destroy. While these six stages have a strong foundation in security, an interesting thing to note is the fact that the two privacy-related stages – use and share – are situated squarely in the middle. Is it just a coincidence that privacy is at the heart of the matter?
Create
If data is not collected and/or created, there is no need to secure it. This may seem obvious, but it’s astonishing how many websites and apps forget or disregard this point. They collect it all “just in case” – with little consideration on how the data may be handled downstream.
Why this matters: Data security begins at the point of creation or collection. Organizations need to be deliberate in the data they request or receive, and individuals should be mindful of the data they’re sharing—whether it’s sensitive data on a financial site or a viral video on YouTube. If this data is not secured, it could end up in the wrong hands.
Store
With the volume of big data being generated these days, it’s not just a question of what data to store, but also how to store it all without blowing the budget. Open-source big data technologies are helping to greatly reduce the cost of data storage, both on-premises and in the cloud.
Why this matters: If an organization creates or collects data, it becomes their responsibility – not the individuals’ – to secure and protect it from corruption, destruction, interception, loss, or unauthorized access. Some organizations take this responsibility more seriously than others.
Use
When an individual sets up a new account with an organization through its website/app, the individual is asked to read and agree to the terms of service and/or privacy policy. This legal contract typically defines how the individual’s data will be managed and used inside and outside the organization. Granted, few people read this legalese, but our expectation is that the organization will use our data “responsibly,” and when this usage changes, we expect to be notified.
Why this matters: It’s the usage – not the collection or storage – of data that concerns most people. It’s this stage where individuals want to be in control. For example, they want to set the dial on how public or private their data should be, who can access their data, and whether their data (aggregated or not) can be sold or rented to third parties. In this big data era, when organizations don’t provide this level of privacy control, they risk losing the loyalty and trust of their customers and users.
Share
Organizations continue to share data between internal systems and external partners, but with the advent of social networks and “smart” devices, sharing data has become a public pastime – even to the point of “selfie” narcissism.
Why this matters: On one hand, individuals want control on how their personal data is being used. Yet some of these same individuals show little to no constraint on what personal data they’re sharing. Even though it’s the responsibility of the organization behind the website or app to secure users’ data and respect privacy settings (if they exist), it’s up to the individual to determine what and how much information they’re willing to share. If you put it on the internet, it’s not a question of if, but when, your information may be used in unintended ways.
Archive
Between big data technologies and the cloud, it’s become relatively cost-effective for organizations to store data for longer periods of time, if not indefinitely. In some cases, regulations stipulate how long certain data will live – like in the US financial and health industries – but, in most cases, the budget and space constraints are being alleviated.
Why this matters: Being able to store more data for longer periods of time at a fraction of the cost is an appealing proposition for organizations. The more exciting proposition, however, is the ability to analyze even more data over greater periods of time to discover new questions, patterns, trends, and anomalies. The gotcha here is: The more data an organization stores and archives, the more data it has to secure.
Destroy
If and when data is tagged for destruction, the question is to what extent. For example, if a website user requests that his account be deleted, what does this mean? Is just the access to his account/data removed (so that he can request access later if he changes his mind) or does a deletion request trigger the destruction of all his data, including archived data? The answer most likely lies somewhere in between for most organizations.
Why this matters: Regulations and governance policies will dictate the extent to which data may be destroyed for many organizations. The data that does not get destroyed must then be secured. So using the example above, if a website user requests that his account be deleted, and he receives an email notification to that effect, what he doesn’t know is what personal data, if any, still exists in the organization’s systems. He may still be vulnerable to a potential data breach, long after he’s been deleted.
It Cuts Both Ways
While a citizen’s right to privacy and freedom from government surveillance has been top of mind for Edward Snowden, national security has been top of mind for the US government.
And therein lies the rub: security cuts both ways. On one hand, it’s the responsibility of an organization to secure and protect any digital information it collects, stores, and transmits. But on the other hand, our governments are knocking on organizations’ doors demanding access to this protected information — all in the name of preserving national security.
This is only the beginning. Snowden may have been a catalyst in getting the big data privacy discussion started, but it’s not his to carry on and finish. It’s yours, mine, and ours.
Originally written for and published on Brand Quarterly.
Stakeholder & Strategic Environmental Advisor | Sustainability Consultant | Community Relations Specialist | Communications Strategist
7 年Great explanation of some of the issues businesses face handling and storing large volumes of sensitive, confidential data. It's also scary, as an online consumer, to think what is happening to my personal data.
Excellent article. Very important topic. I appreciate the "why this matters" POVs. Thank you for sharing.
Director, Oracle NetSuite CPM & Analytics Solutions - GTM JAPAC
9 年Extremely apposite points broached. Data privacy laws along with data security need to be appreciated more deeply. Implementation is the next logical step which is long way further down the line.
Chief Information Officer (CIO) | Digital Transformation Leader | Strategic IT Management | Driving Business Growth Through Technology Innovation
9 年Thank you. Very clear. I really like it!