Blockchain and the General Data Protection Regulation (GDPR)
Blockchain is a much-discussed instrument that, according to some, promises to inaugurate a new era of data storage and code-execution, which could, in turn,stimulate new business models and markets. The precise impact of the technology is, of course, hard to anticipate with certainty, in particular as many remain skeptical of blockchain's potential impact. In recent times, there has been much discussion in policy circles, academia and the private sector regarding the tension between blockchain and the European Union's General Data Protection Regulation (GDPR). Indeed, many of the points of tension between blockchain and the GDPR are due to two overarching factors.
How it Starts..
For the majority of us, it began with email. From the moment we create an account- password and so on, we literally trust someone else to store and maintain our personal information. Then came Internet banking, retailers, hotel booking sites, social networks, mobile phone apps – eventually, almost every commercial or social relationship in our daily lives and ordinary course of business moved online, each one demanding a username and password and a long list of personal information.
All of that information is stored somewhere, but where? It may be in another country, or in several countries. Quite often, even the organisation handling the data cannot be sure precisely where it is stored, how many copies exist, or who might be able to access it. This is because modern network technology makes it very easy to copy data and transmit it instantly across vast distances, ignoring national borders, whereas storing it securely in one location can be costly and inconvenient.
That data has value – not just to its owner, but also to others who may use it for financial gain, for political ends, to steal, or even to cause physical harm. The extreme proliferation of valuable personal data leads to major risks, as recent events have made clear. Cyber-attacks such as WannaCry and the Equifax hack continue to grow in frequency and severity, and represent a flourishing criminal industry based on exploiting the value of personal data. However, this merely mirrors the growth of legitimate business based on the collection and use of personal data, with companies like Google and Facebook now among the world’s biggest; disturbingly, that legitimate business also gives rise to concerns over privacy and potential misuse, as demonstrated by the Cambridge Analytica scandal, in which personal data from 50 million Facebook profiles was reportedly used to influence US voters in the 2016 presidential election.
Hong Kong is one of the world’s hotspots for cyberattacks. Many major incidents are never revealed to the regulators or the press, since the Personal Data (Privacy) Ordinance (Cap. 486) (“PD(P)O”) does not require data users to report breaches and a culture of silence largely prevails. There have, however, been some high-profile incidents, including the infiltration of an inactive database owned by Hong Kong Broadband Network that held information on 380,000 customers, and cyberattacks targeting the personal data of approximately 220,000 individuals held in travel agencies’ databases. The Privacy Commissioner for Personal Data (“PCPD”) has revealed that the number of data breach notifications surged by nearly 20% in 2017.
A New Generation of Data Protection Regulations
To address this concern, a new wave of regulation is sweeping the globe, imposing stricter obligations on those using, controlling or processing personal data to keep it secure. The most prominent example of this trend is the EU’s General Data Protection Regulation (“GDPR”), which came into force on 25 May 2018.
The GDPR represents a significant tightening-up of previous European data protection regulations; however, although the regulation is European, its influence (along with much associated anxiety) traverses the globe, because of three key elements:
- It has extra-territorial effect. Put simply, a non-EU business (including a Hong Kong business) needs to comply with the GDPR if it offers goods or services to, or monitors the behaviour of, EU residents (Article 3).
- Unlike the PD(P)O, which requires no data breach notification at all, the GDPR requires data controllers to notify a data breach to the supervisory authority within only 72 hours. Data subjects must also be notified where a breach is likely to result in a high risk to their rights and freedoms (Articles 33 and 34).
- Most concerning of all are the sanctions for breach of the GDPR: the higher of €20 million or up to 4% of an organisation's total worldwide annual turnover of the preceding financial year. Had GDPR been in force during the Cambridge Analytica affair, Facebook could have been fined as much as US$1.9 billion – nearly 3,000 times the £500,000 fine levied by the ICO under the UK's Data Protection Act 1998 (being the maximum fine under that legislation).
Other important provisions introduced in the GDPR are restrictions on the transfer of data out of the European Economic Area and a “right to be forgotten”, i.e. a right to request the deletion of data in certain circumstances.
Other jurisdictions around the world are also tightening up their data protection regulations. For example, with the world’s largest online population (around 772 million), China has also taken steps to regulate its cyberspace, introducing a new Cybersecurity Law in June last year. While the international media has tended to focus on the national security aspects of the legislation, many of its provisions mirror those in the GDPR, including tighter controls on collection and use of data as well as obligations to inform data subjects and authorities of breaches. In the wider Asia-Pacific region, mandatory data breach reporting has also been introduced in South Korea, Taiwan, the Philippines, Indonesia and Australia.
The Limits of Regulations
The stricter regimes implemented by this new generation of data protection regulations, and in particular the heavy sanctions and extra-territorial reach of the GDPR, should encourage many of those handling personal data to do so more responsibly. However, there remain concerns that these regulations cannot alone solve the problem. In particular:
- The technological and commercial forces behind data proliferation will remain – put simply, data is valuable and easier than ever to copy and transmit. As increasing numbers of commercial and governmental entities collect and store multiple copies of valuable personal data, greater risks will ensue.
- Despite the GDPR's extra-territorial application, in practice it may prove difficult to enforce against infringers based outside the EU. While the GDPR seeks to address this by requiring entities handling EU residents' data to appoint an EU-based representative, it is easy to foresee widespread violation of this requirement.
- There are already indications that the cost of compliance with the GDPR is too much for many smaller businesses to bear – especially technology companies handling large volumes of personal data.
- In today's connected world, the restrictions on cross-border data transfer set out in the GDPR and China's Cybersecurity Law are arguably unworkable and fail to acknowledge the Internet's capacity to transcend national boundaries.
What, then, is the solution?
It has been argued that blockchain technologies might be a suitable tool to achieve some of the GDPR's underlying objectives. Indeed, blockchain technologies are a data governance tool that could support alternative forms of data management and distribution and provide benefits compared with other contemporary solutions. Blockchains can be designed to enable data-sharing without the need for a central trusted intermediary, they offer transparency as to who has accessed data, and blockchain-based smart contracts can moreover automate the sharing of data, hence also reducing transaction costs. Furthermore, blockchains' crypto-economic incentive structures might have the potential to influence the current economics behind data-sharing. These features may benefit the contemporary data economy more widely, such as where they serve to support data marketplaces by facilitating the inter-institutional sharing of data, which may in turn support the development of artificial intelligence in the European Union. These same features may, however, also be relied upon to support some of the GDPR's objectives, such as to provide data subjects with more control over the personal data that directly or indirectly relates to them. This rationale can also be observed on the basis of data subject rights, such as the right of access (Article 15 GDPR) or the right to data portability (Article 20 GDPR), that provide data subjects with control over what others do with their personal data and what they can do with that personal data themselves
- Decentralisation
As the Cambridge Analytica scandal shows, having thousands of copies of sensitive information held by a centralised platform is inherently problematic. At the core of blockchain technology is the concept of decentralisation: a blockchain is typically run on a peer-to-peer network, meaning that there is no central entity with direct access to users' private information, reducing the opportunity for data to be harvested and sold. Furthermore, the absence of a centralised point of vulnerability reduces the risk of data leaks resulting from cyberattacks or human error.
- Encryption
Crucially, encryption can be used to protect data privacy on a blockchain network. If an individual consents to share his or her personal data with a particular organisation, the associated decryption key will be transferred to the recipient, who will then be able to use it to unlock the encrypted data. As opposed to most modern storage systems, where there is root-level administrator access, there is no "back door" through which a third party could access users’ information; nobody other than the holder of the decryption key is able to unlock the encrypted data.
- Immutability
Once a record is stored and spread across the nodes, groups of data are built into blocks and chronologically chained to each other, making it virtually impossible to change existing data without altering the other blocks. Compared with traditional databases, a blockchain provides greater reliability and security, since any manipulation of data can be easily identified and traced.
The tension between blockchain and the GDPR
In recent years, multiple points of tension between blockchain technologies and the GDPR have been identified. Broadly, it can be argued that these tensions are due to two overarching factors. First, the GDPR is based on the underlying assumption that in relation to each personal data point there is at least one natural or legal person – the data controller – whom data subjects can address to enforce their rights under EU data protection law. Blockchains, however, often seek to achieve decentralisation in replacing a unitary actor with many different players. This makes the allocation of responsibility and accountability burdensome, particularly in light of the uncertain contours of the notion of (joint)-controllership under the regulation.
A further complicating factor in this respect is that in the light of recent case law developments, defining which entities qualify as (joint-) controllers can be fraught with a lack of legal certainty and predictability .
GDPR is based on the assumption that data can be modified or erased where necessary to comply with legal requirements such as Articles 16 and 17 GDPR. Blockchains, however, render such modifications of data purposefully onerous in order to ensure data integrity and to increase trust in the network. Again, the uncertainties pertaining to this area of data protection law are increased by the existing uncertainty in EU data protection law. For instance, it is presently unclear how the notion of 'erasure' in Article 17 GDPR ought to be interpreted. It will be seen that these tensions play out in many domains. For example, there is an ongoing debate surrounding whether data typically stored on a distributed ledger, such as public keys and transactional data qualify as personal data for the purposes of the GDPR.
Specifically, the question is whether personal data that has been encrypted or hashed still qualifies as personal data. Whereas it is often assumed that this is not the case, such data likely does qualify as personal data for GDPR purposes, meaning that European data protection law applies where such data is processed. More broadly, this analysis also highlights the difficulty in determining whether data that was once personal data can be sufficiently 'anonymised' to meet the GDPR threshold of anonymisation.
Another example of the tension between blockchain and the GDPR relates to the overarching principles of data minimisation and purpose limitation. Whereas the GDPR requires that personal data that is processed be kept to a minimum and only processed for purposes that have been specified in advance, these principles can be hard to apply to blockchain technologies. Distributed ledgers are append-only databases that continuously grow as new data is added. In addition, such data is replicated on many different computers. Both aspects are problematic from the perspective of the data minimisation principle.
It is moreover unclear how the 'purpose' of personal data processing ought to be applied in the blockchain context, specifically whether this only includes the initial transaction or whether it also encompasses the continued processing of personal data (such as its storage and its usage for consensus) once it has been put on-chain. It is the tension between the right to erasure (the 'right to be forgotten') and blockchains that has probably been discussed most in recent years. Indeed, blockchains are usually deliberately designed to render the (unilateral) modification of data difficult or impossible. This, of course, is hard to reconcile with the GDPR's requirements that personal data must be amended (under Article 16 GDPR) and erased (under Article 17 GDPR) in specific circumstances.
Although blockchain technology’s potential to mitigate data privacy and security risks is beginning to be recognised, the question of whether its use will comply with the new generation of data protection regulations has received little attention. It is perhaps unsurprising, however, that potential difficulties do arise – the GDPR was drafted for a world of more traditional centralised databases, while the blockchain is, by its nature, decentralised.
On the face of it, it may be difficult to reconcile the use of technology designed to maintain multiple copies of (albeit encrypted) data on thousands of networked computers with regulations aimed at controlling where data is stored and by whom. In addition, the immutability of blockchain databases would appear to put the technology at odds with the “right to be forgotten” provided under the GDPR – while Article 17 grants individuals the right, in certain circumstances, to require a data controller to erase their personal data, it is technically difficult, if not impossible, to delete data from a blockchain. Records can be updated and amended by supplementation, but the past cannot be erased.
One potential way to resolve this conflict between technology and regulation is to store the data in question off the blockchain, with only proof of its existence and integrity (known as a “hash”) kept on the chain. For instance, with a blockchain established for the purpose of maintaining medical records as mentioned above, only the hashes would be stored on the chain, the actual medical records remaining in secure “off-chain” hospital databases, thereby reducing proliferation and ensuring that the data can be deleted if required. Whether this technique would be sufficient to ensure regulatory compliance, however, remains unclear – even a hash could constitute personal data – and the technology presents a range of other compliance challenges.
Despite regulators’ valiant efforts to update data protection regulations to address modern risks, the gulf between law and technology has perhaps never been wider. Lawmakers now face the daunting task of keeping up with fast-moving technology and ensuring that regulation does not stand in the way of much-needed technological solutions.
Blockchain as a means to achieve GDPR Objectives
It has been argued that blockchain technologies might be a suitable tool to achieve some of the GDPR's underlying objectives. Indeed, blockchain technologies are a data governance tool that could support alternative forms of data management and distribution and provide benefits compared with other contemporary solutions. Blockchains can be designed to enable data-sharing without the need for a central trusted intermediary, they offer transparency as to who has accessed data, and blockchain-based smart contracts can moreover automate the sharing of data, hence also reducing transaction costs. Furthermore, blockchains' crypto-economic incentive structures might have the potential to influence the current economics behind data-sharing. These features may benefit the contemporary data economy more widely, such as where they serve to support data marketplaces by facilitating the inter-institutional sharing of data, which may in turn support the development of artificial intelligence in the European Union.
These same features may, however, also be relied upon to support some of the GDPR's objectives, such as to provide data subjects with more control over the personal data that directly or indirectly relates to them. This rationale can also be observed on the basis of data subject rights, such as the right of access (Article 15 GDPR) or the right to data portability (Article 20 GDPR), that provide data subjects with control over what others do with their personal data and what they can do with that personal data themselves. The ideas behind these projects might be helpful in ensuring compliance with the right to access to personal data that data subjects benefit from in accordance with Article 15 GDPR. Furthermore, DLT could support control over personal data in allowing them to monitor respect for the purpose limitation principle. In the same spirit, the technology could be used to help with the detection of data breaches and fraud.
Practical Applications
Blockchain technology is rapidly being adopted to help manage and verify personal data. For example, since as far back as 2012, Estonia has been using the technology in its data registries across national health, juridical, legislative, security and commercial code systems. Illinois is also testing various blockchain initiatives, including a birth registry which will allow companies and government departments to verify and authenticate citizens’ identities by making a request for encrypted access to certain information, such as name, date of birth, sex or blood type.
One of the clearest use cases for blockchain technology is to address concerns over the security of medical records. Around the world, large, centralised databases containing medical records have frequently been targeted by hackers; for example, in July 2018, it emerged that personal data belonging to approximately 1.5 million Singaporeans – including Singapore’s Prime Minister, Lee Hsien Loong – had been stolen from the database of the country’s largest healthcare institution. Were a similar data breach to occur in Hong Kong, only very limited sanctions would be available under the PD(P)O (although civil claims could also be brought in the courts). It remains to be seen whether those sanctions will be strengthened in line with global trends; however, penalties, however harsh, cannot undo the damage done.
A number of blockchain projects are now in development seeking to address the problem at source. Developers are building new infrastructure to allow medical records to be stored on a “permissioned” blockchain where only members of a closed group will have access rights. With this technology, medical records will be encrypted and not directly accessible on the blockchain itself; users will only be given indicators as to the actual location of the records, and particular members’ rights to access the data can be limited in scope and time as required. This offers greater security in terms of data management, as opposed to a traditional database in which personal data can be obtained directly from a centralised location. These and similar developments have the potential to give individuals greater control over their personal data and reduce the risks associated with data proliferation.