Data Ethics: Is Health Data Underregulated?
Dimitrios Kalogeropoulos, PhD
CEO Global Health & Digital Innovation Foundation | UCL GBSH Health Executive in Residence | EU AI Office GPAI Code of Practice | PhD AI in Medicine | IEEE European Public Policy Committee | Chair IEEE P3493.1? | Speaker
The full Q&A from the PrivSec Global discussion on March 23, 2021.
The key message I would like to get across today is that data protection, ethics and regulation serve data sharing and to deliver we must regulate our data sharing habits.
Let me give you a little background. International development consulting has certainly been a fascinating and challenging journey, discussing with government officials, international development partners and high-level policy advisory committees, the need for data sharing.
From AI Methods to Global Regulatory Framework Development, the goal has always been to make Universal Health Coverage better facilitated by shared data, supporting value-based care collaboration, empowering and strengthening primary care and improved clinical economics. And these are very important qualities for a health system called to deal with today’s pressing health matters, in a reality of limited resources.
For this reason, in the sector development strategy of each of the countries I have worked in, I encountered one common global concern: how to gain access to quality, reliable data. Thing is this cannot be achieved without strategic, regulatory oversight – something made abundantly clear by enterprising big-data AI applications over the past few years.
Again, to gain access to reliable data, it first needs to be captured and shared. Problem is, the health sector, by contrast to industries such as banking or travel, and in spite of its needs, has been doing very little sharing and has therefore been terrible at capturing meaningful data that may be reliably shared. And with little sharing going on, it is easy to protect data. All you have to do is watch it travel from the admissions department to a payer’s computer system, or to a cancer registry.
Then, literally overnight, we witnessed sharing complexity that reached historic heights, and this is when regulators suddenly realised we need long-life-milk kind-of-protection for our health data, as data now needs to travel far and across too many functional, semantic, institutional, jurisdictional and sovereign boundaries. What happened is we realised we are in a worse situation than we thought we were.
Well, change is sometimes forced upon us. And to succeed, change needs to be a little far wider in scope and certainly enough proactive. This is an experience I want to bring to the conversation today.
Now is the time to discuss what more we can do to protect data and truly enable reliable data sharing, to leverage high quality data and to finally steer our health systems into the kind of 21st century where we can respond to any data challenge at the touch of a button.
Why was the roll out of the Covid19 vaccine in Israel been such a success story?
Allow me to provide a little context before answering.
In Europe, the GDPR has been instrumental in enabling digital development. Yet when sharing data became an imperative for global health, the data was simply not there. For instance, how much comorbidity information do we have without a study? And how long is it before we can have the raw data?
We found ourselves in this situation because, whatever the use, health institutions find it difficult to implement data sharing formatively. A number of things are to blame for this, many of which are currently being resolved, also due to the pandemic.
Fair to also say, many important developments took place before the coronavirus, and as a result, countries were better prepared to collaborate and deal with this public health emergency. For instance, the GDPR was already being, to varied degrees harmonised with national health data protection law.
Moreover, the disease has accelerated progress considerably, and has been driving the adoption of innovative solutions, many of which are AI-based, that were ignored for quite some time. This is important, considering health is a very stubborn sector of the economy, which over the years, and despite best intentions, became very good at entrenching interests.
For example, Israel with its vaccination programme relied on enabling legislation, a well-developed public-health network, and well-assimilated EMRs, in order to dispense data under what our EU GDPR calls substantial public-interest conditions. As we all know, Israel conducts and capitalises on a series of vaccine effectiveness and impact studies with its own real-world data, on behalf of vaccine manufacturers and ultimately its people.
As expected, those of us who feel the pulse of news on social networking platforms, have detected the criticism related to this arrangement, and this is why proactive regulatory oversight is important. When communicated clearly, together with a good measure of how it serves goals and benefits, it does away with doubting lawfulness and ethics in digital development acceleration.
There are other examples. For instance a country might offer digital innovation sandboxing to test AI tools such as symptom checkers. This is what Rwanda has been doing, with a world innovation capitalisation strategy pitched as if you have a great idea, test it in Rwanda and scale it up to the rest of the world.
Again, an enabled data platform is leveraged to foster digital development.
All such examples are of forward-thinking countries, which capitalise on their collective digital development efforts, with regulatory data sharing ecosystems and the proactive transformation of their health sectors.
So, what have we learned?
Clearly, data sharing is very important for our health. We have also agreed that sharing and protecting go hand-in-hand. In other words to enable data sharing we must guarantee the protection of data, as agreed in the EU’s GDPR, the UK Data Protection Act, or the Health Information Privacy section under the US HIPA Act (Health Insurance Portability and Accountability).
This sounds like not much, but actually it is. What is not so clear, or has not been agreed at all, is the following.
Firstly, goals must be clear and the means to reach them well justified. Otherwise our sense of protection is fake, unjust or biased. For instance, protecting data means more than privacy when the ethics of AI deployment are discussed.
Once goals are well-defined, sharing protected data requires also defining what we are protecting against, in order to evaluate the offered protection. There is no room for ambiguity here. Again, we must protect against misuse of patients’ data that encourages bias, discrimination, or the exacerbation of existing biases.
This protection is addressed by ethical AI governance frameworks but somehow they skim over the data-ethics core.
We keep forgetting that, together with the GDPR data minimisation principle, we need to provide for and protect (1) the Completeness and Accuracy of data for all uses, thereby protecting reliability.
We also seem to forget that, further to enabling data sharing by means of GDPR, or equivalent, and governance models, we must protect (2) the Meaning of data.
The goal is to succeed in safely moving away from care silos and siloed data. For that, we must adopt successful data sharing paradigms, building on popular ideas such as big data analytics, which is now long analytics, and data collaboratives and trusts, such as for example the ones established to combat rare diseases and develop orphan drugs.
And there is more. Protecting the meaning of data, offers significant Spillover Effects that benefit privacy.
To name some, we can:
- Protect the accuracy of data, which is an under-regulated yet a fundamental GDPR principle,
- Extend the useful life of captured data and thus make data re-usable and longitudinal,
- Protect a perpetually extended right to recall data sharing permissions, regardless of de-identification mechanisms, and thus make provisions for a number of things--- for true user-centric privacy management, for building collective agencies and chains of trust, for implementing decentralised identifiers or self-sovereign identity management, and ultimately patient or self-agency in data uses; all of which further promote trust.
The resume is, extending data regulation to protect the Original Longitudinal & Integrated Care Contexts makes meaningful, usable and useful data accessible to those mandated. There are standards and guidance for this, many of which developed by the World Health Organisation. Of course adopting them comes with its own set of challenges.
Imagine Blockchain used for this purpose--- to implement perpetual provenance protection with digital fingerprinting, even for anonymised data, and context immutability to safeguard the reliability of data, such as required for example to build ground truths in machine learning. Incidentally, this is a task which accounts for over 80% of the effort of AI with ML implementation projects, and is often carried out without even the prospect for re-using this data.
And consider also what we may achieve with readily available context and outcome-specific Long data, which is a quality upgrade of the big data paradigm and which has been producing the marvels of systems medicine and precision treatment development, which are also used for the Corona virus disease.
How can the lessons being learned from Israel vaccine programme be utilised to support better regulation of health data?
That data sharing comes with a great many strategic advantages, for the health system, the society and scientific progress, in strengthening the collaboration between industry and government to benefit society, and for the acceleration of digital investments and transformation toward digital economy growth.
One key lesson regarding better regulation comes from observing the flow of real-world data as reported in the context of the pragmatic controlled trials in Israel. I believe some effort went into producing the necessary results, in spite of the country’s technological and health system preparedness. This is the 80% wasted effort share mentioned earlier.
Again, post-enablement, there is plenty of room for improvement and progress.
Another important lesson regarding better regulation is in the potential to leverage national health data ecosystems and to capitalise on digital development with much more than the industry-centric productivity gains expected from data sharing toward and within payer systems. Refocusing our expectations is of essence.
I am certain there is plenty more to come in due course.
What would the benefits be from more regulatory oversight for the digital health industry?
Well, the goal is to establish organised, lawful and effective, reliable and interoperable national, regional and global health data spaces, which are driven by specific purposes such as with rare disease data trusts, developed to foster care & medical research innovation, and to deliver vertical and horizontal spillovers within and across sectors, thus benefiting patients and populations, as well as innovators and industry start-ups.
This is an evolutionary process and needs oversight to deliver results. Consider for instance the European Health Data Space vision. The list of benefits is endless.
Allow me to deviate a little by saying, en route to this vision, a “Health Data Protection Regulation” would build on GDPR to provide regulatory oversight related to data reliability, and should include following provisions:
- Prohibit or at least discourage orphan data
- Promote with data minimisation the principle of longitudinality as a minimum reliability standard
- Enforce collective data provenance assurance and clinical context immutability, to reflect the actual patient journey
- Protect patient agency over data and decentralised self-sovereignty to truly enable health data sharing
- Extend trust services to cover with patient-agency the entire distributed and semantically federated health data Ledger, including the collective digital fingerprinting of each record entry.
Benefits would include:
- Protection against unethical A.I., for instance by enabling the establishment of Reliable Non-Discriminative Ground Truths to steer AI into trusted clinical use
- Furnish medical research innovation and Systems Clinics with trusted Longitudinal Data sources as needed, for example, for observational, outcomes-based policy R&D
- Foster care provider collaboration and Clinical / Care Innovation with shared health data spaces
- Accelerate the integration of Precision Medicine into the disease eradication arsenal.
Having said all that, let us not forget, regulatory oversight requires the setting of and compliance with standards.
For instance, we can regulate all we want, but without HL7 messaging no one in the industry will move health data around for you. And this is but one of the standards that must be used proficiently in order to protect accuracy.
What is the work being done to improve the regulation of health data?
Data regulation is a responsibility which is no longer restricted by the mandates of care providers and concepts such as EMRs, EPRs or EHRs. These are data capture tools which instead of resolving the problem of data and care fragmentation, they accentuate it.
Luckily, the need for more and better data has been driving the formation of data collaboratives, such as the ones encountered in the US, where payer systems have been leveraging their care networks and ICTs to build up data spaces to scale and scope, which they can then exploit to develop digital innovation capacities.
Scale up a little and we have big data, a kind of open democratic version of data collaboratives, now accessible by everyone, small or big, brought about by the democratisation of frontier data technologies. For instance, today Facebook data are mined for emotions in order to predict postpartum depression.
It is a kind of deregulation of the platform economy which made this possible. And with it came the world’s love-hate relationship with A.I. and the endless discussion on ethical A.I.
So now we are once again establishing rules to bring order into the chaos of deregulated data spaces and to mitigate the many risks those came with. Big data is being silently transformed into organised big data, where different problem-oriented macroscopic views are possible, given the proper standards.
Thing is, if regulating EMRs as EHRs was hard enough in order to enable data sharing and thus clinical process sharing, imagine today’s pressure on those responsible to enable regulatory oversight with basic tools such as the GDPR.
As we speak, the EU is creating a European Self-Sovereign Identity Framework (ESSIF) which is compatible with the European Identification and Trust Services regulatory framework (EIDAS). The identity framework is a tech platform which makes use of decentralised identifiers (DIDs) and the European Blockchain Services Infrastructure (EBSI) to bring the EHDS vision a step closer.
There is also GAIA-X project, that supports data infrastructural development and is not limited to the health sector (Γα?α in Greek mythology symbolised a greater notion of earth as in our material existence within chaos).
The FDA is regulating Software as a Medical Device (SaMD) and so does the Medical Device Regulation in the EU, both of which mechanisms were established more or less 3 years ago.
SaMD and MDR are concerned with the traditional moral values built into the medical profession, pharma regulation and increasingly HTA: that is, safety, beneficence and non-maleficence, efficacy and effectiveness.
Then there are independent regulatory agencies such as the UK’s Organisation for the Review of Care and Health Applications (ORCHA), a digital health evaluation and distribution organisation that aims to establish order in the digital health application maze by catalysing, accrediting and cataloguing them.
Finally, there is the various Ethical AI Governance Frameworks.
One thing is certain, we are experiencing one of those daring and disruptive innovation diversification periods in our history, and what is soon to follow is convergence, as in the collective life-improving design processes we are undertaking as a society. This is the role of data regulation and governance. The question is: are we really coming up to another GSM type of milestone? And have we learned a lesson from our mistakes of the past?
Can health data regulation and commercial desire ever be in agreement?
Yes of course. In fact they should coexist in harmony, and this is what this is all about - the point I want to get across today. This is about data for sustainable digital development acceleration in health. We cannot harm our health systems by safely regulating digital innovators into our health data spaces. Quite the opposite. But it has to be done properly.
The GDPR is a good example of a solid foundation for orderly data sharing which installs vulnerabilities together with the data protection firewall, so to speak. These have to be addressed and domain-specific HDPR developed to build up from this foundation and to allow each stakeholder group to capitalise on the commercial desires of others.
On the side of current demonstrations, look at GAIA-X. “With GAIA-X, representatives from business, science and politics on a European level create a proposal for the next generation of a European data infrastructure: a secure, federated system that meets the highest standards of digital sovereignty while promoting innovation. This project wants to be the cradle of an open, transparent digital ecosystem, where data and services can be made available, collated and shared in an environment of trust”.
This is certainly where we want to go. What we need to ensure is that once we get to that point, we don’t find ourselves in a situation similar to the one we stared at during the Coronavirus outbreak: that is, loads of fancy weapons and no bullets.