Getting the Most out of Healthcare Data: De-Identification Methods at Your Service

Getting the Most out of Healthcare Data: De-Identification Methods at Your Service

Considerations on Healthcare Data Governance Under GDPR

The digitization of healthcare is in full swing, transforming every industry sector. The introduction of a multitude of digital solutions has led to an explosion of health data, which has quickly become the industry’s most valuable asset. Effective and secure collection, processing, storage, and analysis of health data is essential for healthcare companies, health professionals, and patients, ensuring better health outcomes through more targeted product offerings and data-driven decisions.

Indeed, the sheer volume of?valuable health data is?enormous. It?is?constantly being collected from multiple sources such as?electronic patient health records, patient-generated data and patient-generated outcomes (e.g., surveys), healthcare applications (including DiGAs), IoT devices and wearables, clinical studies, prescriptions, and medication adherence data. This creates large and complex data sets with tremendous potential to?provide valuable insights. However, the healthcare industry itself is?complex, highly regulated, and subject to?changing rules and constraints, making compliant data operations a?challenging task. Addressing these challenges is?key to?unlocking the power of?predictive analytics in?healthcare.

Health data collected in?electronic patient records is?considered patient-related personal data, which under the General Data Protection Regulation (EU) 2016/679 (following ?GDPR?) in?general may be?used for dedicated purposes only. Nevertheless, in?certain cases, GDPR and national data protection regulations of?the member states allow the use of?such data for specific further use cases. If?the use of?the data outside of?the scope of?the original purposes is?permitted by?national legislation and GDPR, healthcare companies will usually be?required to?remove all personally identifiable information (PII) so?that it?can no?longer be?traced back to?an?individual. Then the anonymized data can be?used for specific new purposes and may be?further analyzed, aggregated, and processed to?generate new insights for healthcare entities. These can then be?used to?offer highly customized solutions to?clients?— whether they are patients, medical staff, or?research professionals.

How to?Securely Work with Personal Data?

The current European data protection law regulates data governance, protection, and security. The GDPR provides the basis to?recognize a?much more complete spectrum of?de-identification.

To?date, the GDPR has taken a?binary approach to?de-identification. Data is?either personal data or?anonymous. The recitals of?GDPR Number 26?state the following: ?The principles of?data protection should apply to?any information concerning an?identified or?identifiable natural person. Personal data which have undergone pseudonymization, which could be?attributed to?a?natural person by?the use of?additional information, should be?considered information on?an?identifiable natural person. To?determine whether a?natural person is?identifiable, account should be?taken of?all the means reasonably likely to?be?used, such as?singling out, either by?the controller or?another person to?identify the natural person directly or?indirectly.?

An?effective anonymization solution prevents all parties from singling out an?individual in?a?dataset, linking two records within a?dataset (or?between two separate datasets), and inferring any identifiable information within such a?dataset. In?consequence, it?is?necessary to?remove more than just the directly identifiable information to?ensure that the identification of?a?data subject is?no?longer possible. That means that, depending on?the context and data processing purpose, additional measures may be?needed to?prevent data from being identified. In?this case, the data is?becoming anonymous.

However, in?specific data analytics cases, it?is?important to?keep pieces of?sensitive data?— for example, the user ID?— that can potentially be?traced back to?the individual patient record. Strictly speaking, such data cannot be?considered fully anonymous, and the term pseudo-anonymous is?being used.

Various de-identification techniques provide a?wide range of?valuable tools that can be?used to?protect individual privacy. These techniques range from relatively weak (i.e., can reduce privacy risks to?a?modest degree) to?powerful techniques that can effectively eliminate most privacy risks.

At?present, there is?no?universal method for making data pseudo-anonymous. In?fact, every scenario demands a?thorough evaluation and a?uniquely customized blend of?de-identification approaches. The table below enumerates several of?these methods:

No alt text provided for this image

The data de-identification techniques and methods described in?the table above still contain some risks to?ultimately being able to?identify individual subjects. Thus, such data can only be?considered pseudo-anonymized. An?additional level of?data protection can be?achieved by?employing a?combination of?technological, organizational, and legal measures. Proper data governance will include?— but must not be?limited to?— the following:

  • An?access and permissions policy with the appropriate authentication of?individuals accessing the data
  • Definition of?retention policy
  • Mechanisms to?delete the data upon request
  • Personnel training and education on?data governance and protection principles
  • Collection of?the users’ consent that clearly describes to?the subject the data collection purpose and its usage and others

Thus, to?ensure that data governance is?performed correctly, both an?experienced technology partner and a?legal regulatory consultant with industry knowledge must be?involved in?all stages of?the development of?a?digital health product.

What Compliant Data Solutions May Look Like

In?partnership with NEXTEC medical, DataArt has and continues to?work on?solutions for medical device development for the?EU market, where de-identified data is?used for product analytics and to?generate product insights.

These data solutions typically include the following pillars:

  • Data collection:?Data collection is?performed by?client-side applications (e.g., mobile app or?web app), which monitor user activities. Collected data (events of?user actions) is?sent to?the back-end system on?a?defined schedule (e.g., an?event of?opening the app or?performing a?specific action). The client ensures that all processing activities, including the use of?data for the original purpose, the anonymization, and further analysis of?anonymous data, are in?accordance with the applicable data protection regulations.
  • De-identification transformation: Back-end services include de-identification techniques specifically tailored to?each application. Usually, full data anonymization is?not achieved due to?the need to?keep certain unique identifiers such as?user ID?or member ID. As?an?organizational measure, during clinical studies, separate and optional consent is?provided by?the users for this kind of?data processing.
  • Data storage:?Raw data is?stored securely in?an?encrypted form either on?EU-based on-premise infrastructure or?EU-based servers of?local and international hosting providers. Raw and de-identified data is?stored in?separate storages with a?proper permissions schema and data access control applied to?all employees. Retention policies are applied to?data storage where applicable.
  • Data transfer and visualization: Pseudo-anonymized data was securely transferred via REST API to?data visualization tools?— such as?Power Bi, Looker, and Tableau?— for further data visualization and data analytics. Further, correlation and multivariate analysis informed product and treatment decisions for the relevant user cohorts.
  • Data security: All data processing happens in?a?safe environment with secure connections and authentication between the services. This includes following other standard industry best practice approaches.

Data solutions delivered by?DataArt and NEXTEC medical represent a?powerful engine that is?able to?inform the product management teams about the benefits as?well as?potential weaknesses of?their application at?the very start of?clinical trials. This is?key to?ensuring a?fast decision-making process.

With the correct data solution, it?is?possible to?comply with the strict GDPR and still gain valuable insight from anonymized medical data. Insights can then be?used to?optimize product design and more accurately target offers to?users, bringing benefits to?all parties in?the healthcare ecosystem?— patients, care providers, researchers, and manufacturers.

Originally published here.

要查看或添加评论,请登录

Andrew Mazur的更多文章

社区洞察

其他会员也浏览了