Leveraging Synthetic Data to Optimize Clinical Trials
Basia Coulter, Ph.D., M.Sc.
Global Digital & AI Enablement Executive | Health & Life Sciences | R&D and Real-World Evidence (RWE) | Digital Transformation | Harnessing AI for Breakthrough Innovation & Strategic Impact
Synthetic data could become the next “big thing” in the healthcare and life sciences industry in the near future. According to Gartner, “by 2024, 60% of the data used for the development of AI and analytics projects will be synthetically generated.”?
In this article, you will learn about:
Synthetic Data?
Synthetic data is annotated information generated by computer simulations or algorithms as an alternative to real-world data. Put another way, synthetic data are created digitally rather than collected from or measured in the real world.
Some of the methods to generate synthetic data include statistically rigorous sampling from real-world data, generative modeling, or simulation scenarios where models and processes interact to create completely new datasets of events.
The Challenge of Patient Recruitment in Clinical Trials?
In drug development, the safety and efficacy of treatment are assessed during clinical trials. Recruiting patients who meet eligibility criteria is one of the most significant challenges that contribute to extended timelines and high costs of clinical trials.?
80% of clinical trials globally fail to enroll patients on time. With some sources suggesting that each day of delay of the drug launch costs sponsors between $600,000 and $8 million, failures in timely patient enrollment are a significant contributor to the cost of clinical trials, and the overall cost of bringing new treatments to the market.?
The Value of Synthetic Data in Clinical Trials?
In randomized clinical trials (RCT), patients are randomly assigned to one of two groups. Patients who constitute an experimental arm receive the treatment, while those who receive a placebo or the standard of care constitute a control arm.
领英推荐
The use of synthetic data offers a potential to replace the randomized control arm – made up of patients receiving a placebo or the standard of care – with a synthetic control arm, also known as an external control arm, made up of synthetic data. While not all clinical trials may be suitable candidates for an application of synthetic control arms, they hold great promise, particularly where patient populations are challenging to recruit or assess in randomized clinical trials.
Synthetic control arms can be derived from real-world data (RWD) such as electronic health records (EHR), claims and prescription databases, wearables data, and other sources of de-identified patient health records.
The use of synthetic control arms helps to address the ethical challenge of directing patients to clinical trials, in which they may receive a placebo rather than an active agent. Since patients with severe diseases, such as cancer, may decide not to participate in a clinical trial knowing the possibility of being randomized into a control group, synthetic control arms improve the chances of recruiting the required number of qualifying patients. Additionally, synthetic control arms offer a unique value for clinical trials in rare diseases, where the number of eligible patients is small, and therefore patient recruitment presents an even greater challenge.
By leveraging synthetic control arms, organizations can significantly improve their enrollment timelines and reduce the cost of patient recruitment, ultimately accelerating time-to-market for novel treatments.
The “Fairness” of Synthetic Data
Health inequities among different patient populations are a persistent problem in public health. Inequitable representation of patient subpopulations in synthetic data could lead to inaccurate analysis where conclusions and predictive models do not represent the real world. Any application of synthetic data in clinical trials must be accompanied by efforts to measure fairness to enable the development of machine learning models that create more equitable synthetic healthcare datasets.
The Potential of Synthetic Data in the Future of Clinical Trials
Could the use of synthetic data in clinical trials go beyond synthetic control arms? Yes, it could. In silico clinical trials use patient-specific models to create virtual cohorts for testing the safety and efficacy of new drugs or medical devices. Examples include:?
While in silico clinical trials are unlikely to completely replace clinical trials with real patients any time soon, the methodology can be used to model clinical trials as a way to predict outcomes. With the probability of success (POS) of clinical trials often ranging in single digits, synthetic data offers an opportunity to optimize study design and increase the POS.
As organizations explore data-driven strategies to optimize the process of bringing new treatments to the market, reduce timelines and costs, while bringing greater precision to drug development and improving patient outcomes, synthetic data application should definitely be a strategy to consider.
Digital Strategist II Transformation Specialist II Perpetual Learner
2 年Awesome article Basia Coulter, Ph.D.