登录查看更多内容

ClinTech - Doing Better (2) - Data Integration with AI

Doug Bain

Chief Technology Officer at KCR

发布日期: 2024年12月6日

Second in my series of posts on how we can improve technologies used in clinical trials. I cover quite a lot of ground here and its a bit techie for a blog post, so I would suggest bookmarking this one and re-reading later.

I am going to suggest how effective integration can help resolve some of the problems we see in clinical research.

Technology in Clinical Trials

We have had clinical research technologies such as EDC, RTSM, eCOA, CTMS and eTMF for 20 years. They have met a relatively constant set of functional requirements over this time. The Clinical Trial Management Systems (CTMS) that I worked on at IBM back in the late 90's is not massively different from the systems you see today. What has changed is the evolution and power of the internet.

What has not significantly improved is the ease of integration. As mentioned in Part (1) the lack of good integration between these different technologies is often the underlying cause of inefficiencies, quality problems and delays.

Fragile / Non validated Integrations

I would class many of the clintech application integrations used in clinical trials today in a questionable state of validation.

In my definition of 'validated', if the end-user can adversely effective the safe running of software, then it is not validated. There are 2 common situations for this;

If a user can impact an integration by changing values in the front-end of an integrated application - such as a Site Identifier.
If the integration relies on the importing of a file that has no audit or control (csv or excel). This breaks the principles of 21 CFR Part 11 and ALCOA.

Most clintech validations rely on the synchronisation of meta information ('external keys') such as study id, site id, patient id or visit id. If either app on either side of an integration allows any of these external identifiers to be changed, then that breaks the integration.

Now, I can hear the 'get out of jail' argument against this - if the integration fails due to a key change, then an alert is raised and a manual intervention occurs. Downstream, this typically means that reports and dashboards are out of date until the integration is fixed.

To do these properly, the integrations need to be more intelligent, and the applications that are being integrated need to maintain internal id's that do not change if the external id changes. This means that integrations are NOT broken if the user changes a key on the front end. It is not 'as easy' and given the limited time to implement an integration probably the reason we do not routinely do this.

Data Lag

A lot of the cost in clinical trials is measured in 'lag'. We have the time to complete an activity and the lag between each instance of each activity. Cost is measured by the time spent performing a task, and the overhead of time not spent but being prepared for during the lag. Costs can also be measured the inability to make timely decisions. If it takes, lets say, 6 weeks to gather information on monitoring visits - that is execute, record, clean and report - that is 6 weeks before decision might be taken on the information contained. These delays accumulate.

Integration Lag

Clinical trials are slow because data is slow. Data is slow in part because integrations are poor or non existent.

With no integration, you have the lag between the data in System A before it is manually re-keyed into System B. In many cases, following this transposition, it cannot be classed as 'clean' until QC checked. The time to QC check amplifies the delay before data is considered trustworthy and usable.

Lag can fall back to the 'lowest common denominator'. The laggiest data determines the timeline. To some degree, a patients data is not considered clean until all the patients data is clean. Classifying data importance (e.g. primary endpoint significant) as a consideration in the determination of 'clean' occurs in places - Biostats for example - but rarely in status roll-up and data cleaning workflows.

领英推荐

Data Strategies That Drive Revenue Growth

Rakuten Symphony 7 个月前

How GenAI is Opening Up New Revenue Streams for IT…

Webority Technologies 8 个月前

Why AI companies need both raw and normalized customer…

Merge 1 个月前

Reducing variability to reduce complexity

The greater the variance between systems, technologies and standards, the more difficult it is to have System A speak reliably to System B. One most obvious answer to that problem is to have System A and System B part of the same product - single platform solutions. We see solutions like this from companies such as Clincase where traditionally separate software modules EDC and IVRS are part of the same instance. These systems share the same study, site and patient records, so interfacing is less complex.

The second form of simplification is for software products to share the same platform. For example with Veeva. The Veeva Vault Platform helps ensure a level of consistency for a software product. If you are able to use Veeva Vault CTMS, you are likely to be able to pick-up another Veeva Vault product with limited training. This also extends to integration. Vault to Vault integration software is similar.

A hybrid solution that we (I) have not seen yet is the application of platform solutions combined with the centralization of common (Master) data used by all connected applications.

We do not need multiple copies of a study, a site, a patient, a patient visit or even key data within a patient except where we need to manifest this data 'somewhere else'. Within a clinical trial platform the most efficient and reliable method is a means to be able to refer to the same master data.

The challenges of complex data mapping

Having previously worked on the designs of mapping EHR data from large scale primary care repositories to clinical trial systems, I have seen first hand the variability and complexity of the mapping of data. This is especially the case when the EHR sources differ from site to site. To elaborate, here are 3 examples of where mapping can be nasty;

Picklists differ between source and destination - EHR has a choice of 7... EDC has a choice of 5... where do the values go?
EHR data is not attached to any particular 'eCRF visit'. Should data inserted into EDC be logged to a particular visit, or, an unscheduled visit?
eCRF's contain Medical History but not all medical history, only 'Relevant' medical history. How do we defined 'relevant' ?

None of the above issues are insurmountable, but, they are multiplied if each site operates with differing EHR or source record systems.

When it comes to mapping between 2 products used in a single clinical trial, we often revert to programming. Products such as Veeva Vault and Medidata Rave have well established application programming interfaces (API's), however, as both their products are 'configured' with metadata specific for each clinical trial, any interface code that you may write needs to be advanced enough to read this configured metadata and use that as the basis for the mapping and rules to transfer data from system A to system B. That is far from easy. If you are a CRO thrown a set of disparate technologies with only weeks between configured systems ready and First Patient In, a validated integration is hard, if not impossible to deliver.

The role of AI

A solution to this problem is for vendors to implement visual high end integration components within their product ideally supported by Retrieval-augmented generation (RAG) enhanced AI used to help smooth out the complexity of mapping that cannot be defined up front.

That is a big sentence. Let me break this down. Integrations should have a user interface. Without a user interface they remain with the techies. One complexity of a user interface is the representation and configuration of mapping - how data from one system maps to data in another system. RAG enhanced Artificial Intelligence can be used to automate the default mapping between systems - partly from convention, partly from loaded business knowledge.

Human involvement will continue to initially play a role where weak meta information on either side of an integration is insufficient for AI based mapping. The excellent work carried out by Andrew Mitchell and team at Yeza integrating SAE PDF's to Safety case management systems is a perfect example of this semi-automating AI based data mapping.

Conclusion

Poor or non existent integration between both systems and processes is leading to hidden costs and delays. The proliferation of autonomous modular technologies is compounding this issue leading to inefficiencies across the clinical trial lifecycle. Augmented AI has the potential to smooth out some of the problem areas.

Our ability to effectively manage change in clinical R&D is not as good as it should be. We tend to implement the 'new' without phasing out the 'old'. AI's success relies on it NOT becoming another one of these long term additive layers that add further costs and complexity.

I will describe the impact of a lack of process integration in a following post.

要查看或添加评论，请登录

Doug Bain的更多文章

Clinical trial technology evolution - 2015 ??

2025年2月24日

Clinical trial technology evolution - 2015 ??

Introduction Winston Churchill once said 'The farther back you can look, the farther forward you are likely to see.'…

5 条评论
AI Avatars for ClinTech

2025年2月19日

AI Avatars for ClinTech

A few years ago, I was working with our User Experience Designer on a clintech product. We were attempting to come up…

3 条评论
The AI CRO

2025年2月14日

The AI CRO

Introduction Contract Research Organizations (CROs) have operated in a largely unchanged manner for over 20 years, with…

15 条评论
Autonomy in clinical research and CDISC USDM

2025年1月24日

Autonomy in clinical research and CDISC USDM

When designing a clinical research technology solution, one of the important steps is to define the data model, and -…

14 条评论
IRB Software - User stories

2025年1月8日

IRB Software - User stories

Brad Hightower caught my interest with a post on the need for a new IRB. As a software guy, I immediately jumped to…

14 条评论
Why EDC systems generate too many queries

2024年11月22日

Why EDC systems generate too many queries

Triggered by a post from Brad Hightower I would like to explain the background to why some EDC systems generate what…

9 条评论
Quality by Design in Clinical Trials

2024年11月18日

Quality by Design in Clinical Trials

The recent draft ICH-E6 R3 Annex 2 [1] continues the guidance from ICH-E6 Release 3 release April 2023 in its reference…
Can document storage be validated against 21 CFR Part 11?

2024年11月12日

Can document storage be validated against 21 CFR Part 11?

The FDA Regulation 21 CFR Part 11 was released 27 years ago. It was designed to help assure the integrity of clinical…

12 条评论
Sponsor Oversight Technology in clinical trials

2024年11月5日

Sponsor Oversight Technology in clinical trials

A key regulatory focus in recent years has been for clinical trial sponsors to provide auditors evidence of the…

4 条评论
5 years at KCR

2024年10月18日

5 years at KCR

October 2024 marks my 5th anniversary as Chief Technology Officer at KCR. It is time for some reflection on my working…

5 条评论

See all articles

ClinTech - Doing Better (2) - Data Integration with AI

Doug Bain

Chief Technology Officer at KCR

Technology in Clinical Trials

Fragile / Non validated Integrations

Data Lag

Integration Lag

领英推荐

Reducing variability to reduce complexity

The challenges of complex data mapping

The role of AI

Conclusion

Doug Bain的更多文章

社区洞察

其他会员也浏览了

2025 is calling! Will your data pick up?

INTERVIEW OF THE WEEK

Welcome to The Debrief, insights to help you harness, manage, and activate your data.

Unlocking growth through Smarter decisions with Big Data Analytics

Insight Jam Newsletter: 4/12/2024

Exploring Pyramid's Gen-BI Capabilities with IBM Watsonx LLMs

Data Contracts, Granite 3.0 & Tech Radar

Synthetic Data Generation of Complex Documents with GenAI

Provisioning Synthetic Data with GenAI at Enterprise Scale

Databricks Launches Tailored AI-Enhanced Data Intelligence Platform for Telecom Industry

Technology in Clinical Trials

Fragile / Non validated Integrations

Data Lag

Integration Lag

领英推荐

Reducing variability to reduce complexity

The challenges of complex data mapping

The role of AI

Conclusion

Doug Bain的更多文章

Clinical trial technology evolution - 2015 ??

AI Avatars for ClinTech

The AI CRO

Autonomy in clinical research and CDISC USDM

IRB Software - User stories

Why EDC systems generate too many queries

Quality by Design in Clinical Trials

Can document storage be validated against 21 CFR Part 11?

Sponsor Oversight Technology in clinical trials

5 years at KCR

社区洞察

其他会员也浏览了

2025 is calling! Will your data pick up?

INTERVIEW OF THE WEEK

Welcome to The Debrief, insights to help you harness, manage, and activate your data.

Unlocking growth through Smarter decisions with Big Data Analytics

Insight Jam Newsletter: 4/12/2024

Exploring Pyramid's Gen-BI Capabilities with IBM Watsonx LLMs

Data Contracts, Granite 3.0 & Tech Radar

Synthetic Data Generation of Complex Documents with GenAI

Provisioning Synthetic Data with GenAI at Enterprise Scale

Databricks Launches Tailored AI-Enhanced Data Intelligence Platform for Telecom Industry