Unlocking Healthcare Potential: Precision AI's Edge in Clinical Adoption

Unlocking Healthcare Potential: Precision AI's Edge in Clinical Adoption

The landscape of AI technology is evolving at an unprecedented pace, and the future remains largely unpredictable. Despite being an early adopter and frequent user of the OpenAI sandbox (early version of ChatGPT), I confess that I didn't foresee the explosive growth of AI’s capabilities.

However, it’s vital to strike a note of caution. While the advancements of generative AI (GenAI) platforms like ChatGPT are astonishing and surpass previous expectations for AI, we must stay grounded when predicting its near-term implications. We must also differentiate between types of AI when they are applied in clinical settings.?

Working with ChatGPT and similar foundation models could create the false expectation that – very soon – any medical condition could be diagnosed in an image or medical record from a highly generalizable foundation model without the additional work of tuning it to a specific task. While these AI models can prove invaluable for tasks like reducing administrative burdens, the fact is that GenAI models currently do not have the levels of diagnostic accuracy needed in high-stakes clinical settings.

While there remain challenges in adapting foundation models such as ChatGPT to many clinical tasks, the dominant form of AI for clinical practice will remain what we call “Precision AI." These are models trained for solving specific tasks and above all - achieving diagnostic accuracy to make them valuable in clinical practice.?

The Cost of an Error

First, it’s essential to highlight the fundamental question that, while intuitively understood, warrants explicit mention: what is the cost of an error in AI? The risk profile of drafting a consumer email, for instance, is vastly different from making a medical decision.?

A recent study at Johns Hopkins revealed that every year in the US, 795,000 patients either die or suffer permanent disability due to medical errors. Clearly, when it comes to creating AI for clinical use, the stakes are remarkably high. So if AI is acting as an aid to physicians, and medical decisions in clinical environments can have such a drastic impact on patient outcomes, clinical AI then must prove its accuracy.?

The Complexity of Healthcare Data and Applications

Let’s consider the complexity of healthcare data. It is:

  • Inherently multi-modal: This means it demands the integration of a diverse range of information types. This includes imaging data, textual information, genomic profiles, lab results and even time-series data like vitals. This multitude of data types makes the task of creating coherent and comprehensive healthcare AI models far more challenging.?
  • Highly dimensional: The sophisticated data mentioned above necessitates an extensive context size. Current foundation models like ChatGPT typically handle contexts on the order of 100,000 tokens at a “non-diagnostic” accuracy. A typical CT image could easily contain millions of tokens and require “diagnostic” accuracy.?
  • Highly domain specific: Many real world problems become easier to solve as foundation models evolve, due to the similarity between different domains. For example - an autonomous vehicle camera is still a digital camera with many similarities to your smartphone camera. In contrast, the medical data domain is inherently different from everyday data (an x-ray of your hand will look nothing like any photo produced by your smartphone), and thus a completely dedicated model is required for the medical domain, and the development of this model can’t be accelerated by relying on previous models.
  • Scarce in expert labels: Vast amounts of data are annotated for the training or validation of many “general domain” foundation models today. For instance, GenAI models for image segmentation are often built on annotations of millions of images from non-experts. Even many models which are trained on un-annotated data are validated on vast amounts of data annotated by non-experts. The more general-purpose the model becomes, the more use-cases need to be validated, and this is of even greater importance in the clinical domain.

Furthermore, there is a complexity to the tasks you need AI to perform, of which can fall under two broad categories: Detection and Extraction. Current AI systems, including ChatGPT, are used mainly for extraction of insights from the text or corpus they were trained on. However, detection, particularly of subtle anomalies, is far more challenging than extraction.?

Consider a radiologist reading a CT scan and detecting a subtle brain aneurysm. This requires “detection” at “diagnostic” accuracy. Once the radiologist writes this finding into the report, anyone reading the report only needs “extractive” accuracy to understand the patient has a brain aneurysm. This is a key differentiator that necessitates “Precision AI” to achieve clinical relevance rather than the extractive accuracy you find in foundation models like ChatGPT.?

GenAI Accuracy: A Work in Progress

Achieving accuracy in AI, particularly in healthcare applications, is a more intricate challenge than it might initially seem. And despite its constant advancement, we might still be some distance away from reaching the level of accuracy necessary for effective clinical use of GenAI models. Most GenAI models, like ChatGPT, have been trained/validated? to solve problems significantly different from diagnostic-level detection. For example, consider the difference in complexity between answering a question about a text and detecting a subtle brain hemorrhage in a CT scan. The latter is a task of immense precision and subtlety, which might require detecting a subtle change in a 15 pixel needle in a 100 million pixel hay stack. It’s a vastly different problem and the dimension of the problem is immense. Recent research tried utilizing ChatGPT for detection in long text, which is an easy variation of solving the ‘needle in a haystack’ problem. They found that as ChatGPT input size grew (meaning the number of words that you give ChatGPT to search), it was less capable of answering questions about that input, yielding below 60% accuracy.?

In short, ChatGPT is not great at finding a needle in a haystack, which is exactly what clinical AI needs.?

Charting the Course

The journey towards achieving the necessary accuracy for clinical AI implementation is filled with significant challenges. The substantial difference between extraction and detection, the multi-modal nature of healthcare data, the requirement of an extensive context size at “Diagnostic” accuracy and the scarcity of expertly annotated data all contribute to these obstacles.?

While foundation models like ChatGPT have made impressive strides in generic applications, their application in the nuanced and complex field of healthcare is not an immediate prospect, where diagnostic accuracy is required.?

I do believe that we will see GenAI applied to clinical use-cases that require low accuracy - for example anatomy detection and automatic measurement (which are considered "easier problems" and can nevertheless help solve many clinical challenges).

As the technology evolves, we'll see AI move towards more advanced diagnostic use-cases until one day reaching prediction. At Aidoc we strongly believe in the potential of the technology to overcome these challenges, and have decided to make a $30M investment in research dedicated to making the scientific breakthroughs that will enable foundation models to bridge the gaps required for diagnostic accuracy.?

As it stands, ”Precision AI” remains a crucial component for the continued implementation of AI in clinical settings.

Demetrius Kirk, DNPc, MBA,MSN, RN, LNHA, LSSGB, PAC-NE, QCP

Healthcare Consultant | Expert Leadership Coach | CMS Regulatory Expert | Top Healthcare Executive | Compliance Specialist | Servant Leader

11 个月

Looking forward to your insights on navigating the AI landscape!

Stefan Schr?der

Leading AI Adoption for Clinical Trials and Healthcare

11 个月

Great article. I do think AI offers huge potential to support HCPs in clinical analysis. There are many examples of AI analysing results just as well or better than HCPs. ANd with 49% of HCPs experiences burnout, the current system doesn't work for them or for patients. Having a support mechanism which can back up human findings or support in relatively mundance tasks does offer huge potential to change an industry that is begging for support.

Margaretta Colangelo

Leading AI Analyst | Speaker | Writer | AI Newsletter 56,800+ subscribers

11 个月

Great article Elad Walach

Ajit Deshpande

CEO - Rises Analytics Solutions [rises.io]: startup in AI/ML, BigData, Blockchain. FinTech/BFS Enterprise Solutions.

11 个月

Very well articulated Elad Walach Aidoc

Well said and very helpful way to examine the vast landscape - thank you Elad Walach

要查看或添加评论,请登录

社区洞察

其他会员也浏览了