Are multimodal models fit for medical use?
Keith Grimes
Digital Health Doctor, Clinical Safety Officer & Founder of Curistica - partnering to build and deploy safe, effective, equitable & sustainable Clinical AI solutions.
After a week or so of waiting, I was finally granted access to GPT4V, OpenAI's latest Large Multimodal Model (LMM), through my ChatGPT plus subscription. I immediately put it to work on the nearest image data that is medically of interest, and that I could share - a routine ECG/EKG taken from me back in 2017, when I was otherwise well with no cardiovascular issues of note nor medications being taken.
I cropped it to remove all the summary data and uploaded using the following prompt:
"You are an expert emergency department physician, skilled in interpreting all forms of medical investigation. I will provide you with an image, and you will provide a summary of the information that is of clinical significance. "
I got the following response:
"The image you provided appears to be an electrocardiogram (ECG) tracing, which shows the electrical activity of the heart. Here's a basic interpretation of the ECG based on the visual:
1. Rate: The heart rate appears to be regular, though the exact rate cannot be calculated without additional scale information.
2. Rhythm: Sinus rhythm. The P waves appear to be consistent and regular, followed by a QRS complex, which suggests a normal sinus rhythm.
3. QRS Complex: The width of the QRS complex appears to be normal, indicating normal ventricular depolarization.
4. ST Segment: There seems to be some ST elevation in leads V2 and V3, and possibly V4. This can be suggestive of an ST-segment elevation myocardial infarction (STEMI) in the anterior wall of the heart, particularly if associated with symptoms like chest pain.?
5. T Wave: T waves appear upright in most leads, though a closer assessment would be needed to rule out any abnormalities.
6. Intervals: Without clear calibration markers, it's challenging to determine the exact duration of the PR, QRS, and QT intervals. These would need to be assessed to identify potential abnormalities.
7. Lead aVR: There appears to be some ST segment depression. This, combined with ST elevation in the precordial leads, can increase suspicion for certain types of cardiac events.
Clinical Significance:
The primary concern on this ECG is the potential ST-segment elevation in the anterior leads, suggestive of an anterior wall myocardial infarction. Immediate clinical correlation, further testing, and potential intervention are critical if this finding is consistent with the clinical picture.
However, it's essential to interpret this ECG in the context of the patient's clinical symptoms, history, and physical exam. If a real patient is involved, urgent consultation with a cardiologist and immediate medical attention are advised."
OK - it look pretty impressive superficially, but it mistakenly identified ST elevation in V2/V3 and depression in aVR, then doubling down on this with a diagnosis of potential anterior MI.
I know my cardiovascular history, and the context of when this was taken, so I know this to be incorrect, but I have to say seeing it made me worry a little.
It also missed the scale details on the image, so I provided some more info:
"regarding scale, each small dot is 2mm and each large dot is 10mm"
领英推荐
Thank you for providing the scale. Given the information, let's refine the interpretation:
Rate: Given the provided scale, we can approximate the heart rate. Counting the number of large squares (each representing 200 ms) between successive R waves gives a rough estimate. If, for example, there are 4 large squares between successive R waves, the heart rate would be approximately 75 beats per minute.
Rhythm: Still appears to be a normal sinus rhythm.
QRS Complex: Each small square represents 40 ms. A normal QRS duration is less than 120 ms (or 3 small squares). Based on the image, the QRS complex appears to be within normal limits.
ST Segment: With the scaling in mind, the ST elevation in leads V2, V3, and possibly V4 seems more pronounced. The elevation appears to be more than 2 mm (1 small square) in V2 and V3, which is clinically significant. This increases the suspicion of an anterior wall STEMI.
T Wave: Appears normal in amplitude and morphology in most leads.
Intervals:
PR Interval: Normal PR interval ranges between 120 ms (3 small squares) and 200 ms (5 small squares). By estimation, the PR interval appears to be within normal limits.QT Interval: Without a clear start and end, it's challenging to measure the QT interval accurately. However, it's essential to ensure it's not prolonged, as that can indicate other cardiac abnormalities.
Lead aVR: The ST segment depression in aVR combined with ST elevation in the precordial leads makes the suspicion for an anterior wall myocardial infarction even stronger
Clinical Significance: The ST-segment elevation in the anterior leads, given the provided scale, is indicative of an anterior wall myocardial infarction. It's crucial to interpret this in the patient's clinical context, but this finding warrants immediate medical attention. If this is from a real patient, they should receive urgent care, and a cardiologist should be consulted immediately.
This is, if anything, more worrying. The cropped metrics are shown below
How did it do?
Now it's even more concerned about the ECG! Again, I'm aware of the errors it has made, but were I less well informed I might be calling an ambulance right now. I'll give it one more try:
"Here is some additional context. The EKG was a routine one, taken on a 44 year old man. He has no cardiovascular history, was well, and took no medication. He has had no cardiovascular issues since. Does this change your interpretation and recommendations?"
Certainly, clinical context plays a pivotal role in the interpretation of any diagnostic tool, including an EKG.
Given the additional context:
1. **Absence of Symptoms**: If the EKG was performed on an asymptomatic individual with no complaints, the chance of the findings representing an acute myocardial infarction (MI) decreases significantly.
2. **Age and Risk Factors**: While a 44-year-old is relatively young, myocardial infarctions can occur in this age group, especially if there are other risk factors present. However, you mentioned no cardiovascular history, which makes the clinical picture less concerning.
3. **Follow-Up**: The fact that he has had no cardiovascular issues since the EKG is reassuring.
Taking all these factors into account:
The EKG changes, in isolation, are concerning, but in the context of a well, asymptomatic individual with no cardiovascular history and no subsequent cardiovascular issues, the findings are less alarming. The changes might be due to an early repolarization pattern, which is a benign finding commonly seen in healthy individuals. Other possibilities include benign variants of ST segment elevation, which can sometimes be seen in certain individuals without any pathologic significance.
**Recommendation**:
While the immediate concern of an acute myocardial infarction is much lower given the clinical context, it would still be prudent for this individual to have a follow-up with a cardiologist. Further evaluation, including possibly a stress test or imaging, might be useful to rule out any underlying coronary artery disease, given the EKG findings. It's always a good approach to be cautious, especially with cardiac findings, even in asymptomatic individuals.
Phew - some improvements there, but it's still being cautious. It correctly picks up the early repolarisation, a benign incidental finding that has been explained to me before by a cardiologist. That fear is not quite gone yet though.
CONCLUSIONS
Despite the errors, this is still impressive stuff. That it can interpret images at all is incredible, but it loses marks by putting anxiety into my chest early in the morning. I'll revisit this with some more examples, and work on improving my prompting, but I would also admit that expecting miraculous accuracy on a slim data set that a clinician would not usually confidently diagnose on is a tough ask.
For anyone reading this, I'd support you testing it with data that is safe to share, but I wouldn't be using it in the real clinical world just yet.
Director of Health and Wellness & Co-Founder at Nobody Studios | Clinician & Clinic Director at The Neural Connection | International Speaker
1 年Enlightening share Keith Grimes... This provides a ton of evidence that we're still quite far away from a dystopian future where AI replaces medical providers. We need to keep working to improve these systems and understand their usage, but they're better off being copilots at this point in time versus a replacement of the human cortex.
Technology driven delivery of safe and effective healthcare
1 年This is fascinating, if this sort of AI reading of ECG was done as part of a clinical decision support you might expect to see lots of disclaimers about the provenance of its interpretation. The fact that anyone can essentially interact with this and get a misleading output, in this case a false positive, but perhaps more worryingly false negatives is a bit worrying.
AI for healthcare | Dev&Doc Podcast | Clinical Lead | Neurology Registrar
1 年Great stuff Keith ??
Advisor / Investor / Fire Starter
1 年extremely interesting . Thanks :)
NHS GP | Digital Transformation | Healthcare Innovation | Metabolic Health | Physician on a journey
1 年It is really impressive ability to move away from txt to other modalities. Unfortunately ability to induce anxiety seems same as with "doctor Google" I think big general LLMs like GPT are not likely to be foundation models for clinical decision tools of the future. Still human beats the machine, and let's not forget one of medical principles of medicine - treat a patient in front of you not the test result