登录查看更多内容

Can ChatGPT Replace an ER Doctor?

Graham Walker, MD

Healthcare AI+Innovation | ER Doc@TPMG | Offcall & MDCalc Founder (views are my own, not employers')

发布日期: 2023年5月25日

I keep going back to something Sam Altman said at his Senate hearing: GPT-4 is "good at doing tasks, not jobs." It's a concise and important distinction that many people gloss over when they first experience ChatGPT: you instantly recognize all of your job's tasks that it can perform and disregard all of your other work that it can't.

So I thought I'd do a little experiment to see how quickly I'll be out of work. Let's take just one the 20+ patients I see every shift in the ER and see how much of my job AI could do for me. I'll focus on ChatGPT/Large Language Models since that's all the buzz but I'll think about other AI tools as well. And just for crystal-clarity: this is not an actual patient; numerous demographic and other details about the case have of course been changed for privacy and ethical reasons.

Heart Attack or Not?

It's 6:11am on Saturday morning, and my first patient has walked in reporting some upper abdominal pain this morning. She's a kind Farsi-speaking woman in her 60s. She smokes, has diabetes, and COPD. We get a video translator. The pain woke her from sleep, she vomited once, is feeling a little short of breath, and her color just looks kind of... off. You know when someone's queasy and is going to either vomit or pass out? Pale and kinda green? She looks like that, but isn't passing out or vomiting. She keeps saying upper abdominal pain but is really pointing to her lower chest when I ask her to show me exactly where it's hurting. We get an EKG, which looks a bit different to me from her prior one in the system, but it's read by the computer as completely normal. We send off numerous labs and give medicines for pain and nausea.

Despite the medications, over the next 20 minutes her symptoms worsen, so the nurse comes and grabs me — more pain, more vomiting, now sweaty. (We have a mantra in emergency medicine: If your patient is involuntarily sweating, you should be sweating —?they usually have something very bad.)

I repeat the EKG — it's very slightly worse than the first, but still not meeting the exact criteria we use for heart attack (and is also read again by the computer as 100% normal). Regardless, I still decide to activate the cardiology team given her EKG and symptoms, as I'm now sure she's having a very time-sensitive diagnosis: a massive heart attack (called a ST-elevation myocardial infarction, or STEMI).

A blocked left circumflex coronary artery on cath — A blocked left circumflex coronary artery. via Google Image search, not patient information, of course

Our amazing team drives in from home, awakened on Saturday morning by their pagers, and comes in and talks with me, reviews the EKGs, and talks with the patient and gets consent for a catheterization procedure. They take her down the hall, thread a long catheter from her groin into her heart, and find and open up her 99% occluded left circumflex artery. She's discharged a day later and thankfully makes a full recovery. So what all did I have to do as an ER doctor for this patient?

领英推荐

A conversation with ChatGPT about the COVID-19…

Juan Lama 2 年前

From ELIZA to ChatGPT: Early Chatbots and the…

Rahul Bhattacharya 3 周前

The Rise of ChatGPT: Three Industries Where AI is…

PROCAL TECHNOLOGIES 12 个月前

Human vs. AI: Task Review

Take a history from the patient. ChatGPT can take a pretty decent history, and can translate numerous languages — including Farsi. But it can't see, and I think it would have been slightly thrown off because the patient was complaining of abdominal pain, not chest pain, and actually said (via human translator) she didn't have chest pain. 5/10.
Read the EKG. While there are better AI-trained EKG interpretation systems being developed (and new paradigms about the signs of heart attack on EKG), the current system entirely missed this one. Often they'll even say something like, "Nonspecific changes, consider ischemia," but on multiple EKGs, this was interpreted as stone-cold normal. This would have been a fatal mistake. 0/10.
Recognize she's getting worse. It was the astute emergency nurse that knew to grab me and escalate her concerns about the patient; she was able to recognize visual clues. The woman wasn't reporting new symptoms, but had become visibly sweaty and her response to medical treatment (continued pain, new vomiting) despite appropriate medications were the tip-offs. AI programs — just like the human brain — need inputs in order to make outputs. In this case, AI (without additional senses) wouldn't have had much to go on. NB: We see lots of patients who have continued pain and vomiting despite medications, but it's that gestalt/gut feeling/spidey sense of an experienced doctor or nurse that's really critical. 0/10. (I do think AI models will eventually be able to develop a spidey sense with lots of training and LOTS of data, but not just yet.)
Know which diagnoses to consider (and what tests and medications to order). I think ChatGPT might have been initially thrown off by the report of only abdominal pain, but I think eventually it would have gotten there. But I've definitely seen it cast too wide of a net, listing far too many possibilities (and missing others). In this case, it was critical to make a decision with limited information. Given all the same digital data that I had at the time, I think it would have done a decent job. 7/10.
Decide to make the call and wake up the Cardiology STEMI team on their day off. Let's be very clear: I work with an incredible group of consummate professionals who understand that my job in the ER is to not miss any heart attacks — and that I may wake them up at 2am sometimes, have them drive into the hospital... and be wrong. But even still, I don't want to be wrong very often. (Worse, of course, would be to not call them when someone is having a heart attack and I'm just not sure.) But I think we're a far ways off from a cardiology team agreeing to be called in based on the information from a ChatGPT-esque large language model. Like I said above, I have no doubt that we'll develop better EKG-reading tools, and will be able to combine that with some elements of the patient's story, age, risk factors, and maybe even things like subtle vital sign abnormalities or a video feed showing that they're sweaty — and then say "ALERT ALERT IT'S A HEART ATTACK!" But even if those systems exist, I think we'll want someone with medical knowledge — maybe who is supervising the AI tool — to make the final call. 0/10.
Decide to perform a procedure on the patient... It's one thing for me to call in the cardiology team. It's another for the cardiologist to do their own independent interpretation of the EKG, talk with the patient, and then decide: "Yes, I am concerned enough that this is a heart attack in order to subject the patient to the risk of this procedure, where I take a really long wire, stick it into their femoral artery in their groin, guide it all the way up into the heart, look for a blockage, and open that blockage if it exists." (Again, our cardiologists are amazing, but invasive procedures are never without risks, and while yes, we doctors tend to do something of a logical math game of "benefits > risks" in our heads... it's another thing entirely to know that you're the one doing the procedure and some percentage of the risk is based on how you do the procedure.) 4/10. (This one is extremely subjective because AI could certainly do the calculation, but as a physician, it feels way more than a calculation.)
...And consenting the patient for the procedure. This I think ChatGPT could really do a great job with. From language translation to explaining things in patient-centric language and in terms that patients could clearly understand, I do think it probably does it as well as a doctor and possibly even better. Also obviously a Chatbot has near-infinite time to answer additional questions. However the Chatbot probably can't do a great job at reassuring a patient (as obviously things start happening very rapidly to the patient as we are trying to rapidly get their artery opened) and might have a hard time emphasizing the time-sensitivity to a patient with numerous questions. 9/10.
Writing the physician note. Another really strong area for ChatGPT and AI in general, especially when you add in speech-to-text capabilities. I'm certain these tools could write most of my note for me, especially with a bit of instruction about what information to include for quality and regulatory reasons. 9.5/10.

All the Other Tasks

And let's not forget all of the other tasks that I haven't mentioned —?not just those of the cardiologist (doing the actual procedure) but all of the other people who have tasks and jobs that make an ER work, including but not limited to:

Registration and insurance associates
ER technicians (who perform EKGs, shave the groin for STEMI patients, do splinting and dozens of other daily tasks in the ER)
ER nurses (place an IV, do their own assessment, draw blood, safely, correctly, and rapidly administer numerous medications)
Pharmacists (stock medications, review medication interactions, triple-check orders from physicians)
Housekeeping to keep rooms clean, safe, and infectious-agent-free for the patient who needs the bed next
Radiology to perform imaging
Laboratory technicians to make sure blood work is processed
Engineers to keep the lights on and make sure the technology is working correctly
IT to make sure computers, wireless, and telephones are working and accessible (ChatGPT would certainly be a 0/10 without them and engineering for power)

(And that's just for this one patient, for their very brief 1 hour ER stay — not the ICU team when they are done with the procedure, or the other dozens of people that the patient will come into contact with.)

34.5/80, and I Was Being Generous

If you're reading this as "43.1% of an ER doctor's job can be replaced by AI today" — I forgot to mention the 3 other patients who arrived in that hour that I was managing (a young man who'd fallen off a scooter and hit his head and broke his arm; an elderly woman with an itchy rash behind her knee for a week; an elderly man from a nursing home who was more sleepy than normal) plus the 7 other patients I was managing leftover from the overnight doctor who needed additional time and testing.

As I will continue to say: when these AI tools are supervised, evaluated, tested, and validated by doctors and nurses and proven to be ready to help me do some of my tasks, SIGN. ME. UP. For the time being, however: there's no better place to be than under the care of an experienced ER doctor and nurse when you're in need.

Sean McLeary

VP, Product Experience @ Intapp | Future-focused product leader, expert in driving innovation, scaling teams, and delivering exceptional user-centric software

1 年

Seriously great write up, Graham. I’m going to save this for later discussions. I appreciate that you mention the growth in both inputs and data processing that will inevitably come to be. There may still be encroachment in the future on the “fuzzy logic” analysis and decision making that skilled humans do. I am dubious, however, that AI will ever be able to effectively mimic true bedside manner and empathy.

2 次回应

Holly Urban, MD, MBA

Vice President, Business Development at Wolters Kluwer

1 年

Thank you for this reasoned take that shows Generative AI will never replace physicians, but hopefully will make them more efficient. (disclaimer: ChatGPT did not write this comment.)

1 次回应

Bruce Powell

Writer . Broadcaster . Podcaster . Presenter . Brain Injury Advocate

1 年

Human's reaction to AI is fascinating and will form a significant portion of our resistance to its gainful use. Much of the perceived "poor performance" is user-generated and even the "instinctive" human instincts are using past experiences to guide their decision-making. If you instruct ChatGPT to be less prescriptive and more open in its decision-making it will be. Senior's roles will endure. There will always be times when there is no precriptive answer. If there were an answer, then your juniors would not be seeking your advice. Senior clinicians will continue to lead by making decisions when there is no 'right' or 'wrong', except in hindsight. With that reflection will come the bearing of responsibility and acknowledgement of the uncertainty that will persist indefinitely. That is the human bit that we can focus upon and celebrate while our AI colleague does the mundane grunt.

2 次回应

Bruce Powell

Writer . Broadcaster . Podcaster . Presenter . Brain Injury Advocate

1 年

Of course the ER physicians could just call a clinical expert if they want human-based answers?

1 次回应

Bruce Powell

Writer . Broadcaster . Podcaster . Presenter . Brain Injury Advocate

1 年

Whenever I read articles about ChatGPT, I am reminded of the old adage "a bad workman blames his/her tools". If we ask poor questions, we get shoddy answers. After all, we built ChatGPT and we asked him/her the questions. Let AI provide the knowable answers and concentrate our extraordinary talents on the puzzles that have no evidence-based solution.

2 次回应

查看更多评论

要查看或添加评论，请登录

Graham Walker, MD的更多文章

Dr. Uché Blackstock's Legacy: Medicine's Blind Spots Through a Clearer Lens

2025年2月22日

Dr. Uché Blackstock's Legacy: Medicine's Blind Spots Through a Clearer Lens

You know that sensation when a word is stuck on the tip of your tongue, or when you’re about to have déjà vu? It's the…

8 条评论
Patients at the Bottom, Power at the Top: The Healthcare Control Funnel

2025年2月1日

Patients at the Bottom, Power at the Top: The Healthcare Control Funnel

We all have a role in the healthcare system. Whether we’re shaping policies, providing care, creating drugs, or simply…

39 条评论
The Food Policy Paradox: Why Freedom Won't Fix the American Diet or Make America Healthy Again

2024年11月17日

The Food Policy Paradox: Why Freedom Won't Fix the American Diet or Make America Healthy Again

If you zoom in enough, you can actually find some overlap in the RFK Jr. ?? Graham Walker Venn diagram.

71 条评论
Blind Spots Gets It Half Right: Why Medicine's Problems Run Deeper Than We Think

2024年11月9日

Blind Spots Gets It Half Right: Why Medicine's Problems Run Deeper Than We Think

Medicine is an art for good reason: it's practiced by imperfect, subjective humans prone to bias and error (also known…

37 条评论
My Sunglasses Hallucinate JCPenney: My Meta AI Smart Glasses Review

2024年11月3日

My Sunglasses Hallucinate JCPenney: My Meta AI Smart Glasses Review

??? I absolutely love my Ray-Ban Meta Wayfarers (that’s a mouthful): the form factor, the sound, the image quality—but…

7 条评论
Progress in Business vs Medicine: How Incremental Gains are Overloading Healthcare

2024年10月12日

Progress in Business vs Medicine: How Incremental Gains are Overloading Healthcare

The guy sitting next to me at the conference runs IT for a car insurance company. He leans over and says, “Yeah, this…

27 条评论
??The Compass is Broken: How current MedLLM benchmarks worsen AI in healthcare

2024年5月1日

??The Compass is Broken: How current MedLLM benchmarks worsen AI in healthcare

They’re called models in AI because they’re trying to represent something else. Just like you might build a model…

26 条评论
Healthcare: No Room at the Inn

2024年4月2日

Healthcare: No Room at the Inn

?? You run an airline. The plane’s flying no matter what, so you want every seat occupied.

9 条评论
RuPaul Runs an Amazing Meeting

2024年3月23日

RuPaul Runs an Amazing Meeting

It’s a well-known fact that if you’re good at one thing, you’re good at everything. Adele is incredible at finding the…

9 条评论
Medical Alchemy and Rory Sutherland

2024年2月18日

Medical Alchemy and Rory Sutherland

Last month generated a lot of buzz surrounding VC firm General Catalyst buying a hospital, but I’d be way more excited…

7 条评论

See all articles

Can ChatGPT Replace an ER Doctor?

Graham Walker, MD

Healthcare AI+Innovation | ER Doc@TPMG | Offcall & MDCalc Founder (views are my own, not employers')

Heart Attack or Not?

领英推荐

Human vs. AI: Task Review

All the Other Tasks

34.5/80, and I Was Being Generous

Graham Walker, MD的更多文章

社区洞察

其他会员也浏览了

What makes or breaks a clinical trial- according to ChatGPT

AI Has Become a Technology of Faith

AI (ChatGPT/Grok/Deepseek /Claude) and notebooklm views on how to have constructive conversations with people who think they have all the answers?

How to get more nuanced results from ChatGPT with Chain of Thought Prompting

Diagnosing ChatGPT: A Most Curious Case of Hallucinations & Buzzworditis

ChatGPT & Healthcare coming next week! Dr. Harvey Castro will be live at 6 pm EST Thurs Jan 26th: Planetary Health First Mars Next

Is Chat GPT the most useful tool accessibly available to solve social problems? Here is a recent example.

On ChatGPT: "Doctor, Doctor"

ChatGPT 4.0: Revolutionizing Healthcare with AI

Wow! How? Dr. ChatGPT

Heart Attack or Not?

领英推荐

Human vs. AI: Task Review

All the Other Tasks

34.5/80, and I Was Being Generous

Graham Walker, MD的更多文章

Dr. Uché Blackstock's Legacy: Medicine's Blind Spots Through a Clearer Lens

Patients at the Bottom, Power at the Top: The Healthcare Control Funnel

The Food Policy Paradox: Why Freedom Won't Fix the American Diet or Make America Healthy Again

Blind Spots Gets It Half Right: Why Medicine's Problems Run Deeper Than We Think

My Sunglasses Hallucinate JCPenney: My Meta AI Smart Glasses Review

Progress in Business vs Medicine: How Incremental Gains are Overloading Healthcare

??The Compass is Broken: How current MedLLM benchmarks worsen AI in healthcare

Healthcare: No Room at the Inn

RuPaul Runs an Amazing Meeting

Medical Alchemy and Rory Sutherland

社区洞察

其他会员也浏览了

What makes or breaks a clinical trial- according to ChatGPT

AI Has Become a Technology of Faith

AI (ChatGPT/Grok/Deepseek /Claude) and notebooklm views on how to have constructive conversations with people who think they have all the answers?

How to get more nuanced results from ChatGPT with Chain of Thought Prompting

Diagnosing ChatGPT: A Most Curious Case of Hallucinations & Buzzworditis

ChatGPT & Healthcare coming next week! Dr. Harvey Castro will be live at 6 pm EST Thurs Jan 26th: Planetary Health First Mars Next

Is Chat GPT the most useful tool accessibly available to solve social problems? Here is a recent example.

On ChatGPT: "Doctor, Doctor"

ChatGPT 4.0: Revolutionizing Healthcare with AI

Wow! How? Dr. ChatGPT