Can ChatGPT Replace an ER Doctor?
via Leonardo.ai's Generative AI

Can ChatGPT Replace an ER Doctor?

I keep going back to something Sam Altman said at his Senate hearing: GPT-4 is "good at doing tasks, not jobs." It's a concise and important distinction that many people gloss over when they first experience ChatGPT: you instantly recognize all of your job's tasks that it can perform and disregard all of your other work that it can't.

So I thought I'd do a little experiment to see how quickly I'll be out of work. Let's take just one the 20+ patients I see every shift in the ER and see how much of my job AI could do for me. I'll focus on ChatGPT/Large Language Models since that's all the buzz but I'll think about other AI tools as well. And just for crystal-clarity: this is not an actual patient; numerous demographic and other details about the case have of course been changed for privacy and ethical reasons.

Heart Attack or Not?

It's 6:11am on Saturday morning, and my first patient has walked in reporting some upper abdominal pain this morning. She's a kind Farsi-speaking woman in her 60s. She smokes, has diabetes, and COPD. We get a video translator. The pain woke her from sleep, she vomited once, is feeling a little short of breath, and her color just looks kind of... off. You know when someone's queasy and is going to either vomit or pass out? Pale and kinda green? She looks like that, but isn't passing out or vomiting. She keeps saying upper abdominal pain but is really pointing to her lower chest when I ask her to show me exactly where it's hurting. We get an EKG, which looks a bit different to me from her prior one in the system, but it's read by the computer as completely normal. We send off numerous labs and give medicines for pain and nausea.

Despite the medications, over the next 20 minutes her symptoms worsen, so the nurse comes and grabs me — more pain, more vomiting, now sweaty. (We have a mantra in emergency medicine: If your patient is involuntarily sweating, you should be sweating —?they usually have something very bad.)

I repeat the EKG — it's very slightly worse than the first, but still not meeting the exact criteria we use for heart attack (and is also read again by the computer as 100% normal). Regardless, I still decide to activate the cardiology team given her EKG and symptoms, as I'm now sure she's having a very time-sensitive diagnosis: a massive heart attack (called a ST-elevation myocardial infarction, or STEMI).

A blocked left circumflex coronary artery on cath
A blocked left circumflex coronary artery. via Google Image search, not patient information, of course

Our amazing team drives in from home, awakened on Saturday morning by their pagers, and comes in and talks with me, reviews the EKGs, and talks with the patient and gets consent for a catheterization procedure. They take her down the hall, thread a long catheter from her groin into her heart, and find and open up her 99% occluded left circumflex artery. She's discharged a day later and thankfully makes a full recovery. So what all did I have to do as an ER doctor for this patient?

Human vs. AI: Task Review

  1. Take a history from the patient. ChatGPT can take a pretty decent history, and can translate numerous languages — including Farsi. But it can't see, and I think it would have been slightly thrown off because the patient was complaining of abdominal pain, not chest pain, and actually said (via human translator) she didn't have chest pain. 5/10.
  2. Read the EKG. While there are better AI-trained EKG interpretation systems being developed (and new paradigms about the signs of heart attack on EKG), the current system entirely missed this one. Often they'll even say something like, "Nonspecific changes, consider ischemia," but on multiple EKGs, this was interpreted as stone-cold normal. This would have been a fatal mistake. 0/10.
  3. Recognize she's getting worse. It was the astute emergency nurse that knew to grab me and escalate her concerns about the patient; she was able to recognize visual clues. The woman wasn't reporting new symptoms, but had become visibly sweaty and her response to medical treatment (continued pain, new vomiting) despite appropriate medications were the tip-offs. AI programs — just like the human brain — need inputs in order to make outputs. In this case, AI (without additional senses) wouldn't have had much to go on. NB: We see lots of patients who have continued pain and vomiting despite medications, but it's that gestalt/gut feeling/spidey sense of an experienced doctor or nurse that's really critical. 0/10. (I do think AI models will eventually be able to develop a spidey sense with lots of training and LOTS of data, but not just yet.)
  4. Know which diagnoses to consider (and what tests and medications to order). I think ChatGPT might have been initially thrown off by the report of only abdominal pain, but I think eventually it would have gotten there. But I've definitely seen it cast too wide of a net, listing far too many possibilities (and missing others). In this case, it was critical to make a decision with limited information. Given all the same digital data that I had at the time, I think it would have done a decent job. 7/10.
  5. Decide to make the call and wake up the Cardiology STEMI team on their day off. Let's be very clear: I work with an incredible group of consummate professionals who understand that my job in the ER is to not miss any heart attacks — and that I may wake them up at 2am sometimes, have them drive into the hospital... and be wrong. But even still, I don't want to be wrong very often. (Worse, of course, would be to not call them when someone is having a heart attack and I'm just not sure.) But I think we're a far ways off from a cardiology team agreeing to be called in based on the information from a ChatGPT-esque large language model. Like I said above, I have no doubt that we'll develop better EKG-reading tools, and will be able to combine that with some elements of the patient's story, age, risk factors, and maybe even things like subtle vital sign abnormalities or a video feed showing that they're sweaty — and then say "ALERT ALERT IT'S A HEART ATTACK!" But even if those systems exist, I think we'll want someone with medical knowledge — maybe who is supervising the AI tool — to make the final call. 0/10.
  6. Decide to perform a procedure on the patient... It's one thing for me to call in the cardiology team. It's another for the cardiologist to do their own independent interpretation of the EKG, talk with the patient, and then decide: "Yes, I am concerned enough that this is a heart attack in order to subject the patient to the risk of this procedure, where I take a really long wire, stick it into their femoral artery in their groin, guide it all the way up into the heart, look for a blockage, and open that blockage if it exists." (Again, our cardiologists are amazing, but invasive procedures are never without risks, and while yes, we doctors tend to do something of a logical math game of "benefits > risks" in our heads... it's another thing entirely to know that you're the one doing the procedure and some percentage of the risk is based on how you do the procedure.) 4/10. (This one is extremely subjective because AI could certainly do the calculation, but as a physician, it feels way more than a calculation.)
  7. ...And consenting the patient for the procedure. This I think ChatGPT could really do a great job with. From language translation to explaining things in patient-centric language and in terms that patients could clearly understand, I do think it probably does it as well as a doctor and possibly even better. Also obviously a Chatbot has near-infinite time to answer additional questions. However the Chatbot probably can't do a great job at reassuring a patient (as obviously things start happening very rapidly to the patient as we are trying to rapidly get their artery opened) and might have a hard time emphasizing the time-sensitivity to a patient with numerous questions. 9/10.
  8. Writing the physician note. Another really strong area for ChatGPT and AI in general, especially when you add in speech-to-text capabilities. I'm certain these tools could write most of my note for me, especially with a bit of instruction about what information to include for quality and regulatory reasons. 9.5/10.

All the Other Tasks

And let's not forget all of the other tasks that I haven't mentioned —?not just those of the cardiologist (doing the actual procedure) but all of the other people who have tasks and jobs that make an ER work, including but not limited to:

  • Registration and insurance associates
  • ER technicians (who perform EKGs, shave the groin for STEMI patients, do splinting and dozens of other daily tasks in the ER)
  • ER nurses (place an IV, do their own assessment, draw blood, safely, correctly, and rapidly administer numerous medications)
  • Pharmacists (stock medications, review medication interactions, triple-check orders from physicians)
  • Housekeeping to keep rooms clean, safe, and infectious-agent-free for the patient who needs the bed next
  • Radiology to perform imaging
  • Laboratory technicians to make sure blood work is processed
  • Engineers to keep the lights on and make sure the technology is working correctly
  • IT to make sure computers, wireless, and telephones are working and accessible (ChatGPT would certainly be a 0/10 without them and engineering for power)

(And that's just for this one patient, for their very brief 1 hour ER stay — not the ICU team when they are done with the procedure, or the other dozens of people that the patient will come into contact with.)

34.5/80, and I Was Being Generous

If you're reading this as "43.1% of an ER doctor's job can be replaced by AI today" — I forgot to mention the 3 other patients who arrived in that hour that I was managing (a young man who'd fallen off a scooter and hit his head and broke his arm; an elderly woman with an itchy rash behind her knee for a week; an elderly man from a nursing home who was more sleepy than normal) plus the 7 other patients I was managing leftover from the overnight doctor who needed additional time and testing.

As I will continue to say: when these AI tools are supervised, evaluated, tested, and validated by doctors and nurses and proven to be ready to help me do some of my tasks, SIGN. ME. UP. For the time being, however: there's no better place to be than under the care of an experienced ER doctor and nurse when you're in need.

Sean McLeary

VP, Product Experience @ Intapp | Future-focused product leader, expert in driving innovation, scaling teams, and delivering exceptional user-centric software

1 年

Seriously great write up, Graham. I’m going to save this for later discussions. I appreciate that you mention the growth in both inputs and data processing that will inevitably come to be. There may still be encroachment in the future on the “fuzzy logic” analysis and decision making that skilled humans do. I am dubious, however, that AI will ever be able to effectively mimic true bedside manner and empathy.

Holly Urban, MD, MBA

Vice President, Business Development at Wolters Kluwer

1 年

Thank you for this reasoned take that shows Generative AI will never replace physicians, but hopefully will make them more efficient. (disclaimer: ChatGPT did not write this comment.)

Bruce Powell

Writer . Broadcaster . Podcaster . Presenter . Brain Injury Advocate

1 年

Human's reaction to AI is fascinating and will form a significant portion of our resistance to its gainful use. Much of the perceived "poor performance" is user-generated and even the "instinctive" human instincts are using past experiences to guide their decision-making. If you instruct ChatGPT to be less prescriptive and more open in its decision-making it will be. Senior's roles will endure. There will always be times when there is no precriptive answer. If there were an answer, then your juniors would not be seeking your advice. Senior clinicians will continue to lead by making decisions when there is no 'right' or 'wrong', except in hindsight. With that reflection will come the bearing of responsibility and acknowledgement of the uncertainty that will persist indefinitely. That is the human bit that we can focus upon and celebrate while our AI colleague does the mundane grunt.

Bruce Powell

Writer . Broadcaster . Podcaster . Presenter . Brain Injury Advocate

1 年

Of course the ER physicians could just call a clinical expert if they want human-based answers?

Bruce Powell

Writer . Broadcaster . Podcaster . Presenter . Brain Injury Advocate

1 年

Whenever I read articles about ChatGPT, I am reminded of the old adage "a bad workman blames his/her tools". If we ask poor questions, we get shoddy answers. After all, we built ChatGPT and we asked him/her the questions. Let AI provide the knowable answers and concentrate our extraordinary talents on the puzzles that have no evidence-based solution.

要查看或添加评论,请登录

Graham Walker, MD的更多文章

社区洞察

其他会员也浏览了