Unraveling the Role of Humans in Shaping Responsible AI in Medicine

Unraveling the Role of Humans in Shaping Responsible AI in Medicine

Ilya Sutskever, OpenAI’s Chief Scientific Officer who helped fire Sam Altman (which didn’t last long), was recently quoted in the WSJ saying to an OpenAI employee they needed to “formalize what it means for Artificial Intelligence (AI) to love humanity.” [1] Ilya’s concern with safe AI is noble and important—but love?

Love and safety are different. One is an emotion; the other a feature. AI has features, humans have emotions. AI will never be sentient. AI will never make decisions. AI will never love, hate, or even be indifferent towards humanity. What it will do is reflect the sentience, decisions, love, hate, and indifference of humanity back at humanity.?

AI is not a new concept or even a new technology. In fact, AI is almost as old as computers themselves—humans were the first computers, performing math equations by hand. Alan Turing and many other early computer science pioneers were mathematicians. Given AI is just math, it’s not hard to accept that AI is a collection of evolving mathematical processes and strategies—tools—rather than a breakthrough event in computer evolution.?

The relationship between humans and computation is an important one to remember. In the same way we harnessed machines to industrialize and automate manual labor we bent silicone and electric charge to do our math homework. In that sense, computers are no more likely to become sentient or even creative than the cotton gin or your lawn mower.?

Sentience, or the capacity to feel, is biologically the ability to respond to the environment via biochemistry that happens in our cells—it’s one of the seven characteristics of life. [2] Which is why the sentiment of AI leading to sentient computers is a non-starter—while they may imitate intelligence at times, they completely lack the most important characteristics of life inseparably connected to sentience.?

Unthinking, unfeeling computers is not a sexy narrative—no blockbuster movie script here—but it is imperative we embrace the truth that we, humans, are the sentient element of AI. The classification of an AI outcome as a ‘bug’ or ‘feature’ depends on the data used to train the AI and user (human) acceptance of the outcome.

Which means if AI “decided” to launch a nuclear missile at Los Angeles (the plot of an upcoming blockbuster) it would be a bug—not a decision. Rather than asserting: AI made a decision, let’s go fight AI-mageddon; we would start asking ourselves why the algorithm produced that outcome—and then we would change the inputs to affect a different outcome.

Like any machine, AI will do exactly what it is programmed to do—by a sentient being.?

Which further underscores the fundamental theorem of biomedical informatics proposed by Charles Friedman that the partnership between computer and human is always better than the computer or human alone. [3]?

A famous example: amateur chess players paired with multiple chess algorithms beat grandmasters and the chess algorithms playing on their own. [4] (More on this here)

What’s important is that we understand what computers are much better at than humans and then do what we do much better than computers: apply our creativity, innovation, morality, risk tolerance, and emotion to point AI at the right problems.?

This is best accomplished with data. Data just represents a pattern to AI. Turns out, computers are capable of processing the breadth, depth, and dimension of patterns in data in ways humans simply cannot.?

Making the connection between data and patterns with relation to AI training might be the most fundamentally important thing to understand about AI, machine learning, ontological reasoning, statistical modeling, or any other application of math using computational means.?

Equally important: computers are so good at pattern recognition they can’t operate outside of the pattern.?

So how do you get good AI outcomes? You better have excellent data for the AI to extract patterns from. We call that training. You give the math your data—the patterns—and use computers to create an equation, model, or network. Then you feed that equation new, unseen data and it uses the equation to produce an answer.?

If your data has holes, your answers will have holes. This phenomenon is often characterized by the adage: “garbage in, garbage out”.?

It doesn’t matter how smart you are, how many computers you have, how big your data is, how many engineers you have, how much money you can spend, where you are from, the phase of the moon, or your astrological sign—if your data is garbage your AI will learn bad patterns. AI that learns bad patterns will produce garbage outcomes.?

This is best understood in the context of images. AI performs really well with image data. Why? The collection of images the AI trains on are complete—none of them are missing pixels. This means the data will have patterns the AI can reliably match and produce strong outcomes.

Data feeding into many of the AI algorithms that have become explosively popular recently depend on data that is either missing lots of “pixels” or is bloated with willfully or accidentally distorted “pixels”. Think social media, web forums, wikis, online chat rooms, emails, blog posts, advertisements, legal records, websites—these data sources inherently contain the patterns popular AI tools are following.?

All were created by humans and contain our sentience. AI just reflects it back at us.

Cathay O’Neil famously said in the Social Dilemma on Netflix (which everyone should watch), “We are allowing the technologists to frame this as a problem that they are equipped to solve. That’s a lie. People will talk about AI as if it will know truth. AI’s not going to solve these problems. AI cannot solve the problem of fake news. Google doesn’t have the option of saying, ‘Oh, is this conspiracy? Is this truth?’, because they don’t know what truth is. They don’t have a proxy for truth that’s better than a click.” [5]

In other words, we are the click that provides “truth” to the AI. We think and feel for it. Ever heard the saying if a product is free, you are the product?

This is why AI can’t be relied on to improve AI: we are the product. Every click in every app we use; every bit of content we post anywhere online; every pattern we emulate in the data we generate is fair game to train AI. Are you completely truthful on the internet? Is anyone? I think we can bet the farm on this one: there is an enormous amount of falsehood on the internet. So much so, it’s a widely accepted joke not to believe everything you read on the internet.?

Of course, we believe everything we read on the internet anyway because our intelligence isn’t artificial. Our intelligence is emotional, creative, hopeful, faithful, foolish, misguided, lucky, and singularly unique. We don’t fall for convincing falsehood on the internet because we are following a pattern, we fall for it because we want it to be true.?

Maybe that’s what makes human intelligence so unreplicatable: we aren’t indifferent about the information we consume—we have opinions. Opinions buck patterns. Opinions simultaneously give us the capacity to create and infer and the liability to be confused, deluded, and misled by an assertion that sounds true but isn’t.?

AI can’t distinguish between fact and opinion. That’s scary considering the relative distribution of fact (accurate?or inaccurate) and opinion in the data AI is using to learn its patterns. Nevermind the virtue (or lack thereof) of the user's intention.

With the advent of Chat-GPT 3 and 4 and image engines like DALL-E 2 and 3, high fidelity fake AI images, videos, audio recordings, and articles just became a cheap commodity.?

If we are the sentient element of AI, then every lie, every immorality, every error willful or otherwise on the internet just collectively became the villain from every Hollywood inspired apocalyptic AI movie from the Matrix to The Terminator, to the latest Mission Impossible. No better proxy for the truth than a click.?

To avoid turning tools of our own making on our own heads, we have to collectively be the responsible sentient in cyberspace. Especially in medicine.

Consider medical records: they’re full of accurate and inaccurate facts and opinions. Most of the opinions are highly educated and skilled opinions, but opinions nonetheless. Most of the facts are accurate but many are out of date, out of context, or errors. Many facts are just missing.?

Fragmentation of medical records further obscures medical records across many databases that contain duplication and interoperability hurdles.?

Essentially, medical records have lots of missing and distorted “pixels”. Which means training on medical records will produce output that is also missing or contains distorted “pixels”. Worse yet, those missing or distorted “pixels” will be much more difficult to spot in a medical record than they would in say, an image.?

In a recent interview, David Ferucci, the creator of Watson Health, said, “One interesting thing about human cognition is that we conflate coherent-sounding text with facts. We are like, that sounds really good, it must be true”.?

In other words, to us, missing and distorted “pixels” present much like highly informed opinions. Yikes.

Speaking of, remember Watson Health? Me neither. Because they weren’t responsible with their AI. When David Ferucci and team successfully built Watson to win at Jeopardy, IBM conflated really impressive pattern inference from training data as adaptable intelligence.?

They were wrong. David Ferucci told them so. IBM spent an enormous amount of time and money promoting Watson as a silver bullet for everything from finance to medicine. Ferruci left IBM a year later. Who could have guessed that Jeopardy was the wrong pattern for medicine? What is: the patient should take an aspirin and lie down for a while?

No shock Ferucci left IBM when they wouldn’t heed his warnings. Watson never amounted to anything close to its hype and was well known for recommending inaccurate and unsafe cancer treatments. IBM sold Watson Health to a private equity firm in 2022. [6,7] Importantly, Watson was a large language model (LLM).?

New LLMs have burst on the scene—Chat-GPT 3 and 4 are an impressive iteration on what Watson once was. But they still are just reflecting back at humans everything humans have produced—right and wrong, accurate and inaccurate, moral and immoral. Any intervention to temper that is a human intervention.?

As impressive as Chat-GPT is, it still struggles with basic medical tasks. Asked to provide all the medical codes for Congestive Heart Failure from SNOMED-CT, a clinical terminology for clinical concepts, Chat-GPT dutifully spit out 27 codes.

Problem: over half of the codes it produced weren’t real. All of SNOMED-CT is part of the underlying training but that doesn’t ensure an optimal pattern for the algorithm. We have to steer it.?

Medical calculators are tough on Chat-GPT too. Asked to compute a simple stroke risk score given a handful of inputs for a patient like age, sex, presence of diabetes, hypertension, congestive heart failure, etc. it accurately calculates a CHA2DS2-VASc score. But it forgot to provide the annual stroke risk associated with the score.?

Asked specifically to provide the stroke risk, it provided a risk that was close but not exact. More complicated calculations based on regression analysis and other statistical learning methods produced even less encouraging results. Results for a MELD score predicting 90-day mortality for cirrhosis patients were flat out wrong.?

Objectively spotting these errors is easy enough. But what about when it’s not? Five or six inputs is easy to account for but what about thousands of inputs across thousands of medical calculations across hundreds of patients per doctor per day??

Artificial intelligence, machine learning, statistical models, large language models, natural language processing, simple math—it all constitutes an impressive hammer. But not everything is a nail.?

At AI Medica, we’re being really careful how we wield the AI hammer. Our products would certainly be sexier if we added AI everywhere. But most of the technology problems that need fixing in healthcare don’t need AI—yet. They need formalization.?

Formalization is a key branch of AI. Knowledge formalization is accomplished by creating relationships between concepts in a branching graph-like network called an ontology. Much like an artificial neural network (ANN), ontologies are reminiscent of how axons in our brain form connections.?

However, there is a distinct difference between ANNs and ontologies. Where ANNs, and AI in general, seek to create networks automatically by “learning” a data set, ontologies are taught. Ontologies make the assumption that if not explicitly stated, it could be true. This is called the open-world assumption.

The open-world assumption leaves it to us to specifically formalize what a concept is in the ontology. This is done as rules in the context of other concepts in the ontology. Ontologies can be processed with powerful tools called reasoners that rely on math (no surprise here) to validate the consistency of the assertions about the knowledge in the ontology.?

The true difference between ontologies and ANNs: ontologies can be deciphered and therefore modified to be accurate per the knowledge. ANNs (and AI algorithms in general) are famously inscrutable black boxes only influenced by their inputs and informed by their outputs. The connection between input and output is a mystery.

Medical data in the EHR is an unreliable input; AI is indifferent about the output; Jeopardy is a bad model for healthcare. We keep looking for AI to learn us and teach us. It’s the other way around. We are the sentient element of AI and AI needs to be taught.?

At AI Medica, that’s exactly what we do.?


1. University of Toronto Magazine. Ilya Sutskever: The OpenAI Genius Who Told Sam Altman He Was Fired. https://wsj-article-webview-generator-prod.sc.onservo.com/webview/WP-WSJ-0001376973?adobe_mc=TS%3D1700588503%7CMCMID%3D29085267006007352997277206220409754253%7CMCORGID%3DCB68E4BA55144CAA0A4C98A5%40AdobeOrg&wsj_native_webview=android&ace_environment=androidphone%2Cwebview&ace_config=%7B%22wsj%22%3A%7B%22djcmp%22%3A%7B%22propertyHref%22%3A%22https%3A%2F%2Fwsj.android.app%22%7D%7D%7D

2. 3.1. What are the characteristics of life?https://astrobiology.nasa.gov/education/alp/characteristics-of-life/

3. Friedman, C. P. A ‘fundamental theorem’ of biomedical informatics. J. Am. Med. Inform. Assoc. 16, 169–170 (2009)

4. Mani, S. Note on Friedman’s ‘fundamental theorem of biomedical informatics’. Journal of the American Medical Informatics Association: JAMIA vol. 17 614 (2010)

5. The Social Dilemma - A Netflix Original documentary. The Social Dilemma https://www.thesocialdilemma.com/

6. Landi, H. IBM sells Watson Health assets to investment firm Francisco Partners. https://www.fiercehealthcare.com/tech/ibm-sells-watson-health-assets-to-investment-firm-francisco-partners

7. Diaz, N. Francisco Partners completes acquisition of IBM Watson Health’s data, assets. Becker’s Hospital Review https://www.beckershospitalreview.com/healthcare-information-technology/francisco-partners-completes-acquisition-of-ibm-watson-health-s-data-assets.html


AI Medica的更多文章

