Koko, ChatGPT, and the outrage over corporate experimentation
Online mental health service Koko accidentally sparked outrage recently by disclosing it experimented with using ChatGPT to write messages to users. But the outrage, in my view, was focused on the wrong issue: nonconsensual experimentation.
What happened
Koko is a peer-to-peer messaging service for mental health support. Users post messages anonymously, and other users answer them anonymously. According to tweets from its CEO, Rob Morris, it ran an experiment where help-givers had the option to generate draft messages with OpenAI 's ChatGPT. The human helper then had a few options – send the message without editing, edit it and then send, or scrap it and write their own. By analyzing thousands of chats with and without ChatGPT, the company said it found people rated ChatGPT-assisted messages more highly than human-composed messages.
At first it appeared that recipients of help weren’t informed of ChatGPT’s involvement, and several mainstream media outlets reported the story as such. “Mental health service used an AI chatbot without telling people first” read the headline of a New Scientist article, which has since been changed. Later, articles in Gizmodo and Vice clarified that all users were informed, and AI-generated messages came with a notice saying ‘written in collaboration with kokobot’. It was easy to get the wrong impression, though. Morris had said on Twitter, “once people learned the messages were co-created by a machine, it didn’t work” - which implied they hadn’t been initially informed.?
What people got outraged about
The outraged discussion about this has focused on the fact that Koko was (apparently) experimenting on users without their knowledge. We now know this was never true, but the nature of the outrage is interesting. People wanted to know, did Koko have this reviewed by an IRB (Institutional Review Board)? The CEO responded, correctly, that IRB review wasn’t required. Further outrage ensued. This kind of experimentation is wildly unethical and probably illegal, folks said.?
Experimentation is the wrong focus for our moral outrage
In general, product testing is not subject to IRB approval. Companies experiment on users all the time without IRB review and without specifically notifying users. Such tests are not only legal, but may even be a really good thing.
领英推荐
The term "A/B Illusion", coined by Bioethics professor Michelle Meyer, brilliantly captures the faulty reasoning behind fear of corporate experiments. It describes the fact that we tend to think A/B testing is bad, but just implementing A or B without any testing is fine. In a 2015 New York Times Op-Ed with the provocative title “Please, Corporations, Experiment on Us,” Meyer and Christopher Chabris make a compelling argument that (in some cases) experimenting on people without their consent is not only legal and ethical – it’s a good idea. If we don’t know whether A or B is better, performing an experiment may be the best way of finding out.?Absence of consent might be essential, because If people know about an experiment, that knowledge can distort the results. They’re careful to note this argument only applies to low-risk settings, and no one should be performing high-risk experiments without informed consent.
You might argue that mental health support is a high-risk setting, and therefore doesn’t fit Meyer’s and Chabris’ argument. In addition, OpenAI specifically disallows the use of ChatGPT for health purposes. But keep in mind Koko is a peer support network; the human helpers are likely unqualified too. No one is receiving, or expecting to receive, standard-of-care treatment through Koko.?
To be clear, I’m not arguing that this means it would have been fine for Koko to use ChatGPT without informing users. What I'm arguing is that the failure to inform users would have violated a different ethical principle, unrelated to experimentation.
Transparency is the real issue
Transparency is one of the core principles of AI ethics frameworks such as the EU Ethics Guidelines, the IEEE Ethically Aligned Design guidelines, the OECD AI Principles, and many others. In fact, a systematic review of AI ethics principles revealed that transparency was the single most common principle found in AI ethics frameworks. Transparency includes many elements, but one of the most basic requirements is that users know when they’re interacting with AI.?
In this, Koko appeared to have failed. In Morris’s initial descriptions, it sounded as if support-givers knew they were using ChatGPT, but support-receivers didn’t. If this had been true, it would have been a clear violation of one of the most universally-agreed-upon principles in AI ethics.?
We can learn something else about transparency from this incident: having a human in the loop doesn’t remove the need for transparency. Even though Morris always made it clear that humans were reviewing every ChatGPT-generated message and deciding whether to use it, people were outraged at the idea that recipients didn’t know the messages came from an algorithm. If there’s one important takeaway here, it’s this: people really want to know when they’re interacting with AI.
Principal Platform PM | Former Applied Scientist | MS in Data Science | MBA
1 年Super insightful article Carol! Thank you! I am excited to read more content from you. I agree with the sentiment that, "people?really?want to know when they’re interacting with AI." One question I am curious to see unfold is what will consumers classify as an interaction with AI? In this case, for example, I imagine a lot of the Koko peer helpers use Google to help them find articles etc. for answers. So, even though they are not generating their answers with ChatGPT and AI, they are using Google (which you could argue is AI) to find content, so in some ways it is AI assisted. While a stretch, we do know that Google Search does influence your view of reality and truth, thus the Koko responses without ChatGPT could still be AI influenced responses, but I don't think that surfaces up any red flags to the consumers. Obviously, in this case my example is a bit of a stretch and more clear cut on what is AI interaction (ChatGPT) and what is not (user only), but I do wonder as we have more and more AI embedded into our everyday "accepted" products, then what will consumers flag as AI interactions vs. not?
Vice President and Head, Computational Biology at Bayer Pharmaceuticals
1 年Thanks for your insightful post! As part of Transparency, can we get Terms & Conditions that are succinct and written in colloquial language? Burying permissive language in legalese isn’t transparent at all
Senior Researcher, XAI | PhD | PII Detection | NLP
1 年Did the CEO ever clarify that the end-users (patients) were informed in advance that they were part of this study? I asked him the question on Twitter and never heard back.