A promising approach towards mitigating the risk of inaccuracies of Generative AI
SARUA - Southern African Regional Universities Association
Facilitating meaningful connections between SADC universities.
Generative AI chatbots (ChatGPT and kin) are notorious for the fact that they can generate inaccuracies, often also called hallucinations, confabulations, fabrications or simply falsifications. Various recent publications regard the designation ‘hallucination’ as a misnomer and express preferences for the other words mentioned. The inaccuracies can be on factual matters or regarding the logical coherence of the responses generated. In addition, some chatbots can generate ‘bibliographic resources’ that do not exist at all.
This is a serious challenge in many areas of life and work. In the academic domain of learning/teaching and research this forms a serious problem for both lecturers/researchers and learners, causing additional effort to identify such inaccuracies, and, of course, also risk to the academic and professional careers of those involved. Similarly, in areas of health and business (and many others) the issue of chatbot-generated inaccuracies causes concern and stymies implementation in areas where the risk could be severe.
It has also become clear that these inaccuracies are inherent to the nature of GenAI as based on LLMs and that they might not go away even with improvement in the technology. Some users have put their hopes on developing better prompts. One piece of advice from a well-known internet source indicated that the chances of inaccuracies can be limited by not asking certain types of questions from the chatbot. There are indeed indications that some newer versions of chatbots lead to fewer cases of inaccuracies, and there are also differences between the popular chatbots.
In a paper that has just appeared in its final version in Business Horizons, a highly regarded academic journal, a group of scholars analyse the issue of inaccuracies and suggest a framework of ‘chatbot work’ that provides a way of understanding and mitigating the epistemic risks involved: T.R. Hannigan et al., ‘Beware of Botshit: How to manage the epistemic risks of generative chatbots’. (April 2024)
In developing this framework, the focus was on the use to which the chatbot response is put, and the implications of inaccuracies for that context. Based on this, the required mitigation measures become clear.
领英推荐
Having identified the various types of epistemic risk of chatbot responses, the authors propose that chatbot work can be classified on two axes: response veracity importance (which can be: crucial or unimportant) and response veracity verifiability. This leads to a quadrant with four modes of chatbot work: authenticated chatbot work; augmented chatbot work; automated chatbot work; autonomous chatbot work. In the case of authenticated chatbot work, the output must be checked meticulously against other sources for factual accuracy, logical coherence and truthfulness. Relevant mitigation measures for the other modes in the quadrant are also identified. Once the framework is understood by the user, it effectively becomes a lens through which the user can view the output of chatbots and take the necessary mitigation measures, where applicable.
Of all the various solutions proposed for dealing with inaccuracies generated by chatbots, this framework possibly provides the deepest understanding of the nature and implications of the inaccuracies and the most workable approach in dealing with such inaccuracies.
For readers who do not have access to this article behind the paywall, the pre-publication version of the text is still available for free on the SSRN Generative AI Hub: https://www.ssrn.com/index.cfm/en/AI-GPT-3/?page=1&sort=0&term=botshit. The exceptionally high number of downloads of the pre-publication from this hub over a period of three months is an indication of the interest in resolving the matter of inaccuracies in chatbot work.
?