Mitigating the Risks of Hallucination in Generative AI

Mitigating the Risks of Hallucination in Generative AI

No doubt. Generative AI delivers impressive results, especially when it comes to democratizing access to information. As the public becomes more comfortable trusting the information generated by large language models, there are some risks that need to be carefully managed. One risk that we safeguard against at Crediture is hallucination. This is when an AI generates convincing but false or misleading information.

Unlike humans, generative AIs do not have an actual understanding of facts and common sense. Their knowledge comes from their training data, which can't cover every possible real-world scenario. So when large language models are asked a new question outside of their training data, they can "hallucinate" a plausible-sounding but false response.

For example, imagine asking an AI financial advisor, "What stocks should I invest in to maximize my retirement income?" The AI might generate a list of company names and financial details that seem credible but are entirely fabricated. Even worse, the AI has no way of knowing that its advice is hallucinated rather than grounded in real data.

Generative AI has the potential to automate the mass production of false expert advice at new scale if deployed irresponsibly. This could even lead to misdirection of capital through false predictions. If AI hallucinates market trends rather than analyzing real signals, it could recommend investments expected to succeed but doomed to fail. Billions in collective wealth could vanish if users act on this. Even worse, financial advice that circumvents laws and regulations could be generated and potentially recommend illegal tax evasion schemes, securities violations, etc. Users could end up in legal jeopardy by following such advice.

Without safeguards, AI hallucination could result in users receiving and acting upon false recommendations with devastating consequences. Although this article focuses on generative financial advice, this issue applies to multiple industries, such as legal and healthcare advice.

Implementing Safeguards Against AI Hallucination

The good news is that there are steps that can be taken to prevent the dangers of unchecked AI hallucination when it comes to financial advice. Hopefully, sharing some of our strategies will inspire other AI-powered companies to help make this amazing technology safer for all of us. At Crediture, this comes down to three pillars: constraining prompts, rigorous testing, and evaluating response confidence scores.

First, we build a constrained interface that narrows the domain of allowed prompts to reduce chances of hallucination. By constraining the interfaces through which users interact with generative AI, we help reduce the risk of hallucination. This involves carefully scoping the capabilities exposed to users to minimize opportunities for the system to generate false information. For example, our conversational AI exclusively accepts prompts related to common personal finance goals like "How do I eliminate my debt?" or "How can I budget to save for retirement?" We avoiding open-ended prompts that could potentially result in hallucination. We restrict the output to only allow responses referencing certified financial products and instruments within the AI's verified knowledge base as opposed to searching the open Web for any available product. We also steer away from tax and legal advice in user prompts. Additionally, we have been experimenting a substitution of free generation with pre-written paragraphs and content blocks that have been legally reviewed and fact-checked. The AI would then customize these templates to fit the user context.

Second, we have implemented testing that challenges the AI with edge cases and nonsense inputs to detect hallucination issues prior to deployment. By rigorously testing generative AI systems before deployment, we can reveal issues with hallucination that might otherwise go unnoticed. We apply edge case testing by entering prompts that represent rare scenarios that expose situations where the AI's knowledge may be stretched too thin. For example, prompts about complex derivative products and niche investment scenarios that the AI may not have seen during training. If the output still seems plausible rather than responding it does not have enough information, that indicates a higher risk of hallucination. We also apply invalid input testing by feeding the AI nonsense inputs, such as random word combinations or grammatically incorrect prompts. This puts pressure on the AI to hallucinate output where a human expert would refuse to provide an answer. If the AI generates verbose, seemingly on-topic responses to nonsense, it demonstrates an over-eagerness to generate possibly false information. Lastly, we apply invariant testing. The same prompts are rephrased in several different ways that should not alter the fundamental information. If the AI generates contradictory responses to restatements of the same underlying prompt, it is also a sign of hallucination issues.

Thirdly, we evaluate the confidence scores of generated text to screen out low-certainty responses. In addition to the generated text itself, generative AI systems output metadata about how certain the model is that its output is accurate. Evaluating these confidence scores and filtering out the lower-certainty generations that may be more prone to hallucination helps to protect our users. Typically, there are two main types of confidence scores that can be evaluated: token-level and prompt-level. With token-level scores, the model assigns a score between 0 and 1 for each word in the generated text to represent its confidence that the word belongs in that position. Very low scores indicate uncertainty in a particular word and are flagged for our internal review. With prompt-level scores, the model outputs an aggregate score between 0 and 1 reflecting its overall faith in the accuracy of the entire text generated for a given prompt. We completely hide generations with prompt-level confidence below a specific threshold and instead request the user to phrase the prompt differently and/or provide more specific information.

All three of these measures help ensure that our users are kept safe from potentially harmful advice while offering them access to very personalized financial advice.

At Crediture, we're excited about the tremendous upside in leveraging AI to expand access to financial expertise. The risks of generative models should not discourage us from innovation but rather motivate us to take greater care and responsibility. Financial guidance is deeply personal so we must approach it with careful consideration. By implementing rigorous safeguards like constrained interfaces, thorough testing for edge cases, and careful evaluation of confidence scores, we can mitigate these risks.

Follow Crediture's?LinkedIn Page?to learn more and keep up with our latest advancements. #Crediture

要查看或添加评论,请登录

Crediture | Financial AI的更多文章

社区洞察

其他会员也浏览了