When AI lies: Detecting hallucinations in Gen AI
Vinay Ananth R.
Empowering businesses with innovative solutions | Sales | Generative AI & ML | IoT/ IIoT | Cloud | Presales | Product Owner
Have you ever asked generative AI a straightforward question, only to receive a wildly inaccurate or even downright bizarre answer?
Maybe you requested a text summary, but the AI confidently fabricated details that weren't in the original text. Or maybe you asked for technical advice, only to receive instructions that were completely wrong.
These errors—known as "hallucinations"—happen when AI generates information that isn't grounded in its training data.
Hallucinations aren't always bad. In creative tasks, they can be a feature, not a bug—improvising fictional characters or dreaming up imaginative stories. But in high-stakes areas like healthcare, finance, or legal services, hallucinations can be dangerous—sometimes even life-threatening.
As Generative AI becomes more embedded in application development, it's crucial to understand why hallucinations happen—and how to manage them effective.
Let's dive in:
What makes LLMs hallucinate?
To understand hallucinations, it's important to first understand how large language models (LLMs) generate content. Let's use an example: text generation.
LLMs predict the next word in a sentence based on patterns from their training data. LLMs are trained on massive amounts of text data—such as books, websites, and articles—to learn patterns, grammar, and word relationships.
Think of it like completing a sentence in a conversation. For example, if someone asks you to complete the sentence, “The cat is ...”. You might automatically think of words like “sleeping” or “running.”
LLMs work similarly. They are very good at predicting what comes next based on patterns in the data they’ve seen before. This allows them to generate anything from essays and poems to code and answers to questions.
But, unlike you, they have no actual understanding—just pattern recognition. That's why they can sometimes guess convincingly, but incorrectly.
How to control hallucinations
One effective way to manage hallucinations is by fine-tuning the temperature parameter when generating content. The temperature parameter influences the randomness of an AI model’s prediction. It determines how the model weighs probabilities when selecting the next word.
Through its Amazon Bedrock service, AWS has introduced tools that make it easy to build and scale generative AI applications using foundation models from providers like Anthropic and Cohere.
With Bedrock, you can easily configure parameters—including?temperature—to fine-tune model behavior and control output quality, tailoring responses to your application's needs.
To see this in action, let’s look at Bedrock’s response to the same prompt, with different values of temperature in the slides below:
Adjusting the temperature?with?Amazon Bedrock?lets you?balance creativity and accuracy—making it easier to generate outputs suited to your application's needs, whether you’re building a chatbot or generating marketing copy.
Factual hallucination is misinformation
But what if LLMs generate content that is presented as factual, but is false or inaccurate?
These errors are commonly referred to as "factual hallucinations." A factual hallucination happens because the model predicts based on patterns in its training data without verifying the truth.
Let’s look at an example where these factual errors are dangerous and can easily provide incorrect information.
Think of an organization that wants to use a customer support chatbot to enhance its workflow. Would you trust this chatbot if it occasionally responded with made-up information? The answer would be no.
However, we can implement a system that ensures the chatbot provides correct information and prevents hallucinations in its responses.
How AWS helps safeguard Generative AI outputs
Amazon Bedrock Guardrails is a service designed to enhance generative AI applications’ reliability, accuracy, and compliance by providing safeguards. It provides the following safeguards for your generative AI applications:
At the re:Invent 2024 event, AWS announced the launch of?automated reasoning checks?in Amazon Bedrock Guardrails. These checks allow organizations to mitigate hallucinations by automatically evaluating the outputs of generative AI models against predefined rules and constraints. Compared to RAG, which relies on fetching data from trusted sources, automated reasoning checks ensure that outputs strictly adhere to predefined policies, domain rules, or regulatory guidelines.
How do reasoning checks work?
Reasoning is the cognitive process of concluding, solving problems, or making decisions based on evidence, logic, or information. It involves analyzing and evaluating information, identifying relationships, and applying critical thinking to drive insights or solutions.
Reasoning detects hallucinations using a mathematical framework to validate the information generated by the model. Reasoning techniques analyze the outputs of LLMs against established rules or logical frameworks to identify inconsistencies with the following objectives:
Reasoning capabilities can cross-check the generated content with known facts or logical deductions, effectively filtering out hallucinations before they reach the user.
Automated reasoning checks in AWS
To utilize automated checks in AWS, you can follow the steps below:
Use cases of automated reasoning checks
Automated reasoning checks are particularly valuable for use cases requiring factual correctness, such as:
Let's see what all this looks like in action.
Example: Chatbot for Loan Applications
Let’s look at an example of deploying a customer support chatbot for a financial services company. The chatbot decides the applicant’s loan eligibility based on predefined rules.
Step 1: Create an automated reasoning policy—Rules
You can start by uploading the policy document to Amazon Bedrock Guardrails. Amazon Bedrock will analyze the document and generate an automated reasoning policy.
Automated reasoning policy consists of variables defined by a name, type, and description, along with logical rules that operate on these variables.
For our example, let’s suppose we have the following variables with their descriptions:
The policy also defines logical rules using the policy document to represent formal logic as follows:
The chatbot will initially assess the load application’s eligibility based on these rules. As a domain expert, you can refine the model by updating these rules using natural language without formal logic expertise.
Step 2: Configure automatic reasoning checks in Amazon Guardrails
Once the policy is created, you can configure it within Amazon Bedrock Guardrails. You can enable the automated reasoning check and select the policy to use. The Amazon Guardrails will now apply automated reasoning to validate all the chatbot responses.
Step 3: Validate and correct LLM answers
After configuring the guardrails, you can test the automated reasoning checks to validate the policy’s effectiveness. Amazon Bedrock provides a test playground where you can simulate real-world scenarios by inputting sample prompts and reviewing the outputs.
If any output violates the reasoning policy or contains inaccuracies, the system flags the issue and suggests corrections. This feedback loop helps domain experts refine the policy or adjust the model’s behavior to ensure compliance.
Hallucinations aren’t always bad—they’re what make AI creative. But when accuracy matters (like in healthcare or finance), they can cause real harm.
That’s where?AWS automated reasoning checks?come in. Using formal verification, AWS ensures AI systems behave as intended—making it possible to build safer, more reliable AI for high-stakes use cases like legal advice, healthcare, and financial services.