Hallucination In AI Models

Hallucination In AI Models

Hallucination in Natural Language Generation:

Hallucination occurs when a language model generates text that fits a given context or prompt but includes details or nuances that are not explicitly present in the training data. It involves the model's ability to extrapolate and generate creative or novel responses that make sense contextually.

  • Examples: For instance, if a model is given the input "The cat sat on the mat," hallucination might involve generating responses like "and started playing with a ball nearby," even if it hasn't seen this exact phrase during training. The model creates plausible continuations based on its understanding of language and context.

Challenges: Evaluating hallucination is challenging because it requires assessing whether generated outputs are not only grammatically correct but also semantically coherent and contextually appropriate. Models need to balance between generating novel responses and staying faithful to the input context.

Solution using Cross-Encoders

Contextual Understanding: Cross-encoders can aid in evaluating hallucination by providing a more holistic view of the relationship between input and output sequences. By jointly encoding both the input context and the generated output (or response), cross-encoders can better capture whether the generated text fits the given context.

The HHEM model, developed by Vectara, is an open-source tool designed to identify hallucinations within large language models (LLMs). It is especially beneficial in applications involving retrieval-augmented-generation (RAG), where a collection of facts is condensed into summaries by an LLM. However, this model is versatile and applicable beyond RAG scenarios.

The model was trained using the Cross-Encoder class from SentenceTransformers. It produces a probability score ranging from 0 to 1, where 0 indicates hallucination and 1 indicates factual consistency. By setting a threshold of 0.5, predictions can determine whether a document aligns with its source.


from sentence_transformers.cross_encoder import CrossEncoder

model = CrossEncoder('vectara/hallucination_evaluation_model')
scores = model.predict([
    ["A man walks into a bar and buys a drink", "A bloke swigs alcohol at a pub"],
    ["A person on a horse jumps over a broken down airplane.", "A person is at a diner, ordering an omelette."],
    ["A person on a horse jumps over a broken down airplane.", "A person is outdoors, on a horse."],
    ["A boy is jumping on skateboard in the middle of a red bridge.",
     "The boy skates down the sidewalk on a blue bridge"],
    ["A man with blond-hair, and a brown shirt drinking out of a public water fountain.",
     "A blond drinking water in public."],
    ["A man with blond-hair, and a brown shirt drinking out of a public water fountain.",
     "A blond man wearing a brown shirt is reading a book."],
    ["Mark Wahlberg was a fan of Manny.", "Manny was a fan of Mark Wahlberg."],
])

print(scores)        

要查看或添加评论,请登录

Vivek Sharma的更多文章

  • LLM and Evaluation Methods

    LLM and Evaluation Methods

  • Python Cosine Similarity

    Python Cosine Similarity

    Cosine Similarity is used as a metric for measuring distance when the magnitude of vector** does not matter. Example -…

    1 条评论
  • Cypress Best Practices

    Cypress Best Practices

    Use data attribute when selecting the Elements, Automation QA can add data attribute in the code if not available that…

  • Robot Framework - Template usage in API automation

    Robot Framework - Template usage in API automation

    In Robot Framework the Test templates convert normal keyword driven test cases into data driven tests. Use Case : In…

    1 条评论
  • Listeners in TestNG

    Listeners in TestNG

    If you want to add logs into your test script you can use listeners. Listeners work on action so have below methods.

社区洞察

其他会员也浏览了