登录查看更多内容

A bit about hallucinations

Venkat Ramakrishnan

Chief Quality Officer | Software Testing Technologist | Keynote Speaker | Corporate Storyteller

发布日期: 2024年4月27日

While LLMs are hot, their hallucinations are stark. For a casual user of the LLMs, they might seem to be minor mistakes which are pardonable as we always do with human beings' slips, more so since the LLMs are so polite and human-like in their conversations these days. But those small slips could be dangerous when we start to depend more and more on LLMs for our lives and work, and more so when we automate our work (without human oversight) based on LLM inputs.

A while back, when the Baltimore bridge collapsed because of the collision of a ship on one of its pillars, I was tracking the incident online that evening in my time zone, and I asked one of the LLMs to give me more information about the Baltimore bridge. That LLM is connected to The Internet in real time (not like chatGPT 3.5 whose information are not real time). Initially the LLM didn't seem to have any clue about the incident and didn't say anything about the collision. Eventually, two hours later, there was this mention that the bridge has collapsed, and the date of the collapse was given three months in the past! I was surprised and asked it 'Are you sure?' Then it apologized and gave the correct date and time of the collapse.

It may not sound as a big issue, but imagine if no human had been overseeing the output, and an automated script took that date and time input to take action in some fashion, say, out of my fertile imagination, some ship company's insurance processing. You can appreciate the financial and reputational distress that company would have been put into!

This is an example of an LLM hallucination. This is one type, and there are many others. I am not kidding when I say that organisations are relying more and more on automating based on LLM's outputs. We should be really worried about the quality of these outputs. As a Software Tester and someone who cares about Quality, I am.

领英推荐

AI Regulation

Arya.ai 1 个月前

A Synthetic Stock Exchange Played with Real Money

Vincent Granville 1 年前

Know it all! Quite Rightly Terrible Language and…

Adam Prynn 1 年前

Language processing is not simple. I was casually going through the various Python libraries on text processing and I was not at all impressed by the quality of their outputs. I appreciate and deeply respect the efforts that have gone into these open source projects, many of these done by top-notch university departments and their students, but the results are very disappointing.

I am actively working on how to expose these kind of problems and would be glad to join forces with others who are doing it too. If you are interested, give me a buzz and let's talk about it.

Quality Pivot

583 位关注者

要查看或添加评论，请登录

Venkat Ramakrishnan的更多文章

Security Testing Of Autonomous Vehicles

2025年3月19日

Security Testing Of Autonomous Vehicles

Still an young field, and there's lot of scope to get into and be an expert! This is about security testing of…

1 条评论
Quality Of Zero-Click Search Results

2025年3月17日

Quality Of Zero-Click Search Results

Let's talk about quality of Zero-Click search results!…
Streamlining Testing Process

2025年3月16日

Streamlining Testing Process

Let's talk about streamlining testing process: https://venkatramakrishnan.com/2025/03/16/testing-process-streamlining/

2 条评论
Measuring Software Quality

2025年3月15日

Measuring Software Quality

Let's talk about how to measure software quality in the modern environments:…
Busting Regression Testing Myths

2025年3月14日

Busting Regression Testing Myths

In this article, let's bust some regression testing myths!…
Avoiding Test Results Conflicts

2025年3月12日

Avoiding Test Results Conflicts

Let's talk about the three key pillars that would contribute to avoiding test results conflicts! Here:…
Test Prioritization

2025年3月11日

Test Prioritization

We encounter difficulties on Test Prioritization on a daily basis. We are challenged because we need to deliver fast…
Skipping Testing Activities

2025年3月10日

Skipping Testing Activities

Skipping testing activities might make sense if the test types are not relevant to the situation at hand. One may…
Balancing Thorough Testing and Fast Feedback

2025年3月9日

Balancing Thorough Testing and Fast Feedback

Pressure to deliver as soon as possible and upholding efforts for superior quality are two conflicting goals because by…
How To Test Last Minute Features

2025年3月8日

How To Test Last Minute Features

We have all been through situations where we are asked to do quality analysis and testing last minute features. In the…

See all articles

A bit about hallucinations

Venkat Ramakrishnan

Chief Quality Officer | Software Testing Technologist | Keynote Speaker | Corporate Storyteller

领英推荐

Quality Pivot

583 位关注者

Venkat Ramakrishnan的更多文章

社区洞察

其他会员也浏览了

Unleashing the Potential of AI in Automating Trade Finance Due Diligence

Unleashing the Potential of AI in Automating Trade Finance Due Diligence

The Use of AI in Detecting Anomalies in Financial Data!

AI, Synthetic Data, and the Toxic Assets in the 1987 and 2008 Financial Crises

AI and financial stability: a balancing act

Ever Wondered Why Algorithmic Trading Engineers Earn Million-Dollar Salaries? Unveiling the Hidden Complexities of Building a Financial Analytics Plat

Transforming Compliance: How System Audits and Management Professionals Drive AI’s Impact on Bug Free Decision-Making

The Role of Generative AI in Transforming Financial Institution: a snapshot.

The Regulation of AI in Financial Services: Implications for Insurers, Brokers, and Consumers in the UK Market

Leveraging AI for Originations Credit Strategies, Part 2

领英推荐

Quality Pivot

583 位关注者

Venkat Ramakrishnan的更多文章

Security Testing Of Autonomous Vehicles

Quality Of Zero-Click Search Results

Streamlining Testing Process

Measuring Software Quality

Busting Regression Testing Myths

Avoiding Test Results Conflicts

Test Prioritization

Skipping Testing Activities

Balancing Thorough Testing and Fast Feedback

How To Test Last Minute Features

社区洞察

其他会员也浏览了

Unleashing the Potential of AI in Automating Trade Finance Due Diligence

Unleashing the Potential of AI in Automating Trade Finance Due Diligence

The Use of AI in Detecting Anomalies in Financial Data!

AI, Synthetic Data, and the Toxic Assets in the 1987 and 2008 Financial Crises

AI and financial stability: a balancing act

Ever Wondered Why Algorithmic Trading Engineers Earn Million-Dollar Salaries? Unveiling the Hidden Complexities of Building a Financial Analytics Plat

Transforming Compliance: How System Audits and Management Professionals Drive AI’s Impact on Bug Free Decision-Making

The Role of Generative AI in Transforming Financial Institution: a snapshot.

The Regulation of AI in Financial Services: Implications for Insurers, Brokers, and Consumers in the UK Market

Leveraging AI for Originations Credit Strategies, Part 2