登录查看更多内容

LLMs Hallucinations or much worse?

Jacob Feldman, PhD

OpenRules, Founder and Chief Technology Officer

发布日期: 2024年9月23日

On Friday Sep 20 during the last Day of DecisionCAMP (https://dmcommunity.org/2024/09/20/decisioncamp-day-3-presentations-and-recordings/) we had an interesting presentation by Bruce Silver "DMN Solver/Helper for Sudoku". Before submitting his presentation, Bruce told me: "I know, Jacob, that your constraint solvers can easily handle Sudoku problems, but I wanted to check how DMN FEEL will be able to support different complex patterns". I knew that Bruce is not only a big enthusiast of the DMN standard, but a great presenter and trainer, and I encouraged him to submit his presentation. And his presentation became a success (you may listen to it at https://www.youtube.com/watch?v=9NUAMqGeMZ0).

During the Q&A, the question of LLMs solving Sudoku came up. So, I just copied a Sudoku image from Bruce's presentation and decided to give it to Copilot while we talk about it. I've already had a positive experience with Copilot easily handling table images - see for instance, https://dmcommunity.org/wp-content/uploads/2024/09/challenge2024sep.copilot.pdf).

So, I had a brief dialog with Copilot presented on the top image. It was done almost immediately. Then I asked "Can you solve Sudoku of any complexity?" the answer was: "Yes, I can help you solve Sudoku puzzles of any complexity! Whether it’s an easy, medium, hard, or even an expert-level puzzle, I’m here to assist. If you have a specific Sudoku puzzle you’d like help with, feel free to share the details or an image of the puzzle, and I’ll guide you through the solution. ??"

It was impressive and I shared these results with Bruce and the DecisionCAMP audience as a proof of LLMs power. It confirmed my experience that LLMs could handle certain logical problems which are usually are not difficult for more traditional symbolic AI technologies such as constraint solvers.

After a few minutes, I started to received these messages in the Chat:

====================================================

Gary Hallmark: Jacob, that sudoku "solution" is quite an hallucination!

Me: From a quick glance I don't see any errors. Do you?

Rimantas Zukaitis: The numbers which were in original sudoku are not the same

Me: WOW! What a cheater...

领英推荐

Procrastination, NP vs P, and Delicious Links

Embeddedfm 10 个月前

??#170 | Best of the week

ACQG Enterprises Inc 8 个月前

User Manual To Me/ The Creators of the…

Massimo Portincaso 3 年前

Rimantas Zukaitis: it just gave a random solution, not related to the task at hand. well, its just a prediction - chances are this is most common sudoku solution it found

====================================================

After that, I got back to Copilot and insisted on this particular problem and Copilot started to produce tricky formulated sentences instead of a solution or an explanation and switched the topic.

An initial enthusiasm ended up with a serious frustration. Was it hallucination caused by the statistical nature of LLMs? Or was it an intentional problem substitute? Note that the first number in the top-left corner (where a regular person will immediately look) was "3" in both cases.

In my DecisionCAMP's Closing Remarks (https://www.youtube.com/watch?v=F4spCC9Bh7c) I spoke about necessity to always include rules-based Sanity Checkers in our end-to-end decisioning process. Experienced people did it for mission-critical applications long before LLMs were born. However, we never could imagine that a decision engine may replace our input data as Copilot did in this case.

Sanity Checkers at the end on the Decisioning Loop

I wonder what other people think: was it an unintentional random hallucination or something much worse?

Jacob Feldman, PhD

OpenRules, Founder and Chief Technology Officer

5 个月

https://www.dhirubhai.net/posts/deepakmehta79_ai-llm-machinelearning-activity-7237906411412758528-gzNF?utm_source=share&utm_medium=member_desktop

Vincent van Dijk

5 个月

Pauline Kampinga

查看更多评论

要查看或添加评论，请登录

Jacob Feldman, PhD的更多文章

Discovering Rules From Examples

2025年2月24日

Discovering Rules From Examples

Watch this short Richard Feynman’s video about Discovering Rules of Chess. He used a chess analogy to explain what we…
Decision Intelligence Platform and SharePoint

2025年2月21日

Decision Intelligence Platform and SharePoint

OpenRules announced a new Release 10.5.

2 条评论
Decision Modeling: Iterating over Collections

2024年7月26日

Decision Modeling: Iterating over Collections

In many real-world decision models iterating over collections of business objects is a must-to-have feature…
Decision Models “Stable Marriages”

2024年7月3日

Decision Models “Stable Marriages”

This June-2024 Challenge deals with the famous stable marriage problem. A very good analysis of the problem is provided…

3 条评论
Lookup Tables inside Decision Models

2024年5月22日

Lookup Tables inside Decision Models

DMCommunity April-2024 Challenge “Using Lookup Tables in Decision Models“ deals with processing complex medical claims…
Machine Learning inside Decision-Making Applications: Practical Use Cases

2024年3月30日

Machine Learning inside Decision-Making Applications: Practical Use Cases

Machine Learning (ML) tools have been successfully used for decision-making applications for years. Despite many…

4 条评论
Big Decision Tables

2023年8月15日

Big Decision Tables

When decision models use really big decision tables with tens and even hundreds of thousands of rules, the performance…

2 条评论
Using Custom Decision Tables in Your Decision Models

2023年8月13日

Using Custom Decision Tables in Your Decision Models

One of the unique features of https://OpenRules.com is the simplicity of adding custom decision tables.
Composite Decision Variables

2023年8月7日

Composite Decision Variables

OpenRules 10.0 introduced composite decision variables that allow OpenRules customers to naturally express complex…
Free Decision Model POC

2023年8月6日

Free Decision Model POC

OpenRules team offers FREE development of the Proof of Concept (POC) for your Rules-based Decision Service. Many of our…

See all articles

LLMs Hallucinations or much worse?

Jacob Feldman, PhD

OpenRules, Founder and Chief Technology Officer

领英推荐

Jacob Feldman, PhD的更多文章

社区洞察

其他会员也浏览了

Meet the team! 5 Questions with Aidan Harvey-Nelson!

OD83: Dealing with information overload ??

HOW I USED GENAI 'TODAY' - DEC 17 2024

Holistic Worldview and Practice - navigating ‘edge of chaos’?

Artificially Stupid: The Generalized Monty Hall shows LLMs are not Intelligent

The Meta-Framework of Everything - A Conversation Starter

The 'Cheetah-corn' Brain ...

GATE 2024 DS&AI MCQ Series-4

Introduction to Bayes Theorem

Map with clustering in SwiftUI

领英推荐

Jacob Feldman, PhD的更多文章

Discovering Rules From Examples

Decision Intelligence Platform and SharePoint

Decision Modeling: Iterating over Collections

Decision Models “Stable Marriages”

Lookup Tables inside Decision Models

Machine Learning inside Decision-Making Applications: Practical Use Cases

Big Decision Tables

Using Custom Decision Tables in Your Decision Models

Composite Decision Variables

Free Decision Model POC

社区洞察

其他会员也浏览了

Meet the team! 5 Questions with Aidan Harvey-Nelson!

OD83: Dealing with information overload ??

HOW I USED GENAI 'TODAY' - DEC 17 2024

Holistic Worldview and Practice - navigating ‘edge of chaos’?

Artificially Stupid: The Generalized Monty Hall shows LLMs are not Intelligent

The Meta-Framework of Everything - A Conversation Starter

The 'Cheetah-corn' Brain ...

GATE 2024 DS&AI MCQ Series-4

Introduction to Bayes Theorem

Map with clustering in SwiftUI