Exploring the Chain-of-Thought & Reasoning in ChatGPT o1 Preview

With the introduction of?ChatGPT o1 Preview?and its new?Chain-of-Thought?feature, AI has made strides in approaching problem-solving in a way that mirrors human reasoning. This feature allows the AI to break down big problems into smaller, logical steps, improving the accuracy and consistency of its solutions.

To explore the chain of thought and reasoning features, I decided to test how ChatGPT o1 approaches a straightforward simple task, an arithmetic sequence problem. I partnered with my father, C.P. Rajagopalan a?retired engineer from ONGC and a lifelong learner with a deep love for logical reasoning and Math.

When faced with a simple arithmetic problem, most people rely on intuition and pattern recognition. For instance, if asked to find numbers in the sequence ?100,?97,?94,?91,?88,…we would likely notice the pattern and subtract multiples of 3 from ?100 to quickly check which numbers would fit the sequence. The task feels effortless because we instinctively recognize the pattern.

But how does Chat GPT o1, a sophisticated AI model designed to reason through problems, approach the same task? Would it solve the problem in the same way, or would it take a different route?

We provided it with two scenarios

  1. Without multiple-choice options: The AI had to determine what numbers appeared in the sequence without predefined choices.
  2. With multiple-choice options: The AI was provided with a set of options and had to verify whether each number was part of the sequence.

Without Multiple-Choice Options (23 seconds)

In the first scenario, where no options were provided, ChatGPT o1 Preview took?23 seconds?to solve the problem. Here’s a breakdown of its reasoning process:

Chain of Thought - without the multiple choice option


With Multiple-Choice Options (5 seconds)

When given multiple-choice options, ChatGPT o1 Preview completed the task in?5 seconds. Here’s how it approached the task with predefined options:


Chain of Thought - with the multiple choice options

Key Observations from the Experiment

  1. Consistency in Problem-Solving ChatGPT o1 Preview displayed?consistency?in both tasks, following a structured approach to identifying the sequence’s pattern, deriving a formula, and using that formula to check the numbers. This approach ensured accuracy and demonstrated the strength of the?Chain of Thought?feature in enabling the AI to reason logically and reliably across different tasks.
  2. Task Simplicity and Adaptation While ChatGPT o1 Preview’s method was thorough, it didn’t take the intuitive shortcut a human might have used for a task like this. We would 'reason' that it is not necessary to go through those steps to ensure accuracy.

We would likely have used pattern recognition to subtract multiples of 3 from??100 and quickly determine whether??52 (or any other number in the multicple choice options) fits the sequence.
ChatGPT o1 , however, applied a method that guaranteed correctness and generality, ensuring its solution works for both simple and complex tasks.

This approach doesn't reflects a limitation of the AI, but rather its design. ChatGPT o1 Preview is programmed to prioritize?accuracy and generality, meaning it applies a structured solution that works universally.

The Central Question: Is ChatGPT o1 Preview Truly Reasoning?

I think it is, but it does so in a structured, logical manner. It is basically doing what it was trained to do. The AI used its Chain of Thought feature to break the problem into manageable steps, identify the sequence pattern, derive a general formula, and apply that formula to specific numbers. This process reflects a reasoning approach that ensures accurate solutions, regardless of task complexity.

Unlike humans, who might adapt their methods based on how simple a task is, ChatGPT o1 Preview applies the same level of rigor and consistency to all problems. This ensures reliable answers but means the AI doesn’t take shortcuts that a human might use for faster results.

ChatGPT o1 Preview’s?Chain of Thought?feature enables it to solve problems (simple or complex) using clear, logical reasoning. While humans might rely on intuition and shortcuts to solve simple tasks, ChatGPT o1 Preview’s strength lies in its ability to provide accurate and reliable solutions through structured reasoning.

The AI’s reasoning isn’t about mimicking human intuition, it’s about ensuring solutions are accurate and applicable across a wide variety of tasks, simple or complex. As AI continues to evolve, its ability to blend intuitive and structured reasoning will be crucial.

Note:

This experiment was conducted to explore ChatGPT o1 Preview’s?reasoning capabilities, specifically focusing on its?Chain of Thought?feature in the context of simple problems that may not require complex reasoning skills. ChatGPT o1 Preview prioritizes accuracy and generality, ensuring that it solves tasks consistently across simple and complex problems. While the AI did not take the same intuitive shortcut a human might have used, its structured process guarantees reliable, accurate results every time.

Leon Furze Dr Nick Jackson Lidija Kralj Hiba El Hazzouri Mukundan Sampath Sari Lantto Bronwyn Desjardins Murtaza Sinnarwala Jason Gulya James Hutson, PhD, PhD Rahul Koroth Barbara Anna Zielonka Anoop Rajan Clara Lin Hawking Saeed Al Dhaheri Fengchun Miao

Dr Nick Jackson

Leader of Digital Technologies at Scotch College - Adelaide

1 个月

Hmm. Without wishing to be critical, but how is this different from asking o4 to explain step by step

回复
Adrian Cotterell

Director of Studies ELC-12 | Exploring AI in Education

1 个月

This is really interesting Dr. Anjali Rajan Puthiyedath I recently wrote the following post about the limitations around different types of cognition that was framed around “System 1 & System 2 thinking” made popular by Daniel Kahneman. LLMs at that stage mimiced quick System 1 thinking because of the underlying probability they used to both comprehend input and produce outputs. However, they seemed to lack System 2 thinking: Deliberate “effortful” thinking through a problem. I wonder if your experiment suggests that this chain of thought “thinking” that 01-preview is utilising to ensure that it finds a solution that is both accurate and generalisable is now mimicking System 2 thinking? Do we now have something that truely thinks through a problem? Or is it just appearing to be doing this but is still just based on probability? https://adriancotterell.com/2024/05/11/part-1-ignoring-our-ignorance/

回复
Saeed Al Dhaheri

UNESCO co-Chair | AI Ethicist | Thought leader | Author | Certified Data Ethics Facilitator | LinkedIn Top Voice | Public Speaker

1 个月

A good experiment to understand the structured reasoning approach using chain of thought in how o1 preview reasons. It will be interesting to see if this structured and generalized approach will differ from one problem to another! Thank you

James Hutson, PhD, PhD

AI Innovator | Top AI and Higher Education Voice | PhD, AI + PhD, Art History

1 个月

Very informative

要查看或添加评论,请登录

社区洞察

其他会员也浏览了