登录查看更多内容

What are LLM reasoning models and why you should care?

Erik ?hlin

发布日期: 2025年2月13日

I think LLMs with reasoning capabilities are getting far too little attention compared to some of the other technical advances in the general AI hype. This could change as DeepSeek gets some limelight, but at the moment even here the discussions are on other aspects.

We will dive in to details but let me start with a personal anecdote which serves well to show the fairly substantial difference between LLMs with and without reasoning.

Shortly after o1 was released (OpenAIs first LLM with reasoning) I had dinner with some friends. AI was on the list of conversation topics as always these days. The usual ‘rant’ about hallucinations, bad at math, can’t play chess etc etc. On a separate thread one of the guests brought up an old “riddle”. Allegedly, the former president George W. Bush visited all 50 states but one. As a coincidence(?), the name of this state does not contain any of the letters in his name. Harder than you think to figure out, but it’s true, only one state fits the description of not containing ‘G’, ‘E’, ‘O’, … and so on.

Almost instantly one of the guests said, “I bet ChatGPT can’t solve this riddle!” I said, “I think it depends on which model you use.” Naturally we needed to find out. ChatGPT with model 4 quickly answered “OHIO,” with the certainess only a four-year-old can deliver. Obviously wrong, even a five-year-old would see that. Then we tried the (then fairly new) o1 model in ChatGPT. It thought for a long while, shared its reasoning and came up with the correct answer.?

In this article I will share some basics on reasoning models and why this is a very important step, not just for solving dinner riddles.

PS: As I write this, I just got access to the o3 model in ChatGPT and it solved the riddle in just a second. Things are progressing fast.

Types of reasoning

LLM reasoning models have big potential for all sorts of tasks, but in this context, the most important trait is that they are a cornerstone in agentic workflows and architectures which I covered briefly in another article. It's time to understand what reasoning in LLMs really means, what it entails, and its limitations.

Lets assume AI and large language models (LLMs) will have impact also for the media and broadcasting industries. This article breaks down the different types of reasoning LLMs use, how they compare to human thinking, and how they fit into automated media workflows.

Types of LLM Reasoning Models

LLMs can handle different types of reasoning to varying degrees. They can do deductive, inductive, abductive, and analogical reasoning, but they aren't perfect and still struggle with complex or new problems.

LLMs use several types of reasoning to process information and make decisions. It’s important to understand that regular LLMs and LLMs with reasoning capabilities all use these reasoning models, but with some very important differences. Regular LLMs primarily mimic patterns and do not break-down and check their conclusions; it’s an “I’m feeling lucky”-attitude and works well in some cases, but not at all in others. LLMs with reasoning capabilities, though, break down the question, do a chain-of-thought reasoning and often check the answer again before giving the actual answer.?

In my text below, I will have 4o represent an LLM without reasoning and o1 will be an example of an LLM with reasoning capabilities.

These are the most important kinds of reasoning:

1. Deductive Reasoning

Deductive reasoning involves applying general rules to specific cases to reach a logical conclusion. LLMs mimic this process by leveraging patterns learned from vast datasets to generate consistent responses. For example, in media applications, an LLM might apply established broadcasting guidelines to determine if content meets industry standards.

However, LLMs struggle with new or unclear rules that aren't in their training data. Humans are good at filling in the gaps, but also make mistakes and have issues with complex logic and logic in multiple steps.

Comparing models with and without reasoning, 4o is not really doing deduction but rather filling in the most probable answer and/or pattern matching in most cases.?

o1, on the other hand, is designed to perform chain-of-thought processing, meaning it can break the problem into explicit steps. It “reasons” by explicitly applying the general rule to the specific case, potentially even showing its steps so a human can follow, and learn(?) from the chain-of-thinking. Reasoning models are also designed to break-down the statements and questions in chunks that are easier to verify.?

2. Inductive Reasoning

Inductive reasoning helps LLMs identify patterns from examples and predict trends. This could, for example, be useful for analyzing viewer behavior and improving programming schedules. This is close to a whole area of related machine learning methods.

A known issue with LLMs is overfitting (https://en.wikipedia.org/wiki/Overfitting), coming from over-relying on historical data, relying too much on past data and not adjusting to new trends. Humans, on the other hand, can naturally weigh different factors and adapt.

4o without reasoning and o1 with reasoning differ in that the former merely mimics patterns where the latter weighs in different data points and statements since it is breaking down and looking back at the steps of reasoning that led to the conclusion.?

In a media supply chain, 4o would not be able to track the causation of a workflow step that failed multiple times whereas an o1 model could potentially check incoming data or metadata and reason about errors in the set separately from just the result.

3. Abductive Reasoning

Abductive reasoning means finding the most likely explanation based on incomplete data. LLMs could use it to troubleshoot broadcast issues by analyzing logs and suggesting possible causes for failures.

However, since the models do not possess true understanding or intuition, the hypotheses they generate can sometimes be off-target or overly simplistic compared to the insights of a human expert with lots of experience.

Reasoning models are more likely to get closer to what we humans call ‘intuition’ since, again, it breaks down the steps, reason about each one, and can suggest a hypothesis. There are still limitations in any training set, expertise, and more, but the gap might be nearly closed with fine-tuning and training.?

4. Analogical Reasoning

Analogical reasoning helps LLMs find similarities between different situations and datasets.?

领英推荐

Best-of-N Jailbreak: Now even a Kid Can Cast a…

Mohit Sewak, Ph.D. 2 个月前

Can In-House LLMs Keep Pace with Tech Giants’ AI? An…

Richard Foster-Fletcher ?? 7 个月前

The Integration and Impact of Artificial Intelligence…

Law Insider IN 1 年前

For example, an LLM that understands sports preferences can suggest similar content for music broadcasts. However, they may miss subtle user preferences that humans would pick up on.

Here the difference between 4o and o1 is more subtle and not a major divider.

Conclusions comparing LLMs with and without reasoning

Models with reasoning?

Chain-of-thought and multi-step reasoning: Models like o1 and o3 are designed to do multi-step reasoning. They break down complex problems into a sequence of logical steps, which is particularly useful when dealing with tasks that require inference, deduction, or a structured analysis.

Task decomposition: These models have mechanisms that allow them to decompose a problem into subproblems before coming up with an answer. This means more solid explanations and better performance on tasks that require understanding context or handling abstract concepts.

Adaptability and robustness: Because they can simulate human-like reasoning processes, these models tend to be more adaptable when encountering unknown scenarios. It’s a more transparent "thinking process" that users can sometimes follow (via chain-of-thought outputs), making it easier to diagnose errors or bias.?

If you haven’t yet, go try it out and see how the models reason, especially the new DeepSeek model that is, ironically, more transparent with its reasoning than the “western” models.

Models without reasoning

Pattern matching: Models such as 4o primarily do matching patterns in the data they were trained on. They excel in generating fluent text and retrieving information but do not perform multi-step reasoning. This can lead to answers that look ok but actually lack the depth required for tasks and later step-by-step problem-solving which is essential when creating agentic architectures.

Limited explanation: Models without reasoning can’t really justify or explain how they came to a conclusion, it’s a black box. Their responses might appear as if they “jump” to conclusions without providing the logical steps that lead to those conclusions.

Application-specific strengths: While lacking deep reasoning, models like 4o are still highly effective in applications where the task primarily involves recalling or rephrasing information from their training data, such as understanding incoming data (image, text, audio), summarization or translation. In many, many cases this is more than enough. Also, if you are building AI-fused workflows rather than agentic architectures, it’s easier, more efficient, and less costly to use models without reasoning.

Bottom-line: Agentic architectures need reasoning …

… but far from all tasks requires agents.

Decomposing tasks, understanding nuances, and being able to reason in multiple steps about your own outcome is absolutely crucial if you, as an AI-agent, are requested to understand incomplete, high-level, tasks and then delegate the subtasks to other systems and more specialized agents. AI-agents are also usually trained to ask humans when tasks are incomplete, require human attention or otherwise. If the AI agents are also expected to do autonomous decision making, then you need to understand steps, subtasks and context at much deeper levels.

An LLM that is largely doing pattern-matching will just not suffice. They could have roles in smaller subtasks (translations, text generations, sending ‘I’m sorry emails’, and such) but they need a reasoning model to understand the bigger scope and context.

For example, in media ops, an agentic workflow might receive a request to develop a multi-platform content strategy. The LLM reasoning model can break this down into smaller tasks such as audience analysis, content format recommendations, scheduling, and performance monitoring. There will be human intervention and also a final decision from a human, but I think we can expect an agentic, reasoning AI to do a lot of the groundwork by itself.

The most common reasoning model available (January 2025)

OpenAI's o1 model was a big step forward in AI reasoning. It uses "chain-of-thought" processing to break down problems into smaller steps before giving an answer. This helps it perform better in areas like math, coding, and science.?

More recently, OpenAI's o3 model has been making headlines for its enhanced reasoning capabilities and efficiency. The o3 model builds on the foundation of o1, but with improved processing speed, better contextual understanding, and a more robust ability to handle complex multi-step problems. The o3 model is setting a new standard for AI-assisted workflows and agentic architectures. (I’ve just recently started using it and first impression is that it is much, much faster yet with equal quality compared to o1.)

A key feature of o3 is its performance on the ARC (Abstraction and Reasoning Corpus https://arcprize.org/ ) benchmark, which evaluates an AI's ability to solve abstract reasoning problems that require pattern recognition and generalization. The o3 model has shown amazing improvements in these tests, showing capabilities for complex problem-solving. Look it up, as a human there are tests you can take that highlight in a very concrete way what we are good at, and what AI struggles with. Strongly recommended.?

DeepSeek models have recently gained attention as well. These distilled models are smaller, more efficient versions of the powerful DeepSeek-R1 AI system. They specialize in tasks like solving math problems, writing code, and answering complex questions while being lightweight enough to run on regular computers and smartphones. This distillation process, which transfers knowledge from larger models to smaller ones, makes AI more accessible and cost-effective. However, compared to OpenAI's o1 and o3, DeepSeek has faced serious integrity concerns. This highlights the importance of understanding what you, as a technical decision-maker, bring inside your gates. You should be curious but not naive.?

The more interesting aspect, I think, is that the DeepSeek models offer even more transparent reasoning by showing their step-by-step thinking process in higher detail, making them easier to trust and debug compared to other models. Their open-source nature is also a potential game-changer. Yes, there are other open-source LLMs but they are not at par with DeepSeek yet.

OpenAI shows that breaking problems into reasoning steps improves model performance across various tasks. LLMs can switch between rapid response generation and step-by-step, more deliberate reasoning approaches, improving decision-making accuracy for complex tasks. Additionally, effective reasoning in media supply chain workflows can also be enhanced through prompting techniques, encouraging the model to generate responses with greater accuracy and logical flow. Yuo need to make sure you design your systems so that you use the right model for the right task.

Conclusion

LLM reasoning models can help media and broadcast companies work smarter and faster. While they aren’t perfect, their ability to automate processes, predict trends, and enhance decision-making makes them valuable tools.?

With that said, it's important to be cautious about what this means in the short term. While LLMs offer exciting possibilities, we need to understand their limitations and strengths. It's crucial to get your head around the fundamentals of how LLM reasoning works because AI will increasingly become integrated into daily operations. Organizations should pay attention to what humans do well, such as contextual understanding, creativity, and ethical judgment, and what AI does well, like processing large datasets and providing fast insights. Understanding these differences will be key to leveraging AI.

References

Erik ?hlin

1 个月

Complete link https://mediasupplychain.org/what-are-llm-reasoning-models-and-why-should-you-care-2/

1 次回应

Sabine VanderLinden

Activate Innovation Ecosystems | Tech Ambassador | Founder of Alchemy Crew Ventures + Scouting for Growth Podcast | Chair, Board Member, Advisor | Honorary Senior Visiting Fellow-Bayes Business School (formerly CASS)

1 个月

The rapid evolution of AI models demonstrates how quickly our analyses can become outdated in this dynamic field. #TechInnovation

1 次回应

查看更多评论

要查看或添加评论，请登录

Erik ?hlin的更多文章

Ideation Strikes Back: AI’s BFF is here to fix the “fast and furious” issue

2025年3月19日

Ideation Strikes Back: AI’s BFF is here to fix the “fast and furious” issue

In a world with AI-supported coding tools, agents taking over your screen and all the other things in the hyper-speedy…
The CIO n-AI-ghtmare

2025年3月9日

The CIO n-AI-ghtmare

I’ve never been the CIO but I met a few over the years. Solid individuals with great talent for both details and…
Writing Code Is Easy, Building Software Is Hard

2025年3月2日

Writing Code Is Easy, Building Software Is Hard

Life in the AI-Enabled Fast Lane hits Enterprise Software Complexity I’ll start with a confession, I’ve done it, I’m…

8 条评论
IBC and the Black Swan

2020年5月6日

IBC and the Black Swan

I think you have seen that we decided to cancel our stand at IBC. It certainly wasn’t an easy decision.

4 条评论
vortech.by/ - A new series of tech-oriented summits for anything cloud & media

2019年2月7日

vortech.by/ - A new series of tech-oriented summits for anything cloud & media

Friends and followers, one of the best parts of running a small, startup-minded, tech company is that you can quickly…

3 条评论
Vidispine joins Arvato Systems

2017年11月10日

Vidispine joins Arvato Systems

Big news today! We, the two co-founders, made the decision to join the Bertelsmann family as part of the Arvato Systems…

21 条评论
My take on the PaaS future for professional video solutions!

2016年9月7日

My take on the PaaS future for professional video solutions!

It took a few years before there was real traction to use the Cloud in professional video solutions. I must say that…

4 条评论
Developing with Video on Cloud - Breakfast Briefing at AWS Nordic Summit May 4th in Sthlm

2016年4月28日

Developing with Video on Cloud - Breakfast Briefing at AWS Nordic Summit May 4th in Sthlm

Want to know more how you build video in to your cloud-based application? Got an urge to talk tech with developers who…

1 条评论
The New Order in Video Content Management

2016年2月5日

The New Order in Video Content Management

If you are in marketing or communications you may recognize yourself in one or many of the following statements. If so,…

6 条评论
Brands are uploading LESS Media to Facebook! - How can that be?

2015年6月29日

Brands are uploading LESS Media to Facebook! - How can that be?

In an article available here, Business Insider reports that brands are uploading significantly less media to Facebook…

See all articles

What are LLM reasoning models and why you should care?

Erik ?hlin

Types of reasoning

Types of LLM Reasoning Models

1. Deductive Reasoning

2. Inductive Reasoning

3. Abductive Reasoning

4. Analogical Reasoning

领英推荐

Conclusions comparing LLMs with and without reasoning

Models with reasoning?

Models without reasoning

Bottom-line: Agentic architectures need reasoning …

The most common reasoning model available (January 2025)

Conclusion

References

Erik ?hlin的更多文章

社区洞察

其他会员也浏览了

Ground Truth: The Holy Grail of Knowledge

Augmented Collective Intelligence - February 2023 Newsletter

Why we should worry about Progressive AI

Why the Bard doesn't speak to me

MarcoverseGPT Pre-Release ?? (Features, bug fixes and species patch)

Taming AI Hubris with Ontological Humility

INSIDER for May 2023

Rogue Revelations: Did Gemini just Whistleblow on Hidden Bias in the AI World?

LLMs and the Indian Government - Thou Shalt Obey

It Was Good Talking To You As Always, John

Types of reasoning

Types of LLM Reasoning Models

1. Deductive Reasoning

2. Inductive Reasoning

3. Abductive Reasoning

4. Analogical Reasoning

领英推荐

Conclusions comparing LLMs with and without reasoning

Models with reasoning?

Models without reasoning

Bottom-line: Agentic architectures need reasoning …

The most common reasoning model available (January 2025)

Conclusion

References

Erik ?hlin的更多文章

Ideation Strikes Back: AI’s BFF is here to fix the “fast and furious” issue

The CIO n-AI-ghtmare

Writing Code Is Easy, Building Software Is Hard

IBC and the Black Swan

vortech.by/ - A new series of tech-oriented summits for anything cloud & media

Vidispine joins Arvato Systems

My take on the PaaS future for professional video solutions!

Developing with Video on Cloud - Breakfast Briefing at AWS Nordic Summit May 4th in Sthlm

The New Order in Video Content Management

Brands are uploading LESS Media to Facebook! - How can that be?

社区洞察

其他会员也浏览了

Ground Truth: The Holy Grail of Knowledge

Augmented Collective Intelligence - February 2023 Newsletter

Why we should worry about Progressive AI

Why the Bard doesn't speak to me

MarcoverseGPT Pre-Release ?? (Features, bug fixes and species patch)

Taming AI Hubris with Ontological Humility

INSIDER for May 2023

Rogue Revelations: Did Gemini just Whistleblow on Hidden Bias in the AI World?

LLMs and the Indian Government - Thou Shalt Obey

It Was Good Talking To You As Always, John