Let's Merge Incident and Problem Management

Let's Merge Incident and Problem Management

I’ve been involved with ITIL? for nearly 20 years with my time in OGC / HMT and decades of consultancy, training, writing, and blogging. And all this time I’ve preached on the benefits of both Incident and Problem Management as practices and how they should be kept (and managed) separately. These 2 practices are at the heart of efficient and effective service management and customer support and they serve 2 very distinct purposes.

But now I’m asking the question: is ITIL? wrong? Or rather, is the advice to manage the practices separately wrong?

It’s true that, for most organizations, the default recommendation of keeping these practices separate is still the best as their objectives, timelines, and workflows are fundamentally different. However, ITIL? is designed to be flexible isn’t it? There's an increasing demand to combine the practices, so I’ve been using the flexibility of the framework to do exactly that; and with great results.


What is Incident Management?

At its core, Incident Management is about one thing: restoring normal service operation as quickly as possible. It’s reactive, an event has happened that’s impacted someone in a negative way and Incident Management is designed to minimise disruption to users and business operations by getting things back to “business as usual” (whatever that may look like) as quickly as possible. It’s the sticky tape that holds things together that is never meant to be permanent.

Incident Management as the IT organization’s emergency response team. When something breaks or it’s running slow then Incident Management’s job is to put out the fire as efficiently as possible.

I use a number of examples when I’m teaching, but let’s use my pothole example. Imagine if you will than you’re walking out of your office (or any office for that matter) looking at your phone. We all do it. You don’t see the pothole in the pavement and you tip, breaking your leg. Ouch!

Incident Management will:

  • Call an ambulance.
  • Treat your pain.Put a cast on your leg.
  • Give you crutches.

In other words, restore "service" to you getting you up and walking again as quickly as possible, with the immediate disruption addressed.

The focus for Incident Management is on urgency and restoring functionality, not on understanding or fixing the root cause of the problem.


What is Problem Management?

By contrast, Problem Management focuses on a long-term, and sometimes proactive, view. Its purpose is to prevent further incidents from happening by identifying and resolving the underlying causes of disruptions. The root cause.

Returning to the pothole:

Problem Management will:

  • Immediately mark the pothole with bollards to improve visibility in the short term.
  • Arrange for the pothole to be repaired permanently, ensuring it no longer poses a hazard to others.
  • Ask why was the pothole there in the first place.
  • Investigate measures to ensure no further potholes appear in the pavement.

Problem Management is about identifying and analysing the root cause, then finding solutions to resolve the underlying causes. Unlike Incident Management, which works in the present and with a sense of urgency, Problem Management often operates on a slower timeline (when was the last time you tried to get a pothole repaired??), prioritising long-term improvements over immediate results.


Why Are These Practices Typically Separate?

Let’s face it. you wouldn’t want the roadwork crew attending to your broken leg, and you also wouldn’t want the doctor fixing the pothole. The objectives, workflows, and skills needed for Incident and Problem Management are fundamentally different, which is why ITIL? treats them as distinct disciplines:

Different Goals:

  • Incident Management aims to minimise downtime and restore service quickly.
  • Problem Management seeks to identify root causes and prevent recurring incidents.

Different Timelines:

  • Incident Management operates on short, urgent timelines.
  • Problem Management is a longer-term discipline, involving deeper investigations and systemic fixes.

Different Skillsets:

  • Incident Management involves Service Desk / front-line IT staff skilled and tasked with rapid resolution; they hold the reel of tape to hold things together.
  • Problem Management typically requires specialist technical analysts skilled in root cause analysis. They’re looking for that permanent fix.

By keeping these disciplines separate, organisations ensure that urgent issues don’t get delayed by lengthy investigations, while systemic problems receive the attention they deserve.


So When Should We Combine Them?

Whilst keeping Incident and Problem Management separate is often the best practice, there are situations where combining them both is needed:

  • Smaller Organisations with Limited Resources

In small IT teams, resources are often stretched thin. Maintaining separate workflows and skills for incidents and problems can lead to inefficiencies, duplicated effort, and skills shortages. In these environments, combining the practices ensures that both the immediate and long-term needs of the customer are addressed without unnecessary complexity.

  • Organizations Implementing a Structured ITSM Framework

When 1st implementing a structured ITSM framework organizations will start invariably select Incident Management as their starting point. This is a natural starting point, but you can’t ignore long-term, permanent fixes in favor of short-term work-arounds. Without any Problem Management activities then issues will recurr frequently and so what is implemented isn’t true Incident Management, but a merger of Incident and Problem Management practices.

  • High-Volume, Repetitive Incidents

I expand on proactive and reactive Problem Management and the specialist skills needed during training sessions and also on my Youtube channel if you need further explanation of the practice, but when incidents occur frequently due to the same underlying issue, proactive Problem Management isn’t always the 1st practice to identify the frequency of these incidents. Invariably Incident Management and the Service Desk are the 1st to identify frequent, and widespread, service interruptions. A combined practice can accelerate both resolution and root cause identification. For example, if a specific group of users frequently experience the dreaded “blue screen of death” then the Service Desk can identify the common attributes amongst the users which forms part of the root cause analysis. That same Service Desk team can also investigate the root cause to implement a fix.

  • Organizations Seeking Operational Simplicity

Some organizations aim to streamline their workflows, eliminating handoffs and reducing bureaucracy. Combining incident and problem management allows them to work more efficiently, particularly in fast-moving environments like startups.

  • Integrated ITSM Tools

Most ITSM tools support the integration of incident and problem records, allowing for seamless tracking and collaboration. This makes it easier to combine practices without losing visibility into either and is especially important when 1st implementing an ITSM framework.


What a Combined Practice Looks Like

If you choose to combine Incident and Problem Management, it could look like this:

1: Centralized Logging

  • All disruptions are logged as incidents in the ITSM tool.
  • Incidents that are recurring or have significant impacts are automatically flagged as potential problems.

2: Immediate Response

  • Prioritise restoring service quickly for high-priority incidents.
  • Front-line staff focus on minimising user disruption.

3: Root Cause Investigation (Parallel Workstreams)

  • While resolving incidents, teams initiate root cause analysis.
  • Use knowledge bases or previous problem records to accelerate RCA.

4: Preventative Measures

  • If a root cause is identified, implement temporary workarounds (e.g., marking the pothole with bollards).
  • Schedule permanent fixes where possible.

5: Knowledge Sharing

  • Document known errors, workarounds, and fixes in a shared knowledge base.
  • Ensure that future incidents related to the same problem can be resolved faster.

6: Continual Improvement

  • Periodically review combined processes to ensure they are efficient and achieving the desired outcomes.


Potential Challenges

Combining Incident and Problem Management practices isn’t without risks:

  • Loss of Focus: Urgent incidents may overshadow longer-term problem management efforts.
  • Resource Strain: Teams may struggle to balance reactive and proactive tasks.
  • Skill Gaps: Service Desk staff may lack the expertise (and time) needed for effective root cause analysis.

It’s therefore important that clear a governance structure is in place, with defined policies, processes, skills, roles, and tools to address these challenges and support efficient and effective issue management. It should also be noted that the same distinction between an Incident and a Problem must be maintained even when the practices are combined.


Final Thoughts

Incident and Problem Management are distinct disciplines for good reason, but there are times when a combined approach can bring value—particularly for smaller teams or high-volume environments. By carefully designing a framework that balances immediate resolution with long-term prevention, you can achieve the best of both worlds.




Jo Peacock is a visionary leader in IT governance and organizational change, empowering teams through strategic innovation and best-practice guidance.

Jo Peacock

919 308 0634

[email protected]

Beverly Weed-Schertzer

Author, IT and Business | "Critical Thinking, Limitless Possibilities." Global Program Manager @ BT | Cross-Functional Team Leadership| ITSM Expert| #ITSMForBusiness #ITSMKMBusiness #ITSM4AI #humaninfluenceintechnology

2 个月

Incidents and Problems are not the same. I’ve seen organization’s fail miserably trying to merge into one process for efficiency reasons. The result - SLA achievement decreased by 27% in the first 30 days. Not wanting to give up, one client invested one year thinking it was a temporary dip and the benefits will be seen in the long term. Nope, SLA performance didn’t improve nor did the performance of the new Incident and problem consolidated process. My client reverted back to two separate processes integrated to work together and within 3 months SLA performance back on track performing well.

Faisal Alshammari

IT Support Manager, MIET

2 个月

The fun part is when you do it without even realizing it! Sometimes we get so invested in the how, why, and what that, by the end of the day, we look back and see we’ve seamlessly combined two critical practices, and got the job done. Flexibility doesn’t mean fragility! So why not?

回复
Ian MacDonald FBCS CITP (ITIL Author and Ambassador )

Award Winning ITSM Consultant | ITIL Author | ITIL 4 Master | Trainer

2 个月

The only consideration in a shared approach is that under stress the behaviours and characteristics of incident management will inevitably dominate and PM gets less focus. This of course can be anticipated and managed. A suggestion to ensure balance and ensure strong PM focus is to set a top level team KPI that drives reward/recognition for the combined team based on ‘prevention and reduction of incidents’ this shows that success of the team is not biased and skewed towards incident resolution.

Rich Petti

?? ITIL?4 Master, Managing Professional, Practice Manager, & Strategic Leader ?????? ITSM Coach, Consultant, & Trainer ?? Husband, Father, Papa, Brother

2 个月

P.S. Much discussion, including my comments are on IM and PM from a reactive perspective, but let us not forget the value of proactive PM, especially when assisted using the Monitoring&Event practice! At least one person should be designated for this work in a domain that all agree is most critical and valuable to the organization.

回复
Rich Petti

?? ITIL?4 Master, Managing Professional, Practice Manager, & Strategic Leader ?????? ITSM Coach, Consultant, & Trainer ?? Husband, Father, Papa, Brother

2 个月

Then plz color me a purist too. ?? Before #ITIL was 'invented', in managing small to large IT support operations; for high volume, multi-category [multiple hw & sw domains], & multi-support levels we kept them separate. We even had two KE db's, one proprietary for 'public' consumption. In small IT startup scenarios, it makes sense to initially mingle the two practices, but the vision should be to mature the capability of each so they are highly integrated but separate practices. The timeline of each practice is very different, with IM needing to be as short and problem using work-arounds to buy time. The most common time both practices would be running concurrently/in parallel with each other; would be when a newly discovered incident is registered. IM would trigger the opening of a new problem record & auto-link the new incident record to it for review and by PM to decide what action if any, to take. Another common scenario would be when IM creates a new work-around, for any incident to meet IM's purpose, then also trigger the PM practice to officially 'bless' the WA as valid for use in other occurrences &/or improve the WA. Eventually, all incidents would be auto-linked to existing P/KE records. #ITSM #Incident #Problem

回复

要查看或添加评论,请登录

Jo Peacock的更多文章

  • Do Your Service & Project Management Teams Understand the “Why”?

    Do Your Service & Project Management Teams Understand the “Why”?

    Have you ever been in a meeting where someone asks, “Why are we doing this?” and the answer they get is, “Because…

  • Is ITIL? Holding You Back?

    Is ITIL? Holding You Back?

    If you’re working in IT, you’ve been hearing the murmurs for a while, just as I have: Is ITIL? still relevant, or has…

    22 条评论
  • WHY CERTIFY WHEN THERE'S AI?

    WHY CERTIFY WHEN THERE'S AI?

    AI tools like ChatGPT, Microsoft Copilot,Google Bard, and countless others have become a go-to for everything from…

  • ITIL & ROI

    ITIL & ROI

    I’m often asked “What ROI will I see from implementing an ITIL based framework, and how long will it take me to realize…

    3 条评论
  • Don't Fail On The Stage

    Don't Fail On The Stage

    Why working with the public sector is unlike anything you've ever done before I'm not a perfomer, but I do teach and am…

  • The Heart of Effective Leadership

    The Heart of Effective Leadership

    Do you love your team? Sounds like a bizarre question, doesn't it? I know where your mind is going right now, and it's…

  • Nailing Process Definition: 5 Building Blocks to Success

    Nailing Process Definition: 5 Building Blocks to Success

    Sometimes living in the world of IT governance can feel like herding cats. And I know I'm not alone.

  • You're leaving the human behind!

    You're leaving the human behind!

    Aarrgghh! I've just hung up on yet another call queueing system with interactive voice recognition that doesn't…

    8 条评论
  • Beyond the Numbers: Does Ageism Exist in IT?

    Beyond the Numbers: Does Ageism Exist in IT?

    Welcome, fellow experienced IT pros! Pull up a chair and let's talk about something that's been on my mind lately, my…

    2 条评论
  • The Surprising Secret to Project Success: Risk Management

    The Surprising Secret to Project Success: Risk Management

    Let's talk about something that might not sound like the sexiest topic in the world of project management, but trust…