Building a "root cause" mindset
In a fast-moving or scaling environment, our processes cannot be static. Learning and continuous improvement must be at the heart of how we work, how we design our processes and how leaders work with teams. We need to build scalability into how we work.
Learning becomes a key skill. Creating a learning mindset is important. There are key steps which we can take as leaders to help
Culture is important, and the role of leader as coach to build this is critical. But this goes hand-in-hand with learning good techniques for learning. For example, retrospectives are a key activity, but doing good retrospectives is a skill which needs to be taught and practiced.
In this Play, we look at one of the key skills needed to make learning successful - Root Cause Analysis.
Responding to symptoms
Ask your team about something which has gone wrong. What is their immediate response? They will tell you what happened. What went wrong. In general this will be an observed symptom. It is very easy to react directly and try and address that symptom.
Let's look at an example scenario with four players.
Steve is a directive manager who prides himself in keeping tight control of his team. Mary has raised to Steve a concern that handovers from Jon often seem to be much later than predicted. She is finding it difficult to get the code tested in time to have it released to Ranjit on the planned schedule.
How is Steve going to respond?
There is clearly an inefficiency in the system that needs to be looked at.? A na?ve response would be that if a symptom is observed, we must push back directly against the measured parameter.? Steve's immediate assumption is that this is a failure on Jon's part. If Jon’s delivery is late, Jon must be at fault and must make the deliveries earlier.
As a directive manager following Scientific Management approaches, Steve assumes that Jon must be under-skilled or under-motivated. Since he sees motivation as extrinsic (Theory X), Steve will aim to punish Jon. And since he sees work as predictable and reductionist, Steve will step in and take control, probably initiating a performance management plan (PIP) for Jon.
Read: Scientific Management - https://agileplays.co.uk/why-should-we-move-away-from-scientific-management/
Read: Reductionism and planning - https://agileplays.co.uk/agile-for-pms-are-there-any-plans/
Looking for causes
Let's step back a little and replay the scenario with a different leader.
Anne is a more cautious leader who is data-led and collaborative. She is aware that Mary's observation is valid, but is a symptom of the problem, not the problem itself. She cannot address the situation until she understands the underlying problem which causes the issue observed by Mary.
The purpose of Root Cause Analysis is to look beyond the symptoms and try and assess what the underlying (or "root") causes of the problem may be. If we address the cause, we are far more likely to deal with the problem than if we try and fix the immediate symptoms.
So let's look beyond that first symptom. We know that Jon's deliveries to Mary are late. The next step is to ask "why?". Mary only knows the symptom, so Anne needs to ask why the deliveries are late. It's time to call a retrospective to look at this problem with the whole team.
Note that we need a certain level of psychological safety to do this as a group exercise. It will work much better that way, but if there is a blame culture, Anne may need to talk to Jon individually first and collect people's opinions in private. Fortunately Anne has spent a while working on building a positive culture.
By probing into the symptom and asking "why" questions, Anne is now discovering underlying problems which she can address. Indeed it appears that Anne's own prioritisation of new features may be the underlying cause of the delays that Mary is seeing!
Read: Psychological safety - https://agileplays.co.uk/what-do-we-mean-by-psychological-safety/
Read: Why do retrospectives fail? - https://agileplays.co.uk/are-your-retrospectives-failing/
The "5 Whys" approach
In the example above, I was using a popular approach from Lean called "5 Whys". This involves repeatedly asking the question "Why" to look at the underlying cause of the issue identified in the last step. Each time we probe deeper into the problem, looking for a "root cause".
As we see, after questioning five times, we are reaching a real understanding on which we can act.
There is nothing "magical" about the number five. It emphasises that the approach is not simple and that it is necessary to keep pushing to get beyond symptoms to problems. Often we find that three "why" questions might get a technical explanation (here identifying feature quality) but that more probing is needed to understand a systemic root cause - what part of the process has caused the issue.
"5 whys" is a very effective technique and one which is relatively simple to understand and apply. It does, however, need some practice. Asking "why" repeatedly can seem artificial and disruptive. Personally I've found that once you get over the slightly stilted repetition, it can be a hugely effective way to find out what are the underlying root causes.
Often "5 whys" can take you down a path which discovers something unexpected and a root cause which is not obvious from the symptoms. The example above shows this, but a classic example was an examination of why the Lincoln Monument in Washington was deteriorating.
In this example (which is often quoted but is probably apocryphal) the solution to damaged stonework proves to be to adjust the lighting to attract less dusk-flying midges. This shows some of the possible power of the technique to propose solutions which are very different from directly responding to the symptoms.
Read: Effective retrospectives - https://agileplays.co.uk/effective-retrospectives-in-agile-development/
Read: Risks of local optimisation - https://agileplays.co.uk/the-risks-of-local-optimisation/
Read: Waste in Lean software - https://agileplays.co.uk/what-is-waste-muda-in-lean/
Extending "5 whys"
Although the "5 whys" approach is often advertised as the sole answer to root cause analysis, it does have limitations which are inherent in its simplicity. In our example, we identified that the root cause of the problem of delays was new features being prioritised. But we didn't identify that having Mary test the code after development isn't a good practice, and that quality could be improved by better develop/test integration rather than only testing at the end.
Unlike a tree, there is not a single root and there may be multiple underlying causes. Lean uses a technique called an Ishikawa diagram (also known as a "fishbone diagram" because of its shape). This is a tree-like structure, with a single outcome but multiple paths to root causes. The left hand part is split into categories and each individual root cause is "hung" from one of these categories.
The Ishikawa diagram is more complex to draw and requires more extensive analysis, but it emphasises how multiple factors may play into a particular incident. Standardised categories may also make it easier to identify root causes. I'd recommend an approach like this for a formal analysis after an incident, but "5 whys" makes a great starting point for assessing a situation.
To look at why multiple root causes may be important, consider this analysis from a hospital (from "Sensemaking of patient safety risks and hazards" - Battles et al). An incident with wrong medication being given to a patient is assessed as being due to poor product design on the patient wristband.
As the authors point out, this has correctly identified a factor in the problem but there could be multiple causes forming a complex tree as below. Many of these branches have a root in organisational culture, which is probably a greater factor than the wristband design. By just focussing on the first identified root cause, we may not be addressing the most important.
Good practices
As an Agile leader, you want to avoid making rapid decisions based on observed symptoms. This is the traditional "go with the gut" management, and can often lead you astray. Instead, you should endeavour to be analytical and to look beyond the immediate symptoms of an issue to find the underlying root causes.
For this to be effective, you will need to develop a culture of psychological safety which allows the teams to discuss issues openly and to understand the causes without blame or defensiveness. You will also need to work on your own skills to ensure that you develop the patience to find underlying problems rather than leaping at immediate solutions.
A great starting point is the "5 whys" technique which probes into the problem to find an underlying root cause. With some practice you can use this regularly as part of your normal approach to understand what the cause is behind symptoms which you observe.
A limitation in "5 whys" is the focus on a single root cause. For more important cases you may need to develop more formal process, such as using Ishikawa diagrams to identify multiple root causes and judge which you need to address.
Remember of course that identification is not enough. As with a retrospective, you need to plan improvement activity to address the underlying issue and make sure it is resourced and tracked to prevent the issue recurring.