Failure Is Predictable – So Why Aren’t We Better Prepared?
Last Friday’s Disruption
Last Friday, Barclays Bank experienced a significant service disruption affecting its digital banking platforms. The timing couldn’t have been worse: the outage overlapped with HMRC’s self-assessment tax deadline. Although Barclays will claim it moved swiftly to communicate with customers and implement mitigations, media reports suggest otherwise and the incident raises important questions about how financial institutions handle critical operations and maintain resilience under pressure.
Understanding the Impact
The outage began around 5am on Friday, 31 January, with peak disruption at 7am and 9:20am—prime times for both retail and business customers. While cards and ATMs remained operational, many users struggled with:
For a major bank handling billions of transactions across tens of millions of accounts, even a brief outage can cause widespread disruption.
Critical Risk Management Concerns
What's particularly concerning is that these incidents are entirely predictable - they are the definition of planning for "When, not if." While operational risk teams gather data and expert views on what could go wrong, they don't always take the critical step of turning this into actionable insight through robust quantitative methods. Without such analysis, how can management make informed decisions? Even basic Monte Carlo simulation can yield meaningful results in hours - maybe a little longer if inputs need to be confirmed. The fact that we still see multi-day outages suggests either this analysis isn't being done or isn't reaching the right decision-makers.
Quantifying Operational Risk
By coincidence, I recently modelled a hypothetical authentication service failure for a mid-sized UK bank—actually, two scenarios:
领英推荐
These were based on basic assumed parameters (e.g., call volumes, contact rates) and tested using Monte Carlo simulations. The goal wasn’t to produce an exact figure—rather, to show quantification is an essential element of effective risk management since doing so demonstrates how quickly costs escalate when key drivers shift. These drivers include:
Looking Forward
Although these simulations weren’t specific to Barclays, they demonstrate the value of scenario analysis in:
Ultimately, scenario planning and clear communication help organisations respond faster and minimise damage during an incident. Financial institutions, regardless of size, must invest in robust infrastructure, appropriately-tested contingency plans, and effective incident response—all of which mitigate the risks and improve customer trust.
Explore The Models
Want to see these authentication failure simulations in action? You can explore and run these models yourself at the Risk Insights Explorer (riskspace.com). Click Select Scenario, then from the drop down select, "Online Banking Authentication Service Failure". The platform lets you adjust parameters, challenge assumptions, and develop your own scenarios - helping build better understanding of operational risk modelling and scenario requirements.
Note: The Monte Carlo simulations referenced are illustrative examples only and not based on any specific data, Barclays or otherwise. They serve as examples of risk quantification approaches rather than predict actual impact.