Denial of Suez: What can we learn about risk assessing SPOF?

Denial of Suez: What can we learn about risk assessing SPOF?

Single points of failure (SPOF) creep into many business processes. Often unintentionally. Some exist from the outset but were simply not assessed, or were assessed and deemed low risk. That legacy server running a critical piece of code wasn't legacy at the beginning. That retiring SME, the one who wrote the code, had just started. That Supplier, the only company that still maintains that legacy server, was one of many who could provide support. The process may not have even been that important back then, however it is now critical to the success of the business. We may know instinctively these single points of failure exist. We may have assessed the risk at the beginning - BUT - do you know how much risk now exists? Has something changed in the system that has now introduced a SPOF where one previously didn't exist? Even where you do know where your single points of failure reside, what happens when the SPOF can't be engineered out or, are too expensive to fix? How do you understand the risks associated with SPOF and, over time, ensure we maintain effective mitigation should something [inevitably] go wrong? Using the recent blockage in the Suez Canal as a case study, let's find out...

What happened in the Suez?

On Tuesday 23rd March 2021, a cargo ship by the name of Ever Given (not Evergreen as is plastered on the side of the ship in the picture above) got...well stuck...mid-way along the Suez Canal. The 400metre cargo ship, weighing in at around 220,000 tonnes, blocked one of the most important man-made shipping lanes in the world. According to media reports, the blockage held up the transit of around £7billion - yes billion - worth of trade per day. Lloyds of London estimates the cost to the global economy to be a sobering £300million per hour! The impact has been so great that oil prices rose 3% on the news the ship was likely to take more than a week to dislodge - and it did take more than a week!

No alt text provided for this image

Why didn't the ships just go another way?

In short, there are two routes between Europe and Asia and the Suez canal is nearly a fortnight quicker. The Suez Canal is a man-made waterway that connects the Mediterranean Sea in the north with the Red Sea in the south. Prior to the opening of the Suez Canal in 1869, ships transiting between Europe and Asia were forced to take a long perilous route around the coast of West Africa and around the Cape of Good Hope. The Suez canal effectively provided an 8-10 day shortcut. This shortcut naturally saved shipping companies a significant amount of money and time and as such is the preferred route for 19,000 vessels a year and 12% of the world's freight.

Even though it is located wholly in the state of Egypt, the Suez Canal is of such material geopolitical importance that its use is governed by the Convention of Constantinople, an international treaty which states the Canal:

"may be used in time of war as in time of peace, by every vessel of commerce or of war, without distinction of flag"

So, whilst ships could go another way, the cost of the alternative route would likely make many journeys economically unviable.

Single Point of Failure

The canal's importance to global trade however gives rise to what is technically a single point of failure (SPOF). For those not acquainted with the term, a single point of failure being something that when broken causes everything else connected to it to shut down or become significantly impacted. There is no [viable] plan B. In the case of the Suez Canal, if the canal is blocked, a significant volume of shipping is left idling on either side with nowhere to go. As you can see in the Vesselfinder snapshot below, the number of vessels awaiting passage has built up significantly. For every day of delay, costs are mounting and so too the local environmental impact.

Notwithstanding the cost of delays, the blockage increases the amount of maritime traffic around the horn of Africa, specifically off the coasts of Somalia, Ethiopia and Eritrea - no doubt increasing the risk associated with piracy. Economic impact, environmental impact, piracy impact. Not something that you want at the best of times and certainly not when you're already in the midst of a global pandemic!

No alt text provided for this image

Could the Suez SPOF be avoided?

In short, yes. Like most things, a SPOF can usually be engineered out if it is identified and then someone is willing to throw enough money at the problem to get it fixed. But, just because you can engineer out a Single Point of Failure doesn't mean you should. The system as a whole should be risk assessed first. Where the impact of the SPOF is deemed to be greater than the cost to mitigate, it is appropriate to bake in further resilience. If the impact of the SPOF manifesting is lower than the cost of control, investing in mitigation may not be cost-effective - Risk Management 101. Up until recently. the impact and associated mitigation have been fairly balanced. Of course, there was the Yellow Fleet incident that lasted 8 years between 1967 and 1975. There were also two incidents in 2016 in which ships blocked the entrance to the canal - one of those incidents resulted in the canal shutting down for 2 days. In the case of the most recent Suez canal blockage, what we're seeing is the impact resulting from a failure to periodically re-assess inherent risk.

Increased Inherent Risk

In 2015, Egypt's government chose not to invest in engineering out the single point of failure, inherent in a single channel design and instead chose a path that ultimately increased the likelihood of this type of full-channel blockage occurring. Instead of investing in a second channel, the Egyptian government invested in making the canal wider and deeper. This was so that bigger ships could navigate the canal. Bigger ships are harder to control and thus more likely to ground in the mud. The Ever Given is not the first ship to ground, but when such a large ship grounds, recovery is more complex. These larger ships can block the whole canal, are harder to dig out and ultimately it takes a lot longer to restore the canal to normal service. The same issues can occur with smaller ships but their recovery is swifter and more straightforward. In some cases, traffic can still navigate the canal whilst the smaller grounded vessel is recovered. The decision of the Egyptian government, to support these bigger ships, means the inherent risk of a complex blockage in the Suez occurring, has increased materially. The increased likelihood of similar events occurring in the future is something I am sure shipping companies are factoring into their risk management programmes. But can the rest of us learn too?

What can businesses learn from the 2021 Suez incident?

Whilst you may not be an international shipping magnate or someone in the maritime insurance business, you can still learn from the Ever Given Suez Incident. Here are some of the more salient takeaways:

Design-phase risk assessment must seek to identify SPOF?

Whilst probably not a major component of civil engineering projects in the late 19th century, when the Suez was first constructed, risk assessment is now fundamental. Time spent assessing risk at the design phase of a project is seldom wasted. It's time that the proposed design can be pulled apart, nay, ripped apart, for possible weakness. Ensure this initial risk assessment explicitly seeks out SPOF and the associated impacts should a SPOF occur. If you are running a project, ensure you get the right stakeholders involved as early as possible. Technical Experts, Risk, InfoSec, Business Resilience, DPO, Legal and Compliance will all be well-placed to look at the design and call out potential issues. The earlier we guys are involved, the more likely single points of failure will be identified and a mitigation plan can be put into place.

Model inherent risk against future system use

The risks associated with the Suez changed as ships transiting through the canal have got bigger. In risk terminology, the level of inherent risk increased. The inherent risk associated with your business-critical processes may be doing something similar i.e. drifting upwards unchecked. Inherent risk is brought down to an acceptable "residual" risk level using mitigating controls. The larger the likely impact, the greater investment in compensating controls should be considered. If, however, inherent risk drifts upwards unmonitored, what was once deemed to be adequate investment in control for the then perceived risk, may now be in material deficit. An effective way to keep a check of this drift is first to model different scenarios at project inception, up-to-and-including the worst-case scenario. The outputs of these assessments should inform investment in mitigating controls and at what point these controls should be introduced. Secondly, systems change and so too does system usage so ensure you periodically reassess inherent risk and then factor this new level of inherent risk into your worst-case scenario models. The output of which should again inform what additional controls may be needed - including response to incidents when they occur.

Regularly test incident response

Once a single point of failure fails, what you don't want is to be in a situation where your corrective controls are now no longer able to support recovery because they weren't designed to cope with a big mother of a ship-sized disaster! What exacerbated the Suez blockage was the incident response. The initial response appears to be woefully short of where it needs to be - as can be seen below where Digger McDigface is clearly outgunned!

No alt text provided for this image

At the end of your periodic risk assessment, feed the worst-case scenario into your incident response exercises. Test as close as you can to the real-life worst-case scenario. Was your response effective? If not, ensure you update your plans accordingly, supported by resources commensurate with the likely tasks responders will be facing...maybe two diggers going forward (...and I know there are also dredgers working to dislodge the ship).

Summing up...

Single points of failure can and do happen but they don't need to be a complete blockage of the Suez. Assess your SPOF early. Take steps in your risk assessment process to first identify SPOF and mitigate where possible. If you can't engineer out the SPOF during the design phase, make sure your compensating controls remain effective as demand on the underlying processes grow. Keep a check on the current inherent risk and re-assess the adequacy of your current controls - if they are not effective, do something about that. And if the worst comes to the worst, and the failure occurs, make sure you can unblock your canal as quickly as possible! If you need help to identify Single Points of Failure in your systems or test your incident response plans, get in touch, Fox Red Risk can help!

About The Author

Stephen Massey is the Managing Director of Fox Red Risk, a boutique Business Resilience, Cyber Security and Data Protection consultancy. Stephen has worked in the information security risk, business continuity and data protection world for nearly 20 years. Stephen has delivered complex security programmes across defence, real estate, and financial services. Stephen has also authored the popular book "The Ultimate GDPR Practitioner Guide" which is available on Amazon in both paperback and Kindle eBook.

About Fox Red Risk

Fox Red Risk is a boutique data protection and cybersecurity consultancy and Managed Security Service Provider which, amongst other things, helps client organisations with implementing controls frameworks for resilience, data protection and information security risk management. Call us on 020 8242 6047 or contact us via the website to discuss your needs.

Rachel Dyges MCMI ChMC

Global Business Continuity Management Lead | ISO22301 Lead Implementer & Lead Auditor | AMBCI

3 年

Great post Stephen ?? Lots to learn from this one!

Craig Aspey

Sales Manager APAC - Empowering companies to drive revenue from their social media relationships | Growing businesses and people

3 年

Spot on!

Vicki Gavin

Cyber Security Business Partner

3 年

In fact quite the opposite. Risks must be actively managed with both protective and detective controls to ensure rapid response and recovery when risks are realised.

Harrison Barrett

CIPP/E, CIPM, CIPT

3 年

Good article ??. Nice to see discussions on inherent risk and the balance of implementing controls versus accepting the risk. When something goes wrong people often are quick to point out how it could have been prevented. Some risks however should be accepted, or aren’t even comprehensible until they occur.

James Spencer MPhil MBCI

Global Political Violence Risk Consultant at AIG

3 年

Stephen Massey MSc CISSP FIP Great post. I'd add to / clarify your point about "legacy" items: if there is a change (including upgrades) somewhere, it's important to check that the risk assessment remains valid throughout the whole system. A change in one area may have unexpected repercussions elsewhere, occasionally creating a SPOF where there was none before.

要查看或添加评论,请登录

Stephen M.的更多文章

社区洞察

其他会员也浏览了