The Multi-Faceted Threat of Single-Point Failures

The Multi-Faceted Threat of Single-Point Failures

Every so often, a headline-grabbing incident delivers a sharp reminder of an overlooked vulnerability that needs addressing. Such incidents often originate from a single point of failure, where the failure of one part of a system leads to a breakdown of the entire system.

The recent CrowdStrike outage was such an incident, and the vulnerability it highlights is the growing risks associated with single-point failures.

On July 19, 2024, malfunctioning software from CrowdStrike caused what many believe to be the largest IT outage ever, inflicting over $10 billion in losses on companies in many industries. The flawed update to a component of the widely used Microsoft Windows operating system[1] caused 8.5 million systems to crash. Somehow, CrowdStrike’s testing and validation system allowed the faulty software to be released. The fallout is still being felt.

The software failure and its cascading effects highlight companies’ dependence on the Microsoft Windows operating system and Windows’ reliance on a piece of software from this one supplier of cybersecurity services. Initial assessments of the incident suggest that key reasons for this over-dependence are industry consolidation that has shrunk the pool of suppliers and the interconnectedness of complex IT systems.

Diamond-Shaped Risks

This is a familiar story in supply chain circles. Supply chain professionals are well aware of the dangers of relying on a single core supplier, also known as single sourcing, and have developed ways to mitigate the risks.

My book The Power of Resilience: How the Best Companies Managed the Unexpected (MIT Press, 2015) takes a detailed look at these issues.

In the book, there is an industry supply chain schema that represents the archetypal supply chain (see Figure 1). It is an inverted “tree structure” where each OEM or brand owner has many suppliers, and each supplier has many suppliers of its own. This structure appears to be robust in that there are numerous supply options.


Figure 1. Industry Supply Chain Structure

However, OEMs and brand owners are sometimes unaware that their supply chain structure looks more like a diamond (see Figure 2). Enterprises may rely on a single supplier embedded deep in a lower supply chain tier in these cases.


Figure 2. A “Diamond” Structure – A Hidden Deep Risk

One example of such vulnerability and its consequences described in the book involved a fire in 2012 at the Evonik Industries cyclododecatriene (CDT) factory in Marl, Germany. CDT was a key ingredient in the manufacture of nylon, a high-strength plastic used widely in the auto industry. Other products, including solar cells, athletic shoes, and optical fibers, also used CDT. That one fire destroyed almost half the world’s production capacity for CDT. Moreover, supplies were tight because of demand from the booming market for solar panels. The automobile industry was about to grind to a partial halt because of the crisis.

Sometimes the chokepoint is a region or a country. For instance, Taiwan is a single-source supplier of various types of microchips. Taiwanese companies produce over 90% of the most advanced semiconductors, with most manufactured by a single company, Taiwan Semiconductor Manufacturing Corporation (TSMC). These chips power everything from mobile phones to automotive systems to appliances. Such dependency is behind the US government’s attempt to rekindle chip manufacturing in the United States.

More Weak Links

However, single-point failures and their potentially devastating effects are by no means confined to supply issues for parts and components.

The breakdown of critical physical infrastructure can have single-point failure ramifications. In 2021, a six-day blockage of the Suez Canal caused by a stranded container vessel rippled through many supply chains. More recently, the Panama Canal Authority reduced the number of daily transits and introduced weight restrictions in response to drought conditions in 2023. The Yemen-based, Iran-backed Houthi rebels group has attacked dozens of commercial ships in the Red Sea since November 2023, causing shipping companies to navigate around Africa. The detour has lengthened the voyage time from Asia to Europe by about 10 days. Both disruptions have reduced global maritime shipping capacities, leading to occasional shortages and increased consumer prices.

The Covid-19 pandemic brought to light the lack of personal protection equipment (PPE) production capacity in the US. This led to widespread virus transmission in the population, notably among hospital workers, reducing hospital capacity and making it harder to combat the outbreak.

Labor-related problems can be single-point disruptors.

My 2005 book The Resilient Enterprise: Overcoming Vulnerability for Competitive Advantage (MIT Press, 2005), describes the 2002 West Coast ports’ lockout that shut down all the ports along the US Pacific coast. The damage to the US economy prompted then-president George W. Bush to invoke the 1947 Taft-Hartley Act to open the ports. In The Power of Resilience, I describe a strike by 400 unionized shipping clerks at the Port of LA and Long Beach in 2012. The action shut down three-quarters of the port complex, idling an estimated $760 million worth of goods per day.?

The Writers Guild of America strike in 2023, underscored Hollywood’s dominant position in producing movies and TV series. It also showed the dependence of Hollywood on the writers and their union. The action idled actors, photographers, set managers, and thousands of small businesses that depended on the big studios.

Red Flag Warnings

The range and severity of single-point failures have increased in line with the expanding universe of risks companies now face. Industry consolidation, increasing system complexity, and more interconnectedness—trends that fueled the CrowdStrike outage—have increased our over-reliance on single critical links in the chain and made them opaque.

Companies have responded to these supply chain threats in various ways. For instance, in response to the Evonik Industries crises described above, auto industry companies got together with leading chemical manufacturers to rationalize demand for CDT and develop alternative sources. The effort mitigated the effects of the sudden supply interruption.

Collaboration among companies in the same vertical mapping can highlight suppliers that an entire industry depends on. Moreover, such suppliers, many of which supply critical materials or components to multiple industries, may be buried deep in the supply chain. Mapping the supply chain in as much depth as possible is an important step toward discovering these single-point failure risks. After unearthing hidden dependencies, companies can develop backup suppliers.

IT infrastructure and the ecosystem of companies that support it are typically not under the purview of supply chain managers. However, some techniques described above may help reduce the chances of CrowdStrike-like crises.

The regulatory frameworks that aim to control the market forces that drive supplier consolidation in the tech world may also need examination. Instead of counteracting a merger between two OEMs, in some instances, it may be better to ensure that either of the companies involved or the merged one, are not blind to single-point failure risks buried deep in their supply chain or IT infrastructure. To some extent, this is what the US government is trying to do in its attempts to reduce supply-related risks associated with strategically essential products like microprocessors.

More broadly, the CrowdStrike debacle offers a warning that companies and governments need to pay attention to the single-point vulnerabilities emanating from our complex, specialized world.


[1] The Windows operating system is used by more than 70% of computers worldwide (https://gs.statcounter.com/os-market-share/desktop/worldwide/)

Robert Juricic

?Stop being surprised by end-of-life messages | Use our software to automate lifecycle management and proactively monitor components, reducing downtime and delays?

4 周

The CrowdStrike incident is a powerful reminder of how hidden dependencies can spiral into major disruptions. What struck me most about your supply chain mapping insights is how they parallel what we're seeing in obsolescence management, especially in industries like aerospace and defense where components can become extinct long before the system's end-of-life. I'd love to hear your thoughts on how companies can better integrate obsolescence planning into their risk management strategies, as we're seeing these challenges overlap more frequently in our work with manufacturers.

回复
Veera Baskar K

End to end supply chain solutions to reduce cost, optimise inventory, improve customer satisfaction, smarter processes and capability building | Founder & CEO - 7th Mile Shift | Ex-TVS Motor Company - AVP Logistics.

3 个月

Your perspective on the critical risks of single-point failures is spot on, especially as you highlight the cascading effects in interconnected systems. However, while redundancy is a common strategy, have you considered how over-reliance on redundancy itself could introduce new vulnerabilities?

回复
Amit Ben-Raphael

Founder & CEO At CSO Projement

3 个月

Great insights on single-point failures—proactive risk management and system diversification are crucial to mitigating these risks.

回复
Yann Genais

Indépendant - Flowlog- Expertise Supply Chain

3 个月

Convinced by the ??Diamond structure?? and always consider it when analysing how the Supply chain manage her sourcing responsability.

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了