CrowdStrike Outage: A Wake-Up Call for Holistic IT and Business Resilience

CrowdStrike Outage: A Wake-Up Call for Holistic IT and Business Resilience

Flights have been grounded, live TV broadcasts halted, and retail terminals shutdown. Business operations around the globe have been severely disrupted by an erroneous security software update.

The global impact caused by a faulty software update is a stark reminder that digital roots permutate so deeply throughout all organizations and ecosystems that when these events strike, the impact is increasingly disruptive to firms, supply chains, and society at large.

What we have already learned from the CrowdStrike incident:

  • An error with an automatic update to the CrowdStrike security software caused computers running Microsoft Windows operating systems to crash and then fail to restart.
  • CrowdStrike has a large, global install base, so this incident had a broad reach and cross-industry impact.?
  • The process to recover systems is highly manual and is further complicated in cloud environments. As such, the time and effort will be significant, resulting in a long tail of business disruption.
  • Furthermore, the incident's impact is not limited to CrowdStrike customers. We are witnessing significant indirect supply chain disruption, which we expect to continue over the next few days and weeks.

A call to action for business leadership

While it’s easy to focus solely on the technical specifics of this incident and assign blame externally, organizations must refrain from tactical overreactions. Instead, leaders need to accept that a perfect storm has evolved how IT is consumed, leading to even more events of this nature and we won’t be able to predict their source.

How do business leaders face into the larger systematic challenge of organizational resilience in a complex, highly digitized, and hyper-connected world where?

  • IT is embedded deeply into every business process.? Disruption of IT now means disruption of all our business and public systems on a grand scale.
  • Digital environments have gone from simple to complicated to complex in the last decade, and it is impossible to predict and control every aspect of how they operate.
  • Control environments are complicated and layered, often presenting us with surprising side effects that amplify other problems.
  • IT organizations are not business intimate enough, and thus, leaders as a whole struggle to protect or rebuild their organization’s value chains quickly.
  • Leaders don’t truly believe bad things will happen until they do. Thus, resilience activities are de-emphasized in the face of other priorities, forcing organizations to learn on the fly and under pressure, enhancing the consequences that will play out over the next days and weeks.

The five does and don’ts for business leaders:

  1. Don’t Panic.? But do understand this could easily have been your company/entity that got disrupted.? The organizations hit made sensible, best-practice tooling decisions and are victims of the complexity of the IT worlds we have created.
  2. Do get an independent review of your resilience capability from people with a post-incident mindset. Folks who have been through major outages and know the nuanced reality organizations must prepare for.? The answers are often counter-intuitive and are hard to come to without experience.
  3. Do understand your supply chain resilience. Speak to your third-parties and understand how they are assuring their posture.? Many of those impacted will not be directly disrupted, but will suffer from third party outages.
  4. Don’t expect IT to solve this for you. This is not an IT issue, it is a broad business issue and it needs a business-led solution with IT supporting.
  5. Do act now. Our digital systems are enabling an exciting future and we need to keep evolving and innovating as fast as we can.? But we need to respect that this evolution brings with it risk that we must address.? We need resilience IT and business systems now, and if you’re still waiting to address this, you are likely asleep at the wheel.

---

BCG’s first-hand experience supporting dozens of organizations impacted by massively disruptive cyber and technology crisis events has led us to develop pragmatic approaches to resilience based on a “post-incident” mindset.

You can read more about BCG’s research on general organizational resilience at the following link: https://www.bcg.com/publications/2020/how-to-become-an-all-weather-resilient-company


Charles Hosner Or Klier Thomas Bohne Walter W. Bohmayr Shoaib Yousuf Jean-Francois Bobier ??Jean-Christophe LAISSY?? Pierre Roussel Philippe Savary Gildas Bouteiller Liana Shalev Moti BenMocha Stefan A. Deutscher Biljana Bajic-Bizumic Tad Roselund Nadine Moore Colin Troha Faizul Ali Nadya Bartol Russell S. Benjamin Rehberg Jeanne Kwong Bickford Paras Malik Stefan Mohr Sugar Chan Dr. Bernhard Gehra Romain de Laubier Abhinav Bansal Alex Asen Clark O'Niell Vladimir Lukic Matteo Coppola Dylan Bolden Tawfik Hammoud Sylvain Duranton Vijay Pasupathinathan Jessica Apotheker Mary Martin

According to the WSJ, the update occurred at 9:30 a.m. Indian time. Is it an old legacy system from IBM AS400 or Mainframe with Falcon installation that is causing the issues? Could it be related to the protests in Bangladesh and human errors? Were change management procedures adhered to? Stay vigilant

Rich Klein

Crisis Management Consultant @ Rich Klein | 24/7 Reputation Protection / Investigative Journalist/ Portrait and Headshot Photographer

4 个月

Great piece, Vanessa. Of course, there's also the twin components of crisis management and crisis communications. Every organization must prepare for the worst , long before significant incidents like this occur. I've been disappointed in the initial statements from the CEO's at Microsoft and CrowdStrike in the immediate aftermath of this disruption. Both lacked sufficient empathy for what institutions, businesses and individuals (e.g., travelers) had to endure days after the strike.

Robert Fox

Performance-driven CIO / CTO / CISO, Senior Program Manager and Information Security Evangelist

4 个月

OUCH! ?? All in the name of "Security" . . . Now we see how fragile our systems really are - and how dependent we are in today's technology! ?? Now we also know that major events aren't always caused by bad actors!! ?? Stay Vigilant and Keep Safe !! ?? #majorincident #dependencies #securityglitch

Miranda Meng, ACC

Help Leaders Navigate Change and Align Teams | Coach & Speaker & Facilitator | Leadership Through Change | Global Experience Across Asia to America

4 个月

This IT and business resilience needs leaders to demonstrate leadership and commitment. Change would always happen. Crisis would never be eradicated. But it is people who have the capacity and ownership to make things work regardless the circumstances. Here is my sharing from leadership development perspective with a real story in a Fortune 500 company: https://www.dhirubhai.net/posts/miranda-meng-leadership-coach_leadership-riskmanagement-mindset-activity-7220268687277142016-m23I?utm_source=share&utm_medium=member_ios

Bhavin P. Kapadia

AI-Security Cyber GenAI Data I Banking Strategy Adviser | Speaker @ Imperial College London | Adversarial Threat Bayesian LLM Models | Fraud Identity AML KYC | Regulatory, Compliance | Financial Services Consulting

4 个月

on point

要查看或添加评论,请登录

社区洞察

其他会员也浏览了