The CrowdStrike / Microsoft Chaotic Outage
CYOODA Security

The CrowdStrike / Microsoft Chaotic Outage

Here are my thoughts on what transpired last Friday concerning the CrowdStrike / Microsoft global IT outage.

Like many of you I was indirectly caught up in the crisis first hand seeing disruptions with Retailers and Airlines ( thankfully my flight wasn't impacted).

Reading through many post comments this morning, I sense a mixture of anger, finger-pointing, frustration, empathy, and compassion for those fighting to get systems back on line as quickly as possible. For those of you like me who have been in a crisis situation will know how challenging and stressful that can be so to all CrowdStrike employees and those hard working IT support staff in the thick of it right now, stay strong it will get better.

Some History

What happened on Friday could have happened to anyone but unfortunately for CrowdStrike it was at a scale that businesses globally had neither seen before and were not prepared for.? In the past, for those of you who remember, we had the ILOVEYOU virus, Slammer, and more recently, NOTPETYA.? While all of these were impacting to those affected they were not on the scale of what happened on Friday.

More than ten years ago, I wrote an article and spoke about the pervasiveness of software and, in particular, VMware, and warned then, that it may not happen tomorrow or next week, but years from now, we would have a catastrophic technology disaster.? As much as we can blame the vendor and sometimes quite rightly, we collectively as humans have to take responsibility for our own resilience and do the right thing.? Vendors release patches; it's up to us to implement them as fast as we can and in a safe way so as not to impact the business. ? Equally, if we don't implement the patch, then we can hardly blame the vendor for not trying.? But in that conundrum, there is always the balance of risk.? Do we go quickly in order to protect the organisation, but at the risk if the patch fails to cause business disruption, or do we wait but at the risk we may get compromised by an attack exploit?? That decision essentially boils down to risk appetite. ?

The Friday event and CrowdStrike Response

What happened on Friday no one saw coming, no warning it just happened which led to some thinking was this the beginning of some kind of Cyber War.? Earlier that day Microsoft also suffered a significant outage and although not on the same scale as CrowdStrike was none the less a double whammy for those organisations affected.? Shit happens, and it is refreshing to see that after the initial marketing speak communications from CrowdStrike that their CEO today stood up as he should and took accountability and apologised for what happened.? For that transparency and leadership I commend CrowdStrike and all organisations should take heed (In particular, Microsoft I hope you are listening...)

Technicals

A lot of people have already commented on the technical aspects of what went wrong and all I will say is that for software like CrowdStrike that is so deeply embedded in the Microsoft OS tech stack (e.g., kernel), then Microsoft should take some responsibility for what happened on Friday.? If I'm not mistaken, there should be a QA process for software that hooks into the kernel before it is allowed to be released, both on the software vendor and the OS vendor.? So, processes clearly failed here.? The same goes for deployment, there should have been a fail safe process, deploy to N+1 and then wait, not simply hit the button and go!

Business Continuity

I see a lot of commentary on BCP and DR plans.? The reality of what happened on Friday is that hardly anyone could have predicted that type of scenario, and even if they had, they would have probably dismissed it as unlikely to ever happen and, therefore, low risk.? The fact that it has been so impacting on EUC environments and that since COVID, our working lives have changed, meaning many people now work from home a few days a week or permanently, has made remediation efforts by already stretched IT support staff in many businesses challenging. So this needs to be considered too in future BCP plans.

Eggs in one Basket

Think about your technology stack and single points of failure particularly where your tech is pervasive. For security although a cliche defence in depth or as I like to look at it a layered onion approach is always preferable and will give you options for resilience. Not that long ago organisations used to have multiple Firewall stacks (different vendors) at each layer. May be think about using a different vendor for AV / EDR on your Server stack vs your EUC environment. That again seem like a costly old school approach but some times the old ways are the best!

A New Chapter

Today is Monday, and it's time to turn a new page. Because of our interconnected world, we must rewrite BCP and DR plans, conduct regular testing, and account for supply chains and SaaS platforms that are so intertwined. So, take a hard look at all of your technology and consider how pervasive it is and what interconnections it has across your lines of business. Only then can you make a more informed risk decision about what you need to do and your accepted risk tolerance levels.

Reach out if you need any help or guidance, particularly for small to medium size businesses that may be suffering I'm offering my services Pro Bono (DM me here in LinkedIN).

Alan Chan

A Proud Dad, Passionate Cyber Security Leader & Basketball Fan...

4 个月

Great writeup John Reeman ??

要查看或添加评论,请登录

John Reeman的更多文章

  • Welcome to “The Cyber Security Loop" - News Bites #6

    Welcome to “The Cyber Security Loop" - News Bites #6

    Perspectives and opinions on the world of cybersecurity and the current threat landscape here in Australia and from…

  • The "Cyber Security Loop" News Bites #5

    The "Cyber Security Loop" News Bites #5

    Welcome to this edition of the Cyber Security Loop. This editions theme is on AI and Cyber.

  • Welcome to the 'Cyber Security Loop' news #4!

    Welcome to the 'Cyber Security Loop' news #4!

    Perspectives and opinions on the world of cybersecurity and the current threat landscape here in Australia and from…

    6 条评论
  • Data Privacy Act Reforms: August 2024 Deadline for all Australian Businesses

    Data Privacy Act Reforms: August 2024 Deadline for all Australian Businesses

    Overview The Privacy Act 1988 was introduced to promote and protect the privacy of individuals and to regulate how…

    2 条评论
  • Cyber Security News Bites #3

    Cyber Security News Bites #3

    Welcome to “The Cyber Security Loop - News Bites” ! Perspectives and opinions on the world of cybersecurity and the…

  • Cyber Security News Bites #2

    Cyber Security News Bites #2

    Welcome to “The Cyber Security Loop - News Bites” ! Unique perspectives and opinions on the world of cybersecurity and…

  • Cyber Security News Bites: #1

    Cyber Security News Bites: #1

    Welcome to Cyber Security Loop News Bites! Unique perspectives and opinions on the world of cybersecurity and the…

    1 条评论
  • To be or not to be a CISO?

    To be or not to be a CISO?

    Having seen many articles written about the role of a CISO and opinions of how hard it is becoming, as a former CISO, I…

    4 条评论
  • Data Breaches and Data Retention

    Data Breaches and Data Retention

    In the aftermath of a data breach the topic of data retention is almost always certain to be discussed. While it is…

  • Data Security, DLP, DSPM, and AI

    Data Security, DLP, DSPM, and AI

    Data Loss Prevention (DLP) solutions have been around for over a decade. Back in 2006, I remember deploying Vontu, a…

社区洞察

其他会员也浏览了