Understanding the HISTORIC CrowdStrike Outage and How to Enhance YOUR Cybersecurity Resilience.
July 19, 2024 - Historic Crowdstrike and Microsoft global outages

Understanding the HISTORIC CrowdStrike Outage and How to Enhance YOUR Cybersecurity Resilience.

By Jen Waltz, Vice President, Global Alternate Channels

On July 19, 2024, a significant global computer outage occurred due to a faulty sensor configuration update in CrowdStrike's Falcon cybersecurity platform. This update affected approximately 8.5 million Windows devices, causing disruptions across airlines, hospitals, and financial institutions. Although the outage impacted less than one percent of all Windows machines, its broad economic and societal impacts were significant due to these enterprises' critical services. As I write this blog post, my husband, Dr.?Aaron Waltz,?has been stranded in Calgary, Alberta, Canada, due to his flight back home cancellation.

The Cause of the Outage

According to CrowdStrike, the issue originated from a sensor configuration update released during ongoing operations. This update triggered a logic error, resulting in a system crash and a blue screen of death (BSOD) on impacted systems. A logic or semantic error is a program's source code bug that can cause abnormal application behavior or system crashes.

CrowdStrike's Falcon platform is a breach-prevention tool using cloud-delivered technologies to prevent various attacks, including malware. The platform's core functions include antivirus, endpoint detection and response (EDR), cyber threat intelligence, managed threat hunting, and security hygiene. Falcon operates with a lightweight sensor that is cloud-managed and delivered.

CrowdStrike Response and Apology

“I want to sincerely apologize directly to all of you for today’s outage.” George Kurtz, CrowdStrike’s founder and CEO.

CrowdStrike promptly apologized and began a thorough root cause analysis to understand how the logic flaw occurred and identify process improvements. Kurtz sincerely apologized for the outage and emphasized the company's commitment to preventing similar incidents in the future.

Complete coverage of CrowdStrike's boggled update and Microsoft outage aftermath

Lessons Learned and Necessary Changes

The CrowdStrike outage highlights several key areas where organizations can enhance their cybersecurity resilience:

Enhanced Incident Response Plans

Organizations must ensure that their incident response plans are comprehensive and well-practiced. Response plans include:

  • Regular Drills: Conduct frequent incident response drills to ensure all team members are well-prepared.
  • Updated Playbooks: Keep incident response playbooks current with the latest threat intelligence and lessons learned from past incidents, including the CrowdStrike outage.

Implement Redundant Systems

To mitigate the impact of outages, organizations should consider implementing redundant systems and services:

  • Backup Solutions: Ensure critical security functions have backup solutions or failover systems that can take over during an outage.
  • Multi-Vendor Strategy: Adopt a multi-vendor strategy for cybersecurity tools to reduce reliance on a single provider.

Strengthen Communication Protocols

Clear communication is vital during an outage. Enhancing communication protocols includes:

  • Crisis Communication Plans: Develop and maintain crisis communication plans for internal and external stakeholders.
  • Automated Alerts: Implement systems to quickly notify relevant teams about the outage and provide ongoing status updates.

Improve Monitoring and Logging

Enhanced monitoring and logging can provide better visibility and faster detection of issues:

  • Comprehensive Logging: Ensure all systems, including backups, have detailed logging for better post-incident analysis.
  • Centralized Monitoring: Utilize centralized monitoring solutions for a unified view of the organization's security posture.

Review and Update Access Controls

Reevaluating and strengthening access controls is crucial to minimize risks during outages:

  • Privileged Access Management (PAM): Implement or enhance PAM solutions to control and monitor access to critical systems. Kron Technologies flagship cybersecurity product is Kron PAM.
  • Role-Based Access Control (RBAC): Grant users only the access they need based on their roles.

Enhance Backup and Recovery Procedures

Robust backup and recovery procedures can ensure quick restoration of services:

  • Regular Backups: Perform regular backups of critical data and systems.
  • Disaster Recovery Plans: Develop and test disaster recovery plans to ensure swift restoration of operations after an outage.

Strengthen Vendor Management

Improving vendor management practices to ensure better coordination and support during incidents:

  • Vendor SLAs: Review and update Service Level Agreements (SLAs) with vendors to include provisions for incident response and support during outages.
  • Regular Assessments: Conduct regular assessments of vendor performance and reliability.

Invest in Cybersecurity Training

Continuous training for staff on cybersecurity best practices and incident response:

  • Regular Training: Provide ongoing training sessions on the latest cybersecurity threats and response techniques.
  • Simulated Attacks: Conduct simulated attacks to test and improve the organization's preparedness.

Adopt a Resilience-Focused Approach

Building a resilience-focused cybersecurity strategy that anticipates and mitigates disruptions:

  • Resilience Planning: Develop comprehensive plans, including technical measures and business continuity planning.
  • Continuous Improvement: Establish a culture of continuous improvement using lessons learned from incidents like the CrowdStrike outage.

Review Cybersecurity Architecture

Assessing and potentially redesigning the cybersecurity architecture to address vulnerabilities exposed by the outage:

  • Zero Trust Architecture: Consider adopting a Zero Trust architecture to enhance security by verifying all access requests.
  • Micro-Segmentation: Implement micro-segmentation to limit the spread of threats within the network.

How Kron Technologies KronPAM Could Have Helped Mitigate the CrowdStrike Outage


Kron PAM diagram

I will illustrate how Kron PAM (Privileged Access Management) cybersecurity could have assisted during the CrowdStrike outage; it's essential to consider PAM solutions' functionalities and advantages in managing such incidents. Here's how Kron Technologies could have helped:

Enhanced Privileged Access Controls

  • Restricted Unnecessary Access: Ensured that only essential personnel had access to affected systems, reducing the risk of accidental or malicious changes during the outage.
  • Session Monitoring: Monitored and recorded privileged sessions to detect unauthorized or suspicious activities in real-time.

Rapid Incident Response

  • Quickly Identified Breaches: Provided visibility into who accessed what, when, and how, aiding in the rapid identification of the source and extent of the issue.
  • Automated Response Actions: Enabled predefined computerized responses to detect anomalies, such as terminating sessions or revoking access to contain the incident swiftly.

Audit and Compliance

  • Facilitated Root Cause Analysis: Detailed logs would help understand the events leading up to the outage.
  • Supported Compliance Requirements: Kron PAM ensures documentation of all actions taken during the outage, aiding in compliance with regulatory requirements.

Access Workflow Management

  • Streamlined Access Requests: Managed emergency access requests through approval workflows, ensuring only authorized personnel received elevated access.
  • Temporary Access Controls: Kron PAM provides time-bound access to critical resources, ensuring elevated privileges revocation once the emergency passes.

Risk Mitigation

  • Minimized Attack Surface: Reduced the number of high-risk accounts that attackers could exploit during the incident.
  • Credential Hygiene: Kron PAM ensures privileged credentials are rotated and not exposed, reducing the likelihood of credential-based attacks.

Collaboration and Communication

  • Secure Access for Remote Teams: Ensured remote incident response teams had secure, controlled access to necessary systems without compromising security.
  • Centralized Management: Provided a centralized platform for managing and auditing privileged access, enhancing coordination among different teams.

Moving Forward

The CrowdStrike outage is a stark reminder of the importance of robust cybersecurity practices and disaster recovery planning. By implementing these changes and leveraging solutions like Kron PAM, organizations can better prepare for and respond to cybersecurity incidents, minimizing their impact and ensuring more excellent continuity of operations. This proactive approach will improve resilience and bolster overall security posture in an increasingly complex threat landscape.

If you would like more detailed information on Microsoft's response and remediation efforts, please visit David Weston's Microsoft response, 'Helping our customers through the CrowdStrike outage. '

If you want more detailed information on CrowdStrike's remediation efforts, visit the?CrowdStrike Tech Alert support page. You'll find resources, recommended fixes, and tools to identify impacted hosts here.

For more information on Kron Technologies and how our Kron PAM? Privileged Access Management Suite is known as the fastest to deploy and the most secure PAM solution in the marketplace, click here.

Reflecting on this incident and taking proactive steps can help us better prepare for future challenges. If you would like more details, please get in touch with me at [email protected].




Madinah Ali

President @ SafePC Solutions | Generative AI, IT Infrastructure, & Cybersecurity

4 个月

Thanks for this very important article Jen Waltz ??????

Ricky Battell

Medicare Agency Owner. Open for business connections and always on the lookout for a better FMO

4 个月

Useful tips

Theresa Caragol

?? Helping people build authentic connections, powerful partnerships, and leadership | Bestselling Author | Keynote Speaker & Host | Advisor, Investor, Founder, and CEO.

4 个月

excellent summary from a super smart cyber guru.

Cassandra Gholston

CEO of PartnerTap | Ecosystem Co-Selling Platform | Partner Sales Growth

4 个月

Great article Jen! I hope Dr. Waltz makes it home soon.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了