How a Crowdstrike Update Bricked 8.5 million Computers

How a Crowdstrike Update Bricked 8.5 million Computers

What is Crowdstrike?

Crowdstrike is a leading cybersecurity firm that provides advanced solutions to protect against a wide range of cyber threats. Their flagship product, Falcon Endpoint Security, is widely adopted by enterprises to secure Windows computers.

How Software Applications Can Brick Computers

Software applications can cause significant issues, including bricking computers, due to bugs in the code or problematic updates. Windows security software often includes a critical component that runs in kernel mode during the operating system's boot time. This component blocks boot-time threats and monitors system behavior with high-level privileges. Typically, endpoint security software comprises both kernel mode and user mode components, providing comprehensive system protection:

  • Kernel Mode: Runs with high-level privileges, directly interacting with the hardware and system core. If a kernel mode component crashes, it can cause the entire system to fail, leading to a "bricked" computer.
  • User Mode: Operates with limited privileges, interacting primarily with applications and user-level processes. Crashes in user mode typically only affect the specific application, not the entire system.

The Crowdstrike Incident

A specific incident occurred where a bad update from Crowdstrike caused the kernel mode component of Falcon Endpoint Security to crash, leading to system-wide failures. This kernel mode component, known as a device driver, is crucial for the operation of the security software. The device driver, usually tested and signed by Microsoft, inadvertently received a faulty update that bricked the computer. Although the driver itself was not updated, it dynamically downloads and runs update from Crowdstrike to detect new threats. The problematic update processed by the driver caused the entire system to crash.

Why Couldn't Windows Prevent This?

Typically, Windows has mechanisms to stop loading problematic drivers by skipping them and restarting the system. However, the Crowdstrike driver was marked as a "boot start" driver, which means Windows considers it essential for the boot process and does not skip it even if it malfunctions.

Repairing the Issue

To fix the bricked computer caused by the faulty Crowdstrike update, follow these steps:

  1. Physical Access: Ensure you have physical access to the affected device.
  2. Safe Mode: Start the computer in "Safe Mode," where the Crowdstrike driver is not loaded.
  3. Delete Update Files: Navigate to the Crowdstrike driver folder and delete the problematic update files.
  4. Restart: Restart the computer, which should now run normally.

If the drive is encrypted with BitLocker, you might need the BitLocker recovery keys. Due to the complexity, it is advisable to seek assistance from your IT support team.

1?? Boot Windows into Safe Mode
2?? Navigate to the folder %WINDIR%\System32\drivers\CrowdStrike
3?? Delete the driver files del C-00000291*.sys
4?? Restart the Windows        

More detailed instructions are here

Preventing Future Outages

To prevent similar issues in the future, several measures should be taken:

  1. Thorough Testing: Crowdstrike's engineering team should rigorously test updates before deployment.
  2. Verification Mechanisms: Implement robust verification processes to ensure the integrity and functionality of updates.
  3. Controlled Rollouts: Initially push updates to a small, controlled group of computers to monitor for any issues before a global rollout.
  4. Windows OS Improvements: Microsoft should enhance Windows to skip any malfunctioning driver, even if it is marked as a boot start driver.
  5. Product Redesign: Minimize kernel driver updates and shift more updates to user mode components to reduce the risk of system-wide failures.

By adopting these practices, Crowdstrike and other security software vendors can significantly reduce the risk of bricking computers and improve overall system stability and security.

More Information



Interesting to see how a faulty update can have such far-reaching consequences. What steps do you think companies can take to mitigate the risks associated with software updates, and are there any best practices that can be shared to prevent similar incidents in the future?

回复
Cori Hartje

Director Global Sales Enablement | Sales Leadership, Training, Program Management

7 个月

Very helpful.

要查看或添加评论,请登录

Satish Shetty的更多文章

社区洞察

其他会员也浏览了