CrowdStrike: how a single corrupt file ...
Raphael Reischuk
Partner & Group Head Cybersec at Zühlke, VP Cybersecurity Digitalswitzerland, Co-Founder National Testinstitute for Cybersecurity, Innovation Council Innosuisse, Advisory Board SATW, Top100 Digital Shaper, PhD in Infosec
... brought down entire industries, and what we can learn from it.
Swiss TV "10vor10" asked me for an expert opinion on the CrowdStrike incident and the massive outages around the world. As I answered their questions, I thought I would share them with you below. Thanks to Jan Beilicke and Pascal C. Kocher for some good discussion.
Concentration in the area of cyber security software providers — how big is the problem?
Every IT system today needs an endpoint protection solution
Rather, the problem is that endpoint security software, by its very nature, is deeply integrated into the operating system (close to the heart) and has many privileges, i.e., it is very powerful. Consequently, if there is a breach or an error in a powerful component near the heart, this can quickly have a negative impact on the entire system.
The question is whether a single corrupt file should be able to bring down an entire operating system. Is this consistent with a resilient architecture
How can such a large-scale breakdown occur?
Several reasons. First, it's the need for security software: no system today can do without securing the endpoints. Almost all connected systems require live monitoring and control, whether as a matter of good practice, regulatory requirement, or simply because of the growing threat landscape.
Second, it's auto-updates, which automatically deploy updates without the operating system, the operator, or the end user being able to see them, to inspect them, to test them. IMHO, this practice urgently needs to be reconsidered. There is a conflict of objectives here between quickly distributing security measures and at the same time checking thoroughly enough what is being distributed. And yes, hardly any CISO would want to delay the installation of security updates. This is similar to smartphones, where we all recommend installing updates quickly. Either way, the stability (and therefore availability) of the infrastructure currently comes with too many single points of failure.
Third, it's supply chains: Lack of clarity, high complexity and a large number of vendors who have a good say in what happens on our devices and what does not. This also needs to be rethought.
领英推荐
The reason for the glitch was a failed update, not a cyber attack according to current knowledge. Does the error today also show the potential impact of a targeted attack on a large scale?
Even if the CEO of CrowdStrike says it was a faulty update in a single file, we must not be naive here. Where the error ultimately came from, whether an attacker had access to the development process, or whether the corrupt file still has an ulterior motive, can hardly be answered conclusively after such a short time.
Talking about the impact, we have seen it time and again in the past: if backdoors are carefully placed, it is almost impossible to detect them. If detection is successful, then there is often more luck involved than we would like to admit. And yes, it is a foretaste of what may come. Supply chains in the digital world have long been critical and are not monitored enough. Too many providers, high complexity and very high potential for damage. This makes it extremely attractive for criminals to carry out their malicious intentions.
What challenges do you see in the area of cyber security in the future?
If I reduce it down to four things, these would be: First, promoting heterogeneity at the cost of inefficiency, but for the benefit of resilience: If an airline or a hospital or a bank relies on a single tech stack, then what we saw on Friday is much more likely to happen than if they use a variety of different solutions. Mac OS and Linux, for example, were not affected, at least not this time. Although this measure is more complex and expensive, it should more often be taken into consideration in terms of resilience.
Second, we should?increase testing capacities
Third, every organization must also have a proper business continuity management
Forth, we should reduce dependencies
For the German speaking, here is the link to the SRF - Schweizer Radio und Fernsehen interview with Arthur Honegger where the above is massively shortened: 10vor10 Enjoy!
Fronting the ship of innovation
8 个月I learnt that other OSs are also venerable to this type error too. It's not uniquely a Microsoft concern. And the blue screen is a stop loss measure, so a system can safely recover without further damage. It's one thing to have to manually reboot PCs, it's another thing if data is lost and hardware needs replacing.
Fоunder, Chаirman and Cо-CE0 at Exeon Analytics, Dr. sc. ETH
8 个月Well said!
Senior Software Engineer Team Lead. Industrial automation.
8 个月"Even if the CEO of CrowdStrike says it was a faulty update in a single file, we must not be naive here. Where the error ultimately came from, whether an attacker had access to the development process, or whether the corrupt file still has an ulterior motive, can hardly be answered conclusively after such a short time." The following link supports your statement and just shows that in our complex software landscape it is hard if not impossible to have everything under control: https://trufflesecurity.com/blog/anyone-can-access-deleted-and-private-repo-data-github
Senior Business Consultant; Senior Business Analyst, Digital-Strategie, Digitale Transformation, künstliche Intelligenz intelligent nutzen, Cloud, Unternehmens-Entwicklung, Ad Interim Management
8 个月Very interesting article. I Only disagree about politicians caring for security. This would not lead to problem solving but to chaos. IT has to do this