登录查看更多内容

CrowdStrike: Us or them?

Mark Lomas

Cloud Solutions Architect & Digital Workforce Empowerment Specialist | Volunteer | Tech enthusiast | ?????|

发布日期: 2024年7月24日

The CrowdStrike update, which caused so many issues around the world last week, has resulted in a lot of questions.

The impact of that outage was far reaching, beyond the primarily impacted 8.5 million PC (as reported by Microsoft). Even businesses that did not directly utilise CrowdStrike were impacted, with retail payment systems going down, deliveries delayed, and all manner of secondary effects observed.

Questions? There are plenty. Some of which fall under the category of who is to blame. CrowdStrike, undoubtedly, will take the majority of the heat here, with the CEO George Kurtz already being summoned to appear before the US House Committee on Homeland Security.

Some blamed Microsoft. In the first instance, the fact it was Windows systems being impacted, led the initial knee-jerk reaction to point the finger at the software giant.

As the culprit was quickly identified as the errant CrowdStrike update, the heat on Microsoft cooled, but not entirely. This lead one Microsoft Executive, Frank Shaw, to tweet “A Microsoft spokesperson would not have to make this point [that the changes that enabled the CrowdStrike outages came out of an agreement with EU regulators] if the reporters did their jobs,”

Yikes.

Indeed Microsoft have pointed to the fact that they had deliberately allowed third party security firms to have API access to a feature of Windows called Patch Guard (or Kernel Patch Protection). This, in turn, is why a security tool like CrowdStrike -if it began operating in an unstable manner- could ‘spread’ that instability to the Windows Kernel, thus resulting in such a catastrophic crash of the OS.

This API access was -Microsoft assert- enabled to appease EC antitrust regulators, way back in the days of Windows Vista. Others have correctly pointed out that the EC didn’t explicitly demand the technological changes Microsoft implemented, they merely insisted that Microsoft play fair.

Today, Microsoft have made a variety of changes to Windows 11 that look rather stark when you remember the history of Windows ‘features’ like the Browser Ballot screen. If Microsoft can make such changes without the original anti-trust complaints coming back to haunt them, could they not have also strengthened Kernel Patch Protection too?

These are interesting questions, but technical ones, to be sure.

All sorts of other questions are being asked. Are we over-reliant on big tech firms? Are we over-reliant on foreign tech firms? Does this teach us lessons about cloud?

Most of these questions largely miss the point. If we are to ask ourselves any question about this at all, it should be this: Are we reliant on ‘automatic’ too much, to patch our systems?

Patches are a fact of life in both Operating Systems and software. Security vulnerabilities happen. There’s no getting away from that. Thus, patches happen – no getting away from that either.

Thus a Patch Management solution would seem to be in order. However, patch management solutions are not there simply to validate that all the patching is happening automatically, without you having to lift a finger.

If all this is starting to sound like I’m saying it’s the fault of the IT department that the CrowdStrike issue occurred … I’m not. However, testing software (including patches) is a shared responsibility between a software vendor, and IT.

Even the most robust patch testing process on the part of the software vendor can never account for every single possible configuration scenario and software interaction that might occur in the real world.

As such, it’s incumbent on IT to carry out patch testing. Of course, this assumes that a software vendor gives you the control to roll out patches on your terms in the first place. CrowdStrike have themselves now stated in an incident report that they will “Provide customers with greater control over the delivery of Rapid Response Content updates by allowing granular selection of when and where these updates are deployed.

One wonders what control they offered (or did not) before?

A thought has to occur to all of us here: If any software slated for deployment in our environment, from the largest LoB application down to the smallest utility, doesn’t offer a decent mechanism for us to control update rollout ourselves, should we not think twice before deploying it?

For mission critical IT systems (and that may also have to include PCs too – after all, if they all go down at once, that’s a big deal!), the concept of maintaining a ‘steady state’ is something worth bringing back into our thinking.

领英推荐

Future Beat: Humbling technology

The National News 7 个月前

Lessons to Take Away from the CrowdStrike Crisis

Arbisoft 8 个月前

CrowdStrike BSOD Crisis: Strategic Insights and…

Lahiru Livera 8 个月前

The question of ‘is the cloud a problem?’ has cropped up a few times over the last few days. In reality, the issue is more the ‘evergreen’ approach to IT. Software that is ‘always up to date’, with frequent update cadence, automatically applied.

However, we’re in the world of Business/Enterprise critical IT systems here (and those in Public Sector too). Maintaining a static configuration so that continuity is not disrupted, is likely a better choice.

This does pose a problem. If we’re going to need to test every single patch, and every single update, how on earth to resource that when there are so many?

Two synergistic demands will likely occur.

First, a demand for a significantly greater level of transparency into software & patch testing processes. We want to see the data. We need the reports. We need the auditing. We need insight. Plus, we need it in a standard format. We also need that data not just in PDF format, but in a format that can be imported or directly fed in to our own patch management solutions.

In addition, we don’t just want vendors to test their software, but also maintain information about the relative issues that may have been reported by others following deployment.

This type of transparency will assist in our own internal risk-assessment processes for software delivery and patch testing and deployment.

The second demand, is for significantly better patch management tools and support.

This requires some effort. Right now, many vendors simply provide their own solution for delivering product updates. Some do support third party tools, but the industry has not agreed a single, standardised mechanism for a vendor to provide a repository of software updates, and for patch management tools to fetch those updates.

At this point I can already hear the clamor of Linux admins ready to shout about the software & package repository solutions that have been prevalent in that world for quite some time. I hear you – believe me, I do. Perhaps it’s time for software vendors to coalesce around embracing this approach in the world of closed-source commercial software too.

However, we need more than just a standardised mechanism for software & update distribution. We also need a mechanism to assist with testing. After all, in any given month there can be plenty of updates we might need to test. Having to do all that testing manually, would require a lot of time and effort, as well as testing platform resource.

Tools that can automate that process would be hugely beneficial!

One might imagine (for example) a solution that can fire up a VM based on a standard image, carry out pre-flight tests, apply an update, complete post-deploy testing, and then shutdown – generating a report on the success (or not) of patch deployment. It must do this not just for OS updates, but for all third party software updates.

You’ll need to also first design (and then regularly review) your approach to patch testing, and (likely staggered) rollout. What schedule for security patches? How about bug & performance updates? How frequently will feature updates be deployed? Plus of course, how will your plan respond to critical updates and zero-day patching?

Of course, this all raises an additional need: if you had a simple VM (or set of VMs) to test against, how can you keep that ‘set’ simple? It requires that you have a finite number of supported configurations in your environment, that are -as much as possible- maintained in a steady state, save for standardised patching. Any variation from those static configurations must also be documented, and regularly reviewed.

To do that, we may need to first go on a bit of a re-evaluation of our choices. For example, for endpoints, do you currently use the Long-Term Servicing Channel (LTSC) deployment of Windows (and indeed other software for which that approach is available)? If not, it might be time to look at this.

I am rather leaving some aspects of this unexplored. After all, it’s not just OS and software patching that needs to be considered, but in some cases firmware updates too. Testing such things raises additional complexities. Then there are other devices and systems in our environments where it’s not a simple case of spinning up VMs to check.

We also need to consider the plight of the Small Business, where the resources to carry out this kind of testing simply won’t exist. In such environments, business will look to MSPs to handle this workload, and carry out the patch testing processes for them.

All of these changes take time. These demands won’t be met overnight. However, many will be left asking whether it is indeed too much to ask that ISVs, OS vendors, and other Big Tech firms come together to agree some standards, and provide better tooling to give us back some control, and help us leverage it.

Personally, I don’t think it is.

要查看或添加评论，请登录

Mark Lomas的更多文章

No reprieve - Windows 10 will go EoS in October (and the scope includes Office)

2025年1月15日

No reprieve - Windows 10 will go EoS in October (and the scope includes Office)

Anyone hoping for a last minute reprieve for Windows 10, might want to start thinking again if they'd been banking on…
Windows 10 ... seems faster

2019年6月3日

Windows 10 ... seems faster

It's that time again. Microsoft recently released Windows 10 build '1903', or the Windows 10 May 2019 Update.
Time to get better at passwords

2017年9月18日

Time to get better at passwords

Quick, how many passwords are stored in your browser? Not sure? You're probably not alone. Not that it's necessarily a…
Slap a patch on it

2017年5月16日

Slap a patch on it

So it happened. A large chunk of UK infrastructure got hit by a major cyber security attack.

1 条评论
Your people will change your IT, even if you don't!

2016年5月17日

Your people will change your IT, even if you don't!

Change. Will it happen to you, or because of you? In business, change is of course inevitable, and when it comes to IT,…
Prove your IT is secure; It's Cyber-Essential

2016年4月12日

Prove your IT is secure; It's Cyber-Essential

Everything starts with a plan. Your business, your financial future, your personal future.
IT: make yourself obsolete, or someone else will

2015年8月10日

IT: make yourself obsolete, or someone else will

If you're an IT Manager used to doing more 'engineering' style work than 'consulting' style work ..

1 条评论
Tablet computing - still on

2015年6月17日

Tablet computing - still on

It's always the way, one minute the future is clear - the next it's not. The predictions don't come true.

See all articles

CrowdStrike: Us or them?

Mark Lomas

Cloud Solutions Architect & Digital Workforce Empowerment Specialist | Volunteer | Tech enthusiast | ?????|

领英推荐

Mark Lomas的更多文章

社区洞察

其他会员也浏览了

CxO, CxO Events, Technology, Networks, Encryption, Developer, VMware, Microsoft, Red Hat (293.3.2)

Driving Trust, Agility and Security With Glen Robinson, National Technology Officer at Microsoft UK

Tired of IT Fire Drills? How Microsoft’s Ecosystem & NIST CFW 2.0 Ignite Innovation—and Slash Costs

Microsoft appoints President for Europe, Middle East and Africa; Healix reveals new on-demand service!

Michelle Zatlyn: “You should always be thinking about your next acts—plural.”

MI-One Issue #2 Aperire Edition

How can we build trust in the digital age?

Stop the NSA Before It Destroys the Cloud Industry

A Perfect Storm: Microsoft, EU, and the Great Outage

The Wrap: DoD Info-Sharing Blues; DIU Chief Likes 3.0 Progress; VA’s Login.gov Push

领英推荐

Mark Lomas的更多文章

No reprieve - Windows 10 will go EoS in October (and the scope includes Office)

Windows 10 ... seems faster

Time to get better at passwords

Slap a patch on it

Your people will change your IT, even if you don't!

Prove your IT is secure; It's Cyber-Essential

IT: make yourself obsolete, or someone else will

Tablet computing - still on

社区洞察

其他会员也浏览了

CxO, CxO Events, Technology, Networks, Encryption, Developer, VMware, Microsoft, Red Hat (293.3.2)

Driving Trust, Agility and Security With Glen Robinson, National Technology Officer at Microsoft UK

Tired of IT Fire Drills? How Microsoft’s Ecosystem & NIST CFW 2.0 Ignite Innovation—and Slash Costs

Microsoft appoints President for Europe, Middle East and Africa; Healix reveals new on-demand service!

Michelle Zatlyn: “You should always be thinking about your next acts—plural.”

MI-One Issue #2 Aperire Edition

How can we build trust in the digital age?

Stop the NSA Before It Destroys the Cloud Industry

A Perfect Storm: Microsoft, EU, and the Great Outage

The Wrap: DoD Info-Sharing Blues; DIU Chief Likes 3.0 Progress; VA’s Login.gov Push