Crowdstrike: Should it be lols, or should it be sad?

Crowdstrike: Should it be lols, or should it be sad?

Were at BILLIONS in this Crowdstrike mega debacle across the globe now. Building on over 30 years in this game, my first emotion was to ROFLCOPTER @ the stunning idiocy of a behavioural culture inspired by the Titanic. That emotion in time, simply led to sadness. Sadness, truly in my heart. Sure the money, but more, this is what engineering has come too, after all this time since say 1950 with the scientific and military application of mainframes, were at de-evolution rather than ongoing mastery. The sadness is about the disturbing observation that behavioural culture inspired by the Titanic, isnt unique in any way to Crowdstrike.

The anger I have in frustration is defined in five strikes to Crowdstrike. To Anyone Actually Competent (tm), these truths are self-evident. Anyone Actually Competent (tm) would immediately perceive the context, and would take decisive action in rescuing the organisation from itself, and protect the honour of our wider industry together, protect us all from wastefully dumb examples of ineptitude. There is no honour in these Sins, and they are most offensive to everything I am about.

Strike 1 - Ring 0. Ring zero kernel processes in x86 architecture are special over ring three in user land and require higher levels of risk management in testing and quality assurance. If it goes south, the entire OS goes south. Avoid it in the architecture at all costs, and if it cant be avoided, special treatment is obligatory.

Strike 2 - Boot-start Flagging. Especially when you develop a kernel mode device driver AND THEN mark it as boot-start, you've added a whole other new order of magnitude ontop with risk now because you are forcing the Windows kernel to never drop the device driver on boot should there be a problem. This flag is suicide - if it cant be avoided and is obligatory to the capability your delivering to customers, then, this codebase is now the most dangerous, and most carefully managed nuclear weapon that can only be entrusted to Actually Competent (tm) persons.

Strike 3 - Uncertified Device Drivers. Self signed device drivers are utter nonsense. Microsoft know the risks all too well and are continually and unfairly blamed when it's IHV's - independent hardware vendors whom most of the time are the root cause of stop codes that are not due to equipment failure but something going wrong in the ring zero kernel space stack. WHQL - Windows Hardware Quality Labs, where Microsoft verify and validate amongst other things, foreign device drivers, actually matters. Microsoft do this program for very good reason. Who in their right mind is going to run a mission critical Windows domain without the group policy option of forcing WHQL certified device drivers? Invincible Ignorance is the realm of the short termers.

Strike 4 - Shoddy Development and Testing. Perhaps the biggest failure of all in this, is in the failure to absolutely smash this part of their product codebase in the Crowdstrike Dev/Test efforts. Rapidly self-evident: messing with ring 0 device drivers, boot-start flagging, self signing silliness. Its self-evident this outfit is in extreme danger where no one else is going to save them in this context. Any pathway to success here needs insightful competent Leaders and the Dev\Test Team rallying to the cause of special treatment. Honour. Passion. The gravity of this responsibility was clearly lost on this Dev\Test Team. None of the REAL solutions are new, none of them are so difficult there's only a few across the planet. I'm talking, I'd want to see around the clock lights out automated regression testing using a test suite that is large and deep thats entirely integrated to the SCCM repository on any check in. I'd want to be proven, on every single release candidate, proven beyond doubt that there is ONGOING strength in this device drivers robustness through a combination of test automation and manual testing. Specifically, extreme focus put into demonstrating its execution is a tight show where all arguments formatted - any inputs - is very extensively parsed and is subject to the most stringent checking and error handling. This means, allot of fuzzing and similar techniques to tease out the tricky bugs. It's simply unacceptable that UNHANDLED exceptions can occur that stops the execution. This is the nuclear weapon that must be understood, must be managed, must be controlled. In this Titanic inspired systems architecture, NEVER can there be unhandled exceptions at runtime. Then, overtime, negotiation with influence. This madness of architectural decisions, chip away at permanent root cause change that breeds lasting organisational success. Find a way to integrated to the WHQL labs timings. Find a way over time to shift the architecture away from so much dependency in ring zero execution. Let staff see the honour that comes from customer praise on consistency - on bedrock service delivery. This defect is so bad, that it literally takes less than fourty seconds to replicate......I'm almost lost for words as to how that can happen in any organisation given the context of what they have exposed themselves too by their ignorant decision making on design fundamentals. Fourty seconds to identify it!!!

Strike 5 - Bizarre Release Management and Quality Control. Ummmm, so real men and real women test in production?!!!??!!!!11111111roflcoptering If I was the CTO/CIO, no way would all these prior sins be anywhere near the realm of OK Guys Go Team. Yet, GO FOR GREEN is the insane mantra of this crew where the final horrific display of ineptitude, somehow, just allow all this garbage to go into PROD!!! I don't think being called to front up to the US Congress, which is happening Crowdstrike is formally summoned, to explain your company's idiocy means wearing cowboy spurs on your boots as a medal of delusional pride. The irony is, in any otherwise sane outfit, the role of an Actually Competent (tm) Release Manager would have shifted this baseless and unworthy junk right back to Dev\Test under threat that keyboards will become contraband and there will be a Borg Inspired assimilation of brain wiping and re-education should silliness continue. There is ZERO gating - there is zero quality control - there is zero release criterion to PROD. How is the difference between zero and fourty seconds to replicate the defect, in anyway not important?

My very best career moments have not been things like my Australia Day award ceremony. The things etched into my heart are witnessing human beings rallying together. Having a mutual sense of honour. A mutual sense of urgency to protect. An openness where discussing problems and alternate thinking are respected and considered. When people know that countless companies start, when they know countless companies end, when they know that their work is not a right but indeed a privilege, that is when real effort is applied and real success is achieved long term. I'm decidedly SAD, SAD for the families whom will suffer in job losses, SAD for the industry and public perception, SAD for the frustrating idiocy of a minority that unless directly challenged, rots like cancer within us all. The quackery must be cast out. They do not belong amongst us.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了