登录查看更多内容

The Fragility of IT Ecosystems: Why Our Connected World Depends on Every Link

Ann-Mary Rajanayagam

Tech, Data & AI Leader, Advisor, and Innovator

发布日期: 2024年7月25日

??? Background

In a world of increasing complexity and expansion of deep third party integration with critical systems, IT departments worldwide face ongoing challenges as they navigate the intricacies of maintaining system integrity and security.

Recently, many have encountered a particularly vexing issue: the need for manual reboots of devices due to unforeseen complications with Microsoft systems. These disruptions highlight a broader systemic vulnerability within the IT infrastructure.

?? CrowdStrike & Its Clients

CrowdStrike, a prominent cybersecurity company, prides itself on protecting clients from cyber threats. However, even robust systems like CrowdStrike are not immune to causing disruptions. Last week, their software inadvertently led to widespread system outages due to its integration at the kernel level.

?? A Bigger IT Problem: Single Points of Failure

The Crowdstrike incident underscores a significant issue in IT—single points of failure. Systems are becoming increasingly complex, and in the race to prioritize speed and convenience, resilience often takes a backseat. This recent event should serve as a wake-up call to re-evaluate our approach to system design and risk management.

?? The Complexity and Vulnerability of Modern IT Systems

In our interconnected world, the fragmentation of suppliers delivering a single service complicates architecture and makes risk management more challenging. A simple update, like the one involving CrowdStrike and Microsoft, can have ripple effects that impact entire sectors.

The Role of CrowdStrike and Microsoft

?? CrowdStrike’s Responsibility

CrowdStrike’s involvement in the incident brings to light the delicate balance between cybersecurity and system functionality. While their software is designed to protect, this event shows that even protective measures can become vulnerabilities if not properly managed.

?? Microsoft’s Role and Culpability

Microsoft’s easy susceptibility to this disruption raises questions about their system’s resilience. What could they have done differently in terms of their architecture to prevent such widespread impact? They need to ensure that no single point of failure can take down entire systems.

Veritas Technologies LLC 3 个月前

Backups are critical to cyber resilience — but they…

Cohesity 6 个月前

The SOC’s Guide to Identify Critical Assets:…

Flavio Queiroz, MSc, CISSP, CISM, CRISC, CCISO 6 个月前

Lessons Learned

?? Integrated Testing

There is a critical need for more comprehensive and integrated testing across all systems. If an update has the potential to impact other software, it must be rigorously tested in environments that closely mimic production settings.

?? Global Standards and Redundancies

Establishing global standards and redundancies is essential to prevent a single update from crippling sectors or the global economy. Greater collaboration is required between industry and government to ensure the issue gets the right focus and attention. Resilience must be prioritized over speed and efficiency to safeguard against unforeseen failures.

?? Kernel Access and Security

Giving CrowdStrike kernel access proved to be a double-edged sword. While it allowed for deeper security integration, it also created a significant vulnerability. Microsoft, and other companies, must reconsider such dependencies and seek safer architectures.

Moving Forward: Rethinking IT Resilience

In the aftermath of this incident, it’s clear that building more resilient platforms is crucial. While no system can be entirely free of bugs, the focus should be on minimizing the impact of these bugs and ensuring that updates do not compromise overall system integrity.

CrowdStrike’s issue highlighted the risks inherent in our current architectures. It’s a call to action for all involved in IT and cybersecurity to collaborate more closely, prioritize resilience, and rethink how we design and manage our systems in an increasingly interconnected world.

?? Key Takeaways

Integrated Testing: Ensure updates are thoroughly tested in realistic environments.
Global Redundancies: Implement standards and redundancies to avoid widespread impact.
Resilience over Speed: Prioritize building resilient systems even if it means sacrificing speed.
Kernel Access: Rethink the necessity and safety of granting deep system access to third-party software.

By addressing these areas, we can build a more robust and resilient IT infrastructure capable of withstanding the complexities and challenges of modern technology.

Kristen K.

Corporate Communicator | Issues and Crisis Manager | Brand Builder | Problem-solver | Financial Services Specialist

2 个月

Great article AM!

1 次回应

要查看或添加评论，请登录

Ann-Mary Rajanayagam的更多文章

Open AI releases o1 - It's first model with 'Reasoning" capability

2024年9月15日

Open AI releases o1 - It's first model with 'Reasoning" capability

?? Introducing OpenAI o1: A Leap Forward in AI Reasoning OpenAI has launched o1, a groundbreaking large language model…

3 条评论
The future is now: Waymo completes 100,000 paid robotaxi rides a week across Los Angeles, San Francisco, and Phoenix

2024年8月28日

The future is now: Waymo completes 100,000 paid robotaxi rides a week across Los Angeles, San Francisco, and Phoenix

?? What’s the Technology Behind Self-Driving Cars? Self-driving cars have moved from science fiction to reality, thanks…

2 条评论
5 Tips for Founders: Lessons from the Trenches

2024年8月27日

5 Tips for Founders: Lessons from the Trenches

As someone who has mentored numerous founders, navigated my own founder journey, and interacted with many startup &…

6 条评论
NASA confirms stranded astronauts won't return until Februaury 2025

2024年8月25日

NASA confirms stranded astronauts won't return until Februaury 2025

On a press conference over the weekend, NASA confirmed that due to technical issues with Boeing Starliner, stranded…

3 条评论
Google Loses Landmark Antitrust Case — What This Means for Tech

2024年8月21日

Google Loses Landmark Antitrust Case — What This Means for Tech

In a historic ruling, a federal court has found Google guilty of illegally maintaining a monopoly over the search…

1 条评论
Microsoft Windows Recall - A perfect storm of privacy, security and ethics issues

2024年6月13日

Microsoft Windows Recall - A perfect storm of privacy, security and ethics issues

Do you want your Laptop taking screenshots of everything you do? On one hand it would be handy and make you more…
AFR AI Summit 2024

2024年5月29日

AFR AI Summit 2024

The inaugural Australian Financial Review AI Summit, held this week in Sydney, brought together some of the most…
Lessons from the Unisuper Google Cloud Outage: Prioritise Redundancy & Resilience

2024年5月19日

Lessons from the Unisuper Google Cloud Outage: Prioritise Redundancy & Resilience

The recent first of its kind Google Cloud outage that affected UniSuper is a wake up call for enterprise cloud…

4 条评论
OpenAI Unleashes GPT-4 Level Intelligence for Free: A Game-Changer for All Users

2024年5月13日

OpenAI Unleashes GPT-4 Level Intelligence for Free: A Game-Changer for All Users

A few hours ago OpenAI announced announced GPT-4o (the o is for "omni") , their new flagship model that can reason…

3 条评论
What did I learn about leadership from watching 20+ Seasons of Survivor?

2024年5月6日

What did I learn about leadership from watching 20+ Seasons of Survivor?

I love the TV show Survivor. Despite it having been around for over 20 years, I've only just discovered its addictive…

1 条评论

See all articles

The Fragility of IT Ecosystems: Why Our Connected World Depends on Every Link

Ann-Mary Rajanayagam

Tech, Data & AI Leader, Advisor, and Innovator

The Role of CrowdStrike and Microsoft

领英推荐

Lessons Learned

Moving Forward: Rethinking IT Resilience

Ann-Mary Rajanayagam的更多文章

社区洞察

其他会员也浏览了

7 Best FREE DDoS Attack Tools

What is the best way to assess IT service risks?

DTX Playbook - Chapter 2 - IT Governance & Cybersecurity Resiliency

So What's Next? --How CEOs, CIOs, and CISOs Should Respond to Situations the CrowdStrike-Microsoft Incident

Insights into Supply Chain Vulnerabilities as the #1 Cybersecurity Threat 2030

NIS2: EU Revamps Cybersecurity Rules for Critical Infrastructure

MSSPs: Cyber Allies - Choosing The Right MSSP

Digital Supply Chain Risks Require Board Oversight

Navigating Systemic Risk in Digital Business Systems: A Holistic Approach

"Unlocking Revenue Growth Through Cybersecurity: The CISO's Role"

The Role of CrowdStrike and Microsoft

领英推荐

Lessons Learned

Moving Forward: Rethinking IT Resilience

Ann-Mary Rajanayagam的更多文章

Open AI releases o1 - It's first model with 'Reasoning" capability

The future is now: Waymo completes 100,000 paid robotaxi rides a week across Los Angeles, San Francisco, and Phoenix

5 Tips for Founders: Lessons from the Trenches

NASA confirms stranded astronauts won't return until Februaury 2025

Google Loses Landmark Antitrust Case — What This Means for Tech

Microsoft Windows Recall - A perfect storm of privacy, security and ethics issues

AFR AI Summit 2024

Lessons from the Unisuper Google Cloud Outage: Prioritise Redundancy & Resilience

OpenAI Unleashes GPT-4 Level Intelligence for Free: A Game-Changer for All Users

What did I learn about leadership from watching 20+ Seasons of Survivor?

社区洞察

其他会员也浏览了

7 Best FREE DDoS Attack Tools

What is the best way to assess IT service risks?

DTX Playbook - Chapter 2 - IT Governance & Cybersecurity Resiliency

So What's Next? --How CEOs, CIOs, and CISOs Should Respond to Situations the CrowdStrike-Microsoft Incident

Insights into Supply Chain Vulnerabilities as the #1 Cybersecurity Threat 2030

NIS2: EU Revamps Cybersecurity Rules for Critical Infrastructure

MSSPs: Cyber Allies - Choosing The Right MSSP

Digital Supply Chain Risks Require Board Oversight

Navigating Systemic Risk in Digital Business Systems: A Holistic Approach

"Unlocking Revenue Growth Through Cybersecurity: The CISO's Role"