登录查看更多内容

June 22, 2022

Kannan Subbiah

FCA | CISA | CGEIT | CCISO | GRC Consulting | Independent Director | Enterprise & Solution Architecture | Former Sr. VP & CTO of MF Utilities | BU Soft Tech | itTrident

发布日期: 2022年6月22日

What you need to know about site reliability engineering

What is site reliability engineering? The creator of the first site reliability engineering (SRE) program, Benjamin Treynor Sloss at Google, described it this way: Site reliability engineering is what happens when you ask a software engineer to design an operations team. What does that mean? Unlike traditional system administrators, site reliability engineers (SREs) apply solid software engineering principles to their day-to-day work. For laypeople, a clearer definition might be: Site reliability engineering is the discipline of building and supporting modern production systems at scale. SREs are responsible for maximizing reliability, performance availability, latency, efficiency, monitoring, emergency response, change management, release planning, and capacity planning for both infrastructure and software. ... SREs should be spending more time designing solutions than applying band-aids. A general guideline is for SREs to spend 50% of their time in engineering work, such as writing code and automating tasks. When an SRE is on-call, time should be split between about 25% of time managing incidents and 25% on operations duty.

Are blockchains decentralized?

Over the past year, Trail of Bits was engaged by the Defense Advanced Research Projects Agency (DARPA) to examine the fundamental properties of blockchains and the cybersecurity risks associated with them. DARPA wanted to understand those security assumptions and determine to what degree blockchains are actually decentralized. To answer DARPA’s question, Trail of Bits researchers performed analyses and meta-analyses of prior academic work and of real-world findings that had never before been aggregated, updating prior research with new data in some cases. They also did novel work, building new tools and pursuing original research. The resulting report is a 30-thousand-foot view of what’s currently known about blockchain technology. Whether these findings affect financial markets is out of the scope of the report: our work at Trail of Bits is entirely about understanding and mitigating security risk. The report also contains links to the substantial supporting and analytical materials. Our findings are reproducible, and our research is open-source and freely distributable. So you can dig in for yourself.

Why The Castle & Moat Approach To Security Is Obsolete

At first, the shift in security strategy went from protecting one, single castle to a “multiple castle” approach. In this scenario, you’d treat each salesperson’s laptop as a sort of satellite castle. SaaS vendors and cloud providers played into this idea, trying to convince potential customers not that they needed an entirely different way to think about security, but rather that, by using a SaaS product, they were renting a spot in the vendor’s castle. The problem is that once you have so many castles, the interconnections become increasingly more difficult to protect. And it’s harder to say exactly what is “inside” your network versus what is hostile wilderness. Zero trust assumes that the castle system has broken down completely, so that each individual asset is a fortress of one. Everything is always hostile wilderness, and you operate under the assumption that you can implicitly trust no one. It’s not an attractive vision for society, which is why we should probably retire the castle and moat metaphor.?Because it makes sense to eliminate the human concept of trust in our approach to cybersecurity and treat every user as potentially hostile.

领英推荐

In-depth introduction to Canary Deployments

Arpit Bhayani 2 年前

The 9 Breakthrough Strategies for Mastering Complex IT…

Michael Ferrara 3 个月前

Infrastructure testing

Rohit Singh 3 个月前

Improving AI-based defenses to disrupt human-operated ransomware

Disrupting attacks in their early stages is critical for all sophisticated attacks but especially human-operated ransomware, where human threat actors seek to gain privileged access to an organization’s network, move laterally, and deploy the ransomware payload on as many devices in the network as possible. For example, with its enhanced AI-driven detection capabilities, Defender for Endpoint managed to detect and incriminate a ransomware attack early in its encryption stage, when the attackers had encrypted files on fewer than four percent (4%) of the organization’s devices, demonstrating improved ability to disrupt an attack and protect the remaining devices in the organization. This instance illustrates the importance of the rapid incrimination of suspicious entities and the prompt disruption of a human-operated ransomware attack. ... A human-operated ransomware attack generates a lot of noise in the system. During this phase, solutions like Defender for Endpoint raise many alerts upon detecting multiple malicious artifacts and behavior on many devices, resulting in an alert spike.

Reexamining the “5 Laws of Cybersecurity”

The first rule of cybersecurity is to treat everything as if it’s vulnerable because, of course, everything is vulnerable. Every risk management course, security certification exam, and audit mindset always emphasizes that there is no such thing as a 100% secure system. Arguably, the entire cybersecurity field is founded on this principle. ... The third law of cybersecurity, originally popularized as one of Brian Krebs’ 3 Rules for Online Safety, aims to minimize attack surfaces and maximize visibility. While Krebs was referring only to installed software, the ideology supporting this rule has expanded. For example, many businesses retain data, systems, and devices they don’t use or need anymore, especially as they scale, upgrade, or expand. This is like that old, beloved pair of worn out running shoes that sit in a closet. This excess can present unnecessary vulnerabilities, such as a decades-old exploit discovered in some open source software. ... The final law of cybersecurity states that organizations should prepare for the worst. This is perhaps truer than ever, given how rapidly cybercrime is evolving. The risks of a zero-day exploit are too high for businesses to assume they’ll never become the victims of a breach.

How to Adopt an SRE Practice (When You’re not Google)

At a very high level, Google defines the core of SRE principles and practices as an ability to ’embrace risk.’ Site reliability engineers balance the organizational need for constant innovation and delivery of new software with the reliability and performance of production environments. The practice of SRE grows as the adoption of DevOps grows because they both help balance the sometimes opposing needs of the development and operations teams. Site reliability engineers inject processes into the CI/CD and software delivery workflows to improve performance and reliability but they will know when to sacrifice stability for speed. By working closely with DevOps teams to understand critical components of their applications and infrastructure, SREs can also learn the non-critical components. Creating transparency across all teams about the health of their applications and systems can help site reliability engineers determine a level of risk they can feel comfortable with. The level of desired service availability and acceptable performance issues that you can reasonably allow will depend on the type of service you support as well.

Read more here ...

Today's Tech Digest

9,253 位关注者

要查看或添加评论，请登录

Kannan Subbiah的更多文章

March 19, 2025

2025年3月19日

March 19, 2025

How AI is Becoming More Human-Like With Emotional Intelligence The concept of humanizing AI is designing systems that…
March 17, 2025

2025年3月17日

March 17, 2025

Inching towards AGI: How reasoning and deep research are expanding AI from statistical prediction to structured…
March 16, 2025

2025年3月16日

March 16, 2025

What Do You Get When You Hire a Ransomware Negotiator? Despite calls from law enforcement agencies and some lawmakers…
March 15, 2025

2025年3月15日

March 15, 2025

Guardians of AIoT: Protecting Smart Devices from Data Poisoning Machine learning algorithms rely on datasets to…

1 条评论
March 14, 2025

2025年3月14日

March 14, 2025

The Maturing State of Infrastructure as Code in 2025 The progression from cloud-specific frameworks to declarative…
March 13, 2025

2025年3月13日

March 13, 2025

Becoming an AI-First Organization: What CIOs Must Get Right "The three pillars of an AI-first organization are data…
March 12, 2025

2025年3月12日

March 12, 2025

Rethinking Firewall and Proxy Management for Enterprise Agility Firewall and proxy management follows a simple rule:…
March 11, 2025

2025年3月11日

March 11, 2025

This new AI benchmark measures how much models lie Scheming, deception, and alignment faking, when an AI model…
March 10, 2025

2025年3月10日

March 10, 2025

The Reality of Platform Engineering vs. Common Misconceptions In theory, the definition of platform engineering is…
March 09, 2025

2025年3月9日

March 09, 2025

Software Development Teams Struggle as Security Debt Reaches Critical Levels Software development teams face mounting…

See all articles

June 22, 2022

Kannan Subbiah

FCA | CISA | CGEIT | CCISO | GRC Consulting | Independent Director | Enterprise & Solution Architecture | Former Sr. VP & CTO of MF Utilities | BU Soft Tech | itTrident

What you need to know about site reliability engineering

Are blockchains decentralized?

Why The Castle & Moat Approach To Security Is Obsolete

领英推荐

Improving AI-based defenses to disrupt human-operated ransomware

Reexamining the “5 Laws of Cybersecurity”

How to Adopt an SRE Practice (When You’re not Google)

Today's Tech Digest

9,253 位关注者

Kannan Subbiah的更多文章

社区洞察

其他会员也浏览了

Resilience and Fault Tolerance with Polly in .NET: Enhancing Application Reliability

Service Threat Engineering: Taking a Page from Site Reliability Engineering

Security and Compliance in Infrastructure as Code (IaC)

How To Build A Dynamic Operational Resilience Framework with OpRes & AWS

Monitoring, APM, OpenTelemetry, Observability - modern-day requisites for uninterrupted business operations

Senior Site Reliability Engineer

Istio Fault Injection: Introducing Faults for Resilience Testing

Best Practices for Error Handling in Event-Driven Architectures.

The Real Cost of DevOps Backup Scripts

Stress Testing in Multi-cluster Kubernetes Environments

What you need to know about site reliability engineering

Are blockchains decentralized?

Why The Castle & Moat Approach To Security Is Obsolete

领英推荐

Improving AI-based defenses to disrupt human-operated ransomware

Reexamining the “5 Laws of Cybersecurity”

How to Adopt an SRE Practice (When You’re not Google)

Today's Tech Digest

9,253 位关注者

Kannan Subbiah的更多文章

March 19, 2025

March 17, 2025

March 16, 2025

March 15, 2025

March 14, 2025

March 13, 2025

March 12, 2025

March 11, 2025

March 10, 2025

March 09, 2025

社区洞察

其他会员也浏览了

Resilience and Fault Tolerance with Polly in .NET: Enhancing Application Reliability

Service Threat Engineering: Taking a Page from Site Reliability Engineering

Security and Compliance in Infrastructure as Code (IaC)

How To Build A Dynamic Operational Resilience Framework with OpRes & AWS

Monitoring, APM, OpenTelemetry, Observability - modern-day requisites for uninterrupted business operations

Senior Site Reliability Engineer

Istio Fault Injection: Introducing Faults for Resilience Testing

Best Practices for Error Handling in Event-Driven Architectures.

The Real Cost of DevOps Backup Scripts

Stress Testing in Multi-cluster Kubernetes Environments