登录查看更多内容

New testing how-tos, CI/CD office hours, and how to deal with layoffs

Gremlin

The Reliability Management Platform for high-velocity engineering teams

发布日期: 2024年5月14日

+ 关注

??How-tos and best practices

How to build reliable services with unreliable dependencies

Find out how to use Gremlin to proactively test a service with multiple dependencies, including learning how to prepare your services for dependency failures and how to ensure your services can withstand losing dependencies.

How to build zone-redundant cloud instances and clusters

Learn how to prepare for availability zone outages by proactively detecting services operating in a single zone. You’ll see how Gremlin detects this reliability risk for you, how you can mitigate it using commonly available cloud computing tools, and how you can simulate zone and region outages to prove your resilience.

How to ensure your Kubernetes Pods and containers can restart automatically

Find out how to configure Kubernetes to automatically detect and restart failed containers. You’ll learn how to set a container restart policy, how to create liveness probes, and how to test that these systems will work as expected and when you need them to.

How to make your services resilient to slow dependencies

The reliability discussion often ignores a significant and ever-growing part of nearly all modern software: dependencies. This blog goes over the role dependencies play in reliability, how they can fail, and how you can build resilience against unstable and unreliable dependencies.

How to ensure your Kubernetes cluster can tolerate lost nodes

What happens when one of your nodes fails? This blog post covers node redundancy in Kubernetes, then goes into how one of Gremlin’s built-in Recommended Scenarios can help you verify your resilience.

Three roles you need for reliability success

After analyzing successful programs, we found that every successful program was supported by three pillar roles. Find out more about the three roles—and their responsibilities for improving reliability.

领英推荐

Happy Holidays, Now You’re Fired

Bloomberg News 1 年前

Zoom employees feel betrayed as layoff comes despite…

CNBC-TV18 1 年前

Mass Layoffs: Factors and Causes

Talentedge_In 10 个月前

——

??? Featured Article

Hitting reliability goals in the face of layoffs

It’s never easy when layoffs hit your organization. In addition to the personal impact of losing friends and coworkers from your team, those who remain are left trying to achieve the same business goals with less people and resources.

Unfortunately, layoffs and restructuring have become a common part of business. But you’re not alone. Your partners (including Gremlin) are here to help you navigate your new reality.

Check out this article from Principal Engineer Jeff Nickoloff for three ways to do more with less.

——

??? Office Hours

Upcoming: How to test zone redundancy using Gremlin

DATE: June 13th TIME: 11am PT/2pm ET

Zone failures are rare, but they still happen often enough that your systems must account for them. When an entire zone fails, many of the most common redundancy techniques fail. How do you avoid outages like these, especially if they affect an entire data center?

In this webinar , we’ll show you how to prepare for zone outages using Gremlin. In a live demo, you’ll learn how Gremlin’s built-in reliability tests and Scenarios test your services against zone failures. You’ll also learn how to customize these tests to target different zones, how to recreate an outage in a different zone from the ones your systems are running in, and how to monitor your services throughout using Health Checks.

On-Demand: How to run Chaos Engineering experiments in your CI/CD pipeline

Ad-hoc Chaos Engineering experiments are great for learning more about how your systems work, but they don’t tell you how your systems behave over time. As new features get deployed, environments change, and regressions get introduced, even the most resilient systems can gain reliability risks. QA and performance testing are already built into CI/CD - why not reliability?

In this on-demand webinar , we’ll show you how to run Chaos Engineering experiments as part of your CI/CD process. We’ll show how to use Gremlin’s REST API to trigger experiments from Jenkins, monitor active experiments, and how to check whether the test completed successfully or failed.

——

New testing how-tos, CI/CD office hours, and how to deal with layoffs

Gremlin

The Reliability Management Platform for high-velocity engineering teams

??How-tos and best practices

领英推荐

??? Featured Article

??? Office Hours

Gremlin Reliability Newsletter

1,856 位关注者

Gremlin的更多文章

社区洞察

其他会员也浏览了

Silicon Valley Chief People Officers Weigh In on the Tech Layoffs

Silicon Valley Chief People Officers Weigh In on the Tech Layoffs

?? Improving your SaaS Unit Economics without layoffs?

Rules for Laying Off Lots of People on Zoom

#Lessons in Adaptability: How Talkdesk Can Bounce Back After Layoffs

Microsoft Fires, Game of HBO & Other Stories You Need to Know

Layoffs in big tech plus the latest on Purposeful Work - and so much more.

The Impact of Cisco's Layoffs on Employees

Google Layoffs: Navigating Industry Disruption and Emerging Opportunities

Tech Layoffs For 2016 Projected To Be Deep: What Happens To 260,000 Highly Skilled Professionals In Their 40s And 50s?

??How-tos and best practices

领英推荐

??? Featured Article

??? Office Hours

Gremlin Reliability Newsletter

1,856 位关注者

Gremlin的更多文章

?? A bountiful harvest of reliability tips

?? Tips to help you avoid your worst reliability nightmares

Release roundup, customer webinar, office hours, and compliance!

AWS tips, new RBAC release, TLS/WR SSL certificate tests, and more!

Check out these new releases! Plus: why observability and testing go together

Gremlin for AWS release, migration tips for Kubernetes, and microservice reliability

社区洞察

其他会员也浏览了

Silicon Valley Chief People Officers Weigh In on the Tech Layoffs

Silicon Valley Chief People Officers Weigh In on the Tech Layoffs

?? Improving your SaaS Unit Economics without layoffs?

Rules for Laying Off Lots of People on Zoom

#Lessons in Adaptability: How Talkdesk Can Bounce Back After Layoffs

Microsoft Fires, Game of HBO & Other Stories You Need to Know

Layoffs in big tech plus the latest on Purposeful Work - and so much more.

The Impact of Cisco's Layoffs on Employees

Google Layoffs: Navigating Industry Disruption and Emerging Opportunities

Tech Layoffs For 2016 Projected To Be Deep: What Happens To 260,000 Highly Skilled Professionals In Their 40s And 50s?