?? Tips to help you avoid your worst reliability nightmares

?? Tips to help you avoid your worst reliability nightmares

??How-tos and best practices

Best Practices for Testing Zone Redundancy

Zone redundancy is an essential part of designing resilient, reliable architectures, but how do you go from believing that you’re resilient to knowing that your systems can handle a zone going down?

In this blog from Gremlin Principal Engineer Sam Rossoff , you’ll learn best practices for setting up zone redundancy and how you can verify that redundancy using Fault Injection testing.

?

Three serverless reliability risks you can solve today using Failure Flags

Fault Injection is a method of testing a system’s resilience by creating controlled failures. Most Fault Injection tools require an agent running on a host, but the design of serverless platforms makes this approach impossible.

Gremlin Failure Flags is a code-level Fault Injection solution that injects faults directly into your applications. In this blog , you’ll learn how it can help you uncover and address three common reliability risks in serverless applications.

?

Interpreting your reliability test results

Gremlin’s default suite of reliability tests analyzes critical functions of modern services: scalability, redundancy, and resilience to dependency failures. Services that pass this suite of tests can be trusted to remain available during unexpected incidents. But what happens when a service fails a test? How do you take failed test results and turn them into actionable insights?

This blog aims to answer that question . We’ll walk through all seven tests in the Gremlin Recommended Test Suite and explain what they test, what happens if your service fails, and what actions you can take to turn that failure into success.

——

?? Customer Webinar On-Demand

How Visa Cross-Border Solutions Reduces Outages by Testing System Resilience in Their SDLC

ON-DEMAND

In this Gremlin-hosted webinar, Chris Kempster, a Sr. NFT Engineer at Visa Cross-Border Solutions, shares their journey from early Chaos Engineering experiments to integrating reliability test suites into their staging environments and build pipelines.

Chris will share the lessons he’s learned and best practices for building an effective testing process—and rolling it out across the organization.

WATCH NOW

——

??? Office Hours

Upcoming! Integrating Gremlin with your observability tools

DATE:? November 14th TIME: 11am PT/2pm ET

To get the most value out of Chaos Engineering and reliability testing, you need a way to observe your service’s behavior. Observability tools offer insight into how your systems are performing, but observability on its own isn’t enough. You need a way to monitor your systems while testing their reliability so you can determine whether your service passed or failed a test.

In this Office Hours session , we’ll show you how to connect Gremlin to your observability tool via Health Checks. We’ll also discuss which metrics you should choose when creating Health Checks, and why they’re important for reliability.

Have questions about observability and Gremlin? Just reply to this email and we’ll make sure to cover them in the live Q&A portion of the webinar.?

REGISTER HERE

?

How to test serverless applications using Failure Flags

ON-DEMAND

Serverless applications are ideal for deploying scalable applications without having to manage infrastructure, but this also makes it difficult to test their reliability.

Failure Flags is Gremlin’s answer to serverless reliability. In this Office Hours session , we’ll show you how to run Chaos Engineering experiments on AWS Lambda functions. You’ll see how you can safely inject faults directly into your applications, how to scope experiments from individual functions to entire availability zones, and even how to create your own custom faults.?

Have questions about Failure Flags? Just reply to this email and we’ll make sure to cover them in the live Q&A portion of the webinar.?

WATCH NOW

——

要查看或添加评论,请登录

社区洞察

其他会员也浏览了