Building highly available (HA) and resilient microservices using Istio Service Mesh
What is High Availability in microservices
High availability systems are designed to provide continuous and uninterrupted service to the end customer by using redundant software performing similar functions. In highly available microservices, all the hosts must point to the same storage. So, in case of failure of one host, the workload in one host can failover to another host without downtime. The redundant software can be installed in another virtual machine (VM), or Kubernetes clusters in multicloud or hybrid cloud.
In this blog we will talk about how Tetrate helps platform architects to configure Istio service mesh for enabling automatic failover, achieving high availability.
Why do IT organizations need high availability in microservices?
Organizations today follow service-oriented architecture approaches, using microservices architectures to build distributed systems that span multiple workloads or multiple clouds. One of the main challenges of microservices is the way services communicate over the network using the API.
Communication between these services in a distributed system can fail due to many reasons, which include:
Read the?fallacies of distributed computing?to see a list of assumptions an architect must consider to mitigate service outages in distributed systems.
For the above reasons, public clouds have failed multiple times. Public clouds like AWS, Azure, and Google Cloud provide a service level agreement (SLA) commitment of 99.99% uptime – that is, just under one hour of downtime a year. downtime with just 52.6 minutes/year).
When they fail to meet SLAs, cloud providers offer?service credits, but this does not prevent their customers, and consumers, from getting frustrated when their transactions cannot be completed, or when they’re unable to access applications, leading to a loss of business.
AWS, the leading public cloud player, recently experienced multiple cloud failures. Users were locked out of messaging platforms, gaming applications, and social media sites. The?AWS east region outage in 2021 brought down Disney+, Netflix, and many other?services.?Misconfigurations to routers brought down Whatsapp, Instagram, and Facebook globally?in 2021. There have been many?other outages.
领英推荐
From an infrastructure perspective, a platform architect or enterprise architect should engineer a highly reliable and available system with redundant microservices using service mesh, which is a communication and security services layer in your microservice setup. The idea is that each service will have a proxy service ( often implemented using Envoy as the proxy software), and all the traffic requests and replies to and from each service will go through the Envoy proxy.
If you are new to Envoy Proxy,?learn what Envoy is in 5 minutes. Or, if you are interested, you can learn?why Envoy-based service mesh is an integral part of cloud native applications.
In the image below (Figure A), the Envoy proxy is used both as the load balancer and the sidecar proxy service to two services ( Application A and B). In case of failure, the service mesh can be configured to automatically redirect requests to the redundant instance of the microservice.
Four steps to achieve fault-tolerant and highly available microservices using a service mesh
If you are designing a microservices application, then high availability can be achieved in 4 logical steps: