登录查看更多内容

Microservices Architectures: 3 Overlooked Considerations

Arvind Soni

AWS | VMware | Chicago Booth | IIT Bombay

发布日期: 2017年6月19日

Published originally on Netsil blog

Microservices architectures have become a ubiquitous industry trend because of their promise of speed, agility and scale. Just like any major paradigm shift, microservices adoption introduces many changes at architectural, technical and organizational levels. It is not uncommon to overlook some considerations in pursuit of microservices-based applications. While containers, orchestrations, automation, service definition, etc. take up the majority of the mindshare, here we have identified 3 often overlooked but critical considerations based on our customer conversations.

An overarching takeaway is that in a microservices architecture complexity is shifting from being code-based to network-based. Therefore, service-interactions become a much richer source of information for health, performance and security of microservices applications. Netsil has leveraged this key insight and built the Application Operations Center (AOC) using service interactions as the source of truth. As we explore the considerations mentioned below, we will highlight how the AOC benefits SREs and operations teams in improving reliability and delivering service-level objectives (SLOs) for their microservices-based applications.

Addressing the Common Requirements of Service-Interactions

In microservices architectures, services interact heavily with each other to fulfill transactions. Each of these service interactions requires a subset of these common functionalities:

Connection pooling to avoid expensive overhead of creating new connection for every request
Automatic timeouts and retries for idempotent requests
Ability to dynamically shift interaction to another replica of a service when initial target is experiencing latency. Unlike simple pulse-based load balancing, this goes further into leveraging more metrics to identify the right instance of a service
Ability to flexibly route traffic among multiple instances of services. This is useful for blue-green and canary release management
Leverage modern microservices glue protocols such as gRPC, Thrift etc. that are more efficient to transmit on the wire, offer request multiplexing and pipelining than HTTP
Automatic or manual dependency modeling/mapping for documentation and quick incident response purposes
Build, expose and maintain good APIs and contracts. Also, constantly address forward and backward compatibility

One option is to build all these functionalities into each service interaction. A more efficient option is to employ lightweight proxy services such as linkerd, Envoy or Traefik. From the perspective of operations teams, these additional components are critical dependencies that they should be able to visualize and monitor. The Netsil AOC has the strong advantage of automatically discovering all service interactions with such intermediate components without requiring any code change! Operations teams will see these components and their dependencies on a real-time map along with all the key performance indicators such as latency, throughput and error rates. The below picture shows the Kubernetes L7-Proxy as an example of the AOC’s complete visibility into the hard-to-instrument intermediate services.

Fig 1: Kubernetes Pods Shown on an Application Topology

Think Shared Services Not Shared Libraries

In any microservices application, many services will have a need for some common functions. Cache, key-value stores, distributed lock managers and service discovery are good examples of such common functions used by multiple services. In the monolithic world, multiple components could simply use shared libraries for such common functions. And with handy techniques such as Docker or Packer, it might be tempting to bundle the shared libraries with all the services that use them. But continuing down the path of shared libraries becomes detrimental for at least 2 reasons:

Managing multiple copies of shared libraries and their dependencies would soon become a nightmare. Also, this runs foul with the core principles of microservices i.e bounded context of services
The situation becomes unsustainable when dev teams start using multiple programming languages and frameworks. Then it would be absolute waste to build and maintain multiple copies of shared libraries in different languages

The paradigm of building and using shared services can be seen in Netflix EVcache, the popularity of redis and etcd for key-value stores, and growing use of products such as Consul for service discovery.

While it is important to realize and build shared services, it is equally important to ensure their health, availability, and performance. At Netsil, we’ve come across both providers and consumers of such shared services which may be from same or different teams. If you are a consumer team then quite often you don’t know the details of the service itself but are concerned about its latency, throughput and error rates. If you are a provider of shared services, you have additional concerns regarding saturation and capacity of these services.

All of these “golden signals” are available to Netsil users irrespective of whether they own the services or are merely consuming them. Since Netsil leverages service interactions as the source of truth, it can monitor and present latency, throughput, error rates and saturation without requiring any code instrumentation of these services! As an example, the picture below captures the golden signals for a shared Memcached service.

Fig 2: Golden Signals for a Memcached Service

Don’t Forget The Circuit Breakers

Failure is inevitable in a distributed system. The role of circuit breakers is to contain the failure and avoid propagating isolated failures to the entire application.

Let’s say service A calls service B which in turn calls service C to fulfill a particular transaction. If downstream service C starts experiencing errors and experiences a time-out then service B and eventually service A will both start failing. In real world scenarios when such failures are left unchecked it can have disastrous effect impacting transaction integrity, causing data inconsistency and resulting in widespread outages of multiple services. Instead, if service B implements a circuit breaker paradigm, then the circuit breaker will monitor for failure of service C and when failures exceed specified thresholds it can invoke graceful error handling.

While it is important for development teams to implement circuit breakers for their services, it is equally important for the operations team to get alerted on “circuit break” incidents. Netsil provides multiple important features for handling such incidents:

A real-time auto-discovered map of the entire microservices application. Using this map, operations teams can quickly visualize and understand service dependencies.
Operations teams can define and monitor KPIs at the service level rather than worrying about individual instances. They get alerted when services start experiencing issues potentially even before a circuit break kicks in.
Commonly circuit breakers are implemented using wrapper functions such as Netflix Hystrix. Netsil supports gathering metrics from circuit breaker functions using the standard statsd protocol. Circuit breaker functions can simply send incident metrics using statsd and Netsil will enable the analytics and alerting workflows on the metrics.

Fig3: Netsil Enables Service-level Monitoring to Alert on Circuit Break Conditions

Conclusion

Microservices bring a lot value and a lot of changes. We hope this blog has put the spotlight on the considerations that you will take in to account. Based on your experience, if you have other important considerations then do share them in the comments section below.

Also we have highlighted the value of the Netsil Application Operations Center (AOC) for operations teams responsible for health and performance of microservices applications. You can check out the Netsil AOC here and we look forward to engaging with you on your microservices efforts.

要查看或添加评论，请登录

Arvind Soni的更多文章

Amazon ECS Financial Services Customer Case Studies

2021年6月9日

Amazon ECS Financial Services Customer Case Studies

Good case studies of how financial services customers innovate faster and innovate better with AWS and Amazon ECS…
VMware Project Pacific - Nice!

2019年9月3日

VMware Project Pacific - Nice!

In the 1990s, IBM was faced with the question "Can Elephants Dance". IBM danced adapting successfully to the demise of…
In DevOps, Dev is Killing Ops

2017年7月22日

In DevOps, Dev is Killing Ops

Disclaimer: All ideas are my own. Note: Dev refers to software application developers.

20 条评论
Kubernetes vs Docker Swarm vs DC/OS: May 2017 Orchestrator Shootout

2017年6月29日

Kubernetes vs Docker Swarm vs DC/OS: May 2017 Orchestrator Shootout

Originally published on Netsil Blog and written by Twain Taylor Container orchestration is rapidly evolving. Every…
Snake-lines of Software Production & DevOps

2016年7月27日

Snake-lines of Software Production & DevOps

DevOps paradigm is all the rage these days. And I think it is a necessity rather than just a hype.

See all articles

Microservices Architectures: 3 Overlooked Considerations

Arvind Soni

AWS | VMware | Chicago Booth | IIT Bombay

Addressing the Common Requirements of Service-Interactions

Think Shared Services Not Shared Libraries

Don’t Forget The Circuit Breakers

Conclusion

Arvind Soni的更多文章

社区洞察

其他会员也浏览了

C4 Model in Microservices Architecture

Expert Tips for Migrating from Monolith to Microservices

Microservice Architecture and Design Patterns

EDA and Microservices: A Perfect Match For Modern Software Architecture

Building Scalable Microservices Architecture: Best Practices for Modern Applications

Top 6 Microservices Patterns for Modern Software Architecture

Challenges of Implementing Microservice Architecture

Microservices Architecture, one of the more challenging aspects

Outbox Pattern: Reliable Event Publishing in Microservices