登录查看更多内容

点击“继续加入或登录”，即表示您同意遵守领英的《用户协议》、《隐私政策》及《Cookie 政策》。

Modern Integration Frameworks

Ahmed Mabrouk

发布日期: 2023年3月31日

The burden of communication between the system and application components is one of the most difficult difficulties for software designers in modern distributed systems, particularly for applications built to be hosted in the cloud [Cloud Native Application]. The essential aspect in solving communication challenges, which comprise the majority of bottleneck problems that every distributed program has, is determining the correct protocol and how to represent this communication.

This paper provides a demonstration of the modern communication protocols and frameworks that can be used in tandem to eliminate communication bottlenecks in large, distributed business systems. The purpose of this study is to demonstrate the advantages of each selection and how to combine them.

Several modern communication protocols are commonly used for microservices. Some of the most popular ones include:

HTTP and HTTPS are widely used protocols for communication between microservices, especially when the services are exposed over the internet.
gRPC: This is a high-performance, open-source universal RPC framework that can run in any environment. It uses HTTP/2 for transport and Protocol Buffers for serialization.
Apache Kafka: This is a distributed, pub-sub messaging system that is often used for communication between microservices in a streaming data pipeline.
REST: This is a popular architectural style for building web services, which is often used for communication between microservices. RESTful APIs use HTTP methods like GET, POST, PUT, and DELETE to perform operations on resources.
Asynchronous messaging: This involves using message brokers (such as RabbitMQ or ActiveMQ) to send messages between microservices asynchronously. This can be useful for decoupling microservices and improving fault tolerance.

Which of these protocols you choose will depend on your specific requirements and constraints. It's important to consider factors such as performance, security, and reliability when choosing a communication protocol for your microservices.

On the other hand, once the communication method is selected. Software designers should consider communication optimization to get most of the communication methods used and to get the benefit of low latency communication many steps can flow. Use efficient communication protocols:

Choose a communication protocol that is optimized for the specific needs of your system. For example, gRPC or Apache Kafka might be a good choice if you need low-latency communication. HTTP or HTTPS might be a better option if you need to expose your services over the internet.
Use message batching: If you need to send multiple messages between services, consider batching them together to reduce the overhead of individual requests.
Use a load balancer: A load balancer can help distribute incoming requests evenly across a group of backend services, which can help improve performance and reliability.
Use caching: If you have data that is frequently accessed by multiple services, consider using a cache to store it in memory for faster access.
Use asynchronous communication: If the services don't need to receive a response from each other immediately, consider using an asynchronous communication pattern, such as message-based communication or event-driven architecture. This can help reduce the overhead of synchronous communication and improve the overall performance of your system.
Use a service mesh: A service mesh is a layer of infrastructure that sits between your services and handles communication between them. It can provide features such as load balancing, service discovery, and observability, which can help optimize communication between your services.
Use compression: If you are sending large payloads between services, consider using a compression algorithm (such as gzip) to reduce the size of the data being transmitted. This can help reduce the amount of bandwidth needed and improve performance.

The next section will highlight the most commonly used protocols in modern applications.

GRPC
REST
Orleans Actor Model Framework

Trying to compare those protocols and mention the best use case for each one of them. ?

?gRPC

gRPC is a modern open-source high-performance Remote Procedure Call (RPC) framework that can run in any environment. It can efficiently connect services in and across data centers with pluggable support for load balancing, tracing, health checking and authentication. It is also applicable in the last mile of distributed computing to connect devices, mobile applications and browsers to backend services. gRPC can use protocol buffers as its Interface Definition Language (IDL) and its underlying message interchange format.

No alt text provided for this image — Figure 1: gRPC Service Boundary[1]

gRPC uses HTTP2 as the transport layer by encapsulating the communication between the server and client/s by utilizing the HTTP protocol to provide the functionality of the services designed and provided on the server side. gRPC layer the RPC model on top of HTTP so the API designer is not required to map the services/contracts and data types over the HTTP URLs and HTTP verbs as in the REST protocol.

gRPC uses interface description language to expose the RPC APIs, providing a simpler and more direct way to define remote procedures and clients. IDL is also used by the code generators to generate clients' code easily using the service definition.

?gRPC protocol stack layers built to utilize HTTP protocol layers to provide communication functionality across the server services and services clients. Securing gRPC services inherits HTTP security models and provides features to add an extra layer on the content layer to provide more security compliance. gRPC uses a binary payload that is efficient to create and to parse, and it exploits HTTP/2 for efficient management of connections.

?The following describes the gRPC protocol stack layers from the bottom to the top:

TCP transport layer: Provides connection-oriented reliable data links.
TLS transport layer: Provides channel encryption and mutual certificate authentication. This layer is used to provide transport security using SSL certificate mode (server, server-client, client) models.
HTTP 2.0 layer Carries gRPC. This layer features header field compression, allowing multiple concurrent exchanges on the same connection, and flow control.
gRPC layer—Defines the interaction format for RPC calls. Public proto-definitions, for example, the next Proto definition shows how to define a new service called “Hello Service” that has one method named “SayHello” that accepts one argument “HelloRequest” and returns “HelloResponse” data structure.

service HelloService {
rpc SayHello (HelloRequest) returns (HelloResponse);
}

message HelloRequest {
string greeting = 1;
}

message HelloResponse {
string reply = 1;
}

?gRPC technology like any other technology has downsides as well. the complexity of building the services and generating client code is an overhead for API designers. Also, gRPC-generated code has to be incorporated into client and server build processes—this may be onerous to some, especially those who are used to working in dynamic languages like JavaScript or Python where the build process, at least on development machines, may be non-existent.

Representational State Transfer (REST)

REST was introduced by Roy Fielding’s 2000 dissertation [15] his dissertation presents REST as a distillation of the architectural principles that guided the standardization process for HTTP/1.1. Fielding used these principles to make decisions about which proposals to incorporate into HTTP/1.1.

REST API is widely used in modern systems to cover internet integration, especially in the microservices architecture paradigm. Systems distributed components expose their contracts and functionalities using REST API to be consumed by other components and represent integration points between different system boundaries. Web systems heavily use REST API to support extensibility and scalability. In multitenant distributed applications, providing REST APIs supported even from the system behaviour and workflow integration with the business domain itself. Especially the model that REST provides easiness to create and extend system integration points as it relies on the HTTP protocol definition and HTTP URL structure as service definition language.

The flexibility and wide usage for REST APIs were a result of how REST APIs were defined and designed over HTTP standard protocol, supported by all the browsers and there was no need to set up special software on the client side to consume those services. REST is designed to support Client-Server architecture by supporting the separation of concerns between the data storage and client interface concerns. This separation enabled the stateless nature by shifting the state management to the client side, not to the server side. Without the need to maintain a state between requests, the server component may quickly release resources, increasing scalability, and this also simplifies implementation by relieving the server of the burden of monitoring resource utilization across requests.

?The focus that the REST architectural style places on having a consistent user interface across all components are the primary characteristic that sets it apart from other network-based architectural styles. Relying on the uniform interface and having stateless nature supported scalability features on the server side as the scaling out of the backend servers. This scalability was made easy because there is no stateful data or sessions that need to be shared across the new back-end servers. Also enabled client-side cache usage to minimize the latency for the frequently used server-side data using the modern browsers' caching capabilities.

The REST interface is designed to be efficient for large-grain hypermedia data transfer, optimizing for the common case of the Web, but resulting in an interface that is not optimal for other forms of architectural interaction.[16] To obtain a uniform interface, multiple architectural constraints are needed to guide the behaviour of components. REST is defined by four interface constraints:

Identification of resources
Manipulation of resources through representations
Self-descriptive messages
Hypermedia is the engine of the application state.

gRPC vs REST

The next table shows a comparison between REST and gRPC comparing them based on four characteristics protocol, message format, code generation and communication direction supported.

Distributed Systems Architecture Evolution

As a result of cloud systems and large-scale enterprise systems introduced in the last ten years. This tends to a decoupling evolution. Trying to control and adapt high resiliency and low latency pillars. the potential number of Services in Large Scale Systems tends to be numerous number for example, in the following table 2 demonstrates the service's footprint for three enterprise companies that have implemented cloud-native techniques. Think about the speed, agility, and scalability they’ve achieved.

Multilayer systems design architecture got introduced to organize the system boundaries and add control layers over those systems. To ease the operational scale and maintenance for different services per system layer. Also, that introduced a need and a window for a new type of framework like Orleans to be utilized within the internal services layers and get the benefits of the Actor model to handle the intercommunication between the atomic workers and apply the actor model to handle and deal with those services in a more managed, organized way.

The next section will demonstrate the Orleans framework and how it can be utilized within distributed applications.

Orleans

Is a cross-platform framework for building robust, scalable distributed applications. Distributed applications are defined as apps that span more than a single process, often beyond hardware boundaries using peer-to-peer communication. Orleans scales from a single on-premises server to hundreds to thousands of distributed, highly available applications in the cloud. Orleans extends familiar concepts and C# idioms to multi-server environments. Orleans is designed to scale elastically. When a host joins a cluster, it can accept new activations. When a host leaves the cluster, either because of scale down or a machine failure, the previous activations on that host will be reactivated on the remaining hosts as needed. An Orleans cluster can be scaled down to a single host. The same properties that enable elastic scalability also enable fault tolerance. The cluster automatically detects and quickly recovers from failures.

One of the primary design objectives of Orleans is to simplify the complexities of distributed application development by providing a common set of patterns and APIs. Developers familiar with single-server application development can easily transition to building resilient, scalable cloud-native services and other distributed applications using Orleans. For this reason, Orleans has often been referred to as "Distributed .NET" and is the framework of choice when building cloud-native apps.

Orleans invented the concept of virtual actors. Actors are purely logical entities that always exist, virtually. An actor cannot be explicitly created nor destroyed, and its virtual existence is unaffected by the failure of a server that executes it. Since actors always exist, they are always addressable.

Orleans names actors as grains and grains grouped by silos every silo has a separate host and silos wrapped by HTTP APIs to provide communication between silos and each other. And also to ease the scalability and the load level silos across different hosts.

The grain is one of several Orleans primitives. In terms of the actor model, a grain is a virtual actor. The fundamental building block in any Orleans application is grain. Grains are entities comprising user-defined identity, behaviour, and state. Consider the following visual representation of a grain

Grain state can be persistent in a database or memory, Orleans provides wide options when it comes to integrating with different persistence providers (queues, servicebus, database, filesystems). The grain state stays in memory while the grain is active. Once it goes to an inactive state the state is stored in the persistency provider that is configured in the grain behaviour implementation. Which leads to low latency and less load to the data store.

Every grain has an atomic lifecycle represented in figure[6]. Instantiation of grains is automatically performed on demand by the Orleans runtime. Grains that aren't used for a while are automatically removed from memory to free up resources. Orleans runtime manages the grains' lifecycle and is responsible for activating and deactivating grains. And place and locate the grains as needed based on the hardware resources and cluster management orchestration.

Orleans added support for streaming extensions to the programming model. Streaming extensions provide a set of abstractions and APIs that make thinking about and working with streams simpler and more robust. Streaming extensions allow developers to write reactive applications that operate on a sequence of events in a structured way. The extensibility model of stream providers makes the programming model compatible with and portable across a wide range of existing queuing technologies, such as Event Hubs, Service Bus, Azure Queues, and Apache Kafka. There is no need to write special code or run dedicated processes to interact with such queues.

public async Task OnHttpCall(DeviceEvent deviceEvent)
{
? ? ? ? // Post data directly into the device's stream.
? ? ? ? IStreamProvider streamProvider =
? ? ? ? GrainClient.GetStreamProvider("MyStreamProvider");


? ? ? ? IAsyncStream<DeviceEventData> deviceStream =
? ? ? ? streamProvider.GetStream<DeviceEventData>(
? ? ? ? ? ? ? ? deviceEvent.DeviceId, "MyNamespace");


? ? ? ? await deviceStream.OnNextAsync(deviceEvent.Data);
}

Orleans Streaming Code Sample

Indexing and queries in Orleans are not efficient. And this is because of the nature of the isolation of grains and the atomic lifecycle for each grain. So the next problems considered as a downside for Orleans:

Cross-grain queuing is expensive and not recommended
The solution requires a CQRS style approach- separate reads from writes which is an overhead on the implementation.

To overcome those problems there are two options:

Exhaust de-normalized query-optimized data to external stores, such as SQL, COSMOS DB, etc.
?Aggregator grain that knows about all grains that can be queried and show off inter-grain communication.
Real-world implementation would have one aggregator per Silo, fronted by one cross-silo aggregator.?

Show Case MMA Chat APP

Overview: MMA chat stands for Minimal Message Application. This application shows how the three mentioned technologies can be used to provide real-world implementation and to provide a capability to compare those technologies side by side.

MMA provides a set of features below:

Send Messages
Join Chat Room
Leave the Chat Room
Create a Chat Room
Sync Messages History
Show Room Members

Architecture built using client-server approach using c sharp programming language.

For full access to the code sample?

Check?MMA Chat App [https://github.com/agmabrouk/MMA-Chat-App]

References:

1.????GRPC Official Website: https://grpc.io/

2.????gRPC in Java: https://blog.j-labs.pl/grpc-in-java

3.????gRPC vs OpenAPI: https://medium.com/apis-and-digital-transformation/openapi-and-grpc-side-by-side-b6afb08f75ed

4.????API Design: https://cloud.google.com/apis/design/resources

5.????Web API Design Guidelines: https://pages.apigee.com/web-api-design-register.html

6.????Slack API — Great example of an RPC API Design: https://api.slack.com/methods#conversations

7.????Perun API — another RCP API: https://perun-aai.org/documentation/technical-documentation/rpc-api/index.html

8.????JSON vs Protobuf: https://www.bizety.com/2018/11/12/protocol-buffers-vs-json

9.????Textual vs Binary Data: https://medium.com/better-programming/use-binary-encoding-instead-of-json-dec745ec09b6

10.?HTTP/1.1 vs. HTTP/2: https://www.digitalocean.com/community/tutorials/http-1-1-vs-http-2-what-s-the-difference

11.?HTTP/2’s effect on gRPC: https://dev.to/techschoolguru/http-2-the-secret-weapon-of-grpc-32dk

12.?Using gRPC & Protobuf: https://www.kabisa.nl/tech/sending-data-to-the-other-side-of-the-world-json-protocol-buffers-rest-grpc/

13.?gRPC vs OpenAPI vs Rest APIs: https://cloud.google.com/blog/products/api-management/understanding-grpc-openapi-and-rest-and-when-to-use-them

14.?gRpcTechnologyWhitepaper https://www.h3c.com/en/Support/Resource_Center/EN/Home/Switches/00-Public/Trending/Technology_White_Papers/gRPC_Technology_White_Paper-6W100/

15.?RoyThomasFielding (Architectural Styles and the Design of Network-based Software Architectures) https://www.ics.uci.edu/~fielding/pubs/dissertation/fielding_dissertation.pdf

16.?https://learn.microsoft.com/en-us/dotnet/orleans/overview

Ahmed Ashour

Solutions Architect | Specializing in Integrations | Cloud Architecture | Helping Companies Transform Digitally

1 年

???? ??? ?? ???? ????? ??? ??? ????? ????? Distributed ESB ??? ?? ???????? Smart pipelines ????.

1 次回应

Derek Gallo

CTO Igloo Software

Great read! Choosing an approach for your system is not a one size fits all. It’s important to understand these differences to pick the one that works best for your case.

Russell Derouin

Enterprise App and Data Transformation - Thought and Implementation Leader

Fantastic article Ahmed! I really want to see how this part (grain) is implemented as the virtual concept seems to exist on it's own outside the 'distributed system' with its own infrastructure. "Orleans invented the concept of virtual actors. Actors are purely logical entities that always exist, virtually. An actor cannot be explicitly created nor destroyed, and its virtual existence is unaffected by the failure of a server that executes it." There also seems to be a master data implication (data management) or governance implication that could support re-usability. I'm curious how this could tie into a data product within a Data Mesh concept. well done and very thought provoking. I'll definitely be reading more about Orleans.

2 次回应

查看更多评论