Post-Office Pattern
Abstract: This paper introduces a new enterprise integration pattern called the "Post Office" pattern, which provides a solution to the increasing complexity of establishing and maintaining connections between partners in a large integration platform. The pattern is based on the normalised environment of a post office and includes a service registration, a delivery channel, and a channel manager. By dynamically creating channels and linking them to service names, the Post Office pattern simplifies interoperability between service providers and consumers. We present the architecture of the pattern, its main components, and its advantages and disadvantages. We also discuss the implementation of the pattern and the selection of middleware products. Our results show that the Post Office pattern greatly simplifies connections between numerous partners. This paper is aimed at IT architects and development professionals who are dealing with the increasing complexity of setting up and maintaining connections between partners in a large integration platform. It also comes from work, help and advice from colleagues and friends. Many thanks to Jithu, Szymon, Pavel, Stephen and Dave.
I.Introduction
Projects where an essential part of the specifications is integration, are increasingly becoming the norm. Whether in traditional B2B or C2B (web, banking, administration), data management (data mining, data mesh) or IoT projects (industrial, smart city), integration is becoming one of the key elements for the success of a project. Today, most of the integration platforms on the market have designs rooted in the beginning of this century and the methods they to connect partners are based on design patterns formalised 10 years ago. Most often the connections between partners are the result of a manual definition implemented at development time. Second and third generation of integration platforms[i] facilitate the creation and management of these connections with graphical tools, however, in the integration context of the 20s two main factors, among others, hinder the development, deployment, and maintenance of interoperability when using these integration platforms. These factors are scalability and diversity.
Scalability: In our recent projects such as ERP, data mesh or smart city projects, the number of applications or elements that needing to be connected to a common integration platform is rapidly increasing and can include hundreds of thousands of partners, applications, or devices. It is known that the maximum number of connections between partners, in a P2P model is n^2 ((n*(n-1))/2), which can be significantly reduced when implementing bus-based, API-based and event-driven connection models. However, this simplification comes with a shift of P2P complexity to the integration platform where connections between endpoints or services have still to be created, set up and managed manually.
Diversity: Today, diversity is the norm. Integration is no longer uniform and limited to a single environment, one technology or MEP. Integration encompasses a wide range of diversity[ii] including:
- Different environments (on-premises, edge, fog, cloud)
- Different message exchange patterns (request/response, notification, queues, streaming)
- Different quality of service (0-n, 1-n, 1-1)
- Different fault tolerance and availability levels (persistence, non-persistence, redundancy)
As the number of partners increases and the diversity of interoperability multiplies, the effort required to create, set up, and manage connections between partners also increases. The latest generations of integration platforms offer graphical tools to simplify the management of connections between partners. However, as the number of connections increases manual management becomes a hindrance.
Currently, integration platforms lack a standardised method for automatically creating and configuring connections. This can lead to excessive resource consumption and a significant amount of time spent setting up and managing connections between partners. To address this problem, we have developed a new type of integration platform where connections between partners are established and configured through a standardised and automated process, which is simple and reliable for administrators and developers. The basis of this integration platform is the "Post Office" design pattern. In this paper we present this new design pattern, the "Post Office Pattern", and the approach we have taken to its development.
Please be aware that the "Post Office" pattern aims to minimize the effects of establishing many connections between partners. However, it does not address the integration challenges associated with implementing integration or business processes. The "Post Office" pattern is not a replacement for service choreography and orchestration, but rather works alongside these design patterns within an integration platform.
Inception?
We sought to simplify the creation and configuration of thousands of connections by drawing inspiration from the postal service, which handles millions of users and billions of messages daily. By studying the process of sending and receiving mail, we identified key elements and principles to develop the "Post Office Pattern" for a fourth generation of integration architecture. Our goal is to make establishing connections between partners as easy for the end user as sending a letter.
Process and postal elements
The address: To send mail, you need an address that is clearly associated with a home or office. To ensure proper delivery, the Post Office requires that an address be included on the envelope that clearly identifies the destination of the mail. (Recipient's name, number, street, postcode, city, country). This information makes the unambiguous identification of the destination comprehensible to humans. Identification by a simpler, unique identifier, such as a URL or UDDI would of course be possible, but it would be difficult for humans to use.
The letter: Just as a letter consists of an envelope and a sheet of paper, we define a message with the same structure in 2 parts, the header, and the body. The body is the business content of the message and is never opened when routing the message. This voluntary restriction excludes routing patterns such as content-based routers, content filters and any other operations that require access to the content of the message. In return, this rule greatly improves the confidentiality of the message and, most importantly, it allows our integration platform to be neutral and support any format, including encryption and other encoding.
The second part of the message is the header, the equivalent of the envelope, where we find the destination address of our message, i.e., the name of the service. Optionally, further business and technical meta-information can be added to the header. Within to our main specification, the name of the service is mandatory.
The post-box: The post-box serves as an entry point into the postal routing system. Dropping a letter into the post-box initiates the process of routing letter, regardless of the type of letter or parcel. Of course, for physical reasons, post-boxes cannot accommodate all types of items such as large parcels, but if the post-box were larger, it could.
The mailbox: The deposit of a letter in a mailbox is the end point of the postal journey. A mailbox is associated with a house or group of houses or offices (e.g., business postcode)
Equivalences
The second step in our thinking was to identify any correspondence between the elements used in sending and receiving a letter and the message routing in a large integration platform.
The address: When we studied the postal system, we wondered what the recipient's equivalent should be in our integration platform. At first, we thought of the equivalent of home=endpoint, but found that this was limited and did not offer the flexibility of the equivalence of home=service. We chose to associate an address with a service rather than a service implementation, as is the case in third-party integration platforms. Therefore, we include the name of the service in our message. Our main goal is that our integration architecture must be able to establish interoperability between partners using only the service name (identifier). Included in the header of the message we will be the title of the service, similar to an address on an envelope.
Post-box and Mailbox: We have associated post-boxes and mailboxes with components named pipelines (asynchronous topics). These elements enable communication between the consumer and the service provider.
There are two pipelines: the ingestion pipeline for incoming messages and the publication pipeline for outgoing messages. When a message is pushed in the publication pipeline it starts the message routing process (post-box). When a message arrives in an ingestion pipeline the process is complete with the deposit of a letter in the mailbox.
It is possible to associate multiple consumers with a pair of pipelines, just as multiple people or organisations use the same address by adding a 'postbox/P.O. Box'.
Having identified similarities between the elements of a post office and our integration platform, we assumed that users of our platform would use pipelines in a similar way to send and receive messages. This helped us understand the inputs and outputs of our platform without fully defining its design.
New Ideas
Message routing: Having identified similarities between the elements of a post office and our integration platform, we assumed that users of our platform would use pipelines in a similar way to send and receive messages. This assumption helped us understand the inputs and outputs of our platform without fully defining its design. We then sought to find architectures and components that could simply provide the same features as the postmen and mail distribution centres and create automatically route messages between pipelines within our platform without any effort from the users.
We discussed above that current integration platforms lack an automatic process for connecting partners, so we attempted to create an easy process to automatically connect partners' pipelines. However, after weeks of effort, we realized that implementing such a process within our platform would be difficult and would not result in a clear and simple process for creating connections. It would also complicate the architecture and the management of these connections. Since we did not want to repeat the same mistakes as current integration platforms we aimed to find a way to create a connection channel between partners that is as easy as sending mail between a sender and a recipient.
To address this challenge, we decided to implement a more complex solution on the surface rather than using routing inside the integration platform. We first shift the routine concerns outside the integration platform. We decided to assign to each pair "partner-services" - a communication channel. For each service used by a partner, whether as a service consumer or provider, we create a communication channel between the partner and the integration platform.
By assigning a channel to a single service, we ensure that the messages sent through one channel are intended for one service and consequently we don’t have to route the messages within the integration platform, but simply select the right channel for a service.
To illustrate this concept with the post office process, our solution involves having a huge number of mailboxes in the street, one for each sender-addressee pair (meaning hundreds to thousands post-box in each street). Moreover, every time you need to send mail to a new recipient, the post office would have to install a new post-box on the street. This would make the post office process a very simple one since the sender already select the right post-box for its addressee, however, on the other hand, this option will make the life impossible for the post office and the pedestrians in the street.
While this idea may seem unrealistic, modern technologies such as MQTT implementations (HiveMQ, Mosquitto) or middleware such as NATS.io make this implementation possible and allow the dynamic creation of thousands of communication channels almost instantly. This process is simple, lightweight and does not introduce latency in message transmission between partners once the channel is created.
By creating direct communication channels between the service providers and the integration platform, we eliminate the need for routing in the integration platform as the messages are already in the appropriate channel.
It is the responsibility of the integration platform to assign the provider and consumer channels associated to the same service. This job is easily undertaken by any message broker.
Creating connection: On one hand, we explained that the creation of connections is time and resource consuming when the number of connections increases and on the other side we designed an architecture where the number of connection is far higher than in the second and third integration platform. To solve this paradox, we imagine a new design pattern we called “Post office pattern†whose purpose is to automatically create connection. Associated with the externalisation of the routing effort outside the integration platform, the Post Office pattern defines a new generation of integration platform.
The post office pattern
High level architecture
The post office pattern defined 3 main elements: the service registry, the delivery channel, and the channel manager. The picture below describes the relationship between these three elements.
The picture below associates the partner applications, the pipeline, and the Post Office design pattern.
领英推è
Definition: We use the term ‘channel’ to describe the connection through which a message passes from one partner to another. There are different types of channels, such as queues and topics, as well as other types of channels, e.g., those that generate streaming or support synchronous request/response. A channel is largely defined by the MEP (Message Exchange Pattern) it supports. Within this pattern context we call a connection a channel.
Delivery Channel
Publishing a message: The most important functions of the delivery channel when a message is published by a partner are:?
- Receiving messages: The delivery channel pulls messages from the publication pipeline.
- Analysing the header: The delivery channel analyses each message header to determine which service the message is associated with. The message remains opaque to the integration platform.
- Selecting the right channel: the Delivery Channel searches for the channel associated with the service and sends the message to the partner(s) via that channel.
- Creating the channel: When a message is sent by a partner to provide or consume a service and the partner does not have a channel associated with the service, the Delivery Channel creates the channel. Thanks to modern middleware such as MQTT or NATS.io, creating a channel is an easy operation, a far cry from cumbersome database connections, and these operations affect performance very little.
- Sending the message to the Channel manager: The Delivery Channel pushes the message to the channel.
Receiving a message: When a message is received from the Channel Manager, the Delivery Channel pushes the message to the ingestion pipeline.
Creating the channel: When a message is published by a partner to provide or consume a service, and the service does not have a channel associated with the service, the Delivery Channel creates the channel.
A channel between partners is defined by several properties, such as:
- Message Exchange Pattern (request-response, notification, streaming)
- Quality of service (0-n, 1-n, 1-1)
- Availability
- Security and authorisation
These properties (here we refer to the technical properties of the channel and not its business properties such as the price of use or the non-disclosure of the transported information) are applied when the channel is created by the Delivery Channel. This information is defined by the service provider and stored in a second component of our design pattern, the service registry.
Disabling channels:
Disabling channels:
In order to offer automatic management, the automatic creation of channels is necessary but not sufficient. In addition to this functionality, processes for deleting and updating connections between partners must be implemented. There are two types of deletions: business constraints and technical constraints. Business deletion processes can implement, for example, the end of a service subscription, the application of new security rules, etc. From a technical point of view, processes must be able to handle availability (e.g., timeouts) and performance issues (response time, saturation, etc.) and delete connections that hinder the smooth exchange of information between partners.
Since channel creation is automatic, we suggest that updating a connection could consist of delaying the existing connection and creating a new one. In the post-office pattern, the distinction between business and technical deletions is important, as the deletion is triggered by the service registry or delivery channel. Changes that relate to the contract between the partners are recorded in the Service Registry. The Service Registry informs the Delivery Channels and the channel about these changes. For example, if a service contract is terminated, the service consumer's access rights are revoked by deleting access to the channel through which the service is delivered. The service registry then informs the delivery channels that the connection is no longer authorised.
Technical restrictions, such as non-compliance with SLA, are detected by the Delivery Channels. Depending on the implementation of the post-office pattern, the deletion of the connection is triggered either by the Delivery Channel or by the Service Registry and is specifically performed by the Delivery Channel that manages the connections. It is important to link any business change to the Service Registry. This is one of the keys to success in implementing the post-office pattern.
Service Registry
The service registry records the characteristics of the associated channels for each service. This information is provided by the service provider in the form of a set of business properties. One of the functions of the service registry is to map this information into technical properties that are used by the delivery channel to create the channel. This mapping reduces the binding of the business properties to a channel type (Kafka, AMQP, MQTT...)
So, when the Delivery Channel receives a message, it looks for the service associated with the message. If the channel does not exist, the Delivery Channel calls the service registry to obtain the technical properties of the channel(s) associated with the service. It then creates any necessary channel(s).
The channel manager
When a Delivery Channel creates a channel, it does not manage the implementation or maintenance of that channel. These tasks are handled by channel management, also known as middleware or a message broker. The responsibilities of the channel manager include:
- Creating channel implementations
- Facilitating the transport of messages through channels
- Managing security and access authorization for channels
- Ensuring the availability and redundancy of channels
- Enabling mapping and partitioning between channels
- Guaranteeing proper message delivery.
Channel manager implementation
To determine the best channel implementations for the channel manager, we evaluated 8 criteria on 6 products. Ultimately, we chose the NATS.io product for its low initial resource requirements (which make it suitable for Edge environments) and its powerful mapping and partitioning capabilities between channels.
Avantages and drawbacks
The most significant benefit of this new architectural pattern is that once a service's properties have been registered by the provider it automatically establishes communication channels between partners and uses the service name as the identifier to route the message to a partner. Additionally, this pattern makes it simpler to establish connections at the service provider level, eliminating the need for the provider to concern themselves with configuring the channel.
However, implementing this pattern necessitates the setup of pipelines, standardization of messages, and the establishment of a service registry. The extra work required is deemed worthwhile when the number of partners and services is substantial enough to justify the effort. This is particularly relevant in the context of smart cities, which fully justify this extra effort.
Summary
Design Pattern Name: Post Office Pattern
Problem: The number of services and partners is increasing and creating and maintaining connection between partners requires significant time and resources.
Solution: In a normalized environment, such as the post office (Pipelines, standard messages), we create a service registry that stores the properties of a connection and associate them to a service. We implement a delivery channel that dynamically creates the connection by using the properties stored in the service registry. We select a middleware product that allows the dynamic creation of channel in an easy and lightweight way.
Conclusion
We have implemented the "Post Office" pattern for the first prototypes and MVP of our integration platform. Together with the implementation of the ingestion and publication pipeline, the Post Office pattern proved its worth and greatly simplified the connections between numerous partners, allowing us to process millions of messages per second with only four virtual machines equivalent to an "a1.large AWS instance". However, some security issues were encountered in our prototypes due to the dynamic creation of channels and the resulting dynamic creation of publishing and subscription rights for the channels. This problem should be fixed in future versions of NATS.io.
It is not only in terms of performance that our post-like approach has proven effective in facilitating interoperability between service providers and consumers. Dynamically creating a channel and associating that channel with a service name has greatly facilitated interoperability between partners. In future research, we aim to facilitate the registration of service properties and improve the mapping and partitioning between channels and services.
[i] Generation 1: The first generation of integration platforms was enabled by the standardisation of exchange protocols such as CORBA. And the intermediary between the partners was the exchange protocol.
Generation 2: The second generation is organised around an integration platform where messaging routings to service endpoints are defined within the platform. Each service connects to the platform and does not need to connect to all partners using that service. (Examples: IBM WebSphere, Tibco, MuleSoft, WSO2, Dell Boomi).
Generation 3: The 3rd generation is similar to the 2nd generation, with the difference that the routing is based on a service definition rather than the implementation endpoint, which provides an additional degree of freedom in the choice of service implementation. 3rd generation integration platforms include Open ESB or Oracle SOA Suite.
[ii] We set aside the diversity of the protocol (HTTP, MQTT, AMQP…) since using an integration platform (ex: ESB) suggest a standard internal communication protocol between endpoints or services.
?
?