Pub/Sub Architecture- Benefits and Use Cases

Pub/Sub Architecture- Benefits and Use Cases

Publish subscribe model is an architectural design pattern that provides a framework for exchanging messages between publishers and subscribers. It is based on the concepts of message queues and event brokers but is designed to be more scalable and flexible. The scalability and flexibility come from the fact that it allows the movement of messages between different components of a large system without the components being aware of each other’s identity.

Pub/Sub pattern came into the picture when organizations felt a necessity to expand and scale information systems. In the early days of the internet, communication systems were scaled statically, that is by adding more components doing the same thing. For example, if a telephone exchange has 100 subscribers on its network, the exchange needed 100 input lines to handle their communication needs. If there is a request for a new connection, then a new input line needed to be added. This is known as static scaling, which is adding new components to scale a system even if the existing components are not being used to their full capacity.

As the internet expanded and its adoption increased due to the increase in smartphone usage, a need was felt for scaling the system dynamically. Taking the previous example, if the telephone exchange was built on the cloud, the system could automatically increase or decrease server resources to handle changes in traffic. It results in efficient usage of resources with minimum waste. This is called dynamic scaling.

A similar trend is seen in enterprise information systems which today are working on an internet-scale with geographically distributed data centers. The information systems of today are dealing with a high volume of data. The systems become more complex due to factors like seasonal traffic spikes, data latency, and network data corruption. In governing such complex dynamically scaling systems, the decoupled nature of the pub/sub framework comes in handy. It allows companies to manage the huge scale of systems without overloading the program logic managing system components.

A typical information system consists of three parts, an input system that sends data into the processing module in the form of a message. Then there is the processing module that receives data, processes it, and sends it to the output module in the form of a message. The output module displays the data on the user’s screen.

No alt text provided for this image

But in the real world, an information system of reasonable scale will have multiple input and output modules to manage concurrent requests. At this scale, the?problem of routing messages from their respective input modules to their corresponding output modules arises. To solve this issue, an addressing mechanism is needed which will help the processing module to send the message to the correct receiver based on the address.?

No alt text provided for this image

However, in an internet-scale system that will handle thousands of concurrent connections, the system should be capable of sending and receiving messages across the world along with handling globally spread users.?But at such a large scale, the processing module can’t handle that amount of load, and the requirement for dividing the load between different processing modules arises. This can be solved by introducing multiple processing modules, but this increases the complexity because now the input modules will have to take care of routing the message to the correct processing module. Other issues like maintaining pre-defined addressing between the different modules become a huge overload. At a large internet-scale system, attaching routing metadata information to messages becomes a bottleneck. This problem requires a new way of thinking. This is where Pub/Sub comes in.

No alt text provided for this image

The Pub/Sub is an architectural design pattern that enables Publishers( Pub) and Subscribers (sub) to communicate with each other and allow messages to flow between different system components without the components knowing other’s identities.?The functioning is such that the publisher and subscriber rely on a message broker to send messages to each other. Messages/events are sent out by the host(publisher) to a channel, which a subscriber can join.

It is an asynchronous messaging system that allows modules to have isolated and well-defined responsibilities. In this system, there is no need to maintain a shared knowledge of the whereabouts of other modules. The input modules or publishers are just responsible for publishing the message, the processing module is responsible for processing the data, and the output module is responsible for displaying the output. The only thing that the components need to know is the input and output channel as shown in the image below.?The publishers have to post their messages in the input channel and the subscribers need to retrieve them from the output channel.

The message broker which is responsible for routing the messages from the publisher to the subscriber has some sub-sections called topics. The receiver of messages subscribes to particular topics in the message broker to receive messages. These topics are like virtual pathways which can be created and destroyed easily. This makes the management and administration of topics a separate responsibility, separate from the modules. Due to this, the developers do not have to face additional complexity in programming the modules.

So in the pub/sub pattern, the publishers pass on their messages to the input channel through which it reaches a particular topic inside the message broker. It gets processed over there and then again placed inside the relevant topic. It is then read by the subscribers of that topic through the output channel.

No alt text provided for this image

Let’s go through the advantages and disadvantages of the pub /sub model in detail.

Advantages of pub/sub model:

1.??????It decouples the systems that need to communicate with each other by managing the delivery of messages to the right subscribers even if one or more receivers are offline. This is known as separation of concerns for applications, where each application can focus on its core responsibility because the messaging infrastructure is handling everything that needs to be done to route messages to multiple consumers.

2.??????It helps the publisher (message sender) and subscriber (message receiver) focus on their core processing responsibilities by managing the movement of messages between them. This way unnecessary compute power and storage are not wasted since the publisher and receiver don’t have to keep some bandwidth of their resources to send and receive messages.

3.??????It increases the reliability of the overall system. This happens because asynchronous messaging enables applications to run smoothly even under increased loads and handle failures more effectively.

4.??????Deferred or scheduled processing is enabled because subscribers can wait to pick up messages until off-peak hours, or route or process messages according to a schedule.

5.??????Pub/sub model supports easy integration of systems running on different platforms, built using different programming languages, using different communication protocols, along with systems running on-premise or on the cloud.


Disadvantages of the pub/sub model are:

1.??????The publisher of the message does not know if the message has been delivered to the subscriber or what the status of the subscriber is. The same is the case for the subscriber, it is not easy to gauge whether the message has reached the subscriber.

2.??????The message broker does not notify the publisher about the message delivery status, so there is no way to know whether the message delivery was successful or not. Tighter coupling is required for this.

3.??????As the number of publishers and subscribers increases, this increase in load can lead to some events going missing or lost in the system without reaching their desired destination.

4.??????Sometimes storage cost goes up in cases where some systems remain offline for a long period. In those cases, the messages have to be stored and kept till all the intended subscribers come online and receive the message.

5.??????Sometimes a message that requires access to different resources may cause a service instance to fail if those resources don’t exist. These types of messages are known as poison messages.

?

Now that we know the benefits and disadvantages of the pub/sub model, here are some use cases for it:

1.??????When an application is required to broadcast a message or information to a significant number of users. For example, the commercial operation team of a pharma company wants to broadcast or share information in real-time with multiple pharma salespeople on the field. The pharma salespeople are recording their various interactions with healthcare professionals in their mobile phone applications and the commercial operations team can quickly send some information back to them on phone.

2.??????There may be a case when an application is required to share information with independently developed applications or services that may be running on a different platform, made using a different programming language, and using a different communication protocol.

3.??????When an application is required to send messages to different consumers who have totally different availability requirements and uptime schedules from the sender.

4.??????In systems that require real-time event distribution, pub/sub can help make events, raw or processed available to multiple applications across organizations for real-time processing. Pub/sub supports an event-driven application design pattern.

5.??????Pub/sub can enable you to create an enterprise-wide real-time data sharing event bus which will help distribute database updates, business events, and analytics events across the organization.

6.??????Data replication across many data warehouses is required to maintain data integrity. In this case, pub/sub can be used to distribute change events across different warehouses which are supposed to maintain similar data. These events are used to maintain a view of database state and state history.

7.??????In cases when notifications need to be sent to various end-users who may not be online or available at the same time, the loosely coupled pub/sub model means that publishers can send events without worrying, equivalent to fire and forget, knowing that the message broker will deliver the message to the end-user as and when they are available.?

?

?

References:

1.????https://ably.com/topic/pub-sub

2.????https://data-flair.training/blogs/apache-kafka-tutorial/

3.????https://www.codurance.com/publications/2016/05/16/publish-subscribe-model-in-kafka

4.????https://docs.microsoft.com/en-us/azure/architecture/patterns/publisher-subscriber

5.????https://thenewstack.io/publish-subscribe-introduction-to-scalable-messaging/

6.????https://ably.com/blog/pub-sub-pattern-examples

7.????https://cloud.google.com/pubsub/docs/overview#:~:text=Common%20use%20cases,-Ingestion%20user%20interaction&text=Pub%2FSub%20allows%20you%20to,organization%20for%20real%20time%20processing.

https://abdulapopoola.com/2013/03/12/design-patterns-pub-sub-explained/

Snehamoy (Sneh) Mukherjee

Partner - MathCo | Advisor- CampusGrad | Advisor - Sedris.vision | Ex- Principal - Axtria | Advanced Analytics and AIML | Chief Analytics Officer | Keynote Speaker |

2 年
回复

要查看或添加评论,请登录

Manpreet Singh Sarna的更多文章

社区洞察

其他会员也浏览了