Event-driven architecture(EDA) pattern
Mukesh Kumar, MachineLearningProfessionalBigDataExpert
Chief Technology Officer(AI Architect and Specialist)
I used to be a staunch advocate of the layered architecture in microservices, largely owing to my extensive experience constructing designs within this framework. However, I've come to realize that my perspective might have been influenced by my exposure to layered architecture alone. That's why I've made a conscious decision to pivot my focus towards exploring event-driven architecture and its significance in today's dynamic market.
To facilitate understanding for my audience, I aim to delve into the realm of these two prevalent architectural patterns: layered architecture and event-driven architecture. By comparing their characteristics and merits, I hope to shed light on their differences and advantages.
Stay tuned as I gradually construct a proof-of-concept (POC) and upload the code to GitHub for easy reference. We'll explore the intricacies of both architectures, starting with their foundations and gradually moving towards a deeper understanding of their workings and implications.
Layered architecture
This pattern is quite common in software development. As indicated by the name, in this pattern, the code is implemented in layers. Having this layering enables the implementation of "separation of concerns". This is a fancy way of saying that each layer focuses on doing a few things well and nothing else. Having this separation of concerns allows us to optionally run each of these layers on separate servers and therefore allows us to run each layer on hardware that is optimized for that task.
The topmost layer communicates with users or other systems. The middle layer handles the business logic and routing of requests, and the bottom layer's responsibility is to ensure that data is permanently stored, usually in a database.
Having this separation of concerns or individual duties for each layer allows us to focus on the most important properties for each layer. For example, in the presentation layer, accessibility and usability are going to be important considerations, whereas in the persistence layer, data integrity, performance, and privacy may be more important. Some factors will be important regardless of the layer. An example of a ubiquitous concern is security. By having these concerns separate, it enables teams to not require personnel that are experts in too many technologies. With this pattern, we can hire UI experts for the presentation layer and database administrators for the persistence layer. It also provides a clear delineation of responsibilities. If something breaks, it can often be isolated to a layer, and once it is, you can reach out to the owner of the layer.
From a security standpoint, a layered architecture offers certain advantages over more monolithic architecture. In a layered architecture, we normally only place the presentation layer services in a public VPC and place the rest of the layers in a private VPC. This ensures that only the presentation layer is exposed to the internet, minimizing the attack surface.
If a hacker wanted to use the database in an unauthorized manner, they would have to find a way to penetrate through the presentation layer and the business logic layer to access the persistence layer. This by no means implies that your system is impenetrable. You still want to ensure to use all security best practices and maybe even hire a white hat group to attempt to penetrate your system. An example of an attack that could still happen in this architecture is a SQL injection attack.
Another advantage of having a layered architecture is gaining the ability to swap out a layer without having to make modifications to any of the other layers. For example, you may decide that AngularJS is no longer a good option for the presentation layer and instead you want to start using React. Or you may want to start using Amazon Aurora Postgres instead of Oracle. If your layers were truly independent, you would be able to convert the layers to the new technology without having to make modifications to the other layers.
In this architecture, each layer takes a set of functions and specializes in performing these functions. This is the breakdown of the separation of duties:
User Interface (UX) layer:
The user interface layer mostly manages the interface between the user (normally a human) and the rest of the application. There may be some validation performed at this layer, but the main purpose of this layer is the user experience.
API layer:
Interfacing through the UX layer may be only one of many ways to communicate with the rest of the application. For example, there may be a way to do a batch upload of data. There may be integration with other applications that have their own UX. The application may be integrated with IoT devices that generated data and don't have an interface. In addition, the application may have a variety of UIs (desktop-based, browser-based, mobile app, and so on). For all these reasons, it's a good idea to have a well-defined API layer that can be used by all these integration points.
Business Logic layer:
This layer would contain and execute the business rules of the application. An example would be any calculations that need to be performed. More specifically, in finance, "Assets minus Liabilities must always equal Equity". This would be a rule or a calculation that it would make sense to implement in this layer.
Data Access layer:
Most applications require a layer where data can be persisted. This persistence mechanism can take many forms: files, graph databases, traditional RDBMS databases, and so on. In addition, we may use a combination of these storage methods. Just because you are using a layered approach, it does not mean that your application will be bug-free or easy to maintain. It is not uncommon to create interdependencies among the layers.
Event-driven architecture
Event-Driven Architecture (EDA) is another pattern commonly used when implementing microservices. When the event-driven pattern is used, creating, messaging, processing, and storing events are critical functions of the service. Contrast this with the layered pattern we just visited, which is more of a request/response model and where the user interface takes a more prominent role in the service. Another difference is that layered architecture applications are normally synchronous whereas an EDA relies on the asynchronous nature of queues and events.
More and more applications are being designed using EDA from the ground up. Applications using EDA can be developed using a variety of development stacks and languages. Event-driven architecture is a programming philosophy, not a technology and language. EDA facilitates code decoupling, making applications more robust and flexible. At the center of EDA is the concept of events. Let's spend some time understanding what they are.
Understanding events
To better understand the event-driven pattern, let's first define what an event is. An event is a change in state in a system. Examples of changes that could be events are the following:
? A modification of a database
? A runtime error in an application
? A request submitted by a user
? An EC2 instance failing
? A threshold being exceeded
? A sensor in an IoT system recording a temperature of 20 degrees
? A code change that has been checked into a CI/CD pipeline
Hopefully, this list demonstrates that many changes can be an event. But not all changes are events. A change becomes an event when we decide to make it an event. Another way to understand this is, consider this example, October 13 might be just any date on the calendar, and the fact that the date changed from October 12 to October 13 is just a date. But if October 13 happens to be your birth date, then it becomes important and it is now an event. In the next section, we'll discuss two other critical elements in event-driven architecture: the concept of producers and consumers.
领英推荐
Producers and consumers
Events by themselves are useless. If a tree falls in the forest and no one is around to hear it or see it fall, did it really fall? The same question is appropriate for events. Events are worthless if someone is not consuming them, and in order to have events, producers of the events are needed as well. These two actors are two essential components in event-driven architecture.
Event-driven architecture is loosely coupled architecture. Producers of events are not aware of who is going to consume their output and consumers of events are not aware who generated the events. Let's now learn about two popular types of models designed around event-driven architecture.
Event-driven architecture models
There are a couple ways to design an event-driven model. One of the main design decisions that needs to be made is whether events need to be processed by only one consumer or by multiple consumers. The first instance is known as the competing consumers pattern. The second pattern is most commonly known as the pub/sub pattern. EDA can be implemented using these two main patterns. Depending on the use case, one pattern may be a better fit than the other. Let's learn more about these two models.
Event streaming (message queuing model)
In the event streaming model, events are "popped off" the queue as soon as one of the consumers processes the message. In this model, the queue receives a message from the producer and the system ensures that the message is processed by one and only one consumer.
Event streaming is well suited for workloads that need to be highly scalable and can be highly variable. Adding capacity is simply a matter of adding more consumers to the queue and we can reduce capacity just as easily by removing some of the consumers (and reducing our bill). In this architecture, it is extremely important that messages are processed by only one consumer. In order to achieve this, as soon as a message is allotted to a consumer, it is removed from the queue. The only time that it will be placed back in the queue is if the consumer of the message fails to process the message and it needs to be reprocessed.
Use cases that are well-suited for this model are those that require that each message be processed only once but the order in which the messages are processed is not necessarily important.
Microservice and Event-Driven Architectures
The publish and subscribe messaging model is suited for use cases in which more than one consumer needs to receive messages. An example of this is a stock price service. In this case, typically, many market participants are interested in receiving prices in real time on a topic of their choosing (in this case, the topics are the individual tickers). In this case, the order in which the order tickets are received is incredibly important. If two traders put in a purchase to buy a stock for the same price, it is critical that the system process the order that was received first. If it doesn't, the market maker might get in trouble with accusations of front-running trades.
In this model, many publishers push events into a pub/sub cache (or queue). The events can be classified by topic. Subscribers listen to the queue and check for events being placed in it. Whenever events make it to the queue, the consumers notice them and process them accordingly. Unlike the model in the previous section, when a subscriber sees a new event in the queue, it does not pop it off the queue; it leaves it there and other subscribers can also consume it, and perhaps take a completely different action for the same event. Optionally, the events in the cache can be classified by topic, and subscribers can subscribe only to the topics they are interested in and ignore the rest. The publish-subscribe model is frequently used with stateful applications. In a stateful application, the order in which the messages are received is important. The order can impact the application state.
Benefits of event-driven architecture
EDA can assist an organization to obtain an edge over its competitors. This edge stems from the benefits that the pub/sub model can provide. Some of the benefits are explained in the following sub-sections.
No more polling
The publish and subscribe model delivers the benefit of real-time events through a "push" delivery mechanism. It eliminates the need to constantly be fetching sources to see whether data has changed. If you use a polling mechanism, you will either waste resources by checking for changes when no changes have occurred, or you will delay actions if changes occur when you haven't polled. Using a "push" mechanism minimizes the latency of message delivery. Depending on your application, delays in message delivery could translate into a loss of millions of dollars.
Example: Let's say you have a trading application. You want to buy a stock only when a certain price is reached. If you were using polling, you would have to constantly ping every so often to see if the price had changed. This has two problems:
1. Computing resources will have to be used with every ping. This is wasteful.
2. If the price changes in between pings, and then changes again, the trade may not execute even though the target price was reached.
With events, the ping will be generated only once when the target price is reached, greatly increasing the likelihood that the trade will happen.
Dynamic targeting
EDA simplifies the discovery of services and does so in an effortless and natural way, minimizing the number of potential errors. In EDA, there is no need to keep track of data consumers and, instead, interested parties simply subscribe to the topics that are of interest. If there are parties interested in the messages, the messages get consumed by all of them. In the pub/sub model, if there aren't any interested consumers, the message simply gets broadcast without anyone taking any action.
Example: Continuing with our trading application example, let's assume that each stock is a topic. Letting users of the application select what topic/stock interests them, will greatly reduce the number of events generated and therefore will reduce the resource consumption.
Communication simplicity
EDA minimizes code complexity by eliminating direct point-to-point communication between producers and consumers. The number of connections is greatly reduced by having a central queue where producers place their messages and consumers collect messages.
Example: Let's assume that our trading application has 10 stocks and 10 users. If we didn't have an intermediate queue to hold the events, every stock would have to be connected to every user, for a total of 100 connections. But having a queue in the middle would mean that we only have 10 connections from the stocks to the queue and 10 connections from the users to the queue, giving us a total of 20 connections, greatly simplifying the system.
Decoupling and scalability
The publish and subscribe model increases software flexibility. There is no explicit coupling between publishers and subscribers. They all are decoupled and work independently of each other. Having this decoupling promotes the individual development of services, which in turn allows us to deploy and scale these services independently. Functionality changes in one part of the application should not affect the rest of the application so long as design patterns were followed, and code is truly modularized. So long as the agreed-upon APIs stay stable, making a change in the publisher code should not affect the consumer code.
Example: In our trading application, if a new stock ticker is added, users don't need a new connection to the new stock. We simply create a connection from the new stock to the queue and now anybody can listen for events in that new topic. Something similar happens when new users get added. The user just needs to specify which stocks they are interested in. Nothing else needs to be changed in the system. This makes the overall architecture quite scalable.
Stay tuned for the upcoming architecture diagram and use case that I'll be sharing in my next article. I'll make sure to provide the URL of the article in the comment section for convenient reference. Keep an eye out for the link—it will lead you directly to the detailed content I'll be discussing.