How to Build Event-Driven Business Processes With API-Based Services For Improved Reliability and Consistency
Building reliable business processes using API-driven microservices is a common goal for modern applications, from online food ordering to financial transactions. However, this approach faces two critical challenges: reliability and consistency issues.
The reliability issue stems from the fact that a business process is only as robust as its weakest microservice link. With synchronous API calls, any service failure can break the entire process flow. Even with high service reliability, the compound probability of a successful end-to-end execution drops exponentially as more microservices are involved.
The consistency issue arises when a multi-step process fails midway. Some service calls may have already executed successfully, leading to inconsistent data states across the distributed microservices. Implementing rollbacks and manual fixes becomes increasingly complex in such scenarios.
A better solution is to adopt an event-driven architecture with an orchestration pattern. By using a dedicated workflow service to orchestrate the microservices through asynchronous events, both reliability and consistency can be greatly improved. Even if individual services experience downtime, the workflow can gracefully resume once they're back online, maintaining eventual data integrity throughout the process.
In this article, I’ll explore how Infinitic, a framework for event-based orchestration, enables easy transitioning of existing API-driven microservices to robust, event-driven business processes. I’ll dive into the architectural patterns, workflow implementation, and deployment considerations, showcasing how minimal effort can yield substantial gains in reliability and consistency for your mission-critical applications.
Building Business Processes With API-driven Services
Using API-driven microservices is an excellent approach for building scalable business processes, such as online food ordering and delivery, ride-hailing services, online financial transactions, e-commerce order processing, and more. These processes are usually initiated directly from a customer-facing application or, if recurring, through scheduled cron jobs:
However, this frequent pattern has two main issues:
Reliability Issue
Your business process that relies on synchronous API calls across multiple microservices are fragile. Since the process flow is dependent on every service responding successfully, a failure in any one service can cause the entire process to break down. This issue is compounded as more microservices are involved due to the multiplicative effect on end-to-end reliability.
For example, if each API-driven microservice has 99.95% reliability (approximately 21 minutes of downtime over 30 days), a business process utilizing 10 microservices will have a reliability of .9995^10 = 99.5% (around 3 hours and 36 minutes of downtime over 30 days).
To mitigate this issue, you can introduce more redundancy for each microservice, but this approach raises infrastructure costs without fundamentally solving the problem (e.g., it does not prevent issues arising from network malfunctions).
Alternatively, you can add a message broker (such as RabbitMQ) and push the request to start the business process to a queue, ensuring that failed executions will be automatically retried. While this approach helps, it does not address the consistency issue below.
Consistency Issue
The consistency issue arises when a business process fails after some microservice calls have already executed successfully. This can leave the system in an inconsistent state, with data updates scattered across different services. Implementing rollback mechanisms within the process controller is complex and prone to errors, especially if the controller itself fails during the rollback process.
Overall, this situation is usually challenging, as debugging and understanding the origin of discrepancies in a distributed environment is complex and time-consuming for your teams, often requiring manual fixes.
Using A Message Broker
A better solution is to use a message broker, whose role is to make the inter-services communication durable.?
This is typically achieved using a choreography pattern, an architectural style for event-driven systems, where individual microservices react and take actions based on events emitted by other services. Being event-based solves both the reliability issue (even if a service is down, it can resume when it's back up again) and helps for the consistency issue because a workflow will eventually be completed, even if a temporary issue strikes during its execution.
But while this approach can work for simple scenarios, it becomes increasingly difficult to code, maintain, and observe as business processes grow more complex, involving conditional logic, timeouts, and error handling.
In The Way We Are Building Event-Driven Applications is Misguided, I described why I favour the orchestration pattern, an alternative architecture for event-driven systems, where a dedicated workflow service acts as the central orchestrator, coordinating the various microservices involved in a business process. This service initiates the process flow, manages the sequence of events and service invocations, and handles errors or retries as needed. The orchestration pattern simplifies the development and maintenance of complex event-driven processes while providing better observability and control compared to the choreography pattern.
In the article mentioned above, I also introduced Infinitic, a framework I've developed to enable central orchestration of unbreakable event-driven applications.
The key components provided by Infinitic are highlighted in green in the image below:
Infinitic currently uses Apache Pulsar as the underlying messaging system.
Using Infinitic provides the following benefits:
Implementing the Business Logic With Infinitic
Infinitic implements the business logic in workflows using plain Java or Kotlin code, using only the interfaces of the distributed services. You don't need to learn a Domain-Specific Language (DSL), making it easy to write new workflows or convert existing controllers to event-based workflows.
Let’s take the example of a simple monthly billing process flow for a subscription service. The process starts when a user subscribes to the service. Subsequently, a loop is initiated that checks if the user is still subscribed at the beginning of each month. If the user is subscribed, a series of steps are executed:
1. Wait for next month
2. Get consumption data (ConsumptionService::get)
3. Request payment (PaymentService::request)
4. Create an invoice (InvoiceService::create)
5. Send the invoice via email (EmailService::send)
After the email is sent, the loop repeats to check if the user is still subscribed for the next month. If the user is not subscribed, the process exits the loop and ends.
This workflow can be implemented (here in Kotlin) as:
As you can see, this code is remarkably similar to what you would do in your Business Process Controller, except that this code pilots an event-driven process. If you are interested in how it works, you can read Under the Hood of an Event-Driven “Workflow As Code” Engine.
Creating Event-Driven Services Using Your APIs
Infinitic provides ready-to-use Service workers, that you can use just by providing an implementation of services:
Deploying Your Event-Driven Processes
To deploy your newly created event-driven processes:
By following these steps, your business processes become event-driven, ensuring reliability and consistency, while still utilizing your existing API-driven services. This approach requires minimal effort, taking only a couple of weeks instead of months of work.
Additional thoughts On Code Generation
If your services are well-defined through an IDL like OpenAPI for REST APIs or gRPC for remote procedure calls, the creation of Infinitic's Service Workers could be automated. With clear service definitions, code generation tools can introspect the service contracts and automatically create the boilerplate code required to invoke the services from within Infinitic's Service Workers.?
This would streamline the process of integrating existing API-driven microservices with Infinitic's orchestration framework, reducing manual effort and minimizing the risk of errors. I'm exploring this avenue as a future enhancement, aiming to provide a more seamless onboarding experience for developers adopting Infinitic with their existing microservices. This approach can also help teams that do not want to use Java, Kotlin.
Conclusion
Building reliable and consistent business processes using API-driven microservices is a challenging endeavor, but adopting an event-driven orchestration approach with Infinitic can significantly mitigate the risks of reliability and consistency issues.?
Infinitic provides a powerful solution by enabling the development of event-driven business processes leveraging your existing API-driven microservices. By using Infinitic's workflow implementation approach, you can build new business processes that are inherently resilient, scalable, and maintainable. Often taking just a couple of weeks compared to months of work required for building similar capabilities from scratch.
Key benefits of using Infinitic for event-driven business processes include:
By adopting Infinitic, you can future-proof your mission-critical business processes, ensuring they remain reliable, consistent, and scalable as your organization grows, without being constrained by the limitations of traditional API-driven approaches.