Cron Jobs vs Events for async  data processing

Cron Jobs vs Events for async data processing

Let's consider this scenario:

You are building a system which integrates with a payment provider, and you're storing and managing user transactions while calling the third party payments API.

The caveat is that the 3rd party payment process is asynchronous, meaning you can initiate a payment with one API, and the payment will be confirmed through a webhook/callback.

Let's say the payment provider gives a spec stating that the average time to complete a transaction is up to 1 minute.

As with any other 3rd party provider, we have to consider failures and retries. One of the noted failure scenarios here, is that the payment provider fails to call our webhook/callback after the transaction is completed. This happens 2-3% of the time according to the provider.

We have to be able to sync the transaction status of the payment provider with the records in our system. There are two ways of achieving this:

Using a Cron job

A task will run every 2 minutes that fetches the 'pending' transactions, and will call a payments API to get the transaction details, check the status of each transaction, and update the records in our database accordingly.


Using delayed events

After a transaction is initiated, we can schedule a delayed event (to be fired after 2 minutes), which will be consumed by a worker. The worker can fetch teh transaction record from our database, and if the status is still pending, they can call the payment API to get the transaction details, and update the record accordingly.

Which method should you use? Of course, it depends.

Pros and cons

  • Events are self managed and focused; they only deal with a single transaction, and retry is simple since the worker can trigger itself again with a delay. The code is also simpler, since you're handling a single transaction. However, if there is a large volume of transactions, let's say N, then using events has a drawback of having to fire N number of events, which will subsequently do N number of database fetch queries.
  • On the flipside, Cron jobs run on a schedule and will only fetch the pending transactions. So instead of validating N number of transactions, you only validate a small portion of potential failures, while fetching them using a single database query. The code gets a bit more complex since you're handling data in batches, but you're getting a more optimized approach for database interactions.

Conclusion

The Cron job implementation is simpler since you only fetch the pending transactions, and process them accordingly, while the events based implementation is more distributed and entails validating every single transaction and multiple queries.

For this choice, it seems the trade off is centered around the volume of transactions. For high volume transactions, cron jobs would be preferred since that would reduce the number of external API calls, and also avoid flogging the database and background workers. For lower volume of transactions, the delayed events approach makes sense since you'd get near real-time updates without putting too much load on your system.

SAYED UZ ZAMAN

Software Engineer Expert @ PETRONAS | C# | .NET Core | Java | Angular | MS SQL | Oracle | Docker | Kubernetes |

1 年

I am not too fond of cron jobs, specially its very risky when you have to shutdown the server during maintenance. Its difficult to detect if shutdown action is going to interrupt an ongoing cron operation or not. But, yeah, cron job approach is easier than other approaches. Using combination of events+kafka could be a better approach, and no doubt it depends on use cases.

Saad Ismail

Tech Lead / Senior SWE at Bolt

1 年

Insightful article, thank you for writing it. Since I am dealing with the payments system and encountering similar challenges, I'd like to add one more tradeoff: With crons, you'd ideally want to only run 1 instance of this cron to avoid inconsistent or multiple updates from different processes. 6 cron instances running all at once will get more or less the same X pending payments and will all try to resolve the statuses. With events, it is generally safer to run multiple consumers in parallel to consume the events since an event would be only for a specific payment. There is a caveat though that most of MQs don't provide exactly-once processing but rather at least once processing but in my practical experience this doesn't happen very often.

Ferdous Shourove

Senior Software Engineer at Delivery Hero

1 年

Interesting article. Thanks for sharing. I have a few questions. 1. For the event based approach, how would you "schedule a delayed event (to be fired after 2 minutes)"? Isn't that yet another cron job? If you are not using a "Cron Job" in this approach, then won't you have to implement this logic for triggering, retrying yourself? IMO that feels reinventing CronJob logic. But I might be missing something. 2. Both approaches are processing the data on a 2 min interval. The last sentence in the article mentions event based approach would give near-real time updates. But that's true for Cron based approach as well. Didn't get the difference in terms of "near-realtimeliness" 3. If the payment provider is confirms the transaction through a web-hook/call-back, can't we use that to update the transaction status? I understand this is too specific to this example. May be you just wanted to highlight the Event vs Cron tradeoffs. But just curious.

MD Rakibul Hasan

Backend Engineer & DevOps

1 年

Vai, what if we run a worker to do it. Like, when the transaction initiated, we add a task with 1 minute delay. Depending on the response from payment provider, this task will retry or update in the database.

要查看或添加评论,请登录

Sabbir Siddiqui的更多文章

  • Using ORMs vs Writing Raw Queries

    Using ORMs vs Writing Raw Queries

    This is not exactly a stand off between the two concepts, rather an opinion piece based on my experiences on where ORMs…

    25 条评论
  • Joins vs Split queries in SQL

    Joins vs Split queries in SQL

    Welcome to my newsletter! In “It Depends”, I’ll be discussing various trade-offs and challenges that software engineers…

    16 条评论
  • Conversational Commerce in Bangladesh

    Conversational Commerce in Bangladesh

    The internet has allowed people to be in tune with the rest of the world, no matter where they are, and Dhaka is not…

  • Quest for a practical NodeJS API Framework

    Quest for a practical NodeJS API Framework

    While working on REST APIs with NodeJS/Express, I came across some common challenges: I had to configure Express the…

    2 条评论

社区洞察

其他会员也浏览了