Event Sourcing Agtech - turning the tractor into a DeLorean time machine
Part 2 - Event Sourcing Primer
In the last blog , I talked about the need for Event Sourcing in AgTech. Now we start to wander in the weeds to uncover naming all the pieces and the fundamentals of how it works.
Event Sourcing at its core is about making data models from a series of events. The question arises on how can this be conceptualised for mere mortal developers who are used to saving data models directly? If I just created a new weight record, then as a developer, I just want to save that weight record with minimal fuss.
While in the previous pseudo event-sourcing system we made records act like events (by playing through the records chronologically), they were not actually events. The weight record itself needs to be made up of events as these can also have synchronisation conflicts (e.g. two people editing the same weight record offline at once).
So how do we transform the data object into events and back again?
One interesting design pattern that goes hand-in-hand with Event Sourcing in recent years is called Command Query Responsibility Segregation (CQRS). CQRS is a fancy way of saying to separate the architecture for reads vs writes. The flow for saving the weight record is not following the same path as for querying the weight record. There are some conceptual benefits of CQRS, though there is a tangible benefit where by decoupling the read/write architecture, each can be optimised to suit their single purpose. For example if you are experiencing 50x more requests for querying of your data as opposed to writing new data, you can add more servers when you deploy your infrastructure for solely focusing on the read capabilities. New and updates to the data are persisted through the use of a Command framework, where a Command is some code that can modify the state of the system leaving it in a valid state. If the validation of the command fails (e.g. trying to sell some animals that have already been sold), then the command will fail to execute, ensuring the system state is still valid.???
The primary data you are saving in an Event Sourcing system is the events. When coupled with CQRS, now you have Commands that create and save Events, and Queries that read Events to populate a data model. The data models are basically an aggregation of multiple events, and are hence known as Aggregates. Now we can see that the Commands can exist completely separate from the Queries, hence the catchy term Responsibility Segregation (of CQRS). The only crossover will be during command validation where the existing system may need to be queried, however, they can still exist as two separate systems with different responsibilities.
Commands are written as a request for the system to execute, as it is not guaranteed that it will work. Hence, it uses an imperative verb when naming a command. This is in contrast to Events, where they are taken as valid and their names are written in the past tense as they have happened (at a particular point in time). Events will just keep on getting added, you do not delete or modify existing Events. This will make the architecture simpler for replication (to the mobile) and for throwing events on a bus for asynchronous operation. So how do you edit data you ask? Well you can make new events that target old events, but we can get into more detail in a separate blog entry.?
The architecture allows for 1 command to create multiple events. It raises the question of how many events do I make? Do I bundle on these changes and save it as a single event, or do I split it out into multiple and if so, what are the rules for splitting it out? If you can consider an Event being a small atomic change that leaves the system in a stable state, you can think about how these events will be used by aggregates. An aggregate may use all of those events you are saving, though another aggregate may only care for one of those events. For example an animal might consist of multiple concepts: its characteristics (like date of birth, breed and other things that do not change), its identity (the electronic and visual identifiers) and its state (its weight and whether it has been weaned). If you are displaying the animal details in a dialog, then you may need to load all these concepts. However, if you are just doing a count of your breeds, then only the characteristics matter. You need your events to be small enough to be reusable for different types of aggregates, without making ridiculous numbers of events for each animal created - as that will just slow down the system for both reading and writing. Commands are easier to map, as they just reflect the user operations. If the user drags and drops their mob of sheep on the map, then that becomes a single command, which in turn may create multiple events.
Event Handlers are what copies the data from the Event to the Aggregate for a given query. The one event could be used by different Event Handlers on different Aggregates. Each Event Handler can have 2 methods: a ‘canHandle’ to see if a particular aggregate is relevant for an event and the actual ‘handler’ that does the copying. The handlers themselves should act like dummies, not doing any business logic and only copying the relevant data to the aggregate.?
Aggregates are a bit like the ancient gods in a fantasy story, in that they do not exist unless someone believes in (or queries for) them. This is because it is the events that are persisted, so you can have as many event handlers on different aggregates as you want - they will not run unless someone is querying for that particular aggregate.? An exception to this is projections which we will get to later.?
One problem that arises is performance. If playing through events is required to build each aggregate, how does this scale over time? After a while a single aggregate may have thousands of events to play through?
One of the answers is adding snapshots. A snapshot is a copy of an aggregate at a point in time that is persisted. Now to do a query, you just have to find the latest relevant snapshot and play events since.
Snapshot creation may be triggered manually or automatically based on criteria such as periodically (e.g. every quarter or week), or based on the number of events processed. The ideal solution will be autonomic (self-healing) that could dynamically create snapshots as needed and even learn when they should be created based on query usage.
Over time, multiple snapshots can be generated so the user can easily query across any point in time. Just find the latest snapshot prior to the query time, and play through the remaining events up to the query time.
Snapshots are like online banks - if they get outdated, just throw them away and get a new one. If an event is backdated, then relevant snapshots more recent than the new event need to be invalidated and recreated to take into account the new data change.
The Time dimension is omni-present and all important in Event Sourcing. Every event is saved at a particular time in the event store, for example you can backdate a weight record to have occurred 2 weeks ago. And every aggregate query is at a particular time - to handle the case where I am querying all my animals as of last financial year. However, it is important to recognise that most of the queries will occur to view the system right now. We can make use of this fact to help optimise now-time queries using the next event sourcing feature, Projections.
A Projection is like a Snapshot but instead of being fixed at a particular point in time, it’s time is the nebulous ever-changing moment of right now. For projections to work and stay relevant, they need to be updated when the events change, which is different to the typical query where the aggregate is updated at query time. Think of Projections like a preemptive strike - or “here’s one I prepared earlier”. It has already processed the events for the most common query time, which is right now. The downside is the more projections you add, the more processing is required at save time as one event may update multiple aggregates (and hence projections).?
A basic rules engine can be attached to an event stream as a consumer to determine what to do with new events in the system. For example, a projection rule processor can analyse the new events coming in, check if there are any relevant projections for those events and update them appropriately.?
There is a whole lot of complexity to think about when writing an Event Sourcing framework. Asynchronous systems add the world of pain of distributed computing problems. We managed to write our initial version of Event Sourcing as a synchronous system, so when you save a command, then next time you query it you are guaranteed that it has been updated. This was about picking our battles and making sure the core of Event Sourcing was solid first before worrying about asynchronous updating to the system and User Interface (UI).? However, as we started fleshing out more projections, we had to handle asynchronous operation as well to improve the save performance.
These are the main components of Event Sourcing. In part 3 of this blog series I will go more into the nuts and bolts of various techniques for building an Event Sourcing system.
Contractor | Data Interoperability | Traceability | Distributed Systems
2 年One throw away line from the Martin Fowler talk that sticks with me - storing your current application state ("aggregates" and "projections" I think, still learning the lingo) in RAM. Since they can be rebuilt by folding over the event stream at any point, they don't really need to be persisted with the same guarantees a regular app would. How do you think that would go in practice? Looking forward to part 3 :)
Country Manager @ Acoustic | Driving Customer-Centric Growth
3 年Very cool Kenny Sabir - if only more people could incorporate Ancient Gods as an analogy for explaining technical concepts the world would be so much more interesting! I know I always feel a little bit smarter after I've spoken to you, and now I can feel a whole lot smarter having read this blog post! ??