登录查看更多内容

How to Build Discrete Event Simulations

Markus Voelter

Software Engineering Consultant ? Professional Photographer

发布日期: 2023年6月28日

Let's assume you develop some kind of software system whose internal state evolves over time. You want to let users explore how the system evolves in different scenarios to help understanding the system or to perform some kind of analysis. So you have to write a simulator.?

Let's be a bit more formal about what we want to do. Let's say our system state consists of a number of variables v1 ... v_n. In practice, these probably aren't a flat list of variables, they might be structured in some way, as vectors, matrices, trees or whatever, but for the purpose of this article we'll keep it simple: we treat the system state as a set of variables v_i. We use the term s (for state) to represent the set of all these variables.

Usually, some of these variables are connected through a set of system-specific rules. For example, v2 might be computed from v7 by dividing v7 by 3. Or v4 is always the sum of v2 and v1. Often, the current state will be computed by taking into account the state from various times previously; for example, the current v5 might be computed from a previous (yes, that's still vague, will get to that!) v5 plus the current v3. So our system really contains a model m (in the scientific sense) of the world we want to simulate. In our example where we can say

The purpose of the simulator is to let us explore how the model evolves over time, m(t). More specifically we want to know how the model state evolves, s(t), so we have to run the model for every t within the range we are interested in. Now, the interesting question is: for which ti do we have to run this? Are we really interested in s(t) for all t, or maybe only for some? How we answer this question drives the design of the simulator. I will explore three different options.

Continuous Models

The first option is relevant for systems where the state evolves continuously. Conceptually we have to compute the state for every t.

Since time in the real world is continuous, this would result in an infinite number of computation steps. So we have to discretise time. How finely grained our time discretisation should be is not so easy to answer -- it depends on the precision requirements and sometimes the model itself (I will provide an example for this later).

The typical example for such a model is a weather or climate model. The state tracked by the model is temperature, dew point, cloud cover and the like for every point in the atmosphere within a particular region (say, Germany) with a particular resolution (say, 3 km horizontally and 100 m vertically). The calculations in the model are basically the physical laws -- or approximations thereof -- that govern how "weather" evolves over time: as cloud cover increases, temperature reduces; once we have enough clouds, temperature near the ground lowers far enough for there to be fewer thermals, resulting in a reduced cloud cover. And so on. Since we are really interested in the weather at any time during the forecasted period, we have to run it for every (discretized) t, say every 10 minutes -- this would be an example of a requirement driving the granularity of t. An example of the model itself driving the granularity would be that certain physical phenomena, such as turbulence and detailed cloud formation, cannot be calculated if the temporal resolution is too coarse. But I digress.

The point is: you write such a simulation basically by running a loop over time:

The mechanics of the simulator are simple to implement (the complexity is in the model itself!) but it also has a huge drawback: it is expensive to compute, because you have to run the model for every t. For realistically fine-grained granularities of t, this can result in lots of model calculations.

Schr?dinger's Approach

Is the cat dead or alive? We don't know unless we look! This is another way of running a model: we only compute the state for those points in time when the user explicitly looks at the state. Let's say the user, for whatever reason, is only interested in the state of the model at times t = {7, 20, 33}. Can we make do with just three model runs, for t = 7, t = 20 and t = 33?

There are two different cases here. One case is that the underlying system is still fundamentally continuous -- like the weather. If we decided we wanted to know the weather tomorrow at noon (let's say that is t = 36), then we still have to to run the whole model at its predefined granularity. The physics dictates that you have to evolve the weather step by step. Stephen Wolfram calls this computationally irreducible: there's no shortcut to running the calculations step by step. There's no "analytical" solution for weather(t). In this case we are back to the continuous? case above: we run the big loop until we reach t = 36 and then decide to just look at t = 7, t = 20 and t = 33. Nothing gained.

However, let's say we have this model:

Note that here we are not calculating based on "previous" values of the model, we are just computing a new state based on actual functions. In this case, we can just plug in t = 7, t = 20 and t = 33 to get our three results. We don't need the states for the other t. We don't look, so the state doesn't need to exist.?

This is of course much more efficient computationally. It's also trivial to implement. However, many systems -- weather! -- just don't work that way, the model doesn't allow the simplification. There's another problem: you have to know when to look. You have to know somehow when some interesting change happens so you can make the effort to look (and hence, calculate the model). For example, in our diagram here, we might get the impression that s remains between s1 and s2, while, in reality, it leaves these bounds during times we don't look (at t = 22 s is much bigger than s2). We just didn't know that something interesting might happen between t = 20 and t = 33.

It's not always clear that the user has a means of making the decision when to look. If they have -- for example, based on constraints outside the model -- and if the model allows it, then this approach is great because it is computationally very efficient.

For the intermediate times -- assuming that for some reason we can't or don't want to run the model -- we assume that the state stays constant, or that it evolves based on some other simple extrapolation (eg., the average between two adjacent known states).

There's a slight variation for the Schr?dinger approach. Let's say a state change s1 to s2 should happen at t = 25. But the user only "looks" at t = 33. This means that between 25 and 33 no state update has happened, the state remains s1. When the user looks at t = 33 and we can update the state, we can do two things:

we can make the state change to s2 at t = 33. This means that the change is delayed relative to what has happened in reality, but we don't rewrite the history for between t = 25 and t = 33
alternatively we can retroactively register the state change to t = 25. This means that the state change is registered for the correct time, but we have just rewritten history; because in fact, between t = 25 and t = 33, while we hadn't looked the state was s1. If we look after back from the perspective of t = 33 the system claims that it has been s2.?

In the end, to really be correct, we'd have to use bitemporal data storage.

Entering User Interactions

Staying with our weather example, let's play god. God can modify the weather. Or maybe more practically, let's play climate scientist, and we want to simulate the eruption of a volcano. So we run our climate model for a year, with, say, a daily granularity. As we have said, this might not allow us to model certain finer-grained processes based on the actual physics (for example cloud formation during a day), but let's assume we have ways of approximating this reasonably (we change the model to use the approximations instead of the real-physics-formulas). So we are happily running our big loop, step by step, and now, at t = 666, we want to inject a volcano eruption. This is feasible to do in both approaches we have seen so far. We stop the simulation, and just "patch" the state of the model. Potentially, if the model makes calculations based on history, we might have to patch not just s[666] but also, for example, s[600] to s[665], but once we have made the patch, we can continue the simulation regularly. We'll come back to this later.

Discrete Event Simulations

We have seen two ways of running models so far. The first one calculates the state for every t, given some previously decided granularity. It is precise (up to the granularity), but also expensive, because we have to execute the model lots of times. The second approach runs the model only at specific, user-decided times -- assuming the model's calculation logic allows it. This is very efficient (we run the model only a few times), but we might be overlooking interesting stuff between the times we look. Whether this is a problem for the particular simulation depends on the domain. It might not be, and then this approach is perfect.

There is a third option that lives in the middle between these two extremes. We run the model only for discrete times (and not for every t), but we also won't miss interesting things just because we didn't look. It's called a discrete event simulation, DES for short.?

Note: There's a full implementation of a DES framework available for playing around with the idea. It's written in Kotlin. It has lots of comments that should connect well to the this article. https://github.com/markusvoelter/KotlinDES

领英推荐

CLV Ultra: Our breakthrough new CLV model (and how you…

Daniel McCarthy 1 年前

Anzu Partners Monthly Newsletter: June 2024

Anzu Partners 8 个月前

Navigating the Numerical Landscape: Choosing the Right…

Tony Golden 4 年前

The core idea is that there is a queue of events. Every event in the queue is associated with a time t_event, and the queue is ordered by ascending time. Each event also contains instructions how to change the state at time t_event when the event is executed. This is similar to our god-like user interaction we have introduced above. We then run the model for tevent to bring the overall state in sync with the change we just made. The top-level loop looks basically as follows:

There are two ways events get into the queue. One is driven by the user. Through some means outside of the model we just know when to look or when to change the state explicitly. For example, in a simulation that tracks how a medical treatment evolves over time, we might decide to look at the state of the simulation every day at 08:00, 12:00 and 19:00 hours. Assuming we want to run the simulation for two weeks, we populate the queue with a just-look-and-dont-do-anything-Event for each of the 14 days at the three wall-clock times just mentioned. Or we just "know" that on day 3, 15:00 hours the simulated patient took a particular pill. So we add a PatientTookPill event for 3@1500.

The other way events enter the queue is through the simulation itself. Let's say that we want to understand how a fever is reduced in our simulated patient after the patient takes some medicine on day three at 15:00. The user inserts an event into the queue for 3@1500; the instruction part of that event changes the state of the system by making the simulated patient take their pill. In addition, it inserts additional events into the queue, to track how the patient's temperature changes over the next two hours: say for 3@1530, 3@1600, 3@1630 and 3@1700. Assuming the patient's temperature is one of the state variables, then just "looking" at the system will run the model at these four times and update the patient's temperature.?

The previous case inserted events explicitly as part of the PatientTookPill event's instructions. But you can also register monitors that produce events if the state changes in a particular way. For example, we might want to say that whenever a patient's temperature goes above 37,5 °C (the fever threshold), then we want to automatically track the evolution of their temperature on an hourly basis for the next two days. We don't care how or why they develop a fever. It could be through a user "patch" or as a consequence of the model, but we definitely want to track it. You could put this event-creation logic directly into the model, but I find it more understandable and extensible to explicitly introduce the notion of a? model monitor. Here's the one for the fever:

So, to summarize, a DES computes the state for all user defined times plus for all those that are decided by the simulation itself. Like in the Schr?dinger case, we assume that the state does not change between subsequent events (technically the state cannot change because that change would have to be caused by an event, but our assumption is that the system we are simulating -- the real world -- also doesn't change). Our system must not be continuous; our weather simulation wouldn't work as a DES.

There are two additional complications. The first one is that you can potentially run into loops where one event triggers a state change which, through monitors or other mechanisms, changes the state back (potentially indirectly). Your queue will never become empty, your main loop runs forever. If this happens you might have made a programming mistake. Or, alternatively, the system you're simulating just works that way. In order to prevent your simulation from running forever, you have to resolve the issue through fixpoint detection, vby limiting the number of change-createEvent-loops or by specifying a maximum number of simulation steps at which the simulation is stopped.

As mentioned above, the queue is ordered by time. This ensures that things happen in the order they should. However, there is a slight problem. What happens if multiple things happen at the same time? Let's be more precise about what this means. Let's say we are at time t = 100. An event's instructions lead to producing another event that should happen immediately. Right after the current one, but at the same "time" t = 100. To make this possible, we have to distinguish between the logical time (aka wall clock time) and a more dense time we use for execution. How can we achieve this?

One option is to build your queue in a way where for each wall clock time ti there's an ordered list of events; when you schedule something to ti, the new event is appended to the end of the event list for time ti. The problem is that this makes your data structure more complicated and larger.?
The second option is to make your system's time a pair of {clock,dense} values. So if we want to perform an event right after t = 100, it would be schedule it to {100,1}. The next one scheduled for t = 100 would be scheduled to {100,2}. Your queue is sorted with a two-level sort key consisting of the two parts of the pair.
The previous approach has the drawback that, whenever you insert something intot he queue at, say, t = 100, you first have to check whether there's already something for {100,0} or {100,1} or whatever, in order to correctly increment the dense time. It is therefore much easier to make the dense time a globally increasing counter. You basically count the events. So if you insert five events into the queue at t = 100, t = 101, t = 101 (same!), t = 110 and t = 200, they would be inserted as {100,0}, {101,1}, {101,2}, {110,3} and {200,4}; the dense time is just counting up. Since you're only eer looking at the dense part in the comparison if the clock part is the same. This solution is simple to implement and efficient in terms of memory and speed. Here is an example comparator implementation for a Time type:

There's still a problem though. Let's say, there are three events for clock time t = 200.?A, X, and D, inserted in this order. Let's say you add another one, B, which, for whatever reason, should be executed before D. This won't easily work with any of the approaches outlined above because all rely strictly on insertion ordering. There's no easy solution to this problem; in particular, it's now no longer a matter just of {clock, dense} Time how things are ordered, you have to consider the thing that is ordered. So you maybe need a comparator for Events, together with a definition of event type priority:

Recording State

As we have seen, many simulations calculate the state of the model based on, among other data, the state of the model at previous times in the simulation's history. If we just kept the state as a bunch of global variables, this would not be possible, because older values vanish as soon as a new one is assigned. Instead you have to use a fresh copy of the state data structure(s) for each time t for which you run the model, at least conceptually. There's a secondary reason for (conceptually) keeping copies of the state for all t. After you have run the simulation, you will probably want to analyse the results. For example, you might want to plot a chart that displays how particular state variables changed over time. To be able to do this you have to remember when which state variable changed to which value. There are different ways of implementing a conceptual copy of the state for every time t; we assume our state is expressed as a class S:

You can literally create a new copy of S for every t and keep a list<S> associated with their time. This is simple, but requires lots of memory for real-world computations.
You can keep track of all state changes and, if you need "the state" for some time ts, you can assemble a snapshot instance S for that time. A snapshot is created by replaying the state changes from t = 0 to t = ts, where later updates of the same state variable overwrite previous ones. The latest -- up to ts -- wins. This approach requires much less memory, but assembling the snapshot takes time. If you need lots of snapshots to implement calculations in the model, your simulation will become slow. There's various more or less clever means of optimizing this, however.
The third and most elegant option is to use a persistent data structure (https://en.wikipedia.org/wiki/Persistent_data_structure). It uses sophisticated algorithms to implement the copy semantics in an efficient way. When you access the data, you can imagine this based on a "time cursor" you can move through recorded history. As a user, you don't really have to care about how the internals of this data structure work.

Changes and Deltas

One way of working with the recorded state is to ask the question, "what was the state at some time t". Consider the case where you wanted to render a chart that shows how your patient's state evolved over time. And you want your chart to have one entry per day, say at noon. You can move through time day by day at 12, and ask the system for a state snapshot at each of these times.

Alternatively, you might also want to make sure that the chart you render shows every change of every state variable. In other words, it needs an entry for every time at which something changed. One entry per day -- like the 12 o'clock example above -- is not enough. To this end, you could first query the history for "all times t at which something changed", and then use this list of times to populate your graph, again, asking for a state snapshot for each of the t. This might result in a big chart though, there's likely lots of changes happening in any realistic simulation. So a variant of this one is to only ask for certain kinds of changes. You could ask the system for "all times t where changes of type TemperatureChangeEvent or BloodpressureChangeEvent" happened. You'll get a chart that shows all changes of temperature and blood pressure, but not all other ones. And of course you can also restrict the chart to a time range to reduce its size ("I am just interested in what happened between t = 50 and t = 90).

But there are other kinds of queries you can run. Let's revisit the monitor example above. We wrote

Note the justBecameTrue[t, p] construct. It takes a time t and a property p. It checks whether at the time t, the property became true. The "became" means that at the time directly before t the property was false. So you are effectively querying the "first derivate" of the state.?

You also can use this to find out when something happened, as in "when did the patient develop a fever". You would go through all recorded Temperature-ChangeEvents in the record and return all times t for which the "has fever" predicate became true (became in the above sense).

Running models vs. interactive simulators

To close the article, let us look at three ways of executing simulations, from a user perspective. The first one runs the simulation completely automatically. Again, the classical example is the weather simulation. You initialize your model with a state for t = 0 and then you run it as far into the future you need (or chaos theory permits). In a DES case, the system must produce enough events itself to keep itself going.?

However, most DES systems I have seen or built take into account external actors, potentially some kind of user. So you typically script certain external events:

We script the behavior of the environment (lines 1 and 3) and check that the model performs as expected (all the asserts). The simulation still runs in an automated way, but with "injected" user behavior. More specifically, it consumes the queue automatically up to a time when a user interaction is given. In the example above, it would run from 0 - 100, then run lines 1 and 2, run from 101 to 130, then run line 130, and so on.

The third option runs the system interactively with a UI. The UI has the following elements:

an indication of the current time
widgets that render the state at the current time
Buttons to move time forward in various intervals
Buttons to create and enqueue events.

When the user moves time forward to some time t, the simulation consumes the queue up to t and evolves the state accordingly. The state for t is shown and the simulation stops processing the queue. The user can then either continue to move time forward, or inject events. The system executes those as usual and updates the state.?

We can also add UI affordances to assert that certain state variables are "correct" for the current time. The user can use these to express assertions. If this is done, test scripts like those shown above can be generated from their interactive playing with the system.

Summary

Discrete Event Simulation is one of these computer science terms that sounds advanced and fancy. Until you take a closer look, because then you understand that it's not a big deal. The goal of this article (together with the example code at https://github.com/markusvoelter/KotlinDES) was to supply that closer look at the three key ingredients: the event queue, the notion of clock and dense time, plus the recording of state in a way that lets you remember the state for each (discrete) time in the system. DES is extremely useful for any system that evolves its state in discrete steps (hence the name). Looking forward to your feedback via [email protected].

要查看或添加评论，请登录

Markus Voelter的更多文章

On Meeting Efficiency

2024年3月6日

On Meeting Efficiency

I think everybody agrees that there are too many meetings in software projects these days. Holger Schill recently wrote…

12 条评论
omega tau ist zu Ende

2024年1月26日

omega tau ist zu Ende

Vor 15 Jahren, im Sommer 2008, waren meine damalige Freundin Nora und ich auf der Suche nach einem gemeinsamen Projekt.…

22 条评论
How do you prototype requirements?

2023年10月31日

How do you prototype requirements?

One of the things I emphasize in my book How To Understand Almost Anything is the validation of your understanding. As…

7 条评论
Stop using Slides!

2023年8月23日

Stop using Slides!

Well, not generally. It's perfectly fine to use them for presentations -- remember, a presentation is when somebody…

6 条评论
You gotta change the rules of the game!

2023年8月4日

You gotta change the rules of the game!

I my work as a consultant, I am often involved with innovation projects. These, in turn, are often part of larger…

2 条评论
On the Relationship between Domain-Driven Design and Domain-Specific Languages

2023年7月25日

On the Relationship between Domain-Driven Design and Domain-Specific Languages

I have always been surprised why the domain-driven design community has so little interest in DSLs. So when I recently…

15 条评论
Yet another attempt at explaining Domain Specific Languages

2023年7月19日

Yet another attempt at explaining Domain Specific Languages

As you may know, I always try to find (more) convincing and (more) accessible ways of explaining why using DSLs is a…

2 条评论
How to explain complex technical topics to management? I don't know.

2023年5月30日

How to explain complex technical topics to management? I don't know.

As you probably know, I specialise in DSLs, modeling, code generation and interpreters and such, plus the architecture…

78 条评论
On AI, consciousness and intent

2023年5月1日

On AI, consciousness and intent

There is a lot of discussion about AI becoming conscious. It's not 100% clear what the various researchers and pundits…

8 条评论
On the drawbacks of standard templates, process and ceremony

2023年2月26日

On the drawbacks of standard templates, process and ceremony

Whenever there's a task that needs to be done, we generally consider it a good thing if there's predefined…

17 条评论

See all articles

社区洞察

Algorithms

What are the best practices for keeping your simulations up-to-date?

How to Build Discrete Event Simulations

Markus Voelter

Software Engineering Consultant ? Professional Photographer

Continuous Models

Schr?dinger's Approach

Entering User Interactions

Discrete Event Simulations

领英推荐

Recording State

Changes and Deltas

Running models vs. interactive simulators

Summary

Markus Voelter的更多文章

社区洞察

其他会员也浏览了

Genesis: Genesis Physics Simulation Platform - We are living in exciting times.......to see the genesis of a new paradigm.

Crawling inside a multiphysics modeler's mind: How do I start modeling?

k-epsilon Models: Standard, Realizable, and RNG

Simulation: The Mode of Evolution in Life - Part-2

The User Community View: Talks about developing a great simulation model

Investigating the Fidelity of Models for Developing Digital Twins

Parallel Adjacent-Cell Modification Support for General-Purpose Cellular Automata

The Role of CFD and DPM in Optimizing Windsifter Performance

Top Experts to Follow in CFD and AI

Continuous Models

Schr?dinger's Approach

Entering User Interactions

Discrete Event Simulations

领英推荐

Recording State

Changes and Deltas

Running models vs. interactive simulators

Summary

Markus Voelter的更多文章

On Meeting Efficiency

omega tau ist zu Ende

How do you prototype requirements?

Stop using Slides!

You gotta change the rules of the game!

On the Relationship between Domain-Driven Design and Domain-Specific Languages

Yet another attempt at explaining Domain Specific Languages

How to explain complex technical topics to management? I don't know.

On AI, consciousness and intent

On the drawbacks of standard templates, process and ceremony

社区洞察

其他会员也浏览了

Genesis: Genesis Physics Simulation Platform - We are living in exciting times.......to see the genesis of a new paradigm.

Crawling inside a multiphysics modeler's mind: How do I start modeling?

k-epsilon Models: Standard, Realizable, and RNG

Simulation: The Mode of Evolution in Life - Part-2

The User Community View: Talks about developing a great simulation model

Investigating the Fidelity of Models for Developing Digital Twins

Parallel Adjacent-Cell Modification Support for General-Purpose Cellular Automata

The Role of CFD and DPM in Optimizing Windsifter Performance

Top Experts to Follow in CFD and AI