Simplifying Spaghetti Process Maps for Process Mining: Research Advances
There is no doubt that process mining has considerably lifted the game in the field of business process management. With process mining, managers and analysts can make their business process improvement decisions based on hard data stored in Customer Relationship Management (CRM) systems, Enterprise Resource Planning (ERP) systems, IT service management systems, or manufacturing execution systems.
One of the most appealing value propositions of process mining is that it allows managers and analysts to visualize how the process is actually executed, based on digital traces left behind by process workers as they perform their daily work. In this way, process mining brings transparency and enables a fact-driven analysis of business process improvement opportunities.
Under the hood, this value proposition is enabled by automated process discovery techniques. An automated process discovery technique converts an event log — a table where each row represents the execution of one step of a business process — into a picture showing the directly-follows relations between activities in the process. This picture is called a process map. Most process mining tools allow you to visualize your processes by means of process maps, or by means of BPMN diagrams derived from process maps.
But there is a catch to all of this. When you apply an automated process discovery technique on the full execution dataset of a process (e.g. an order-to-cash or a manufacturing process), and unless you apply some simplifications, you may get something like the following.
Figure 1. Full process map discovered from a real-life event log extracted from a manufacturing execution system.
This huge "spaghetti-like" picture of a process defeats the whole purpose of process mining. You wanted to understand how your process is executed in reality? Well, all you can see is that your process is complex, full of variations, and exceptional paths! How can you find your bottlenecks, rework and over-processing paths in this pile of spaghetti?
Ideally, you want a simplified process map (or a BPMN model) that captures the most frequent behavior, like this one:
Figure 2. Simplified process map of the manufacturing process showing only frequent arcs while keeping the process map connected.
For this reason, pretty much every automated process discovery algorithm incorporates a filtering or simplification step. One approach to produce simplified process maps is to ignore the least frequent pathways (case variants) of the process, or to ignore case variants that contain infrequent pairs of consecutive activities. However, if you use this approach, you will not see all the activities in your process. Some activities will just go missing. Or you might end up looking at your happy paths only. Again, this defeats the purpose of process mining. Why would you want to miss on some of the activities in the process that might be causing the most delay and the worst issues?
领英推荐
A second approach is to surgically remove the most infrequent arcs in the process map. So you get a process map that keeps all the activities in your process, and that is representative of most or all case variants. The trick here is that if we indiscriminately remove the most infrequent arcs, we do not get a process map in one single piece. Instead, we get a bunch of disconnected activities or process map fragments.
So here's a key question that most automated process discovery techniques need to solve under the hood:
How can we simplify a process map to get a well-connected process map, while keeping all the activities that the user finds relevant, and the arcs with the highest frequency?
In a recent research article with David Chapela de la Campa , Manuel Mucientes, and Manuel Lama , we show that this apparently simple problem is NP-hard . What does this mean? It means that any algorithm to optimally simplify a process map will take exponential time, and would be useless in complex cases (unless a condition called "P = NP" holds, which theoretical computer scientists assume is not the case).
Fortunately, this is not the end of it. If we just want to get a "good enough" simplified process map (not necessarily optimal), there are algorithms to simplify process maps that are fairly efficient for practical applications. In our recent research article, we look into an algorithm used by existing process mining tools, and we compare it against alternative ones, including the simplification algorithm used by the Split Miner algorithm (the algorithm used by Apromore ) as well as an algorithm derived from a well-known graph algorithm: the Edmond's optimum branching algorithm .
In the article, we show that the Split Miner simplification algorithm and the variant of Edmond's algorithm are highly efficient in practical use cases.
For those who want to know more, the article is available below [1]. For completeness, the second reference below describes an alternative simplification approach by Sander Leemans , Erik Poppe and Moe Thandar Wynn , which does not retain all activities and case variants — this approach is representative of the first family of approaches mentioned above. In line with standard conventions in the field of process mining research, these articles use the term "directly-follows graph" or DFG to refer to a process map.
References
[1] David Chapela-Campa, Marlon Dumas, Manuel Mucientes, Manuel Lama: Efficient edge filtering of directly-follows graphs for process mining . Information Sciences, Volume 610, Pages 830-846, September 2022.
[2] Sander J. J. Leemans, Erik Poppe, Moe Thandar Wynn: Directly Follows-Based Process Mining: Exploration & a Case Study . International Conference on Process Mining (ICPM), Pages 25-32, October 2019.
Junior Business Analyst @'escent
1 年Appreciate the insights on process mining. However, it appears that the complexity issue may not have been fully addressed, as the suggested simplification solutions may lead to the omission of relevant activities, limiting our understanding of processes. Nevertheless, I found your perspective on this topic quite interesting. Thank you for sharing!
Gestión de Procesos/Gestión por Procesos/Arquitecto de Negocio/Analista de Negocio
2 年Método para que realmente se analicen aquellas NO bifurcaciones visibles en los procesos y procedimientos de negocio. Marlon Dumas, gracias por tus aportaciones.
Data Scientist - Business Analyst - Data & Process Mining - Big data - Healthcare - R&D
2 年Nicolas Marzin
Researcher and Developer, Object Event Modeling & Simulation (sim4edu.com/reading/oems)
2 年Can such Spaghetti-like process maps (in the form of Directly-Follows Graphs) be useful for human experts, even if they are cleaned up algorithmically? I don't think so. Process mining will in most real-world cases not be able to generate a useful model, unlike a human expert who is able to abstract the process knowledge gathered from process stakeholders into a "clean" readable model.