Implementing Pull Systems should not be this hard

The Insanity of Operations Planning and Control

It is already a cliche to quote Einstein’s definition of insanity as doing the same thing over and over and expecting different results. Unfortunately, this cliche can be often applied to the approach many businesses take to production planning and control in the presence of inefficiencies, missed production commitments, stockouts etc.. Instead of rethinking the practices that lead us, we keep repeating the same formulas we were trained on, hoping for different results. Those practices, with some variation, go like this: Start by compiling a list of orders to be fulfilled within a specific timeframe. If not enough orders are available, create a forecast of estimated demand to fill-in the prescribed production capacity. With the resulting total demand, plan the materials and activities needed to fulfill it. Finally issue instructions to the shop floor and? procurement to execute the plan.?

At this point, the plan takes a life of its own. Once the production schedule is set, Ship dates and production forecasts are communicated to management and customers, orders are issued for suppliers, and the production planning job is considered done.

Time and time again, this method will create difficulties when new orders that don’t fit the forecast arrive, or vice versa, too much is produced for demand that does not materialize. Incidences in production or supply create additional variability and disruptions to the plan. Real world variability has a way to invalidate the perfectly optimized plans that this top-down planning approach produces.?

When speaking about the challenges of production planning with operators of small and medium-sized businesses, common reactions are:

  • "Yes, I know I should plan better and have better forecasts, but we don't have time to collect and process all the necessary information."
  • "I can't afford an ERP system. It's too expensive to run, and we don't have good enough data to feed it anyway."
  • "We have a license for the MRP module, but we don't use it because it takes too long to set up correctly."

In reality, these individuals manage their operations by maintaining spreadsheets of work in progress, communicating directly with suppliers via email or phone, and relying on their experience and intuition to adjust activities to match the actual situation. Uncertainty is accounted for by adding safety buffers of delivery time (risking revenue to more agile competitors) or materials (with all the associated costs).

It's not that these experienced professionals don't know how to do their job, as some ERP vendors might suggest. Rather, the problem stems from the accepted methodologies and tools in traditional ERP/MRP systems. Already back in 1997, academics were already publishing about this, describing how normal planning leads to awful effects in supply chains [1],[2].

It is not that we have not been warned, The Goal [3], a landmark book in operations management, even has a 30th anniversary edition. Yet we keep trying to make operations conform to the idealized world in which MRP algorithms live.

Dr. Goldratt himself and plenty others point out that this is a losing proposition and that the only way to get a handle on production control is by changing the way we look at the problem. Production systems are highly dynamic, real-time systems. We cannot reduce them to a static planning algorithm but need to treat them as a Controls problem. Pull systems promise to do precisely that and offer a way to:

  • Reduce the cycle time to deliver orders, reducing the need to rely on uncertain forecasts.
  • Manage the variability intrinsic in physical and human processes through closed-loop controls.
  • Reduce incidents, stock-outs and emergencies by using real-time feedback from the shop floor to make timely decisions.

Pull systems in operations are nothing new. Toyota pioneered them in their renowned Toyota Production System [4] decades ago. A quick search in google renders a wealth of references for pull systems, lean manufacturing, waveless operations or similar prompts. Technical books like Hopp & Spearman's excellent textbook [5] provide in depth? insights on the behavior of these systems, even the new AI Assistants like Google's Gemini know about the differences between Pull and Push systems when asked the right question:

Push operations produce based on forecasts, while pull waits for actual demand. Push keeps buffer stock, pull uses just-in-time inventory. Choose push for stable demand, pull for flexibility and lower costs. [Gemini AI, July 2024]

Despite all this information and knowledge, whole industries like warehousing or supply chain management are still trying to adopt them and even in manufacturing planning, the dominant ERP systems still cling to Top-Down, Plan-and-assign paradigms that characterize push systems.

It is worth revisiting the fundamentals of pull systems and why it is difficult to implement them, if we are to significantly change the way operations management works.

The nature of operations systems

Push and Pull methodologies aim to manage the performance of discrete item processing networks. A simplified, but still very useful, model to reason about these systems includes:

  1. Demand is received from an external source, typically in the form of orders to deliver a product or set of products by certain time. Orders arrive over time. In the technical lingo, inter-arrival time is the time elapsed between two consecutive orders arriving.
  2. An order release policy selects the demand to be worked on and in which sequence.
  3. Orders are processed by the system, consuming resources and taking up processing time until they are completed and delivered. To do this, the system must assign resources and capabilities (materials, labor, machine time, ...) to prepare the order.
  4. Processing of an order may be successful, or it may result in a failure to meet the product requirements. The ratio of successful completions over a period of time is called the yield of the operation.

This very simple view is surprisingly powerful to gain insight into a broad range of operations. Using a coffee shop as an example, it receives orders as customers walk in, processes them with a First-In-First-Out policy by the cashier and then the barista, delivering the finished orders to the customer waiting to pick them up. Complex supply chains with multiple companies can also be analyzed in this way for a first-cut, aggregate view of their operation.

These systems have been thoroughly studied by the Operations Research and Management Science community, In a first approximation, their performance is characterized by three magnitudes related through two statistical relationships.

At the risk of getting too technical, it is worth delving into these, as they will be critical to understand the difference in behavior between pull and push systems.

  • The throughput (symbolized by λ) is the quantity of value (units, orders, money, ...) that the system generates per unit of time.
  • The cycle time, also called sojourn or lead time (symbolized by W) is the amount of time that the system takes to produce a unit of value.
  • The Work In Progress (symbolized by ) is the amount of value (units, jobs, orders, ...) that is being processed by the system at a point in time.

These concepts are familiar to anybody with a management accounting background with concepts like Inventory Turns, Payable days and similar.

Little's Law [6] expresses a very strong relationship among these three quantities when measured over a long enough period of time for the statistics to be valid:

stating that, for a stable system:

L = λ W

Over a long enough time period, the average of WIP is the product of throughput times cycle time

Little's law is a well-behaved linear relationship between throughput and cycle time for a given level of work in progress.

The second relationship connects the Cycle time with the throughput based on how busy a system is. Unfortunately, it is not nearly as clean and universally applicable as Little's law. Different system configurations or policies (e.g. FIFO, LIFO order releases, etc...) yield different specific values, many times without a known analytical expression, having to resort to Monte Carlo simulations to obtain them. Yet, for most real-life systems, it is safe to say that their behavior follows a curve of the form:

W = K / (1-ρ)

This equation relates the long term average of cycle times with the long term average of utilization (ρ). The constant (K) is a measure of the combined variability in the system operation and its demand. Utilization is equivalent to the throughput of the system (λ) normalized against the maximum potential throughput that the system can produce (symbolized by μ, i.e. ρ = λ / μ ).

This relationship shows that average cycle time grows unbound as the throughput of the system approaches the maximum that a system can achieve.

Traditional Operations Planning

Traditional operations planning and execution, as implemented by MRP and ERP systems, is, in essence a three-step process:

  1. Collect all the information on the demand to fulfill, the resources available and the processing instructions (e.g. process sheets, BOM's, etc.) for the required products.
  2. Create an optimized production schedule containing the required tasks to be performed and the assignment of those tasks to resources and times.
  3. Release the production schedule to the shop for execution of the assigned tasks.

The optimization of the plan can usually be expressed as:

Minimize the cost of the operation processing a given set of orders while complying with the constraints of available resources and the order demand conditions (including delivery times and quality).

This formulation leads naturally to batching orders together to accommodate transportation constraints, minimize or eliminate set up times, tool changes, re-calibrations, etc., as well as stocking enough resources to complete at least one batch. This is the familiar solution adopted by traditional MRP systems.

The core difficulty with this approach is hidden in? the problem statement itself. It relies on assumptions that end up creating a number of issues:

  1. Demand is considered independent and unaffected by the characteristics of the operation. In reality, operations with poor delivery times see customers balk and lose demand over time.
  2. A set of orders is known to optimize against. In most real operations, demand will be a mix of firm orders and a demand forecast with uncertainty that depends on the industry, product mix, season, etc.
  3. The condition and availability of resources is known and relatively stable from the time planning starts until the set of orders is complete. Provisioning for rush orders, disruptions, order cancellations, re-work etc. is done outside the planning process by using safety stocks, reserved capacity, etc. that degrade the performance of the optimal solution as a whole.
  4. Processing capacities and processing times of the operation are also well known and stable over time. Note that this is a particularly tricky assumption, as processing capacity of any process is really difficult to assess in real-life conditions, yet it is an essential input to MRP systems.
  5. The yield or quality of the operation is independent of the size of the batch or how much work in progress there is at any point in time.
  6. Ramp-up and Wind-down of activities for a batch are short and do not represent a significant portion of the time resources are dedicated to a batch of orders, or alternatively two batches can overlap during these periods without significant impact to the operation.

An MRP plan results in a well optimized solution for the conditions and assumptions on which the plan is based. It is well known that MRP systems suffer from Nervousness [7], meaning that small changes to the input conditions produce large changes in the resulting plan. Or to rephrase it differently, the resulting plan is very fragile in the presence of changes to its conditions.

The bogey man hiding behind these problems is that MRP plans do not handle well the effects of variability on the performance of the system, or as Hopp & Spearman call it out, The Corrupting Influence of Variability [5]. The Plan-and-assign approach for tasks is inherently problematic in the presence of variability because of its lack of real-time feedback mechanisms to adjust tasks and assignments to the conditions on the shop floor and variability of demand.

The plans, by their very goal to minimize the cost to fulfill demand, will drive the system to operate with as high utilization as possible. Otherwise, resources would be wasted. In this situation, any deviation from the conditions of the optimization problem will result in big swings in the performance of the operation. In the simplest possible system (an M/M/1 queue [8] as shown in the figure below, when working at 92% of its capacity, a deviation pf +/- 3% in the estimate of utilization, results in an uncertainty of [-25.7% to +52.7%] in the average cycle time, making the system performance essentially unpredictable.

This problem is further compounded by the fact that businesses cannot make commitments to their customers based on their average cycle time, or they would not be meeting them most of the time, they need to make commitments on an expectation of 90-95% success. The cycle time that can be promised differs substantially from what the average performance would be.

Two further effects compound this problem. The first is the effect of variability either in order flow or processing times. The example above is for the very simple M/M/1 queue which has a variability constant?K = 1.0. If this constant changes, cycle times will further change as illustrated by another simplified system, the G/G/1 queue [9]:

The second effect is the instability of the system with even the slightest inaccuracy of the input parameters. For example, if the capacity of the production process is different from the estimate, utilization will correspondingly change for a given production schedule. For example even a 3% deviation on the estimated utilization of 92% may result in disastrous oscillations of cycle time.

When the variability can be mitigated, as it is possible to do when cycle time commitments are not demanding and predictable production schedules are possible, traditional systems manage to produce and implement their production plans effectively. In the more dynamic environment of modern fulfillment, where order delivery times are a competitive advantage and flexibility in the marketplace is an asset, these systems are increasingly challenged to support competitive companies.

Toyota’s production system was developed in the late 1940s and early 1950s by Taiichi Ohno and Eiji Toyoda and rolled out at scale in Toyota at the beginning of the 1960s. It became known to other manufacturers in the late 1970s and it is now recognized as the leading approach to organize operations across many industries. Toyota’s system contains many interlocking practices that reinforce each other. The one most relevant to our discussion is the introduction of the Kanban ( Card in Japanese) system to control when production tasks are initiated, in contrast with the prescribed schedule of tasks that MRP systems generate.

The core concept of the Kanban system is that a processing station will only become active and produce its deliverables when presented with a demand card or Kanban by a downstream station that requires them. The number of cards in circulation for a particular item is fixed in the system during normal operation, only changed when the company wants to change its production mix, capacity or plant layout.

Cards are attached to the Work In Progress inventory when it is produced and detached when consumed. They are then sent back as requests for more items when they are needed by the consuming station. For the last station in the Kanban loop the request cards are provided by the order release process, also subject to a predetermined maximum number of cards in circulation. The common Kanban implementation uses cards between any two stations in a production process. Other variations, like CONWIP use a global number of cards for an end-to-end process or loop?.

The kanban system differs from traditional planning in two critical ways:

The effects of these two differences are dramatic on the real world performance of the system. By adapting the start times of tasks based on the downstream status of the shop floor and demand, the composite variability of processing times across multiple stations is greatly reduced, moving the operating curve or the system to the lower variability ones in the previous figure.

Using the amount of Work in progress as the control variable moves the system from the volatility of a hyperbolic curve (as shown above) to a linear relationship dictated by Little’s law. The same example, with a deviation of 3%, this time on the number of cards in the system (WIP, (L)), which is the Pull system control variable:

Even larger deviations of the estimated capacity or throughput (e.g. 10%) result in much more benign changes in the Cycle time of the system:

Despite decades since the principles of Kanban and pull systems came out of Toyota and countless books, courses and academic papers available, the adoption of these principles is still, with notable exceptions, constrained to large companies and production facilities. Adoption of these methodologies still depends heavily on being able to afford expertise and the ability to develop in-house systems and proprietary software. Despite marketing claims, the large ERP vendors are still largely bound to MRP methodologies. In the logistics and warehousing industry, the term that has been adopted for pull systems is Waveless Fulfillment and, except for industry leaders like Amazon and Walmart, very few other operations use these methodologies. Looking at supply chains and inter-company operations, the situation is still worse. Supply Chain Management Systems still rely on the “Plan and Command” paradigm to determine inventory levels, placement, etc… despite the difficulties these systems have in managing supply chains as explored in this previous article [10]

There are three main reasons behind the slow adoption of Pull systems despite their proven benefits.

The messy interface between the physical world and the information world

Pull systems require a close alignment between the physical reality at the shop floor and the information systems to manage it. This interface between the physical world and the information world is very difficult to implement. In automated operations, it requires sensors to detect and measure the flow of materials, progress of tasks, deviations from standards, etc… in a very hostile environment for the signal (EM interference, dirt,?…) as well as the sensors themselves (vibrations, temperature, collisions,?…). When human operators are involved, the data collections needs to be ergonomic, tolerant of user input errors, compliant with labor regulations and potential union contract terms etc.

Traditional systems stay clear of this complexity. ERP’s produce plans and schedules leaving their execution to other systems or manual controls.

Pull systems require different software and systems architecture

Current enterprise systems mostly follow Client-Server and WebApp architectures based on a transactional model of computation against a persistent data store. Typically, but not always, a relational database. This architecture, developed in the service of transactional business processes. It is well suited to the Plan and Optimize approach that ERP’s support and has enabled a wealth of capabilities in the last 30 years with the rise of commercial enterprise software (e.g. SAP, Oracle, BlueYonder,?…) and its evolution to SaaS models. It is mature and well understood with plenty of tools, training and professionals available, making adoption and support for these systems accessible to most companies. Pull systems in contrast have characteristics in common with?. Borrowing terminology, Push operations are Control Systems Open-Loop control systems. Pull operations in contrast adopt a Close-Loop Control Theory approach using signals from the process (e.g. the Kanban cards) to modify the inputs (e.g. order releases to the shop floor) to the system. Pull systems require reactive software architectures more similar to industrial controls than to transactional business processes with tools and skills that are more specialized and less available in the labor markets.

Pull systems don’t generate management friendly?plans

The nature of pull systems, and their great advantage is that they adapt the execution tasks to the conditions on the shop floor and the real-time variations of the production demand. This adaptability negates the possibility of having a set-in-stone production plan with firm dates and commitments. Organizations, in particular managers and executives, are out of their comfort zone in these situations. The plans and schedules that Push systems produce are well known to be faulty and obsolete as soon as they are generated and any incidents, rush orders, etc… invalidate the optimization claims of those plans. Yet, established organizational processes, controls and performance reviews rely on those generated plans, which in itself creates a whole other set of problems like sandbagging of schedules, etc…

The previous paragraph presents a very black-and-white picture to make the point and in reality Push plans contain elements to cope with uncertainty and pull systems can produce schedule estimates and targets but the point remains largely valid. Introduction of Pull systems will not be successful without an accompanying change management process to set expectations across the affected organizations and establish new metrics, collaboration practices and organizational interfaces as advocated by Dr. Demming in his book?.

There are very few commercial systems supporting Pull operations that interface well with the physical world, are agile and reactive to the feedback from the shop floor and provide good visibility into the operations for managers and executives. Arda? is a new entrant in the space that is seeing early success with small and medium complexity operations and already making inroads into larger and more complex operations. Arda is a streamlined implementation of the Kanban principles, currently focused on manual or hybrid manufacturing operations.

The central element of the system is a Card that represents a quantity of inventory or work-in-progress used in the production process. Cards are defined and created with a simple to use web based interface. At the shop floor, the cards can be printed on-demand and attached to the physical inventory providing an easy visual cue for operators.

Cards contain scannable codes that make it easy for those operators to double check the information in the card and to update the status of the system by simply scanning the card as it arrives or departs a station.

Configurable rules determine when to trigger purchasing or production orders as cards cycle through the system. The system allows to define as many Kanban Loops (with their associated cards) as necessary for an operation and seamlessly links the demand (e.g. coming from an e-commerce site) and replenishment (e.g. generating purchase orders) to the loops that represent finished goods and raw materials respectively.

It addresses user adoption challenges by using concepts taken from consumer e-commerce like shopping carts to represent the demand pull already familiar to a digital native workforce.

Finally, the system also provides Kanban specific metrics and analysis of the operation, allowing it to predict the velocity of production and easily control it by changing the number of cards in circulation as described in the Pull Revolution section above.

Arda’s system is deceptively simple from the outside, with a powerful Kanban model at its core, which goes a long way towards overcoming the obstacles for adoption outlined above:

If it sounds too simple, it is because it really is that simple. The founders of Arda created the system in response to the needs of their own manufacturing operation and have built into their product an obsession with pragmatic results. While it is still an early product, it already beats the results of more traditional production control solutions and, given the sound principles it is built on, will be able to add scale and functionality very rapidly.

Thank you to Uriel Eisen and Kyle Henson from Arda Cards for their contribution to the Real world section and the images from Arda’s system implementations. The author does not have any financial or contractual relationship with Arda systems.


? Miguel Pinilla & Salduba Technologies, All rights reserved.

This work is licensed under the Creative Commons License CC BY-NC-SA 4.0:" email: [email protected]

