登录查看更多内容

Efficient Concurrency: a Quick-Thinking Cashier

Volodymyr Stashchenko

Chief Software Engineer at EPAM Systems

发布日期: 2024年5月16日

It was a busy day at a small shop, and the line at the checkout counter was getting longer. The cashier efficiently processed customers one after another. As the next customer approached the counter, they placed their items on it, and the cashier scanned them, announcing the total amount due.

The customer reached for their wallet and handed the cashier a credit card for payment. The cashier had to wait for confirmation from the bank (2G speed) to complete the credit card transaction, yet the queue of customers continued to grow.

A quick-thinking cashier decided to process other customers while waiting for the bank's confirmation. The cashier could efficiently manage the customer's queue by handling payments with cash and return to the first customer to check for the bank confirmation once it was available.

To stay organized, the cashier uses a method similar to maintaining a task list in a notebook. Such as creating a queue to check and process again postponed customer’s tasks. By managing this list, the cashier can efficiently revisit and process these marked tasks once the bank confirmations become available. This system allows the cashier to handle multiple customers sequentially without losing track.?

Exploring Software Concurrency?

This small story can easily explain the simplified analogy of the "Green threads" mechanism in many languages.

At this point, the cashier's role became analogous to a thread in a software process:

Payment by card represents I/O-bound tasks.
Payment by cash represents CPU-bound tasks, since it involves counting the amount and giving change.
Our notebook of additional tasks represents "Green threads" in modern software languages.

When an I/O-bound operation is encountered (like waiting for network data or disk I/O), the "Green Thread" can yield control back to the event loop, allowing it to handle other tasks while waiting for the I/O operation to complete.

This is related to the concurrency models used in many languages like Node.js with Promises, Ruby with Fibers, Golang with Goroutines, Java - Virtual Threads (Project Loom), etc.

The key difference in handling concurrency between languages lies in their default behavior and the level of control given to developers. There are basically two main models: Preemptive and Cooperative scheduling models, and there is also a mix, like the "hybrid scheduling model" or "semi-preemptive scheduling".?

Cooperative model: threads voluntarily yield control to the scheduler when they encounter a blocking operation or complete their current work.?

Preemptive model: the scheduler (typically the operating system) handles blocking and switching between threads automatically, without the thread's involvement. The scheduler can interrupt and preempt a running thread at any time, allowing other threads to execute.?

Node.js automatically handles concurrency for I/O operations using its non-blocking I/O model and event loop, without requiring explicit concurrency constructs from developers. It does not have built-in support for "green threads" or preemptive models. [1]?

In Go, developers have explicit control over concurrency and must use goroutines and channels to achieve concurrent execution. Yet Go's scheduler drives the mechanism of switching between goroutines automatically, providing a cooperative scheduling model that feels like preemptive scheduling. [2]?

In the similar way Java’s (Project Loom) Virtual threads automatically yield control when they encounter a blocking operation, such as I/O or synchronization primitives.? [3]

Fibers in Ruby are cooperatively scheduled, meaning that developers must manually yield control from one fiber to another using the Fiber.yield? method.? [4] Or use FiberScheduler to be similar as Go with auto switching when reaches some blocking operation[5]?

When we have a single Thread and "Green Threads" (or any lightweight concurrency construct like fibers, goroutines, etc.) that are running on it, the underlying approach follows the general concept of the Reactor pattern or a variant of the Proactor pattern. In the case of lightweight concurrency constructs, the underlying implementation often involves a single operating system thread that multiplexes the execution of these lightweight threads or tasks. This multiplexing is typically achieved through an event loop or a similar mechanism. In some languages, developers have more or less control when to switch.?

However, it's worth noting that many languages like Java, Go and Ruby 3+ implement interesting solutions to achieve true parallel execution (out of the box). While different languages may have varying approaches and techniques for handling parallel execution, this story focuses on the basic principles of efficiently handling tasks concurrently on a single execution unit (the cashier).?

Lightweight Concurrency

The key idea behind all these lightweight concurrency mechanisms is to avoid blocking the main execution flow (the cashier). Instead, they allow switching to other tasks or execution contexts (other customers) while waiting for the I/O operation to complete.?

This is similar to using lightweight concurrency mechanisms like green threads instead of creating additional OS-level threads (comparable to hiring more cashiers). Several factors motivate the choice:

Resource Efficiency: Creating and managing OS-level threads can be resource-intensive, as each thread requires its own stack, memory, and kernel resources. In contrast, lightweight concurrency mechanisms like "green threads" are more lightweight and can be created and managed more efficiently, especially when dealing with a large number of concurrent tasks or connections.??

Context switching:? Switching between OS-level threads involves costly context switching by the operating system kernel, which can introduce significant overhead, especially when dealing with a large number of threads.?

Simplicity: Lightweight concurrency constructs can provide a simpler programming model compared to traditional thread-based concurrency, especially for handling I/O-bound tasks or implementing event-driven architectures.?

Many modern applications are heavily I/O-bound, meaning they spend a significant amount of time waiting for network, disk, etc., lightweight concurrency mechanisms are often implemented at the language or runtime level, making them more portable across different OS and hardware architectures, as opposed to relying on low-level threading primitives provided by the operating system.?

Concurrency != Parallelism?

In an analogous manner, one might consider hiring more cashiers to avoid customer wait times. However, this approach might not always be viable or efficient. Hiring one more cashier when all customers pay by card won't significantly improve the total wait time. All of them will still wait for bank confirmation. That's the reason why executing I/O-bound tasks in parallel (adding more cashiers) doesn't give a huge performance boost since the bottleneck lies in waiting for the I/O operation to complete, not in the computational power.

However, when all customers pay by cash, having more cashiers who can work in parallel may help a lot. They can more quickly decrease the waiting queue since handling cash payments (a CPU-bound task) doesn't involve waiting for external resources.

The ideal scenario is to have a mix of both cards and cash tasks. In this case, having at least two cashiers can effectively utilize parallelism for cash tasks. While one cashier is busy with a cash payment, another cashier can simultaneously handle another cash payment task. This parallel execution of cash tasks (CPU-bound) improves overall throughput. Additionally, each cashier can leverage concurrency by switching between card and cash tasks within their own execution flow, further improving resource utilization.

Conclusions

It's important to find the right balance when adding more cashiers (threads). While having more cashiers can improve parallelism for CPU-bound tasks, it can also introduce overhead due to context switching and resource management by the operating system. It's worth noting that the degree of parallelizability varies among CPU-bound tasks, including certain algorithms, such as some sorting algorithms, merge sort and parallel quicksort, can be parallelized effectively, while others may not benefit as much from parallelization due to their inherent sequential nature or complexity.?

The developer's role is to understand the nature of the tasks (I/O-bound or CPU-bound) and their parallelizability, and choose a middle ground that avoids having too few resources, which can lead to underutilization, or too many resources, which can introduce unnecessary overhead.?

Ultimately, developers should strive to learn the underlying principles and patterns that govern the efficient management of concurrency and task handling. Just like the quick-thinking cashier. This holistic understanding will empower them to make informed decisions and implement efficient solutions, akin to a skilled cashier who can deftly juggle multiple tasks, ensuring a smooth and efficient flow at the checkout counter.

Volodymyr Stashchenko的更多文章

Unpacking the Benefits of a Modular Monolithic Ruby on Rails

2024年5月8日

Unpacking the Benefits of a Modular Monolithic Ruby on Rails

Software development, a highly complex and intricate process, requires strategic planning and execution for any project…

2 条评论

Exploring Software Concurrency?

Lightweight Concurrency

Concurrency != Parallelism?

Conclusions

Volodymyr Stashchenko的更多文章

Unpacking the Benefits of a Modular Monolithic Ruby on Rails