Power of Java Virtual Threads: A Deep Dive into Scalable Concurrency
Java introduces a groundbreaking feature: Virtual Threads, designed to address the limitations of traditional threading models and make high-concurrency applications more accessible and efficient.
In this blog, we'll dive deep into the why, what, and how of virtual threads, compare them with other concurrency models, and explore practical use cases with coding examples.
What are Virtual Threads?
Virtual threads are lightweight threads that are managed by the Java runtime rather than the OS, that reduce the effort of writing, maintaining, and debugging high-throughput concurrent applications. They provide a similar programming model to traditional threads but with much lower resource overhead, enabling the creation and management of a large number of concurrent tasks more efficiently.
There are two kinds of threads, platform threads and virtual threads. Like a platform thread, a virtual thread is also an instance of java.lang.Thread. However, a virtual thread isn't tied to a specific OS thread and when code running in a virtual thread calls a blocking I/O operation, the Java runtime suspends the virtual thread until it can be resumed. The OS thread associated with the suspended virtual thread is now free to perform operations for other virtual threads.
Why Use Virtual Threads?
Use virtual threads in high-throughput concurrent applications, especially those that consist of a great number of concurrent tasks that spend much of their time waiting.
Traditional threads, or platform threads, in Java are directly mapped to operating system (OS) threads. While they are powerful, they come with several limitations:
These limitations hinder the development of highly concurrent applications, especially those that need to handle tens of thousands or even millions of concurrent tasks, such as web servers or real-time data processing systems.
Virtual threads are not faster threads; they exist to provide scale (higher throughput), not speed (lower latency). Virtual threads are suitable for running tasks that spend most of the time blocked, often waiting for I/O operations to complete. However, they aren't intended for long-running CPU-intensive operations.
How Do Virtual Threads Work?
Virtual threads decouple the application-level concurrency from the OS-level threading model. This decoupling allows the JVM to manage thousands or millions of virtual threads efficiently by multiplexing them onto a smaller number of platform threads.
When the Java runtime schedules a virtual thread, it assigns or mounts the virtual thread on a platform thread, then the operating system schedules that platform thread as usual. This platform thread is called a carrier. After running some code, the virtual thread can unmount from its carrier. This usually happens when the virtual thread performs a blocking I/O operation. After a virtual thread unmounts from its carrier, the carrier is free, which means that the Java runtime scheduler can mount a different virtual thread on it.
A virtual thread cannot be unmounted during blocking operations when it is pinned to its carrier. A virtual thread is pinned when it runs code inside synchronized block. So it is recommended to use ReentrantLock.
Creating Virtual Threads:
Creating virtual threads in Java is straightforward. Here’s an example:
Thread.startVirtualThread(() -> {
// Simulate some work
System.out.println("Running in virtual thread: " + Thread.currentThread());
});
Virtual threads can also be created with ExecutorService which is easy to manage
ExecutorService myExecutor=Executors.newVirtualThreadPerTaskExecutor());
Future<?> future = myExecutor.submit(() -> System.out.println("Running thread"));
future.get();
System.out.println("Task completed");
Can I create 1,000,000 Virtual Threads?
Yes, I tried creating 1,000,000 and it took 5 seconds to create and run them.
public static void main(String[] args) throws InterruptedException {
Instant start = Instant.now();
Set<Long> vThreadIds = new HashSet<>();
var vThreads = IntStream.range(0, 1_000_000)
.mapToObj(i -> Thread.ofVirtual().unstarted(() -> {
vThreadIds.add(Thread.currentThread().getId());
try {
Thread.sleep(100);
} catch (InterruptedException e) {
e.printStackTrace();
}
})).toList();
vThreads.forEach(Thread::start);
for (var thread : vThreads) {
thread.join();
}
Instant end = Instant.now();
System.out.println("Time =" + Duration.between(start, end).toMillis() + " ms");
System.out.println("Number of unique vThreads used " + vThreadIds.size());
}
Output:
Time = 4482 ms
Number of unique vThreads used 1000000
Why Can’t I Create 100,000 Normal Threads?
Creating 100,000 normal (platform) threads is not feasible due to their heavy memory consumption and the OS's limitations in handling such a large number of threads. Each platform thread typically uses around 1MB of memory for its stack. Creating 100,000 threads would require around 100GB of memory just for the stacks, which is impractical for most systems.
Try: I tried to create Executors.newFixedThreadPool(100000) and got OutOfMemoryError. Please do try, its interesting.
Using Virtual Threads vs. Other Concurrency Models
Virtual Threads in One Request Per Thread Model
The "one request per thread" model is a common pattern where each incoming request is handled by a separate thread(tomcat). This model is simple and intuitive but scales poorly with platform threads due to their high resource usage. Virtual threads can revolutionize this model by making it feasible to handle thousands or even more of concurrent requests efficiently.
Simple and Scalable: This example sets up an HTTP server where each request is handled by a new virtual thread, using the Executors.newVirtualThreadPerTaskExecutor(). This approach combines the simplicity of the one request per thread model with the scalability of virtual threads.
Low Overhead: Virtual threads allow the server to handle a massive number of concurrent connections without the resource overhead associated with platform threads.
How Virtual threads are so efficient?
Assume a scenario where we have one carrier thread(platform thread) and three virtual threads. We will focus on what happens when a virtual thread performs an I/O operation and how the other virtual threads are managed.
Virtual threads have their stacks stored in the heap, unlike platform threads that use OS-provided stack space. This allows the JVM to efficiently manage and switch stacks without the overhead of OS-level thread context switching.
Virtual threads leverage a continuation-based model. When a virtual thread is unlinked from a carrier thread, its state is captured as a continuation. This state can be stored and resumed later without needing a one-to-one mapping with carrier threads.
The JVM's scheduler ensures that carrier threads are not idle when there are runnable virtual threads. This efficient scheduling minimizes the time a carrier thread is idle and maximizes CPU utilization.
What is Continuation-based model
A continuation is a mechanism that allows a computation to be paused and resumed at a later point. In the context of virtual threads, continuations enable the JVM to suspend and resume the execution of virtual threads efficiently.
The JDK's virtual thread scheduler is a work-stealing ForkJoinPool that operates in FIFO mode. The parallelism of the scheduler is the number of platform threads available for the purpose of scheduling virtual threads. By default it is equal to the number of available processors. A virtual thread can be scheduled on different carriers over the course of its lifetime, i.e the scheduler does not maintain affinity between a virtual thread and any particular platform thread.
Conclusion: When to Use Virtual Threads
When to Use Virtual Threads:
When Not to Use Virtual Threads:
Credits: Java,Oracle,openjdk official documentation and Java Youtube channel.
Backend engineering (UPI Payments)@ PayU | Ex-Decathlon India
9 个月Insightful writeup Kiran??