Parallelize tasks in Node.js

Parallelize tasks in Node.js

Node.js is really popular for its single-threaded, non-blocking I/O model. It works blazing fast in the case of I/O-related tasks because the event loop offloads any blocking task to the tread pool and continues processing other requests. On the other hand, Node.js shows a lack of speed when there are tasks that require heavy usage of CPU. This is because the main thread processes this type of task where the event loop also resides and gets blocked. As a result, the event loop can not process another request when the CPU is busy processing a long-running task.

To solve the problem Node.js introduces the worker-threads. The worker thread allows running CPU-intensive tasks in background threads. Thus it opens the door to parallel processing for CPU-intensive tasks.

Let us see an example. Suppose, we want to get all the prime numbers within a range. So, we can write a function like the below:

function generatePrimes(start, end) 
    const primes = [];

    //Function to check if a number is prime or not
    function isPrime(num) {
        if (num < 2) return false;
        for (let i = 2; i <= Math.sqrt(num); i++) {
            if (num % i === 0) {
                return false;
            }
        }
        return true;
    }

    // Generate primes within the range
    for (let i = start; i <= end; i++) {
        if (isPrime(i)) {
            primes.push(i);
        }
    }
    return primes;
}        

Now, we want to calculate all the prime numbers between 0 and 100000000 in the main thread and see how much time it takes to complete.

const { generatePrimes } = require("./prime")

console.time("PRIMES COUNT TIME:");
generatePrimes(0, 100000000);
console.timeEnd("PRIMES COUNT TIME:");

//PRIMES COUNT TIME:: 1:06.696 (m:ss.mmm)        

In my machine, it takes around 1 minute and 6 seconds. Now, if we want to run the task in the background thread as well as want performance gain (run parallel), we can use the worker-thread module. So, let's break down the working procedures:

  1. Make chunks of sub-ranges from the main range of numbers so that we can distribute the task to each thread on a chunk basis.
  2. Give each chunk of a sub-range of numbers to each thread for calculating.
  3. Calculate the time for the parallel processing.

The below breakIntoParts method is responsible for making the chunks

const breakIntoParts = (number, threadCount = 1) => 
    const parts = [];
    const chunkSize = Math.ceil(number / threadCount);

    for (let i = 0; i < number; i += chunkSize) {
        const end = Math.min(i + chunkSize, number);
        parts.push({ start: i, end });
    }

    return parts;
};        

Now let's see the full code

const { Worker, isMainThread, parentPort, workerData } = require("node:worker_threads")
const { generatePrimes } = require("./prime");

const threads = new Set();
const number = 100000000;

const breakIntoParts = (number, threadCount = 1) => {
    const parts = [];
    const chunkSize = Math.ceil(number / threadCount);

    for (let i = 0; i < number; i += chunkSize) {
        const end = Math.min(i + chunkSize, number);
        parts.push({ start: i, end });
    }
    return parts;
};

console.time("PRIMES COUNT TIME");
const threadPromises = [];

if (isMainThread) {
    const parts = breakIntoParts(number, 12);

    parts.forEach((part) => {
        const thread = new Worker(__filename, {
            workerData: {
                start: part.start,
                end: part.end
            }
        });
        threads.add(thread);

        const threadPromise = new Promise((resolve) => {
            thread.on("error", (err) => {
                throw err;
            });
            thread.on("exit", () => {
                threads.delete(thread);
                console.log(`Thread exiting, ${threads.size} running...`);
                resolve();
            });
            thread.on("message", (msg) => {
                console.log(msg);
            });
        });

        threadPromises.push(threadPromise);
    });

    Promise.all(threadPromises).then(() => {
        console.timeEnd("PRIMES COUNT TIME");
    });

} else {
    const primes = generatePrimes(workerData.start, workerData.end);
    parentPort.postMessage(`Completed primes count for ${workerData.start} and ${workerData.end}`);
}        

In the above code, inside the if block we run a forEach loop to distribute the sub-ranges of numbers for each thread. Then we collect the results of each thread inside a promise array named threadPromises to calculate the overall time after the completion of result from each thread.

As my PC has 12 virtual cores, I am using all of them. So after running the script, it takes around 19 seconds to calculate all the primes without blocking the main thread using all the 12 CPU cores.

Completed primes count for 0 and 833333
Thread exiting, 11 running...
Completed primes count for 16666668 and 25000002
Thread exiting, 10 running...
Completed primes count for 8333334 and 16666668
Thread exiting, 9 running...
Completed primes count for 33333336 and 41666670
Thread exiting, 8 running...
Completed primes count for 25000002 and 33333336
Thread exiting, 7 running...
Completed primes count for 50000004 and 58333338
Thread exiting, 6 running...
Completed primes count for 66666672 and 75000006
Thread exiting, 5 running...
Completed primes count for 41666670 and 50000004
Thread exiting, 4 running...
Completed primes count for 58333338 and 66666672
Thread exiting, 3 running...
Completed primes count for 83333340 and 91666674
Thread exiting, 2 running...
Completed primes count for 75000006 and 83333340
Thread exiting, 1 running...
Completed primes count for 91666674 and 100000000
Thread exiting, 0 running...
PRIMES COUNT TIME: 19.125s        

See how all the CPUs are being utilized (100%) while running the script.

No alt text provided for this image

So, we can conclude that there is definitely a great opportunity for us to try out this feature from Node.js and feel the power of parallel processing.

#nodejs #programming #knowledgesharing #softwareengineering

要查看或添加评论,请登录

社区洞察

其他会员也浏览了