Node.js Worker Threads – In a Nutshell
https://blog.arps.co.il/node-js-worker-threads-%d7%a2%d7%9c-%d7%a7%d7%a6%d7%94-%d7%94%d7%9e%d7%96%d7%9c%d7%92/

Node.js Worker Threads – In a Nutshell

For the Hebrew version: https://blog.arps.co.il/node-js-worker-threads-%d7%a2%d7%9c-%d7%a7%d7%a6%d7%94-%d7%94%d7%9e%d7%96%d7%9c%d7%92/

As you may know, Async and Await are not magic solutions, and sometimes we need to perform tasks that could "block" the Event Loop, such as complex calculations, heavy array iterations, and more. If you have a function that performs a complex calculation, adding async at the beginning will not solve the problem, and it will still block the Event Loop.

Reminder: Node.js operates using an Event Loop, which is essentially single-threaded.

(To be precise, Node.js does use threads. The libuv library creates a pool of 4 threads by default, which allows the Event Loop to offload "heavy tasks" to the pool automatically, primarily tasks related to the file system, encryption, compression, etc. But that's another topic.)

In any case, it is our responsibility as developers to write code that uses the Event Loop in the best possible way.

So, what’s the solution? One solution to the problem, which might not always fit every case in terms of proper architecture, but can sometimes be appropriate, is using Worker Threads. The key points are:

  • Each thread has one process and its own Event Loop. What runs in one Worker Thread does not affect the other in terms of blocking (Non-Blocking).
  • You can share memory (for example, using SharedArrayBuffer) and pass information to our thread.
  • Each thread has its own instance of V8 and libuv (isolated, and yes, they consume resources).

This is mainly recommended for tasks that require CPU resources. Let's start by writing the following code in our parent file (it could also be your app.js/ts):

// Parent Code:
const { Worker } = require('worker_threads');
const worker = new Worker('./worker.js');
        

In line 3, we load the worker with the file we want the worker to run, in our case, worker.js.

Next, add the following lines to the file:

// Subscribing for messages in our Parent.
worker.on('message', message => console.log(message));

// Sending a message to our Worker
worker.postMessage('Hello');
        

In line 2, we "subscribe" to messages from our worker, meaning when our worker sends us a message, our parent will run console.log(message).

In line 5, we send our worker a message with the value 'Hello'. Up to this point, not much has happened. We created a worker, subscribed to its messages, and sent it a message with the value 'Hello'. However, our parent still won’t print the message to the console because our worker currently does not send a message back to the parent.

In worker.js, write the following code:

const { parentPort } = require('worker_threads');
parentPort.on('message', message => 
    parentPort.postMessage({ hello: message })
);        

In line 2, we subscribe from our worker to the parent, meaning when our worker receives a message, it will run the command parentPort.postMessage({ hello: message });, which will actually send a message to our parent.

Up to this point, we have created full synchronization between our parent and its Worker Thread. Here is the complete code:

Parent code:

const { Worker } = require('worker_threads');
const worker = new Worker('./worker.js');

// Subscribing for messages in our Parent.
worker.on('message', message => console.log(message));

// Sending a message to our Worker
worker.postMessage('Hello');
        

Worker code:

const { parentPort } = require('worker_threads');
parentPort.on('message', message => 
    parentPort.postMessage({ hello: message })
);
        

I almost always recommend creating a separate file for our thread (in this case, it is worker.js). However, if a separate file is not suitable for a specific case, we can use the following technique, using isMainThread to include all our code in one file:

const { Worker, isMainThread } = require('worker_threads');

// Are we running from the Parent?
if (isMainThread) {
    // Creating a new Worker and sending it the current file
    const worker = new Worker(__filename);
} else {
    // This code runs in the Worker
    console.log('Hi from your worker!');
}
        

In this case, we use isMainThread to determine if we are currently running from the Worker Thread or from our parent, and to separate the code that will run.

How do we pass initial information to our worker?

To pass data to the worker, we can use the workerData option, for example:

const { Worker, isMainThread, workerData } = require('worker_threads');

if (isMainThread) {
    const worker = new Worker(__filename, { workerData: 'Hi!' });
} else {
    console.log(workerData); // will print 'Hi!'
}
        

There are more interesting things you can use like SHARE_ENV, eval. I recommend checking the documentation to discover interesting features.

Another option that I think is worth highlighting is resourceLimit:

const { Worker } = require('worker_threads');
const worker = new Worker('./worker.js', { resourceLimits: { maxOldGenerationSizeMb: 10 } });
        

In this case, we limit the HEAP size (memory) to 10MB. If the worker reaches this limit, it will be terminated with an error.

What Events Exist in a Worker?

  • message: for example, parentPort.postMessage()
  • exit: when our worker has stopped.
  • online: when our worker starts running its code.
  • error: when an uncaught exception is thrown from our worker.

A Note on Using postMessage:

In the examples above, we used postMessage to send data/messages between the parent and the thread. It is very important to know that when we use postMessage, the data we transfer is actually "cloned" – with all the drawbacks/advantages this entails. That is, cloning complex data can cost us a lot of CPU power. The more complex/deep the data is, the more computing power it will require and not to mention the double RAM consumption needed to store this additional data in memory... From an architectural perspective, it’s very important to pay attention to this. To learn more about this topic, you can search Google for "is postmessage slow," and also get to know the TransferList option and more. (Perhaps I’ll write a separate post on this topic another time.)

Found a mistake? Comments? Questions? Did you manage? Stuck? Write to me in the comments!

Svetlana Ratnikova

CEO @ Immigrant Women In Business | Social Impact Innovator | Global Advocate for Women's Empowerment

6 个月

???? ??? ?? ?? ???????? ??? ?????? ???? ?????? ???: ?????? ????? ??? ??????? ????? ????? ?????? ??????. https://chat.whatsapp.com/BubG8iFDe2bHHWkNYiboeU

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了