The Node.js case: how can a single threaded program run asynchronous operations?

The Node.js case: how can a single threaded program run asynchronous operations?

Node.js is often referred to as a single-thread process, which at first may look like something unable to create threads and therefore manage multiple tasks at the same time, but wait!!! This is exactly what node does, so what is that witchcraft?


Slow down buddy, take a look at Async first

let's clarify one concept: Asynchronous operations.

Async (shortened version) operations referred to any computation performed independently of another. The classic execution of a program is sequential, which basically means that instructions are executed one after another. It is easy to foresee the limit of this model, which is the interdependence of the operations: "I can not manage task B if task A is not yet completed" cried the CPU.

Here is the problem, when you have to deal with operations you haven't any control over, it can be very frustrating. Take network request for example (The perfect scapegoat) no matter how skillful you are in programming you cannot enforce a server to respond to you in an exact timeframe (if you know how, please let us simple human beings know the technique in the comments :) ). So if you have something like that in your poor sequential code, it will block the execution of the whole program (until an answer arrived) making it slow and unresponsive and God knows, everybody HATE unresponsive software so to avoid that Asynchronous Programming enter the game.


Okay, but what's your problem with Node.js, though?

Node.js is single threaded, euh... wait no, the main Node.js process (the event loop) is single threaded. For those sleeping at the back of the classroom, it basically means that when you launch the Node.js program :

node program.js        

Your computer create a unique process representing the program in execution state. Each line of program.js will be executed sequentially until the end because this is what single-threaded mean: no other threads or process are created to manage tasks. But to achieved async operations with a CPU, you have to create a new thread or new process or whatever can represent the task you want the CPU to manage asynchronously.


Not so rocket science after all

In reality, node does create threads, but implicitly. There is a fix number of threads allocated by node at the launching of a new node process. This set of thread is known as the thread pool and are allocated to manage eventual async operations to be done during the main process's lifecycle. The env variable UV_THREADPOOL_SIZE (default 4) define the max size of this thread pool. To operate async operations, node use under the hood a C (the big OG ;) ) library called libuv. This library is useful to create and manage threads. Node.js use these threads for tasks natively recognize as async:

- DNS resolution (for network requests)

- Read operations (on the file system)

- Encryption operations

-other async stuffs

As these operations are CPU intensive, they can take a lot of time to be achieved. In order not to block the whole application because of them, Node.js offload them to the OS using libuv (yes it's basically what this library does because today's OS are very great for multitasking) which execute them and then return the final result to the Node.js event loop.

One question I had when reading about this topic was: How the hell does Node.js recognize which instruction referrer to an async operation? And the solution lies in the node async API.

So the guy who made Node.js (big up Ryan Dahl ) knew that us, the community, will surely want to create really badass servers with it. And he also knew all the typical operations that can slow down our servers. These operations also referred as blocking operations were categorized, and special API was created to manage them with libuv (yes, him again). We access these APIs through standard library function that you know very well (fs.readFile(), https.request(), crypto.createHash(), ...). In reality, we can say that node does not have to recognize async operations because their inner implementation make their execution asynchronous.


Alright but Is'n there a kind of native Async and custom Async?

If you've been focus throughout our analysis (and I know you have :) ), you shall notice that I didn't mention the child_process or worker_threads modules that allow use to create highly efficient programs by creating new threads or spawning process and execute entire programs separately (even shell and other languages scripts). This is because I (and this is purely subjective not a convention) distinguish native async operation which get their async nature from Node.js's architecture and inner working and custom async operations which are defined and manage by you and me (fellow programmers) for optimization purpose.


Thanks for reading so far, you are a real warrior. I really want to know your thoughts or questions in the comments section, maybe I've made some mistakes or said falsehood don't hesitate to mention it (with explanations of course) I am always open to learn more.

In future articles I will deep dive into those "custom Async operations" since then I hope I will be done with my side project (???)'

Happy coding dear software engineers

要查看或添加评论,请登录

社区洞察

其他会员也浏览了