Concurrency in Rust ??
Image source - https://analyticsindiamag.com/in-rust-we-trust/

Concurrency in Rust ??

Hi Rustaceans, welcome to another article on Rust. In this article, we will learn about achieving concurrency. In the process, we will learn about Threads, Mutex, and Arc<T> (smart pointer), which is like a big brother to Rc<T> that we learned earlier in smart pointers. It would be helpful if you reviewed it before reading further. I will patiently wait for you till then ??.

"Fearless concurrency" is often used to describe Rust's approach to handling concurrent programming. It refers to Rust's ability to provide robust and reliable concurrency without the common pitfalls of parallel programming, such as data races, deadlocks, and crashes due to thread-related bugs.

So, let's see what this concurrency is and how it is handled in Rust ??.

Concurrency

Concurrency refers to the ability of the system to handle multiple tasks simultaneously. Concurrency does not necessarily mean parallelism. Unlike parallelism, concurrency doesn't refer to tasks that start literally at the same time.

For example, in a restaurant, there are different people at work, and each has their own job. The waiter will take orders from the customer and serve them the ordered food. The chef will receive the orders from the waiter, cook them according to the recipe, and give the prepared food to the waiter.

The thing to note here is that they work on their individual jobs and do not interfere with the jobs of others. Another thing to note is that the order will come at different times; the same customers can order a new dish and new customers can place a new order (work is done concurrently, not parallelly). The efficiency here is handling all the customers within the restaurant without any disappointment.

These are the same things we want to be happening with our program. That is to handle multiple tasks concurrently and efficiently.

Achieving Concurrency

Concurrency can be achieved programmatically through techniques like - Threads, Processes, Asynchronous programming and parallelism. In this article, we will be learning to achieve concurrency using threads.

Threads

Threads refer to the smallest programmed instruction sequence a scheduler can manage independently. A scheduler is typically a part of the operating system(OS).

Since the OS provides a thread, threads are helpful for writing programs for I/O (Input output) related tasks like reading a disk or network). In short, threads provided by the OS entrust us to write more responsive, efficient, and scalable applications by taking advantage of modern hardware capabilities.

We are interested in one of the features of the thread here: Threads allow multiple tasks to be executed concurrently. With this feature in mind, we will jump to our Rust programming.

First, we will see how to work with a single thread in Rust, then hop into the multi-threaded environment.

Working with Thread in Rust

Threads are part of the Rust's standard library. So, let's spawn a thread using it.

use std::thread;

fn main() {
    thread::spawn(|| {
        // print statement with a spawned thread
        println!("A new thread spawned!!");
    });
}
        

We have successfully spawned a thread. Now, we will break down what is happening under the hood. Every program is assigned a thread that will execute the program tasks starting from the main() method (here in Rust). A new thread may be created based on the program. So, for example, if the program has an asynchronous task, it is best to assign a new thread for it. This avoids waiting for the asynchronous task to complete (multi-thread environment), which can be too long.

We will discuss more about multi-thread when we talk about Arc<T>, so for now, let's understand our code a bit more. The thread::spawn will request the operating system to generate a new thread for us to execute operations within the closure (|| {}). But when you run this code, you might notice that nothing gets printed on the console.

But Why?

This has something to do with the lifecycle of the main thread. As we talked about earlier, a thread is assigned to every task, starting with the main() method. This root or main thread is what caused this problem. It's not an issue because it is what should be expected. Why would you want to execute other threads if the main thread itself has been terminated?

use std::thread;

fn main() {
    thread::spawn(|| {
        // print statement with a spawned thread
        println!("Printing from a spawned thread!");
    });

    println!("Printing from the main thread!");
}

Output:
Printing from the main thread!
        

As you can see above, the main thread's print function was executed successfully, whereas our poor spawned thread didn't even get a chance ??.

How can we fix this?

All we have to do is to let the main thread wait till our spawned thread executes.

use std::thread;

fn main() {
    let join_handle = thread::spawn(|| {
        // print from spawned thread
        println!("Printing from the spawned thread");
    });

    // wait for the spawned thread to execute
    join_handle.join().unwrap();

    // print from main thread
    println!("Printing from the main thread");
}

Output:
Printing from the spawned thread
Printing from the main thread
        

The thread::spawn will return a joinHandle. The joinHandle is holding the ownership of the spawned thread within our program. This means that when JoinHandle is dropped, the associated thread(spawned) is also detached. This is why we were not getting the print statement of the spawned thread.

So, now we have made sure that the spawned thread completes execution using the .join() method in the received joinHandle(join_handle variable). Also, since we are dealing with a single spawned thread, the order of spawned thread execution may not be seen (which you might notice in a multi-thread situation).

Working with Multi-thread in Rust

If you work with Rust in real-time projects, you will pretty much work in a multi-thread environment. API calls alone are the best example of the multi-thread condition. Don't worry; for most cases, Rust will handle them for us. In other instances, Rust provides tools for us to manage them.

In this section, we will learn how to handle multiple threads with the help of Arc and Mutex. I have tried my best to be as simple as possible, and by the end of the article, you won't need any explanation of the final code. You would understand every line in the program (please let me know if I was able to do so).

Arc<T>

At first glance, Arc<T> is almost similar to the Rc<T> smart-pointer. Both are reference-counted smart pointers used to achieve multiple data ownership (allocated in the heap). The extra 'A' in the Arc<T> stands for Atomic, meaning it is thread-safe.

Thread-Safety refers to a property of a program or a data structure that ensures that it behaves correctly and as expected, even when accessed by multiple threads concurrently. In a multi-threaded environment where multiple threads access shared data simultaneously, it is easy to get bugs caused by Data Races and deadlocks. This can cause unexpected results or even crashes.

With Rust and its fearless concurrency, it is unlikely to slip into these bugs. But there are still some things to consider when working with Arc. First, let's see how to use Arc.

use std::sync::Arc;

fn main() {
    let major_numbers: Arc<Vec<i32>> = Arc::new(vec![1, 2, 3]);

    // cloning only the reference of the major_numbers
    // notice the type of minor_numbers is same as major_numbers
    let minor_numbers: Arc<Vec<i32>> = Arc::clone(&major_numbers);

    println!("cloned numbers are {:?}", minor_numbers);
}        

This is similar to what we saw in the Rc<T> section. The difference is that Arc<T> can handle multiple ownerships with thread safety. Notice the word "multiple ownerships" when we talk about Arc<T> and Rc<T> being thread-safe and not a thread-safe smart-pointer, respectively. We are not talking in the context of the actual data. This means Arc<T> can only make it thread-safe for multiple ownership of the same data, not the data itself.

That is why Arc<T> and Rc<T> are designed to provide shared ownership that only allows read-only access to the same data. Even if you try to mutate the minor_numbers from the above example, Rust will throw a compile error as follows:

cannot borrow data in an Arc as mutable trait DerefMut is required to modify through a dereference, but it is not implemented for Arc<Vec<i32>>

Before comprehending the mutation of shared memory, let's see how multiple threads can be handled in Rust with a simple example:

use std::thread;

fn main() {
    // to store join handle of multiple spawned threads
    let mut join_handles = vec![];

    for index in 1..10 {
        let join_handle = thread::spawn(move || {
            println!("Spawned thread - {}", index);
        });

        join_handles.push(join_handle);
    }

    // Run a loop, so that we can wait for all the spawned threads
    for join_handle in join_handles {
        join_handle.join().unwrap();
    }
}

Output:
Spawned thread - 1
Spawned thread - 4
Spawned thread - 3
Spawned thread - 5
Spawned thread - 2
Spawned thread - 7
Spawned thread - 6
Spawned thread - 8
Spawned thread - 9        

We are spawning a thread within a for loop and storing each thread's joinHandle in a vector. Finally, we loop through the vector of joinHandles for all our spawned threads so that we can call the .join() method on each of them to ensure our main thread waits until our spawned threads execute.

You may also notice that the order of thread execution is random by the output section. This may always be different every time you run this code. But that is expected because it proves that our threads are independent and work at their own pace, like the chefs and waiters from our restaurant's example.

Mutating a Shared Memory

Arc cannot maintain thread safety for mutable references. This means that for mutable references, Arc and Rc behave similarly. So, we need help in achieving interior mutability. Like in the case of Rc, we used RefCell to obtain interior mutability.

Questions may come like - Why not use the same RefCell that we used to mutate in the case of Rc? Well, the catch is that RefCell is not thread-safe, either. For the particular case of Arc and to have a complete thread-safe environment, we will use Mutex. There are other options as well, like RwLock and Atomic types. In this article, we will be achieving interior mutability using Mutex.

Mutex is a mutual exclusion primitive, helpful in protecting shared data. What Mutex does is to make sure to obtain the data before using it. If it cannot obtain the data, it simply waits for it till it becomes available. This makes sure there isn't any race condition and avoids unexpected results.

Let's understand Mutex with a single thread example:

use std::sync::Mutex;

fn main() {
    // create a new mutex
    let counter = Mutex::new(0);

    {
        // check if the counter is lockable(available)
        let mut locked_counter = counter.lock().unwrap();
        *locked_counter += 10;
    } // removes the lock over the counter, makes it available again

    println!("{}", counter.lock().unwrap());
}

Output:
10        

We are simply creating a new mutex with the new() constructor of Mutex. We are opening a new inner scope. Then, we check if our mutex data is available; the way to do it is with the .lock() method. The data can only be accessed through the RAII guards returned from the lock, which guarantees that the data is only ever accessed when the Mutex is locked.

RAII guard in Rust refers to an object that encapsulates a resource and is responsible for releasing it when it goes out of scope. The term "guard" is used because these objects act as guards, ensuring that the associated resource is managed correctly and released when the guard is dropped. For Mutex, this RAII gaurd is MutexGaurd.

After securing the lock, we mutate the counter (locked_counter) with an additional 10. Then our rusty (more specifically, our MutexGaurd) ensures that the lock is released once moving out of scope (how convenient) ??.

Arc and Mutex

The question is, why can't we use only Mutex? Why do we need Arc? The answer to both questions is that You cannot use only Mutex if you need shared ownership across multiple threads. Mutex is all good, but it needs Arc to ensure that multiple threads can own and access the shared data safely.

Let's see both Arc and Mutex in action:

use std::sync::{Arc, Mutex};
use std::thread;

fn main() {
    // Create a Mutex to wrap the shared mutable data
    let counter = Arc::new(Mutex::new(0));

    // store joinHandles for our spawn threads
    let mut join_handles = Vec::new();

    // spawn multiple threads
    for _ in 1..10 {
        // clone the counter
        let cloned_counter = Arc::clone(&counter);

        let join_handle = thread::spawn(move || {
            // Acquire the lock before accessing the shared data
            let mut locked_counter = cloned_counter.lock().unwrap();
            *locked_counter += 10;
        });

        join_handles.push(join_handle);
    }

    // wait for all the spawned threads
    for join_handle in join_handles {
        join_handle.join().unwrap();
    }

    // Get access and print the shared data
    println!("Updated counter value = {}", counter.lock().unwrap());
}

Output:
Updated counter value = 90        

I wish you didn't need any explanation for the above code example since we learned each section separately a few scrolls back.

With this, we have come to the end of this article. I hope I was helpful and we all learned something new today. I have added all the mentioned code in this article in my replit (category-wise):

  1. Concurrency.
  2. Smart Pointers.

You can take a look at my other articles on Rust ??:

  1. Ownership Model.
  2. Memory Management.
  3. Rust Terminologies (Beginner edition).
  4. Smart Pointers.

I'm also working on a project which has been around for a while. The project is about integrating Rust and GPT-4. Currently, I'm learning about web servers. You can check on the posts that I have progressed so far:

  1. Tokio - Simple Example.
  2. Simple Actix Example.





要查看或添加评论,请登录

Mukesh kumar的更多文章

  • Generic Types in Rust

    Generic Types in Rust

    Generic types in Rust are exciting topics to learn, which may be intimidating for someone new to strongly typed…

    3 条评论
  • Smart Pointers In Rust

    Smart Pointers In Rust

    Smart pointers are a fairly advanced topic in Rust, but we might have used them without even noticing it. For example…

  • Rust Terminologies (Beginner Edition)

    Rust Terminologies (Beginner Edition)

    Learning something new and coming across specific technical terms that slow your journey always feels frustrating…

  • Memory management in Rust

    Memory management in Rust

    Memory management is one of the many things that are done by the program automatically, and most often, we need to…

    2 条评论
  • Ownership Model of Rust

    Ownership Model of Rust

    Rust has indeed taken the best of two worlds that are of having low-level control like C and C++ (that are 45 years…

    5 条评论
  • Deployed my first "Hello DApp" (Decentralised App) ?? .

    Deployed my first "Hello DApp" (Decentralised App) ?? .

    What is a Decentralised App(DApp) ?? ? DApp is an application running on a decentralized peer-to-peer network. Although…

社区洞察

其他会员也浏览了