Sharing Data Between Threads in Rust
Luis Soares, M.Sc.
Lead Software Engineer | Blockchain & ZK Protocol Engineer | ?? Rust | C++ | Web3 | Solidity | Golang | Cryptography | Author
Rust is renowned for its emphasis on safety, particularly concurrent safety. One of the challenges in concurrent programming is the safe sharing of data between threads. In this article, we'll delve into how Rust allows for data sharing amongst threads, ensuring safety and efficiency.
The Problem: Data Races
Before we dive into the solutions, let's first understand the problem. When multiple threads can access data simultaneously, and at least one of them is writing, you have the potential for a data race. Data races lead to undefined behaviour, which is a situation Rust works hard to prevent.
The Rust Way
In Rust, the type system and the borrow checker are designed to prevent data races. There are two key concepts:
With these principles, Rust ensures that:
Sharing Data
1. Arc<T>
The Arc<T> type is an atomic reference counted pointer. When the last reference to an Arc is dropped, the data it points to is also dropped. It allows multiple threads to read shared data simultaneously.
Example:
use std::sync::Arc;
use std::thread;
fn main() {
let numbers = Arc::new(vec![1.0, 2.0, 3.0]);
let handles: Vec<_> = (0..3).map(|i| {
let numbers = Arc::clone(&numbers);
thread::spawn(move || {
println!("{:?}", numbers[i]);
})
}).collect();
for handle in handles {
handle.join().unwrap();
}
}
In the above example, Arc::clone creates a new reference to the same data, increasing the reference count.
2. Mutex
A mutex (short for "mutual exclusion") ensures that only one thread can modify the data it protects at a time.
use std::sync::{Mutex, Arc};
use std::thread;
fn main() {
let counter = Arc::new(Mutex::new(0));
let handles: Vec<_> = (0..10).map(|_| {
let counter = Arc::clone(&counter);
thread::spawn(move || {
let mut num = counter.lock().unwrap();
*num += 1;
})
}).collect();
for handle in handles {
handle.join().unwrap();
}
println!("Result: {}", *counter.lock().unwrap());
}
Here, each thread increments the counter. The lock method locks the mutex, and if another thread has already locked it, lock it will block the current thread until it can get access.
3. Channels
Channels in Rust are a way to send data between threads. There are two halves to a channel: the sender and the receiver.
use std::sync::mpsc;
use std::thread;
fn main() {
let (tx, rx) = mpsc::channel();
thread::spawn(move || {
let val = String::from("hi");
tx.send(val).unwrap();
});
let received = rx.recv().unwrap();
println!("Got: {}", received);
}
In this example, the spawned thread sends a message (a String) to the main thread, which receives and prints it.
The Underlying Principles of Mutex<T> and Arc<T>
Mutex Internals:
A mutex, by design, provides exclusive access to the data it wraps. But how does it guarantee this?
Atomic Reference Counting:
Arc<T>, standing for Atomic Reference Counting, is a thread-safe version of Rc<T>. The atomicity ensures that the reference count can be safely incremented or decremented across threads.
领英推荐
More on Channels: mpsc and Beyond
Rust's standard library provides a multiple-producer, single-consumer channel, often abbreviated as mpsc.
RwLock<T>
The RwLock is similar to a Mutex, but allows for more flexibility:
Barrier
A Barrier is a sync primitive that allows multiple threads to synchronize the beginning of some computation.
Unsafe and Atomics
Rust offers low-level atomic operations through the module when the provided abstractions don't offer the desired performance characteristics. They're called "atomics" because their operations are indivisible — other threads can't see an operation at an intermediate state. While powerful, they require careful handling to ensure safety.
External Crates
The Rust ecosystem offers a wealth of concurrency-related crates. Some notable ones include:
Lock-Free Data Structures
Scoped Threads in crossbeam
The Rust standard library's thread spawning doesn't allow threads to have references to the stack of the spawning thread because they can outlive their parent's stack frame. The crossbeam crate provides a way to spawn scoped threads, which can safely access the parent thread's stack.
Concurrency vs Parallelism
While they're related, it's important to differentiate concurrency from parallelism:
Rust provides tools for both. While the primitives discussed above largely deal with concurrency, tools like rayon are designed for parallelism, focusing on splitting computation across multiple cores.
Memory Model and Hardware Considerations
While Rust's abstractions shield developers from many of the intricacies of concurrent programming, underlying hardware and the memory model play a crucial role. Cache coherence, memory reordering, and atomic operations can behave differently across architectures. Rust's atomic operations, by default, use a strong memory ordering to ensure consistency across platforms. Still, they can be configured for relaxed ordering if needed and if one knows what they're doing.
Stay tuned, and happy coding!
Visit my Blog for more articles, news, and software engineering stuff!
All the best,
CTO | Tech Lead | Senior Software Engineer | Cloud Solutions Architect | Rust ?? | Golang | Java | ML AI & Statistics | Web3 & Blockchain
Developer / Programmer / Automatics, control and Robotics
8 个月man ...!, a really good article, thank you