Smart Pointers in Rust
Amit Nadiger
Polyglot(Rust??, Move, C++, C, Kotlin, Java) Blockchain, Polkadot, UTXO, Substrate, Sui, Aptos, Wasm, Proxy-wasm,AndroidTV, Dvb, STB, Linux, Cas, Engineering management.
Rust is a modern programming language that prioritizes safety and performance. One of the ways Rust ensures safety is through its smart pointer types. Smart pointers are a type of data structure that wrap a value and provide additional capabilities, such as ownership or reference counting. In this article, we will explore Rust smart pointers, including their different types, the problems they solve, their advantages, limitations, and suitable scenarios for their use.
Types of Smart Pointers in Rust
In Rust garbage collection usually handled by RAII wrapper types such as Box, Vec, Rc, or Arc. These encapsulate ownership and memory allocation via various means, and prevent the potential errors in programming lang like C.
RAII stands for Resource Acquisition Is Initialization. It's a programming idiom used in languages with deterministic memory management, such as C++ and Rust, to manage the acquisition and release of resources (like memory, files, locks, etc.) in a safe and automatic way.
The basic idea behind RAII is that the acquisition of a resource is tied to the initialization of an object, and the release of the resource is tied to the destruction of the object. This way, the resource is automatically released when the object goes out of scope, even in the presence of exceptions, early returns, or other error conditions.
In Rust, RAII is a fundamental part of the language design, and is enforced by the ownership and borrowing system. Rust provides several built-in types that use RAII to manage memory and other resources, such as Box, Vec, String, and many more. Rust also allows users to define their own RAII types by implementing the Drop trait, which provides a custom destructor function that is called when the object is dropped.
RAII is a powerful and elegant pattern that makes it easy to write correct, robust, and efficient code in Rust, and is one of the key features that sets Rust apart from other systems programming languages.
Rust provides three types of smart pointers:
1.Box<T>
Box is a simple smart pointer that points to data on the heap. It is used when we want to store a value on the heap rather than on the stack. When we create a Box<T> instance, the value of type T is allocated on the heap and the Box points to that memory location. Box implements the Deref trait, which allows us to access the data it points to as if it were a reference. Box is a single-owner pointer, which means that it cannot be copied, but can be moved.Box is similer to unique_ptr<> in C++.
Here is an example of how to use a Box:
fn main() {
let x = Box::new(5);
//let y = x; // Compilation error , since box dont implement clone trait ,
// so value will be mooved
println!("{}", *x); //After move x barrowed and printed.
}
/*
Below is compilation erorr when I uncomment the "let y = x;"
?BoxTest.rs:6:31
? |
2 |? ? ?let x = Box::new(5);
? |? ? ? ? ?- move occurs because `x` has type `Box<i32>`, which does not implement the `Copy` trait
3 |? ? ?let y = x;
? |? ? ? ? ? ? ?- value moved here
4 |? ? ?println!("{}", *x);
? |? ? ? ? ? ? ? ? ? ? ?^^ value borrowed here after move
? |
? = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)
help: consider cloning the value if the performance cost is acceptable
? |
5 |? ? ?let y = x.clone();
? |? ? ? ? ? ? ? ++++++++
error: aborting due to previous error; 1 warning emitted
*/
In this example, we create a Box<i32> that contains the value 5. We can access the value by dereferencing the Box using the * operator.
Box<T> does implement the Clone trait but not copy because it would violate Rust's ownership rules. When a value is moved into a Box<T>, ownership is transferred to the box. If Box<T> implemented Clone, it would create a new owned copy of the value which has a separate memory location, which will not violate Rust's rule that a value can only have one owner at a time.
To create a new Box<T> with the same value as an existing Box<T>, you can use the box keyword and dereference the original box with the * operator, like this:
let x = Box::new(5);
let y = Box::new(*x);
This creates a new Box<T> with the same value as x, without violating Rust's ownership rules.
fn main() {
? ? let x = Box::new(5);
? ? let y = x.clone();
? ? println!("x = {}, y = {}", x, y);
}
/*
Op => x = 5, y = 5
*/
In this example, Box::new(5) creates a new Box holding the value 5. The clone() method is called on x, creating a new heap allocation.
It create two separate heap-allocated boxes that contain the value 5. The first statement creates a new box x and stores the value 5 in it. The second statement creates a new box y and clones the contents of x into y.
After the second statement (let y = x.clone()), there are two separate boxes, x and y, each containing a copy of the value 5. The two boxes are independent and can be modified separately.
The output of this program is:
x = 5, y = 5
Box is suitable for cases where you need to allocate memory on the heap for a fixed size data type. For example, you might use a Box to hold a large array or a struct.
2. Rc<T>
Rust supports reference counting with Rc. Rc<T>: Rc stands for Reference Counted and is used when we want to have multiple owners of the same data on the heap. This appends a reference counter to your type as a tuple. Rc keeps track of the number of references to a value and when the last reference is dropped, the value is deallocated.
You can wrap a type in Rc to make it reference counted:
Rc implements the Clone trait, which allows us to create new references to the same data. We can increment the reference count with clone(), and decrement it with drop(). When the reference count reaches zero, the variable is dropped.
Rc is a single-threaded smart pointer, which means that it cannot be shared across multiple threads.
Here is an example of how to use an Rc:
use std::rc::Rc;
fn main() {
let x = Rc::new(5);
let y = x.clone();
println!("{} {}", *x, *y);
}
In the above example, we create an Rc<i32> with the value 5. We then create a new reference y to the same data using the clone() method. Both x and y share ownership of the value 5.
Below example shows how Rc is pointing to same variable '
use std::rc::Rc;
#[derive(Debug, Clone)]
struct Resource(String);
impl Drop for Resource{
fn drop(&mut self) {
println!("{} was dropped", self.0)
}
}
fn main() {
let s = Rc::new(Resource("Amit".to_string()));
{
let t = Rc::clone(&s);
println!("I have a {:?}",t);
}
println!("Still the referance is valid s= {:?}",s);
}
/*
O/P;
I have a Resource("Amit")
Still the referance is valid s= Resource("Amit")
Amit was dropped
*/
Use case :
So we can create functions that accept Rc<Data> and pass clones, and all of the clones will point to the same data. You've traded a quick add/compare operation for re-use---and memory is preserved by only allocating once.
See below example :
use std::rc::Rc;
#[derive(Debug, Clone)]
struct Resource(String);
impl Drop for Resource{
fn drop(&mut self) {
println!("{} was dropped", self.0)
}
}
fn fun1(rc1:Rc<Resource>) {
println!("in Fun1 rc1 = {:?}",rc1);
// println!("Count of rc1 in fn1= {}",Rc::strong_count(&rc1));
}
fn fun2(rc1:Rc<Resource>) {
println!("in Fun2 rc1 = {:?}",rc1);
}
fn fun3(rc1:Rc<Resource>) {
println!("in Fun3 rc1 = {:?}",rc1);
}
fn main() {
let s = Rc::new(Resource("Amit".to_string()));
{
let t = Rc::clone(&s);
println!("I have a {:?}",t);
}
fun1( Rc::clone(&s));
fun2( Rc::clone(&s));
fun3( Rc::clone(&s));
}
/*
I have a Resource("Amit")
in Fun1 rc1 = Resource("Amit")
in Fun2 rc1 = Resource("Amit")
in Fun3 rc1 = Resource("Amit")
Amit was dropped
*/
Please note that RC value can't be passed to or shared with threads.
Rc is a single-threaded construct. We can't pass an Rc between threads, because nothing is protecting the reference count itself from being modified by multiple threads.
Below will be a compilation error is we share RC with the thread
use std::rc::Rc;
use std::thread;?
fn main() {?
? ? let x = Rc::new(5);?
? ? let y = Rc::clone(&x);?
let t = thread::spawn(move || {?
? ? ? ? println!("{}", *y);?
? ? });?
? ? println!("{}", *x);?
? ? t.join().unwrap();?
? ? println!("{} {}", *x, *y);?
}?
/*
Compilation error :
? ?|
22 |? ? ? ?let t = thread::spawn(move || {
? ?|? ? ? ? ? ? ? ?------------- ^------
? ?|? ? ? ? ? ? ? ?|? ? ? ? ? ? ?|
? ?|? _____________|_____________within this `[[email protected]:22:24: 22:31]`
? ?| |? ? ? ? ? ? ?|
? ?| |? ? ? ? ? ? ?required by a bound introduced by this call
23 | |? ? ? ? ?println!("{}", *y);
24 | |? ? ?});
? ?| |_____^ `Rc<i32>` cannot be sent between threads safely
? ?|
? ?= help: within `[[email protected]:22:24: 22:31]`, the trait `Send` is not implemented for `Rc<i32>`
note: required because it's used within this closure
? --> ARCTest.rs:22:24
? ?|
22 |? ? ?let t = thread::spawn(move || {
? ?|? ? ? ? ? ? ? ? ? ? ? ? ? ?^^^^^^^
note: required by a bound in `spawn`
? --> /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/thread/mod.rs:709:1
error: aborting due to previous error; 1 warning emitted
For more information about this error, try `rustc --explain E0277`.
*/
Rc is suitable for cases where you need to share ownership of data between multiple parts of your code. For example, you might use an Rc to share a data structure between a UI and a backend system.
APIs provided by Rc:
Below code shows how the reference counts will be incremented and decremented.
use std::rc::Rc;
#[derive(Debug, Clone)]
struct Resource(String);
impl Drop for Resource{
fn drop(&mut self) {
println!("{} was dropped", self.0)
}
}
fn main() {
let s = Rc::new(Resource("Amit".to_string()));
println!("Count of referance before block = {}",Rc::strong_count(&s));
{
let a = Rc::clone(&s);
println!("Count of referance = {}",Rc::strong_count(&s));
let b = Rc::clone(&s);
println!("Count of referance = {}",Rc::strong_count(&s));
let c = Rc::clone(&s);
println!("Count of referance = {}",Rc::strong_count(&s));
}
println!("Count of referance outside the inner block= {}",Rc::strong_count(&s));
}
/*
Count of referance before block = 1
Count of referance = 2
Count of referance = 3
Count of referance = 4
Count of referance outside the inner block= 1
Amit was dropped
*/
3. Arc<T>
Arc stands for Atomic Reference Counted which uses atomic types and is used when we want to have multiple owners of the same data that can be shared across multiple threads. Arc is similar to Rc in that it keeps track of the number of references to a value and deallocates the value when the last reference is dropped. However, Arc is thread-safe and can be shared across multiple threads. Arc implements the Clone trait, which allows us to create new references to the same data.
The only difference between Rc and Arc is Arc has slightly slower performance, because reference count updates are atomic operations.
领英推荐
Here is an example of how to use an Arc:
use std::sync::Arc;
use std::thread;
fn main() {
let x = Arc::new(101);
let y = Arc::clone(&x);
let t = thread::spawn(move || {
println!("{}", *y);
});
println!("{}", *x);
t.join().unwrap();
}
/*
Op =>
amit@DESKTOP-9LTOFUP:~/OmPracticeRust$ ./ARCTest
101
101
*/
In this example, we create an Arc<i32> that contains the value 101. We then create a new reference y to the same data using the Arc::clone() method. We then spawn a new thread and pass y to it. In the main thread, we print the value of x. Both the main thread and the spawned thread have access to the same data and can safely read from it. When the threads are complete, the value 101 is deallocated.
Arc is suitable for cases where you need to share data between multiple threads. For example, you might use an Arc to share a cache of data between multiple worker threads in a web server.
Please consider the below example to demonstrate the usage of Arc.
I want to achieve below :
let's try to do this using normal mutable reference :
fn thread_fun(a: &mut i32) {
? ? *a = 101;?
? ? println!("in the thread a = {}", a);
}
fn main() {
? ? let mut a = 100;
? ? let handle = std::thread::spawn(move || {
? ? ? ? thread_fun(&mut a);
? ? });
? ? // Handle the Result returned by join()
? ? handle.join().unwrap_or_else(|_| {
? ? ? ? println!("Error joining thread");
? ? });
? ? println!("in the main thread a = {}", a);
}
/*
Op =>
in the thread a = 101
in the main thread a = 100
*/
In the above code the value of a is modified to 101 inside the thread function, but when the thread is joined and execution continues in the main thread, the value of a remains unchanged at 100. This is because when a is passed to the thread function using a mutable reference (&mut), the thread function is able to modify the value of a within its own scope. However, once the thread completes and ownership of a is returned to the main thread, the modified value of a is lost because the original reference to a in the main thread still points to the original value.
If we want to see the modified value of a in the main thread, we need to either return it from the thread function or use a synchronization primitive like Arc or Mutex to share ownership of a between the threads.
use std::sync::{Arc, Mutex};
use std::thread;
fn thread_fun(a: Arc<Mutex<i32>>) {
? ? // Lock the mutex to access the value
? ? let mut value = a.lock().unwrap();
? ??
? ? // Modify the value
? ? *value = 101;
? ??
? ? // Print the modified value
? ? println!("in the thread a = {}", *value);
}
fn main() {
? ? // Create an Arc wrapped Mutex to share ownership of i32 b/w threads
? ? let a = Arc::new(Mutex::new(100));
println!("main thrd before modification a = {}", a.lock().unwrap());
? ? // Spawn a new thread & move the ownership of Arc wrapped Mutex to it
? ? let handle = thread::spawn({
? ? ? ? let b = Arc::clone(&a);
? ? ? ? move || thread_fun(b)
? ? });
? ? // Wait for the thread to finish executing
? ? handle.join().unwrap();
? ? // Lock the mutex to access the value modified by the thread
? ? let value = a.lock().unwrap();
? ? // Print the modified value
? ? println!("in the main thread a = {}", *value);
}
/*
Op =>
amit@DESKTOP-9LTOFUP:~/OmPracticeRust$ ./OwnerShip
in the main thread before modification a = 100
in the thread a = 101
in the main thread a = 101
*/
In the code, we first create an Arc<Mutex<i32>> to store the integer value 100, then we clone the Arc and pass the clone to the thread. Inside the thread, we first obtain a mutex guard by calling lock() on the cloned Arc, then we dereference the guard to access the inner value and increment it by 1. Finally, we release the guard by letting it go out of scope, which automatically unlocks the mutex. In the main thread, we again obtain a mutex guard by calling lock() on the original Arc, then we dereference the guard to access the updated value and print it.
APis provided by Arc:
Please note that :
We cannot call the lock() method directly on a reference to an Arc<T>. Instead, you need to first call the clone() method on the Arc<T> reference to increment the reference count, then call the lock() method on the cloned Arc<T> to obtain a mutex guard, and finally dereference the guard to access the inner value.
Problems Smart Pointers Solve
Smart pointers solve several problems related to memory management and ownership in Rust. One of the main problems that Rust solves is memory safety. Rust ensures that there are no null or dangling pointers, which are common sources of bugs and security vulnerabilities. Smart pointers ensure that the values they point to are always valid and that they are properly deallocated when they are no longer needed.
Smart pointers also solve ownership problems. Rust's ownership model ensures that a value has only one owner at a time. When the owner goes out of scope, the value is deallocated. Smart pointers provide a way to share ownership of a value or to transfer ownership to another scope.
Example showcasing all three smart pointers in Rust, along with comments to explain their usage:
fn main() {
? ? // Create a simple struct to use as an example
? ? struct Person {
? ? ? ? name: String,
? ? ? ? age: u8,
? ? }
? ? // Raw pointers - these are just pointers to memory addresses
// and are not automatically managed
? ? let mut p1 = Box::new(Person { name: String::from("Amit"), age: 40 });
? ? let p2: *const Person = &*p1; // Create an immutable raw pointer to p1
? ? let p3: *mut Person = &mut *p1; // Create a mutable raw pointer to p1
? ? // If raw pointers are not used correctly, they can lead to a
// variety of issues such as null pointer dereferencing,?
? ? // use-after-free, and data races.?
? ? // Box pointers - these are managed pointers that automatically
// allocate and deallocate memory for their contents
? ? let p4 = Box::new(Person { name: String::from("Vinayak"), age: 43 });
? ? // Uncommenting the following line will result in a compile-time
// error because Box<T> owns its contents and?
? ? // does not allow copying or moving. This prevents accidental
// double-frees and use-after-free errors.
? ? // let p5 = p4;?
? ? // Rc pointers - these are reference-counted pointers that
// allow shared ownership of their contents
? ? let p6 = std::rc::Rc::new(Person { name: String::from("Om"), age: 3.5 });
? ? let p7 = p6.clone(); // Both p6 and p7 now own the same Person instance
? ? // Since Rc<T> uses reference counting, it can lead to issues such as cyclic references and memory leaks?
? ? // if not used carefully.
? ? // Arc pointers - these are atomic reference-counted pointers that allow shared ownership of their contents?
? ? // across multiple threads
? ? let p8 = std::sync::Arc::new(Person { name: String::from("Anvi"), age: 1 });
? ? let p9 = p8.clone(); // Both p8 and p9 now own the same Person instance
? ? // Arc<T> is thread-safe but has a performance overhead due to the use of atomic operations. Additionally,?
? ? // it can still lead to issues such as cyclic references and memory leaks if not used carefully.
}
Advantages of Smart Pointers
Smart pointers provide several advantages over raw pointers in Rust:
Limitations of Smart Pointers
Smart pointers have a few limitations in Rust:
Suitable Scenarios for Smart Pointers
Smart pointers are suitable for various scenarios in Rust:
Differences between Box, Rc, and Arc:
Still, there are 3 more smart pointers in Rust!!
You heard it right, yes => RefCell<T>, Cell<T>, and Cow<T>.
While Box<T>, Rc<T>, and Arc<T> are indeed more commonly used smart pointers in Rust, it doesn't diminish the importance and usefulness of RefCell<T>, Cell<T>, and Cow<T>. Each of these smart pointers serves specific purposes and can be valuable in different scenarios.
RefCell<T>:
Cell<T>:
Cow<T>:
While Box<T>, Rc<T>, and Arc<T> are more commonly used due to their broader applicability, RefCell<T>, Cell<T>, and Cow<T> are valuable tools for specific use cases that involve interior mutability, atomic-like behavior, or flexible borrowing scenarios. Understanding their strengths and use cases can help you choose the appropriate smart pointer for your specific needs.
I will write a separate article on RefCell<T>, Cell<T>, and Cow<T>.
Thanks for reading till end. Let's learn together!!
Happy learning!