Memory management & ownership in Rust
Amit Nadiger
Polyglot(Rust??, C++ 11,14,17,20, C, Kotlin, Java) Android TV, Cas, Blockchain, Polkadot, UTXO, Substrate, Wasm, Proxy-wasm,AndroidTV, Dvb, STB, Linux, Engineering management.
Memory management is a critical aspect of software development. Programs must allocate and deallocate memory effectively to avoid performance problems, memory leaks, and security vulnerabilities.
Rust is a modern, low-level programming language that is known for its strong memory safety guarantees. Memory management in Rust is designed to be safe and efficient, with a focus on preventing memory errors such as null pointer dereferences, buffer overflows, dangling pointers (use-after-free errors).
In this article, we will explore Rust's memory management model in detail, including how Rust manages memory on the stack and the heap, how it implements ownership and borrowing, and how it handles lifetimes.
Memory Segments
Before diving into Rust's memory management, it's helpful to understand the different segments of memory in a typical program. A running program usually has below memory segments:
The text segment and data segment are typically read-only and cannot be modified at runtime. The heap and stack are dynamic and can be allocated and deallocated at runtime.
Memory Management in Rust
Rust uses a combination of ownership and borrowing to manage memory. Ownership refers to the idea that a single owner is responsible for a particular piece of memory. When the owner goes out of scope, Rust automatically frees the memory. Ownership can be transferred from one variable to another using the move keyword.
Borrowing, on the other hand, refers to temporarily lending a reference to a variable to another variable. The borrowing mechanism ensures that multiple variables do not have simultaneous access to the same piece of memory, preventing data races and other concurrency-related problems.
let's discuss each memory segment one by one.
1.Stack Memory
The stack is a region of memory that stores function call frames and local variables. The stack is managed automatically by Rust's runtime. When a function is called, a new stack frame is created on the stack to store the function's local variables and other information. When the function returns, the stack frame is destroyed, and the memory is freed.
Rust's stack management is deterministic and efficient. Since the stack is a fixed size, Rust can allocate and deallocate memory quickly and without fragmentation. However, the stack has a limited size, and programs that use a lot of stack space may run into stack overflow errors.
Here's an example of stack memory usage in Rust:
fn main() {
let x = 5; // x is allocated on the stack
println!("x = {}", x);
} // x is deallocated when main() returns
In this example, the variable x is allocated on the stack when main() is called, and is deallocated when main() returns.
----------------------------------------------------------------x
2. Heap Memory
The heap is a region of memory that is used for dynamically allocated memory. Rust's heap is managed using a combination of ownership and borrowing. In Rust, heap memory is managed using smart pointers. Smart pointers are special data types that manage a value on the heap and provide additional functionality such as reference counting, automatic memory deallocation, and data sharing.
Rust provides several types of heap-allocated data structures, such as Vec and Box.
Can I allocate the memory on the heap without smart pointers?
In Rust, heap allocation is typically done using smart pointers, such as Box, Rc, or Arc. These smart pointers provide additional functionality and guarantees over raw pointers and allow for safer and more controlled memory management.
However, if you specifically want to allocate data on the heap without using smart pointers, you can use the std::alloc module provided by the Rust standard library. This module provides low-level functions for memory allocation and deallocation.
Here's an example of how you can allocate memory on the heap using std::alloc:
The below code manually allocates memory on the heap, writes a value to the allocated memory, accesses and prints the value, and finally deallocates the memory. This low-level memory management can be useful in certain scenarios where fine-grained control over memory allocation and deallocation is required. However, it's important to note that manual memory management comes with increased complexity and potential for bugs, so it should be used with caution.
use std::alloc::{alloc, dealloc, Layout};
use std::ptr;
fn main() {
? ? let value = 42;
// Below line creates a Layout object using the Layout::new function.
// It specifies the size & alignment requirements for an i32 type.
? ? let layout = Layout::new::<i32>();
// Below 2 lines retrieve the size and alignment values from the Layout
// object. The size represents the number of bytes needed to store an i32
// value, and align represents the alignment requirements.
? ? let size = layout.size();
? ? let align = layout.align();
// The below line uses the alloc function from std::alloc to allocate memory
// on the heap. The alloc function takes a Layout object as an argument,
// specifying the size and alignment requirements for the allocation.
// The returned pointer is then cast to a mutable raw pointer (*mut i32),
// as we are allocating memory for an i32 value.
? ? let ptr = unsafe { alloc(layout) as *mut i32 };
? ? if !ptr.is_null() {
? ? ? ? unsafe {
// this unsafe block is used to perform the memory write operation.
// The ptr::write function writes the value of 42 to the memory location
// pointed to by ptr. Writing to raw pointers is an unsafe operation,
// so it must be done within an unsafe block.
? ? ? ? ? ? ptr::write(ptr, value);
? ? ? ? ? ? // Access the allocated value
? ? ? ? ? ? println!("Allocated value: {}", *ptr);
// Below line deallocates the previously allocated memory.
// The dealloc function takes a pointer to the memory and the corresponding
// layout as arguments and releases the memory back to the system.
// Note that the pointer is cast to *mut u8 (unsigned byte) because dealloc
// expects a pointer to bytes.
? ? ? ? ? ? // Deallocate the memory
? ? ? ? ? ? dealloc(ptr as *mut u8, layout);
? ? ? ? }
? ? }
}
Note that using raw heap allocation with std::alloc requires careful management of memory, including manual deallocation. It bypasses Rust's ownership system and the safety guarantees provided by smart pointers. Therefore, it's generally recommended to use smart pointers like Box or other higher-level abstractions unless you have a specific reason to resort to raw heap allocation.
When a variable is allocated on the heap, it is allocated using the new keyword. Ownership of the memory is transferred to the variable, and the variable is responsible for freeing the memory when it goes out of scope. Rust's Box type is a smart pointer that is used to allocate memory on the heap and automatically free it when the Box goes out of scope.
Rust's heap management is designed to be safe and efficient. Rust uses a combination of ownership and borrowing to ensure that the heap is not overused, and that memory is not leaked or used after it has been freed.
RAII in Rust:
RAII stands for Resource Acquisition Is Initialization. It's a programming idiom used in languages with deterministic memory management, such as C++ and Rust, to manage the acquisition and release of resources (like memory, threads,mutex,files, locks, etc.) in a safe and automatic way.
The basic idea behind RAII is that the acquisition of a resource is tied to the initialization of an object, and the release of the resource is tied to the destruction of the object. This way, the resource is automatically released when the object goes out of scope, even in the presence of exceptions, early returns, or other error conditions.
In Rust, RAII is a fundamental part of the language design, and is enforced by the ownership and borrowing system. Rust provides several built-in types that use RAII to manage memory and other resources, such as Box, Vec, String, and many more. Rust also allows users to define their own RAII types by implementing the Drop trait, which provides a custom destructor function that is called when the object is dropped.
RAII is a powerful and elegant pattern that makes it easy to write correct, robust, and efficient code in Rust, and is one of the key features that sets Rust apart from other systems programming languages.
In Rust memory management is usually handled by RAII wrapper types such as Box, Vec, Rc, or Arc. These encapsulate ownership and memory allocation via various means, and prevent the potential errors in C.
Smart pointers in Rust:
Rust provides three types of smart pointers for managing heap memory: Box<T> and Rc<T>,ARC<T>. Box<T> is a simple pointer that manages a single heap-allocated value of type T, while Rc<T> is a reference-counted pointer that can be shared between multiple owners, Arc<T> is automatically reference counted pointer which is atomic operations, and which is thread safe also.
Here's an example of heap memory usage in Rust using Box<T>:
fn main() {
let x = Box::new(5); // allocate an integer on the heap
println!("x = {}", x);
} // x is deallocated when main() returns
In this example, the variable x is allocated on the heap using Box::new(). When main() returns, the memory used by x is deallocated automatically.
Details about smart pointers : Smart Pointers in Rust | LinkedIn
Let's discuss the heap allocation with String as example:
When a String is created in Rust, a fixed-sized part of the data is allocated on the stack, and a dynamically sized part of the data is allocated on the heap.
The fixed-sized part of the data includes the pointer to the heap-allocated memory where the actual string data is stored, as well as the length and capacity of the string. These are stored on the stack as part of the String struct.
The dynamically sized part of the data is the actual string data, which can be of varying length depending on the string. This is allocated on the heap and pointed to by the pointer stored in the fixed-sized part of the data.
fn main() {
? ? let S = String::from("OmShree");
}
String type is represented by a block. It contains a pointer ptr which points to the actual string data that is heap-allocated. The length of the string is stored in the len field and the allocated capacity of the string is stored in the capacity field. These fields are also stored on the stack as part of the String struct.
? ? ?+---------------------+
? ? ?|? ? ? ? S ? ? ? ? ?|
? ? ?|? (String type) |
-------------------------------------
? ? ?+---------------------+
? ? ?|? ? ? ? ? ? ? ? ? ? ?|
? ? ?|? ? ? ? ? ? ? ? ? ? ?|
? ? ?|? ? ? ? ? ? ? ? ? ? ?| Heap
? ? ?|? ? ? ? ptr? ? ? ? ? | +---------------------+
? ? ?|? ? ?(heap address)? |------------->| ? ? |
? ? ?|? ? ? ? ? ? ? ? ? ? ?| ? ?| ?|O|m|S|h|r|e|e| |
? ? ?|? ? ? ? ? ? ? ? ? ? ?| ? ?| ? ? |
? ? ?|? ? ? ? ? ? ? ? ? ? ?| ? ?| ? ? |
? ? ?|? ? ? ? ? ? ? ? ? ? ?| +---------------------+
? ? ?+---------------------+
? ? ?|? ? ? ? ? ? ? ? ? ? ?|
? ? ?|? ? ? ? len? = 7? ? ?|
? ? ?|? ? ? ? ? ? ? ? ? ? ?|
? ? ?+---------------------+
? ? ?|? ? ? ? ? ? ? ? ? ? ?|
? ? ?|? capacity = 7? ?|
? ? ?|? ? ? ? ? ? ? ? ? ? ?|
? ? ?+---------------------+
Please see below code which demonstrate the above points:
fn main() {
? ? let mut S = String::from("Jai Shree Ram Jai Hanuman");
? ? // DON'T try this in PRODUCTION CODE! JFYI only.
? ? // String provides no guarantees about its layout, so this could lead to
? ? // undefined behavior.
? ? unsafe {
? ? ? ? let (ptr,capacity, len): (usize, usize, usize) = std::mem::transmute(S);
? ? ? ? println!("ptr = {}, len = {}, capacity = {}",ptr,len,capacity);
? ? }
}
/*
Op =>
ptr = 94110558509632, len = 25, capacity = 25
*/
In the above code std::mem::transmute function to transmute a String into a tuple of three usize values. Here's what each of the values represent:
More about std::mem::transmute
The std::mem::transmute function in Rust is a low-level transmutation function that allows you to convert a value of one type to another type without changing its memory representation. It is a powerful and potentially dangerous function that should be used with caution.
The transmute function is used to reinterpret the memory of the String as a tuple of usize values, which can then be assigned to the tuple on the left-hand side of the assignment.
It's worth noting that this code is using unsafe Rust, as the transmute function can be dangerous if used improperly. In this case, it's being used to extract low-level details about the string's memory representation, which is not typically necessary in Rust code.
The signature of std::mem::transmute is as follows:
pub unsafe fn transmute<T, U>(val: T) -> U
The transmute function takes a value val of type T and converts it to type U. It assumes that the memory layout of T and U are compatible, meaning they have the same size and alignment requirements.
It is important to note that std::mem::transmute is an unsafe function, which means that it bypasses Rust's usual safety guarantees. You should only use it when you are absolutely certain about the memory representation of the types involved and understand the potential risks.
The use of std::mem::transmute requires explicit unsafe blocks, and it is our responsibility to ensure that the transmutation is safe and well-defined. Incorrect usage of transmute can lead to undefined behavior, memory corruption, or other serious issues.
It is generally recommended to use safer alternatives provided by Rust's type system and standard library whenever possible. Only resort to std::mem::transmute when you have a deep understanding of the types involved and have carefully considered the potential risks and consequences.
----------------------------------------------------------------x
3. Static and Global Data
Data Segment:
Static and global data are stored in the data and BSS segments of memory. Rust provides several ways to declare static and global variables, such as the static and const keywords.
Static data is allocated at compile-time and remains in memory for the duration of the program. Static and global data are initialized at program startup and remain in memory for the duration of the program. Since static data is read-only by default, Rust's runtime can optimize its access for performance. It can also be shared between multiple threads, making it a useful tool for inter-thread communication and synchronization. Rust also provides a lazy_static macro that can be used to create lazy-initialized static variables.
These variables are stored in a separate section of the program's memory called the "data segment", and they are loaded into memory when the program is first started.
Static memory is read-only by default, which means that once a value is assigned to a static variable , it cannot be changed during the program's execution.
Whereas constant memory is always read-only.
Here's an example of declaring and initializing a static variable in Rust:
static mut COUNT: i32 = 0;
fn main() {
? ? unsafe {
? ? ? ? COUNT += 1;
? ? ? ? println!("Count: {}", COUNT);
? ? }
}
In this example, we declare a static variable COUNT of type i32. We use the static mut keywords to indicate that the variable is mutable, but we also have to use the unsafe keyword because modifying mutable static variables can be unsafe in Rust.
In the main function, we increment the value of COUNT and print it out. Because COUNT is a static variable, its value will persist across multiple function calls.
It's important to note that Rust's ownership and borrowing rules still apply to static variables and constants. For example, if we have a function that takes a reference to a static variable, the lifetime of the reference will be 'static.
----------------------------------------------------------------x
领英推荐
4. Code Segment:
The .rodata section is a read-only memory segment in the executable file that contains constant data, such as string literals ,consts,and other read-only values. In Rust, the .rodata section is used to store string literals and other constants.
String literals:
String literals are immutable and are stored in the read-only data segment of the executable file. When a Rust program is compiled, the string literals are compiled into a binary representation and stored in the .rodata section of the executable. When the program runs, the string literals are loaded into memory and can be accessed by the program. When the program is loaded into memory, the string literals are mapped to this read-only memory segment and are marked as read-only, so they cannot be modified at runtime.
In Rust, string literals have a static lifetime ('static) which means that they are valid for the entire lifetime of the program.
For example, consider the following Rust code:
fn main() {
let hello = "Hello, World!";
println!("{}", hello);
}
In this example, the string literal "Hello, World!" is stored in the .rodata section of the executable file. When the program runs, the string literal is loaded into memory and the hello variable points to its location in memory.
The .rodata section is read-only, which means that the data stored in this section cannot be modified at runtime. This is because the memory pages containing the .rodata section are marked as read-only, and any attempt to write to this memory results in a segmentation fault.
In Rust, constants are also stored in the .rodata section. Constants are variables whose values cannot be changed after they are initialized. They are declared using the const keyword, and their values are determined at compile-time. Constants are useful for defining values that are used throughout a program and that should not be changed.
Ownership and Borrowing
Let me discuss about only ownership in this article:
One of the most important concepts in Rust's memory management is ownership. Rust has a unique system for managing memory, which relies on the ownership of data. In Rust, memory safety is enforced using a system of ownership and borrowing. Every value in Rust has an owner, which is responsible for allocating and freeing the memory used by that value. When a value is passed to a function or assigned to a variable, ownership of that value is transferred to the new owner.
The ownership rules are:
The borrowing rules are:
No ownership transfer happens for primitive type
When you assign a primitive type, such as an integer or a boolean, to another variable, Rust performs a copy of the value rather than transferring ownership. This is because primitive types are stored on the stack, and their values are copied directly from one memory location to another. Therefore, the original variable retains ownership of the value, and there is no need to transfer ownership.
Here's an example:
let x = 5;
let y = x; // This performs a copy of the value, not a transfer of ownership
println!("x = {}, y = {}", x, y); // Prints "x = 5, y = 5"
In this example, x and y both contain the value 5, but ownership of the value is retained by x.
How ownership can be transferred:
1.Move: Ownership of a value is moved when it is assigned to a variable, passed as an argument to a function, or returned from a function. After the move, the original owner no longer has access to the value.
fn main() {
? ? let s1: String = String::from("JaiShreeRam");
? ? let s2: String = s1;
}
Dig:
Stack? ? ? ? ? ? ? ? ? ? Heap
+------------------+? ? ? +-----------------------------------------+
| s1: ptr, len, cap | --> | "JaiShreeRam" (allocated on the heap)? |
+------------------+? ? ? +-----------------------------------------+
After Moved to S2
S1 (inaccessible)
+------------------+? ? ? +---------------------------------------+
| s1: ptr, len, cap | ----> | "JaiShreeRam"" (allocated on the heap)?|
+------------------+? ? | ? +---------------------------------------+
|
+------------------+ |
| s2: ptr, len, cap| -----
+------------------+
2. Clone: If a value implements the Clone trait, its ownership can be cloned by calling the clone() method. This creates a new value with the same data and transfers ownership of the new value.
#[derive(Copy, Clone, Debug)]
struct Point(i32, i32);
fn main() {
? ? let p1 = Point(99, 101);
? ? let p2 = p1.clone();
? ? println!("p1: {:?}",p1);
? ? println!("p2: {:?}",p2);
}
/*
p1: Point(99, 101)
p2: Point(99, 101)
*/
2nd example :
//#[derive(Copy, Clone, Debug)] // compile error -> String?is not a?Copy?type.
#[derive(Clone, Debug)]
struct Person(String, i32);
fn main() {
? ? let p1 = Person(String::from("Amit"), 101);
? ? let p2 = p1.clone();
? ? println!("p1: {:?}",p1);
? ? println!("p2: {:?}",p2);
}
/*
p1: Person("Amit", 101)
p2: Person("Amit", 101)
*/
3. Copy: If a value implements the Copy trait, its ownership is automatically copied when it is assigned to a new variable or passed as an argument to a function. The original owner still has access to the value.
Primitive types can be copied by default with any extra code.
fn main() {
? ? let x = 101;
? ? let y = x;
? ? println!("x: {}",x); // 101
? ? println!("y: {}",y); // 101
}
/*
p1: Point(99, 101)
p2: Point(99, 101)
*/
Custome types :
#[derive(Copy, Clone, Debug)]
struct Point(i32, i32);
fn main() {
? ? let p1 = Point(99, 101);
? ? let p2 = p1;
? ? println!("p1: {:?}",p1);
? ? println!("p2: {:?}",p2);
}
Difference between copy and clone:
The Copy trait is used for values that can be duplicated by simply copying their bits from one location in memory to another. The Copy trait is automatically implemented for all types that don't contain pointers to allocated memory, such as primitive types like integers and floats, and simple structs that only contain other types that implement Copy. When a type implements Copy, it means that the values of that type can be safely copied and moved around in memory without any special handling.
On the other hand, the Clone trait is used for values that need to be duplicated in a more complex way, such as when the value contains pointers to allocated memory. Clone is a user-defined trait, and types that implement Clone can provide their own implementation of how the duplication should be performed. When a type implements Clone, it means that it is safe to create a new value that is an independent copy of the original.
Below are list of some of the key differences:
4. Assigning a value of one variable to another variable in Rust involves ownership transfer. If the variable is of a primitive data type, such as i32 or bool, the transfer is simple and straightforward, and the ownership is not moved from the source variable to the destination variable.
5. If the variable is of a non-primitive data type, such as a struct or a vector, the ownership transfer involves moving the heap-allocated data from the source variable to the destination variable. This can be done using the clone method or the std::mem::replace function.
6. Passing a value to a function in Rust also involves ownership transfer. By default, the ownership of the argument is moved to the function, and the original variable becomes invalid in the calling scope. To borrow the variable instead of transferring ownership, the & operator can be used to create a reference.
fn say_hello(name: String) {
? ? println!("Hello {name}")
}
fn main() {
? ? let name = String::from("Amit");
? ? say_hello(name);
? ? // say_hello(name);
}
9. Returning a value from a function in Rust also involves ownership transfer. By default, the ownership of the return value is moved to the calling scope, and the function becomes invalid in the called scope. To transfer ownership of a heap-allocated data structure, the Box smart pointer can be used to move the data onto the heap and transfer ownership. Alternatively, the ownership of the data can be transferred to the calling scope using a tuple, which allows multiple values to be returned from the function.
When a value is passed to a function or assigned to a new variable, or returned from function , ownership of the value is transferred to the new owner. This ensures that there is always exactly one owner of a value, preventing issues such as double-free or use-after-free bugs.
Barrowing and Lifetime are discussed in separate article.
Destructor in Rust:
In Rust, a destructor is a special function that is automatically called when an object goes out of scope. The purpose of the destructor is to free any resources that the object owns, such as memory allocated on the heap or file handles.
In Rust, destructors are implemented using the Drop trait. Any type that implements this trait can define its own destructor logic. The Drop trait contains a single method, also named drop(), which is called when the object is dropped
The Drop trait is a special trait in Rust that defines the behavior when an owned value goes out of scope. The Drop trait is called automatically by the Rust runtime when an owned value goes out of scope, and it allows for cleanup operations to be performed..
For example, let's say we have a custom type MyType that needs to perform some cleanup when it is dropped. We can implement the Drop trait for MyType like this:
struct MyType {
// some fields here
}
impl Drop for MyType {
fn drop(&mut self) {
// cleanup code here
}
}
In this example, the drop() method will be called when an instance of MyType is dropped. The method can perform any necessary cleanup logic, such as freeing allocated memory or closing file handles.
By using the Drop trait, Rust provides a way to ensure that resources are properly cleaned up without relying on garbage collection or manual memory management. The compiler automatically generates code to call the drop() method for every object when it goes out of scope, ensuring that the cleanup logic is always executed.
Advantages:
Disadvantages:
Cases where implementing the Drop trait can introduce overhead. For example, if the Drop implementation involves a large amount of computation or IO operations, this could slow down the program's performance. Additionally, if the Drop implementation panics, this could cause the program to abort, which may not be desirable.
Therefore, it's important to carefully consider whether implementing the Drop trait is necessary and to ensure that any custom behavior performed in the Drop implementation is efficient and safe. It's generally recommended to use Drop only when there are no other suitable alternatives for ensuring resource cleanup, and to keep the implementation as simple and efficient as possible.
Suitable scenarios to implement the Drop trait include:
It is important to note that the Drop trait should be used sparingly, and only when necessary. Overuse of the Drop trait can lead to complex and difficult-to-understand code, and can also introduce performance overhead.
Thanks for reading till end.
Let's learn together!!