登录查看更多内容

Storage in substrate -Part1

Amit Nadiger

Polyglot(Rust??, C++ 11,14,17,20, C, Kotlin, Java) Android TV, Cas, Blockchain, Polkadot, UTXO, Substrate, Wasm, Proxy-wasm,AndroidTV, Dvb, STB, Linux, Engineering management.

发布日期: 2023年9月16日

I will be composing a comprehensive trilogy on Substrate storage. Part 1 will center on best practices for determining what to include and exclude from blockchain storage, along with an exploration of transactional storage. In Part 2, we will delve into various storage data structures, including StorageValue, StorageMap, StorageDoubleMap, and StorageNMap. Lastly, Part 3 will provide an in-depth examination of hashmaps, encompassing both cryptographic options like the Blake2 series and non-cryptographic alternatives and when use what.

When developing the runtime (pallets or modules) that runs within the blockchain, one crucial aspect is deciding what information to store and how to do it efficiently.

Storing data, like saving and reading information, can be resource-intensive and slow down the blockchain. So, it's essential to be careful about what and how much you store.

Using hashes in blockchain governance helps save storage space and keeps the blockchain efficient. It's like storing a digital fingerprint of data rather than the entire data itself. This approach is particularly useful when dealing with large pieces of code or files that should only be brought onto the blockchain when necessary.

In blockchain development, it's important to be efficient with storage space. You should avoid storing temporary data that won't be needed if an operation fails. Instead, only store it when it's certain it will be used. Additionally, creating bounds or limits on the amount of data that can be stored for certain actions helps control and optimize the use of storage space on the blockchain.

Deciding what to store in BC.

The key idea here is to be selective and efficient in what we choose to store in the blockchain's runtime. Focus on critical information that's essential for the blockchain's operation and consensus, and avoid storing data that's temporary, large, or unnecessary for maintaining the blockchain's integrity. This helps keep the blockchain lean, fast, and cost-effective.

Using Hashed Data to Reduce Storage:

When you're dealing with a blockchain system, it's crucial to be efficient with the storage of data. Storing data on a blockchain can be costly and slow down the network. One way to be efficient is by using a technique called hashing.

What is Hashing?

Hashing is like a digital fingerprint for data. It takes any amount of data and turns it into a fixed-size string of characters (the hash). This hash is unique for each unique set of data. Importantly, the size of the hash is always the same, regardless of how much data you put in.

Example in Governance:

In blockchain governance, like in Substrate's Democracy pallet, network participants often need to vote on things. Instead of voting on the entire proposal or decision (which could be very long), they vote on the hash of that proposal.

Why Hash the Proposal?

Hashing the proposal makes sense because hashes are always a fixed size, no matter how big the proposal is. So, it's more efficient to store and manage on the blockchain.

Runtime Upgrades Example:

Consider a scenario where a proposal involves a large piece of code (Wasm blob) that needs to be executed during a runtime upgrade. Storing this entire code on the blockchain would be impractical and costly. Instead, the proposal can be tied to the hash of the code.

The Benefit:

When you hash the proposal, you don't have to store all the details of the proposal immediately. This saves space on the blockchain. Only after the proposal is approved, and it's necessary to execute the code, will the full details be brought on-chain.

Minimizing On-Chain Data Using IPFS:

Another way to use hashes is with IPFS (InterPlanetary File System). Instead of storing large files directly on the blockchain, you store only the hash of the file's location on IPFS. This hash is small and manageable on the blockchain.

Avoid Storing Transient Data:

In blockchain systems like Substrate, storage is valuable but limited. Storing data can consume a lot of space and slow down the network. It's important to be mindful of what data you store, especially when it's temporary or only needed temporarily during a specific action.Example:

Imagine you have an operation that involves multiple steps, and each step needs some temporary data. Instead of storing this temporary data permanently in the blockchain's storage, you can avoid doing so. This is because, if the operation fails at some point, you won't need that temporary data anymore.

Multi-Signature Example: For instance, consider a multi-signature feature where several parties need to sign a transaction. You may want to keep track of who has signed it. However, you shouldn't store each signer's information in the blockchain storage right away. Instead, you should only record this information after all the conditions for signing are met. Until then, it's considered temporary and doesn't need to be stored.

Create Bounds for storage:

Creating bounds means setting limits on how much storage space can be used for specific data. It's like saying, "You can store this much data, but not more." This is a powerful way to control the use of storage space in the blockchain, ensuring it doesn't get overloaded.

Example: Let's go back to the multi-signature feature. If you allow an unlimited number of signatories to be tracked, it could potentially lead to excessive data storage. To prevent this, you create a limit, or a bound, on how many signatories can be tracked for a specific operation. Users are required to set this limit as a precondition before the data is stored.

Using Hashed Data in Transactional Storage:

In blockchain systems, storing data efficiently is crucial to keep the network fast and cost-effective. One way to do this is by using a technique called hashing, which creates a unique fingerprint for data.

Transactional Storage Basics:

Substrate's storage architecture is designed with efficiency, data integrity, and security in mind. The runtime storage layer is where application-specific data resides, while the overlay change set manages changes before they are committed. The Merkle Trie provides an efficient structure for data organization and validation, and the key-value database offers permanent storage for blockchain data. Together, these layers enable Substrate to efficiently manage and secure data on the blockchain.

In a blockchain's runtime, there's a key-value database that keeps track of all the data (like balances, smart contract state, etc.).
When you make changes to the blockchain, these changes are first recorded in an in-memory "transactional storage" layer.
This transactional storage layer is like a temporary workspace where changes are stored until they are ready to be permanently added to the main database.

Imagine Substrate's storage layers like a stack, with each layer built on top of the previous one:

.--------------------------.
|   Runtime Storage       |
| (Application Data)      |
| Features:               |
| - Utilizes SP-IO        |
| - Easy APIs for         |
|   data management       |
`--------------------------'

.--------------------------.
|   Overlay Change Set    |
| (Temporary Workspace)   |
| Features:               |
| - Stages changes        |
| - Submitted once per    |
|   block                 |
| - Manages two types of  |
|   changes:              |
|   - Prospective         |
|   - Committed           |
`--------------------------'

.--------------------------.
|   Merkle Trie (Patricia)|
|  (Efficient Data Store) |
| Features:               |
| - Efficient data       |
|   organization         |
| - Used for validating  |
|   transactions and     |
|   blocks              |
`--------------------------'

.--------------------------.
|   Key-Value Database    |
|   (On-disk Storage)     |
| Features:               |
| - Stores data on disk  |
| - Permanent storage    |
|   of blockchain data   |
`--------------------------'

1. Runtime Storage:

This is the top layer of storage in Substrate, where application-specific data is stored. It's where blockchain modules and smart contracts save their information.

Utilizes SP-IO (Storage and Pallet Input/Output): SP-IO is a set of tools and APIs provided by Substrate for managing storage efficiently. It helps with reading and writing data to this storage layer.

Easy APIs for Data Management: Substrate provides straightforward APIs for managing data, making it easier for developers to interact with and manipulate storage.

2. Overlay Change Set:

The Overlay Change Set is an intermediate layer that acts as a temporary workspace for tracking changes before they are finalized and committed to the Merkle Trie and, ultimately, the database. It plays a crucial role in managing transactions and ensuring data consistency.

Stages Changes: This layer stages changes to data, allowing them to be reviewed and verified before they become permanent.

Submitted Once per Block: Changes are submitted and finalized once per block, ensuring that updates to the blockchain state are batched and executed in an organized manner.

Two Types of Changes:

Prospective Changes: These are potential changes that are proposed but not yet committed. They are tentative and can be discarded if needed.

领英推荐

Blockchain Technology Industry: Revolutionizing the…

Rajoo Jha 1 年前

Modular Blockchains: All You Need to Know

Bake 1 年前

Connecting the Dots: Multi-chain Communication Goes…

Jean Ng ?? 1 年前

Committed Changes: These are changes that have been reviewed and approved by the network and are ready to be permanently stored in the blockchain.

3. Merkle Trie (Patricia):

The Merkle Trie, also known as the Patricia Trie, is a data structure used to efficiently organize and store data on the blockchain. It's a critical component for validating transactions and blocks.

Efficient Data Organization: The Merkle Trie organizes data in a way that makes it efficient to prove the state of the blockchain at any given point without storing all data in a single structure.

Used for Validating Transactions and Blocks: In consensus mechanisms like PoA and PoS, the Merkle Trie is used to validate transactions and ensure their integrity, contributing to the blockchain's security and reliability.

4. Key-Value Database(Rocksdb or parity DB ):

The Key-Value Database is the bottom layer of Substrate's storage architecture. It's where the actual data is stored on disk, providing permanent storage for blockchain data.

Stores Data on Disk: This layer is responsible for persistently storing data on physical storage devices, ensuring that data remains accessible even when the system restarts.

Permanent Storage of Blockchain Data: It serves as the long-term storage solution for all blockchain data, including historical data, making it available for retrieval and validation.

Error Handling in Transactional Storage:

If, for some reason, there's an error that prevents a change from being successfully recorded, the data in the transactional storage layer is discarded. The main database remains unchanged to maintain consistency.

Extending Transactional Storage:

We can extend this transactional storage system by using a #[transactional] macro. This allows us to create additional temporary storage layers in memory.
These extra storage layers give us control over which changes should eventually be added to the main database. We can choose to commit specific changes or not.

Nesting Transactional Layers:

We can nest these transactional storage layers. This means you can have multiple layers, one inside the other, up to a maximum limit (usually ten).
Each layer can decide whether to pass its changes to the layer below it. This nesting provides a way to control what data gets permanently stored.

Dispatching a Transactional Storage Layer Call:

If you want to execute a function call within its own temporary storage layer, you can use the dispatch_with_transactional(call) function.
This creates a new transactional layer specifically for that call and allows you to handle the results within that context. It's like creating a mini workspace for a specific task.

How to commit changes without transactional storage layer:

In a blockchain system, it's common to use a transactional storage layer to temporarily store changes before they are committed to the main storage. This approach helps ensure data consistency and prevents errors from affecting the main database. However, there might be cases where you want to commit changes directly to the main storage without using this temporary layer.

The #[without_transactional] macro is a tool that allows blockchain developers to bypass the usual transactional storage layer and commit changes directly to the main storage overlay. However, it should be used with caution, as it can lead to data consistency issues if errors occur after storage modifications. Developers should carefully assess whether a function is safe for this approach based on the specific requirements of their blockchain application. Below are details:

Using the #[without_transactional] Macro:

To commit changes directly to the main storage overlay, you can use the #[without_transactional] macro in your code.
This macro helps you identify a function that is safe to execute without its own transactional storage layer. It essentially skips the temporary storage layer.

Example Function:

Here's an example function that uses the #[without_transactional] macro:

/// This function is safe to execute without an additional transactional storage layer.

#[without_transactional]
fn set_value(x: u32) -> DispatchResult {
    Self::check_value(x)?;
    MyStorage::set(x);
    Ok(())
}

In this function, the #[without_transactional] macro is applied, indicating that it's safe to directly modify the main storage.

Caution with #[without_transactional]:

While using #[without_transactional] allows you to skip the temporary storage layer, it comes with a caution. Any changes made to storage within a function using this macro will directly affect the main in-memory storage overlay.
If an error occurs after these changes have been made, those changes will persist. This could potentially leave your database in an inconsistent or unexpected state.

How to Access Runtime Storage in Substrate:

In Substrate, the blockchain state is stored using a key-value database. Developers interact with this storage using storage abstractions provided by Substrate. These abstractions simplify the process of reading and writing data in the underlying database.

Substrate provides a structured and organized way to work with blockchain state data through the FRAME Storage module. We can choose from various storage structures to efficiently store and manage data, tailoring our choice to the specific requirements of your blockchain application. These storage items become part of the blockchain's state and play a crucial role in the functionality and integrity of the blockchain.

FRAME Storage Module:

The FRAME Storage module is a part of the Substrate framework that simplifies how developers access and manipulate the blockchain's storage.
It provides a structured way to interact with storage data, making it easier to manage and maintain the state of the blockchain.

Types of Storage Structures:

StorageValue:StorageValue is used to store a single piece of data, such as a number (e.g., a u64 integer).It's suitable for storing individual values like a balance or a counter.
StorageMap:StorageMap is used to store key-value pairs, where each key is associated with a specific value.For example, it can be used to map account addresses to their corresponding balance values.
StorageDoubleMap:StorageDoubleMap extends StorageMap and is used when you want to efficiently manage mappings with two keys.This is useful for scenarios where you need to remove all entries with a common first key.
StorageNMap:StorageNMap is a more generalized version of StorageMap that allows us to create mappings with any arbitrary number of keys. It provides flexibility for handling complex data structures.

Introducing New Storage Items:

As a Substrate developer, we can include these storage structures in our pallets (modules) to introduce new storage items to the blockchain's state.
These storage items can represent various pieces of information, such as balances, account details, or any data relevant to the blockchain's functionality.

Choosing the Right Storage Structure:

The choice of which storage structure to implement depends on how we intend to use the information within your runtime logic.
For example, if we need to store a simple value like a token balance, we might use StorageValue. If you need to manage a mapping of accounts to balances, StorageMap would be appropriate.

Let's discuss details of storage and storing in next article part-2.

Will discuss below in next article:

StorageValue to store any single value, such as a u64.
StorageMap to store a single key to value mapping, such as a specific account key to a specific balance value.
StorageDoubleMap to store values in a storage map with two keys as an optimization to efficiently remove all entries that have a common first key.
StorageNMap to store values in a map with any arbitrary number of keys.

Referance : Runtime storage structures | Substrate_ Docs

要查看或添加评论，请登录

Amit Nadiger的更多文章

Rust modules

2025年3月16日

Rust modules

Referance : Modules - Rust By Example Rust uses a module system to organize and manage code across multiple files and…
List of C++ 17 additions

2025年3月9日

List of C++ 17 additions

1. std::variant and std::optional std::variant: A type-safe union that can hold one of several types, useful for…
List of C++ 14 additions

2025年3月7日

List of C++ 14 additions

1. Generic lambdas Lambdas can use auto parameters to accept any type.

6 条评论
Passing imp DS(vec,map,set) to function

2025年3月2日

Passing imp DS(vec,map,set) to function

In Rust, we can pass imp data structures such as , , and to functions in different ways, depending on whether you want…
Atomics in C++

2025年2月24日

Atomics in C++

The C++11 standard introduced the library, providing a way to perform operations on shared data without explicit…

1 条评论
List of C++ 11 additions

2025年2月23日

List of C++ 11 additions

1. Smart Pointers Types: std::unique_ptr, std::shared_ptr, and std::weak_ptr.

2 条评论
std::lock, std::trylock in C++

2025年2月22日

std::lock, std::trylock in C++

std::lock - cppreference.com Concurrency and synchronization are essential aspects of modern software development.

3 条评论
std::unique_lock,lock_guard, & scoped_lock

2025年2月22日

std::unique_lock,lock_guard, & scoped_lock

C++11 introduced several locking mechanisms to simplify thread synchronization and prevent race conditions. Among them,…
Understanding of virtual & final in C++ 11

2025年2月22日

Understanding of virtual & final in C++ 11

C++ provides powerful object-oriented programming features such as polymorphism through virtual functions and control…
Importance of Linux kernal in AOSP

2025年2月15日

Importance of Linux kernal in AOSP

The Linux kernel serves as the foundational layer of the Android Open Source Project (AOSP), acting as the bridge…

1 条评论

See all articles

Storage in substrate -Part1

Amit Nadiger

Polyglot(Rust??, C++ 11,14,17,20, C, Kotlin, Java) Android TV, Cas, Blockchain, Polkadot, UTXO, Substrate, Wasm, Proxy-wasm,AndroidTV, Dvb, STB, Linux, Engineering management.

Transactional Storage Basics:

领英推荐

Error Handling in Transactional Storage:

Extending Transactional Storage:

Amit Nadiger的更多文章

社区洞察

其他会员也浏览了

A Revolution in Government Document Storage and Distribution: The Power of Blockchain Technology

Unlocking Blockchain's True Potential: Why Developers Need Access to Big Data Monopolized by Tech Giants and Governments

Let's Check Out What's New in R3 Corda 5 Beta 2.0.

Blockchain Technology V.S Distributed Ledger

Introducing the Super Genius Blockchain: An Optimized DAG-like Structure for Efficiency and Trust

Blockchain consensus algorithms Present Trends : voting Based Consensus

Chainlink: Empowering Blockchain with Real-World Data

Exploring the Intersection of Blockchain and Web 3.0: Building the Future of the Decentralized Web

8 Inconvenient Truths About Blockchain Integrations (That Only Bitwave Addresses)

Revolutionizing Web3 Arena With The Graph

Transactional Storage Basics:

领英推荐

Error Handling in Transactional Storage:

Extending Transactional Storage:

Amit Nadiger的更多文章

Rust modules

List of C++ 17 additions

List of C++ 14 additions

Passing imp DS(vec,map,set) to function

Atomics in C++

List of C++ 11 additions

std::lock, std::trylock in C++

std::unique_lock,lock_guard, & scoped_lock

Understanding of virtual & final in C++ 11

Importance of Linux kernal in AOSP

社区洞察

其他会员也浏览了

A Revolution in Government Document Storage and Distribution: The Power of Blockchain Technology

Unlocking Blockchain's True Potential: Why Developers Need Access to Big Data Monopolized by Tech Giants and Governments

Let's Check Out What's New in R3 Corda 5 Beta 2.0.

Blockchain Technology V.S Distributed Ledger

Introducing the Super Genius Blockchain: An Optimized DAG-like Structure for Efficiency and Trust

Blockchain consensus algorithms Present Trends : voting Based Consensus

Chainlink: Empowering Blockchain with Real-World Data

Exploring the Intersection of Blockchain and Web 3.0: Building the Future of the Decentralized Web

8 Inconvenient Truths About Blockchain Integrations (That Only Bitwave Addresses)

Revolutionizing Web3 Arena With The Graph