The Problem of Data Availability
Genesis IT Lab
Genesis Lab is a Market-Leading Blockchain, AI and Cyber Security Software Development Team <We Code Your Dreams>
A question that usually rises in scaling blockchain technologies is:?
When a new block is produced, how do nodes make sure that all of the data in the block is actually published to the protocol?
This phenomenon is known as the “data availability problem”. This raises concern because if all of the data is not provided for a block, any malicious transaction hidden within that block cannot be detected. This in turn will risk security of the whole network.
We will address the problem of data availability and the aspect it affects in detail.
Let's begin with the process of working of nodes.
Table of Contents:
How Nodes Work?
Each block in a network contains two pieces:
Moving on, there are two types of nodes in a blockchain network.
But wait, there is still a way for light clients to indirectly verify transactions in a block. Instead of checking transactions themselves, they can get fraud proof from a full node. Fraud proof is a small proof that a transaction is invalid.?
Here is where the problem of data availability comes into play. If a block does not have complete data, a full node cannot detect if a transaction is valid or not. This will result in no fraud proof & that in turn will compromise the authenticity of the network.
Block developers should provide complete data in their block. This should be enforced. One solution to this problem is to devise a plan where light clients can check transaction data. But this is to be done without the light client having to download the whole block. It will defeat the purpose of a light client.??
Let’s dive deeper into the topic and discuss each function separately.
Scalability of Blocks:
Normally in blockchains like bitcoin, an average laptop can become a full node and verify the entire chain.This is because there is an artificial block size limit to keep the blockchain’s size in check.?
What if we want to increase the block size limit?
This will reduce the number of nodes as it will outgrow abilities of an average laptop. This will start a cycle; there will be more light clients; light clients will not be able to verify the transactions, security of that will be compromised. Moreover, this is bad for decentralization. This is because it will be easier for developers to change the rules of that protocol without being detected.?
Thus it is established that there should be a way for light clients to check validity of all the data published in a block.
领英推荐
Sharding:
One of the ways of increasing processing speed of a blockchain is to split a bigger chain into a number of smaller chains. This process is called sharding and the smaller chains produced as result are called shards. Each shard has its own block developers. They can also communicate with each other to transfer tokens between shards.?
The plan behind sharding is to divide the processing power into different shards. This way, instead of each developer processing every transaction, their power is divided into different shards that only process some of the transactions.
Normally, a full node in a sharded blockchain can run as a full node for just one or a couple of shards. But he can run as a light client for every other shard. This is because a single full node for every purpose defeats the purpose of sharding.?
But what if the producers in a shard are malicious and start accepting invalid transactions?
?This is likely to happen in this system because this system is easier to attack. It only has a few block developers for each shard.?
In order to overcome the problem of detecting if a shard has accepted any invalid transaction, you need to make sure that the shard has complete data. This can help prove invalid transactions with fraud proof.
Rollups:
Optimistic rollups are a scaling technology based on side-chains called rollups. These side-chains have their own dedicated block developers. These developers ensure transfer of assets between chains.
But what if the block developers produce blocks that have invalid transactions?
They can steal money from users of that side-chain. Fraud proofs can be used to detect this scam. But this takes us back to the initial problem; the side-chain users need some way of ensuring that the blocks have complete data. Since fraud proofs cannot be developed otherwise.??
Rollups of Ethereum overcome this by posting all of the blocks on ethereum chain. By doing this, they rely on it for data availability. They use ethereum as a data availability layer.
Zero-knowledge (ZK) rollups are similar to optimistic rollups. The difference is that instead of using fraud proofs, they use cryptographic proofs (validity proofs) to prove validity of a block.? These proofs don't require data availability. But ZK rollups, as a whole, require data availability. This is because if a node validates a block using validity proof but the developer has not provided complete data, the users won't be able to interact with the blockchain. This is because they will not know the state of blockchain and their balances.
Moreover, these rollups are designed in such a way that they use blockchain as a data availability layer. They dump their transactions on the chain but all the calculations and computational processes occur inside rollups. This leads to an interesting bit; a blockchain doesn't need to perform any computational process, but it needs to ensure data availability and arrange transactions in orderly fashion.?
Lazy blockchains are an example of this type of protocol. They usually perform two core functions;
It is like a capable component for systems like rollups.
?
Conclusion:
Data availability problems can be a major threat for a blockchain network. It can hinder security and compromise decentralization of a blockchain. This article covered the aspects where and how the data availability problem can affect the protocol. Next article will explain the plausible solutions for these problems