Interpreting EIP 4844 for non-techies: DAS, danksharding, blobs, PBS, and KZGs
Amber Shi, MBA
Specialize in early stage crypto investment and venture building. been in crypto since 2019
Ethereum’s scalability plan: Danksharding and disappearing blobs to keep the network healthy and lower cost to run a roll-up.?
EIP-4844: Blob Transactions, proposes a mechanism that drastically increases the amount of data that can be put on-chain at a lower cost. This is particularly valuable to layer 2 blockchains as layer 1 data fee dominates the cost of running a layer 2 on top. As shown in Graph 1, with this implementation, instead of putting roll-up block data in transaction calldata, roll-ups would put the data into blobs. On the consensus-layer the blobs are now referenced, but not fully encoded, on the Beacon Chain. Instead of embedding the full contents on-chain, the blobs are propagated separately, as a “sidecar”. In addition, those blobs are not available forever. Roll-ups need the data to be available for validity proofs but not in perpetuity. Blobs can be deleted after a reasonable timeline, usually a few months. Without those disappearing blobs, as the transaction data is held in the consensus clients on Beacon Chain, the ever growing amount of data will eventually cause a centralization problem. If the data was kept forever, these clients would bloat and lead to large hardware requirements for running nodes. The data is there so that the clients can attest to the fact that these transactions exist and have been executed onto the blockchain at some point of time. Instead, if the data is automatically pruned from the node after a few months, it both gives sufficient opportunities for second verification and alleviates the “data bloating” issue, thus less validator centralization. These disappearing blobs can be seen as introducing state expiry to blockchains. The actual data can be stored off-chain by rollup operators, or users.
Graph 1?
In danksharding’s ultimate form, it divides 1 blob to 64 blobs. This makes Ethereum more scalable to the potential extent of >100,000 transactions per second, a significant step toward a real foundation of mass adoption. But, even on Ethereum, there is no free lunch. Danksharding and its blob mechanism bring us new challenges. In New Order’s 2023 thesis, there is one section dedicated to blob transactions. I do believe that blob transactions alone are not going to fix all the scalability issues on Ethereum. In the following sessions, we will discuss how those challenges can be solved in the future.?
Using Data Availability Sampling (DAS) to keep the roll-up sequencers honest?
Because the full Danksharding has 64 blobs, each about the size of 128kb, they contain a large amount of data. We need a smart verification mechanism, where the validators can easily prove that all the transaction data are present and available without institutional grade software and hardware. This will be increasingly relevant to the future of Ethereum as it adopts more rollups and light clients. Rollups posts batch transactions on Ethereum, reducing congestion and fee for users. However, it is only possible to trust rollups if the state change can be independently verified. Currently, such data are available on chain as CALLDATA living permanently on-chain. If anyone identifies a problem, they can generate a fraud proof and challenge the roll-up. In other words, as the first step of verification, the roll-up sequencer must make transaction data available so that the validators can check it. However, this is not sustainable in a sense that if the nodes are required to download all the data dumped by the sequencer, it will overtime increase the hardware requirements of running a node thus worsening validator centralization. In contrast to full nodes, the light clients (light nodes) only use a tiny amount of computing power, memory and storage so it can be run on a home computer or even a cell phone. As you can imagine, those clients are important stakeholders in the Ethereum scaling roadmap. They are flexible, have the ability to verify transactions, and don’t need to download all transaction data like a full node would. But, the caveat here is that because they don’t see the full ledger, a malicious full node can trick them into accepting a false block.?
How do we make sure that those light clients can provide the network the same level of security without doing more work? Here is where data availability sampling (DAS) becomes relevant. Light clients download small random chunks of the full state data and use the samples to verify that the full data set is available. Note that availability is not equal to validity. This will become relevant as this article goes on. As such, Data availability sampling (DAS) is introduced as a verification method used by light clients to attest to the data availability. The underlying technology for DAS is Erasure coding and polynomial function. Without getting into the technical details, you can consider it as a series of mathematical functions that leads to an extremely high confidence of the availability of ALL the transaction data. What erasure coding does is to extend the original data with a math function. The key consequence of this data creation is that if any of the original data is unavailable, half of the extended data will be missing. So now if any transaction data is fake, probabilistically, even with two checks, it will show that data is not available. In reality, we can tune the threshold number of sampling to have statistical certainty that all the data is available.?
How do we know the math worked? KZG polynomial commitments?
KZG is a scheme that reduces the data in a blob to a small cryptographic commitment. This is the actual math kicks in and a large part of the reasoning will look like this
领英推荐
. This article is written for non-technicals to understand the meaning of EIP 4844 and the design choice for the roadmap so we can skip the math and get right into the “why”. The execution clients on Beacon chain check the validity of Ethereum transactions using Merkle proofs and re-executing the transactions in the block. By the same token, A prover will also need to go through the same process to prove the validity of the transactions in the blob submitted by the sequencer. Here is why data availability needs to be guaranteed so as the data will be available for anyone to download. This ensures the honesty of the rollup sequencers because if they lie about the state change then anyone can identify the fraud and challenge them. When the rollup posts transaction data in a blob, they also post a “commitment” to the data. The idea here is that the commitment will fit into a polynomial function and a prover can use the same function to prove that the original data is not changed. If the original data is changed, then the function will change and does not fit the commitment anymore.?
Why is proposer-builder separation a(nother) prerequisite for Danksharding? Where does PBS fit in?
In today’s Ethereum system, the roles of block builder and block proposers are performed by the same entity (validators). This means that the party who proposes which transactions to include in a block is also the one who approves it. It’s like in a company, the account and the auditor are the same person. Although it does not create opportunities for embezzlement because transaction settlement has to follow Ethereum consensus, it does put the validators in an advantageous position to look at all the transactions, pick and choose and even include their own strategy to front, back, and sandwich run the users. This is commonly referred to as MEV. MEV arises from the inherent power that block builders have in determining the composition of blocks and the order in which transactions are processed. The main threat this is proposing to the Ethereum community is that smaller validators can’t compete with the bigger and more sophisticated players who can afford quants. Hence, in the long run the MEV profit will lead to validator centralization.?
Proposer Builder Separation (PBS) fixes this by dividing the blockchain construction from block proposals. You can see PBS as a new block market structure where the proposers outsources block construction to third parties. The proposers auction off their rights to make a block. The builders submit bids and the highest bidder takes home a block. They then fill the block with transactions which in turn are broadcasted to the network for consensus. You can assume that the block proposers are more likely to be at home stakers, smaller validators while the block builders are more specialized in ordering and slotting transactions. The builders have to pay the bid to have the right to fill the blocks. However, they also get the MEV. In an efficient market where there are enough builders (bidders), you can in theory assume that the highest bid is infinitely close to the value that can be extracted by MEV. The resource requirements for the proposer are low. The idea is that anyone can run as a proposer on their laptop. Although the requirement is low, the role they perform is mission critical. If it’s not sufficiently decentralized, then it could jeopardize the integrity of the network.?
In the context of EIP 4844 or danksharding, PBS is introduced as a prerequisite. This is because the PBS frees block proposers from having to generate commitments and proof for blob data. The more specialized block builders take that responsibility. Once the block is constructed and filled with execution payload (transaction data), the light nodes sample the data with aforementioned DAS to ensure the data availability.?
Conclusion?
To put the objective of Ethereum scaling in one sentence, it is all about scaling computation without increasing resource requirement. This is why DAS is in the center stage of Ethereum scaling. DAS ensures that all types of validators can successfully defend Ethereum consensus without the need to have sophisticated software or hardware. And that no matter how much data is added to the chain, the light clients can stay light and the cost for rollups to input data (majority of a block) is low. Simply put, with DAS, it is possible to have a decentralized and scalable light nodes network to sustain a “roll-up centric” future of Ethereum. The blob transactions also serve a similar function. Having blob transactions is mutually beneficial to both roll-ups and the base chain. It decreases the data cost for roll-ups and doesn’t constrain the network with ever growing amounts of transaction data. With PBS, it further unburns the at home or smaller validators with resource requirements.