Written by: 0XNATALIE

At the recent Ethereum developer meeting, a proposal to split Ethereum's Pectra hard fork into two parts was discussed. This proposal was previously rejected because people were worried that it would delay the upgrade of the Verkle tree. However, at this meeting, the developers brought up the idea again because they hope to include more EIPs in the Pectra fork. It is proposed to split the hard fork into two parts: the first part will include all EIPs currently on Pectra Devnet 3, and the second part of the fork will include EOF (EVM object format) and PeerDAS, etc. In order to better understand PeerDAS, let's start with the basic concept of data availability.

DA: Ensure that nodes obtain on-chain data

Data Availability (DA) means ensuring that the blocks published by the block proposer and all transaction data contained in the blocks can be effectively accessed and obtained by other network participants. Data availability is a key factor in blockchain security because if the data is not available, even if the block is legitimate, other nodes cannot verify its content, which may cause consensus problems and network attacks. For example, an attacker may only publish part of the block data, making it impossible for other nodes to verify.

When a new block is broadcast, all participating nodes will download and verify the block's data. This model is feasible when the network is small, but as the blockchain continues to grow, the amount of data will become very large, and the storage of each node will continue to increase, and the hardware requirements will increase accordingly. In order to allow light nodes (mobile devices such as mobile phones or computers) to participate in block verification, blockchain introduces sharding technology.

Sharding technology divides the entire blockchain network into multiple small "shards". Each shard only processes its own part of the data, and does not have to process the data of the entire blockchain. Therefore, a single node only needs to process the data of its own shard. But each shard only processes a part of the data, which means that the nodes of other shards cannot directly access the complete data. So how to ensure that the data in the shard is available and that other nodes can verify the validity of this data? For example, a node in a shard publishes a newly generated block, but it may only publish a part of the data. If other nodes cannot obtain all the data of the block, they cannot verify whether the block is authentic and legal.

DAS: Verify overall data availability through partial data

In order to deal with the data availability problem in sharding, Data Availability Sampling (DAS) technology was proposed. Its core idea is to verify the data availability of blocks through sampling, without requiring each node to store or download the complete block data.

Data availability sampling allows nodes to verify data availability by simply randomly obtaining a portion of the data in a block. If the node can successfully obtain and verify these random data fragments, it can be inferred that the data of the entire block is available.

To support this sampling verification, block data usually uses RS encoding. This encoding allows the complete data to be restored even if part of the data is lost. Therefore, even if the node only downloads part of the block data, it can infer and confirm the validity of the entire block data. DAS reduces the amount of data that each node needs to process through sampling verification, and light nodes can also participate in block verification.

DA layers such as Celestia are implemented using these technologies, mainly involving RS encoding + validity proof + DAS.

  • RS encoding (Reed-Solomon Encoding): This encoding method allows nodes that only receive a portion of the data fragment to reconstruct the entire data block. It is similar to an error-correcting code and has a certain fault tolerance. Even if part of the data is lost, the remaining part is sufficient to reconstruct the complete data.

  • Validity Proof: Use zero-knowledge proof to ensure that there are no errors in the encoding and transmission of data. If the verification is successful, the entire data can be decoded without error.

  • DAS (Data Availability Sampling): Light nodes randomly sample a portion of RS-encoded fragments in a block, verify the availability of these fragments, and thus infer that the entire data block is available.

PeerDAS: Collaborative data verification between nodes

PeerDAS is a specific implementation of DAS. It performs data availability sampling through a peer-to-peer network. A peer-to-peer network is a network composed of multiple nodes that communicate directly with each other. Under DAS, each node independently performs data sampling verification, while PeerDAS optimizes this process by allowing nodes to collaborate to share and verify data in blocks, further improving verification efficiency. Nodes are not isolated and can share data verification tasks and results, and can rely on data that has been verified by other nodes. In this way, nodes do not have to bear all the verification work alone, but can share verification tasks through cooperation, further reducing the burden on nodes. In addition, collaborative verification increases the difficulty of data tampering. Attackers need to affect multiple verification nodes at the same time to successfully tamper with data.

Currently, according to the latest Ethereum meeting on PeerDAS, the Ethereum client Lighthouse team has merged the DAS branch into the main branch and is testing to ensure compatibility with PeerDAS. Branches are usually used to develop and test independent code versions of new features or improvements. Merging into the main branch means that this feature or improvement has been developed and is confidently stable and can be merged into the core code.