Possible futures of the Ethereum protocol, part 1: The Merge

Original article by Vitalik Buterin

Original translation: Deng Tong, Golden Finance

Special thanks to Justin Drake, Hsiao-wei Wang, @antonttc, and Francesco for their feedback and reviews.

Originally, the "merge" referred to the most important event in the history of the Ethereum protocol since its launch: the long-awaited and hard-won transition from proof-of-work to proof-of-stake. Ethereum has been a stable and running proof-of-stake system for nearly two years now, and this proof-of-stake has performed very well in terms of stability, performance, and avoidance of centralization risks. However, there are still some important areas for proof-of-stake to improve.

My 2023 roadmap breaks it down into several parts: improvements to technical features such as stability, performance, and accessibility to smaller validators, and economic changes to address centralization risks. The former takes over the title of “merger”, while the latter becomes part of “scourge”.

This post will focus on the “merger” part: what improvements can be made to the technical design of proof of stake, and what are the ways to achieve these improvements?

This is not an exhaustive list of improvements that could be made to proof of stake; rather, it is a list of ideas that are being actively considered.

Single-slot finality and democratization of staking

What problem are we solving?

Today, it takes 2-3 epochs (about 15 minutes) to finalize a block, and 32 ETH is required to become a staker. This was originally a compromise to strike a balance between three goals:

Maximizing the number of validators that can participate in staking (which directly means minimizing the minimum amount of ETH required to stake)

Minimize completion time

Minimize the overhead of running a node

These three goals are in conflict with each other: in order to achieve economic finality (i.e. an attacker would need to destroy a lot of ETH to revert a finalized block), every validator needs to sign two messages for each finalization. So if you have many validators, it either takes a long time to process all the signatures, or you need very powerful nodes to process all the signatures at the same time.

Note that this all depends on a key goal of Ethereum: ensuring that even a successful attack is costly for the attacker. This is what the term "economic finality" means. If we don't have this goal, then we can solve this problem by randomly selecting a committee (such as Algorand does) to finalize each slot. But the problem with this approach is that if an attacker does control 51% of the validators, then they can attack (revert finalized blocks, censor or delay finalization) at a very low cost: only some of the nodes in the committee can be detected as participating in the attack and punished, either by slashing or a few soft forks. This means that an attacker can attack the chain over and over again many times. Therefore, if we want economic finality, then the simple committee-based approach will not work, and at first glance, we do need the full set of validators to participate.

Ideally, we would like to preserve economic finality while improving the status quo in two ways:

Finalize blocks within one timeslot (ideally keeping or even reducing the current 12 seconds length) instead of 15 minutes

Allow validators to stake with 1 ETH (from 32 ETH to 1 ETH)

The first goal is justified by two objectives, both of which can be viewed as “aligning the properties of Ethereum with those of (more centralized) performance-focused L1 chains.”

First, it ensures that all Ethereum users benefit from the higher level of security achieved through finality. Today, most users cannot enjoy this security because they are unwilling to wait 15 minutes; with single-slot finality, users can see their transactions finalized almost immediately after they confirm. Second, it simplifies the protocol and the surrounding infrastructure if users and applications do not have to worry about the possibility of a chain rollback (except in the relatively rare case of inactivity leaks).

The second goal is motivated by a desire to support solo stakers. Poll after poll has repeatedly shown that the main factor preventing more people from staking solo is the 32 ETH minimum. Lowering the minimum to 1 ETH would solve this problem to the point where other issues become the main factor limiting solo staking.

There is a challenge: both the goals of faster finality and more democratized staking conflict with the goal of minimizing overhead. In fact, this fact is the entire reason we don’t go with single-slot finality in the first place. However, recent research suggests some possible ways to address this issue.

What is it and how does it work?

Single-slot finality involves using a consensus algorithm that finalizes blocks within one slot. This is not in itself a difficult goal to achieve: many algorithms (such as Tendermint consensus) already achieve this with optimal properties. One desirable property unique to Ethereum that is not supported by Tendermint is inactivity leaks, which allow the chain to continue running and eventually recover even if more than 1/3 of validators are offline. Fortunately, this desire has been met: there are proposals to modify Tendermint-style consensus to accommodate inactivity leaks.

Leading Single-Slot Finality Proposal

The hardest part of the problem is figuring out how to make single-slot finality work at very high validator counts without incurring extremely high node operator overhead. To this end, there are several leading solutions:

Option 1: Brute Force — Work towards a better signature aggregation protocol, perhaps using ZK-SNARKs, which would essentially allow us to process signatures from millions of validators per slot.

Horn, one of the proposed designs for a better aggregation protocol.

Option 2: Orbit Committees - A new mechanism that allows a randomly selected medium-sized committee to be responsible for finalizing the chain, but in a way that preserves the attack cost properties we are looking for.

One way to think about the Orbit SSF is that it opens up a space of compromise options ranging from x=0 (Algorand-style committees with no economic finality) to x=1 (status quo Ethereum), opening up a point in the middle where Ethereum still has enough economic finality to be extremely secure, but at the same time we gain the efficiency advantage of only requiring a moderately sized random sample of validators to participate in each epoch.

Orbit exploits pre-existing heterogeneity in validator deposit sizes to obtain as much economic finality as possible while still giving small validators a role. Additionally, Orbit uses slow committee rotation to ensure a high overlap between adjacent quorums, ensuring that its economic finality still applies across committee rotation boundaries.

Option 3: Two-Tier Staking - A mechanism where stakers are divided into two classes, one with a higher deposit requirement and one with a lower deposit requirement. Only the tier with the higher deposit requirement would directly participate in providing economic finality. There are various proposals as to exactly what rights and responsibilities the tier with the lower deposit requirement would have (see, for example, the Rainbow staking post). Common ideas include:

The right to delegate stake to higher-level stakeholders

A randomly selected lower-level stakeholder attests and is required to complete each block

· The right to generate inclusion lists

What are the connections with existing research?

Path to Single Slot Finality (2022): https://notes.ethereum.org/@vbuterin/single_slot_finality

· Detailed Proposal for Ethereum Single-Slot Finality Protocol (2023): https://eprint.iacr.org/2023/280

· Orbit SSF:https://ethresear.ch/t/orbit-ssf-solo-staking-friendly-validator-set-management-for-ssf/19928

Further analysis of Orbit-style mechanisms: https://notes.ethereum.org/@anderselowsson/Vorbit_SSF

Horn, Signature Aggregation Protocol (2022): https://ethresear.ch/t/horn-collecting-signatures-for-faster-finality/14219

Signature Merging for Large-Scale Consensus (2023): https://ethresear.ch/t/signature-merging-for-large-scale-consensus/17386?u=asn

Signature aggregation protocol proposed by Khovratovich et al.: https://hackmd.io/@7dpNYqjKQGeYC7wMlPxHtQ/BykM3ggu0#/

· STARK-based Signature Aggregation (2022): https://hackmd.io/@vbuterin/stark_aggregationRainbow

Staking: https://ethresear.ch/t/unbundling-staking-towards-rainbow-staking/18683

What is left to do? What are the trade-offs?

There are four main possible paths (and we can also use hybrid paths):

Maintain the status quo

· Orbit SSF

· Strong SSF

SSF with two layers of staking

(1) means doing nothing and keeping stakes as is, but this would make Ethereum’s security experience and stake centralization properties worse than they would otherwise be.

(2) Avoid “high tech” and solve the problem by cleverly rethinking the protocol assumptions: we relax the “economic finality” requirement so that we require an attack to be expensive, but accept that the cost of an attack might be 10x lower than it is today (e.g., $2.5 billion instead of $25 billion). It is widely believed that Ethereum’s economic finality today is far beyond what it needs to be, and that its main security risks are elsewhere, so this is arguably an acceptable sacrifice.

The main work is to verify that the Orbit mechanism is secure and has the properties we want, and then fully formalize and implement it. In addition, EIP-7251 (increase the maximum valid balance) allows voluntary validator balances to be merged, which immediately reduces chain verification overhead and serves as an effective initial stage for the Orbit rollout.

(3) Avoiding clever rethinking, they instead used high technology to brute-force the problem. To do this would require collecting a large number of signatures (more than 1 million) in a very short period of time (5-10 seconds).

(4) It avoids clever rethinking and high tech, but it does create a two-tiered staking system that still carries centralization risk. The risk depends largely on the specific rights granted to the lower staking tier. For example:

If low-tier stakers need to delegate their attestation power to high-tier stakers, then delegation may become centralized and we end up with two highly centralized staking tiers. If a random sampling of the low-tiers is required to approve each block, then an attacker can spend a very small amount of ETH to prevent finality. If low-tier stakers can only make inclusion lists, then the proof layer may still be centralized, at which point a 51% attack on the proof layer can censor the inclusion lists themselves.

It is possible to combine multiple strategies, for example:

(1 + 2): Add Orbit without enforcing single-slot finality.

(1 + 3): Use a brute force technique to reduce the minimum deposit size without enforcing single-slot finality. The amount of aggregation required is 64 times less than the pure (3) case, so the problem becomes easier.

(2 + 3): Implement Orbit SSF with conservative parameters (e.g. 128k validator committee instead of 8k or 32k) and use brute force techniques to make it super efficient.

(1 + 4): Adding rainbow staking without single-slot finality.

How does it interact with the rest of the roadmap?

Among other benefits, single-slot finality reduces the risk of certain types of multi-block MEV attacks. Additionally, in a single-slot finality world, the prover-proposer separation design and other in-protocol block production pipelines need to be designed differently.

The weakness of brute force strategies is that they make it much harder to shorten the slot time.

Single Secret Leader Election

What problem are we trying to solve?

Today, which validator will propose the next block is known in advance. This creates a security vulnerability: an attacker can monitor the network, identify which validators correspond to which IP addresses, and launch a DoS attack on the validator when it is about to propose a block.

What is it and how does it work?

The best way to solve the DoS problem is to hide the information about which validator will generate the next block, at least until the block is actually generated. Note that this is easy if we remove the "single" requirement: one solution is to let anyone create the next block, but require randao to reveal less than 2256 / N. On average, only one validator will be able to meet this requirement - but sometimes there will be two or more, and sometimes there will be zero. Combining the "confidentiality" requirement with the "single" requirement has always been a difficult problem.

The single secret leader election protocol solves this problem by using some cryptography to create a "blind" validator ID for each validator, and then giving many proposers a chance to shuffle and re-blind the pool of blind IDs (this is similar to how mixnets work). At each epoch, a random blind ID is chosen. Only the owner of that blind ID can generate a valid proof to propose a block, but no one knows which validator that blind ID corresponds to.

Whisk SSLE protocol

What are some links to existing research?

Dan Boneh’s paper (2020): https://eprint.iacr.org/2020/025.pdf

Whisk (Ethereum specific proposal, 2022): https://ethresear.ch/t/whisk-a-practical-shuffle-based-ssle-protocol-for-ethereum/11763

Single Secret Leader Election tag on ethresear.ch: https://ethresear.ch/tag/single-secret-leader-election

Simplified SSLE using ring signatures: https://ethresear.ch/t/simplified-ssle/12315

What is left to do? What are the trade-offs?

Really, all that remains is to find and implement a protocol that is simple enough that we can easily implement it on mainnet. We place great importance on Ethereum being a fairly simple protocol, and we don’t want the complexity to grow further. The SSLE implementations we’ve seen add hundreds of lines of code to the specification and introduce new assumptions in complex cryptography. Finding a sufficiently efficient quantum-resistant SSLE implementation is also an open problem.

It may eventually be the case that the “marginal additional complexity” of SSLE will only drop low enough when we introduce general zero-knowledge proofs on L1 of the Ethereum protocol for other reasons (e.g. state tries, ZK-EVM).

Another option is to not bother with SSLE at all, and instead use extra-protocol mitigations (e.g. at the p2p layer) to address the DoS problem.

How does it interact with the rest of the roadmap?

If we add a prover-proposer separation (APS) mechanism such as execution tickets, then execution blocks (i.e. blocks containing Ethereum transactions) will not need SSLE because we can rely on specialized block builders. However, for consensus blocks (i.e. blocks containing protocol messages such as attestations, parts that may contain lists, etc.) we will still benefit from SSLE.

Faster transaction confirmation

What problem are we solving?

There is value in further reductions in Ethereum’s transaction confirmation times, from 12 seconds to 4 seconds. Doing so will significantly improve L1 and aggregation-based user experience while making the defi protocol more efficient. It will also make it easier for L2 to decentralize, as it will allow a large number of L2 applications to work on aggregation-based ordering, thereby reducing the need for L2 to build its own committee-based decentralized ordering.

What is it and how does it work?

There are roughly two techniques here:

Reduce the slot time, e.g. to 8 seconds or 4 seconds. This does not necessarily mean 4-second finality: finality inherently requires three rounds of communication, so we can make each round a separate block that will be at least tentatively confirmed after 4 seconds.

Allow proposers to publish preconfirmations during the slot process. In an extreme case, proposers could include the transactions they see in their blocks in real time, and immediately publish preconfirmation messages for each transaction ("My first transaction is 0×1234...", "My second transaction is 0×5678..."). The case where a proposer publishes two conflicting confirmations can be handled in two ways: (i) by slashing the proposer, or (ii) by using attestors to vote on which one is earlier.

What are some links to existing research?

Based on preconfirmations: https://ethresear.ch/t/based-preconfirmations/17353

Protocol Enforced Proposer Commitments (PEPC): https://ethresear.ch/t/unbundling-pbs-towards-protocol-enforced-proposer-commitments-pepc/13879

Staggered periods on parachains (2018 idea for low latency): https://ethresear.ch/t/staggered-periods/1793

What’s left to do, and what are the trade-offs?

It is unclear how practical it would be to reduce slot times. Even today, it is difficult for stakers in many parts of the world to get proofs fast enough. Attempting 4 second slot times risks centralizing the validator set, and makes becoming a validator outside of a few privileged regions impractical due to latency.

The weakness of the proposer preconfirmation approach is that it greatly improves average-case inclusion time, but not worst-case inclusion time: if the current proposer is behaving well, your transaction will be preconfirmed in 0.5 seconds instead of being included in (on average) 6 seconds, but if the current proposer is offline or behaving poorly, you will still have to wait a full 12 seconds for the next slot to start and a new proposer to be available.

Additionally, there is an open question of how to incentivize preconfirmations. Proposers have an incentive to maximize their optionality for as long as possible. If provers sign off on the timeliness of preconfirmations, then transaction senders can make part of their fees conditional on immediate preconfirmations, but this would place an additional burden on provers and could make it more difficult for provers to continue to act as neutral "dumb pipes."

On the other hand, if we don’t try to do this and keep finality times at 12 seconds (or longer), the ecosystem will place more emphasis on pre-confirmation mechanisms enacted on Layer 2, and interactions across Layer 2 will take longer.

How does it interact with the rest of the roadmap?

Proposer-based preconfirmation actually relies on attestor-proposer separation (APS) mechanisms such as execution tickets. Otherwise, the pressure to provide real-time preconfirmation may put too much pressure on regular validators.

Other research areas

51% Attack Recovery

It is often assumed that if a 51% attack occurs (including attacks that cannot be cryptographically proven, such as censorship), the community will work together to implement a minority soft fork, ensuring that the good guys win and the bad guys leak or are slashed due to inactivity. However, this level of over-reliance on the social layer is arguably unhealthy. We can try to reduce our reliance on the social layer and make the recovery process as automated as possible.

Full automation is impossible, because if it were, this would count as a >50% fault-tolerant consensus algorithm, and we already know the (very strict) mathematically provable limitations of such algorithms. But we can achieve partial automation: for example, clients could automatically refuse to accept a chain as final, or even as the head of a fork choice, if it censors transactions that the client has seen for long enough. A key goal is to ensure that the bad guys in an attack can't at least get a quick win.

Increase quorum threshold

Today, blocks are finalized if 67% of stakers support it. Some argue that this is too aggressive. In the entire history of Ethereum, there has only been one (very brief) failure of finality. If this percentage were increased to 80%, the number of additional non-finality periods would be relatively low, but Ethereum would gain security: in particular, many of the more contentious cases would result in temporary halts in finality. This seems much healthier than an immediate win for the “wrong party”, whether that wrong party is an attacker or the client is buggy.

This also answers the question of “what’s the point of having solo stakers”. Today, most stakers already stake through pools, and it seems unlikely that a solo staker could ever get as high as 51% of the staked ETH. However, it seems possible for solo stakers to reach a majority-blocking minority if we try, especially if the majority is 80% (so a majority-blocking minority only needs 21%). As long as solo stakers do not participate in a 51% attack (either by final reversal or censorship), such an attack will not result in a “clean win”, and solo stakers will actively help prevent a minority soft fork.

Quantum resistance

Metaculus currently believes that, despite the large margin of error, it is likely that quantum computers will begin to crack cryptography sometime in the 2030s:

Quantum computing experts, such as Scott Aaronson, have also recently started to think more seriously about the likelihood that quantum computers will actually work in the medium term. This has implications for the entire Ethereum roadmap: it means that every part of the Ethereum protocol that currently relies on elliptic curves will need some kind of hash-based or other quantum-resistant alternative. This specifically means that we cannot assume that we will be able to forever rely on the superior properties of BLS aggregation to process signatures from large validator sets. This justifies conservatism in performance assumptions for proof-of-stake designs, and is a reason to more aggressively develop quantum-resistant alternatives.

Original link