Vitalik's new article: The possible future of Ethereum, the Merge

Democratization of staking, faster transaction confirmation, resistance to quantum attacks... What will be the future of Ethereum?
By: Vitalik Buterin
Compiled by Alex Liu, Foresight News
Special thanks to Justin Drake, Hsiao-wei Wang, @antonttc, Anders Elowsson, and Francesco for their feedback and reviews.
Originally, "the Merge" referred to the most important event in the history of the Ethereum protocol since its launch: the long-awaited and hard-won transition from Proof-of-Work (PoW) to Proof-of-Stake (PoS). Ethereum has now been a stable and functioning Proof-of-Stake system for almost two years, and this Proof-of-Stake has performed very well in terms of stability, performance, and avoidance of centralization risks. However, there are still some important areas for Proof-of-Stake to improve.
My 2023 roadmap breaks this down into several parts: improvements to technical features such as stability, performance, and accessibility for small validators, and economic changes to address centralization risks. The former takes over the title of “the Merge”, and the latter becomes part of “the Scourge”.
This post will focus on the “Merge” part: where can the technical design of Proof of Stake be improved, and what are the ways to achieve this goal?
This is not an exhaustive list of things that can be done with proof of stake; rather, it is a list of ideas that are being actively considered.
Single-Slot Finality and Democratizing Staking
What problem are we trying to solve?
Today, it takes 2-3 epochs (about 15 minutes) to finalize a block, and 32 ETH is required to become a staker. This was originally a compromise to balance three goals:
Maximizing the number of validators that can participate in staking (which directly means minimizing the minimum ETH required to stake)
Minimize time to finalization
Minimize the overhead of running a node, which in this case is the cost of downloading, verifying, and rebroadcasting all other validators’ signatures
These three goals are in conflict with each other: in order to make economic finality possible (meaning: an attacker would need to burn a lot of ETH to reverse a finalized block), you need every validator to sign two messages every time finalization happens. So if you have a lot of validators, either you need a long time to process all their signatures, or you need very powerful nodes to process all signatures at the same time.
These three goals are in conflict with each other: in order to make economic finality possible (meaning: an attacker would need to burn a lot of ETH to tamper with a finalized block), you need every validator to sign two messages every time finalization happens. So if you have a lot of validators, either you need a long time to process all their signatures, or you need very powerful nodes to process all signatures at the same time.
Note that this is all conditioned on one of Ethereum’s key goals: ensuring that even a successful attack is costly to the attacker. This is what the term “economic finality” means. If we don’t have this goal, then we can work around this by randomly selecting a committee to finalize each block. Chains that don’t try to achieve economic finality, such as Algorand, often do this. But the problem with this approach is that if an attacker does control 51% of the validators, then they can perform an attack (revert a finalized block, censor or delay finalization) at a very low cost: controlling only some of their nodes. In a committee, it is possible to be detected as participating in an attack and be punished, either through slashing or a socially coordinated soft fork. This means that an attacker can repeatedly attack the chain many times, losing only a small fraction of the stake during each attack. So if we want economic finality, the simple committee-based approach doesn’t work, and at first glance, we do need full validator participation.
Ideally, we would like to maintain economic finality while improving the status quo in two areas:
Complete blocks in a single slot (ideally, maintaining or even reducing the current 12 second length), rather than 15 minutes
Allow validators to stake with 1 ETH (down from 32 ETH)
The first goal has two objectives, both of which can be viewed as “aligning the properties of Ethereum with those of (more centralized) performance-focused L1 chains.”
First, it ensures that all Ethereum users actually benefit from the higher level of security guarantees achieved through the finality mechanism. Today, most users don't do this because they are unwilling to wait 15 minutes; with single-slot finalization, users will see their transactions finalized almost as soon as they confirm the transaction. Second, it simplifies the protocol and the surrounding infrastructure if users and applications don't have to worry about the possibility of chain reversals (except in relatively rare cases of inactivity leaks).
The second goal is because of the desire to support solo stakers. Poll after poll has repeatedly shown that the main factor preventing more people from staking solo is the 32 ETH minimum. Reducing the minimum to 1 ETH will solve this problem, and other issues will become the dominant factors limiting solo staking.
There is a challenge here: both the goals of faster finality and more democratized staking conflict with the goal of minimizing overhead. In fact, this fact is the entire reason we didn’t start with single-slot finality. However, recent research suggests some possible avenues to address this problem.
What is it and how does it work?
Single-slot finality involves using a consensus algorithm to finalize the blocks in a slot. This is not a difficult goal in itself: many algorithms, such as Tendermint consensus, already do this. One desired property unique to Ethereum that Tendermint does not support is inactivity leaks, which allows the chain to continue running and eventually recover even if more than 1/3 of validators are offline. Fortunately, this has been addressed: there are proposals to modify Tendermint-style consensus to accommodate inactivity leaks.
Leading Single-Slot Finality Proposal
The harder part of the problem is figuring out how to make single-slot finality work with very high validator counts without incurring extremely high node operator overhead. To this end, there are a few leading solutions:
Option 1: Brute Force - Work towards a better signature aggregation protocol, perhaps using ZK-SNARKs, which would essentially allow us to handle signatures from millions of validators per slot.
Horn, one of the designs for a better aggregation protocol
Option 2: Orbit committees — A new mechanism that allows randomly selected medium-sized committees to be responsible for finalizing the chain, but in a way that preserves the attack cost properties we are looking for.
One way to think about the Orbit SSF is that it opens up a space of compromise between two options, ranging from x=0 (Algorand-style committees with no economic finality) to x=1 (status quo Ethereum), opening up a point in the middle where Ethereum still has enough economic finality to be very secure, but at the same time we only need a moderately sized random sample of validators to participate in each slot to gain efficiency benefits.
Orbit exploits pre-existing heterogeneity in validator deposit sizes to obtain as much economic finality as possible while still giving small validators a role. In addition, Orbit uses slow committee rotation to ensure a high overlap between adjacent quorums, ensuring that its economic finality still applies to the boundaries of committee switching.
Option 3: Two-Tier Staking — A mechanism where there are two classes of stakers, one with a higher deposit requirement and one with a lower deposit requirement. Only the higher deposit tier would directly participate in providing economic finality. There are various proposals for exactly what rights and responsibilities the lower deposit tier would have (see, for example, the Rainbow staking post). Common ideas include:
The right to delegate staking to higher-level stakers
Some random low-level stakers are required to attest and finalize each block
The right to generate inclusion lists
What are the connections with existing research?
Paths toward single slot finality (2022): https://notes.ethereum.org/@vbuterin/single_slot_finality
A concrete proposal for a single slot finality protocol for Ethereum (2023): https://eprint.iacr.org/2023/280
Orbit SSF: https://ethresear.ch/t/orbit-ssf-solo-staking-friendly-validator-set-management-for-ssf/19928
Further analysis on Orbit-style mechanisms: https://ethresear.ch/t/vorbit-ssf-with-circular-and-spiral-finality-validator-selection-and-distribution/20464
Horn, signature aggregation protocol (2022): https://ethresear.ch/t/horn-collecting-signatures-for-faster-finality/14219 Horn, signature aggregation protocol (2022)
Signature merging for large-scale consensus (2023): https://ethresear.ch/t/signature-merging-for-large-scale-consensus/17386?u=asn Signature merging for large-scale consensus (2023)
Signature aggregation protocol proposed by Khovratovich et al: https://hackmd.io/@7dpNYqjKQGeYC7wMlPxHtQ/BykM3ggu0#/ Signature aggregation protocol proposed by Khovratovich et al:
STARK-based signature aggregation (2022): https://hackmd.io/@vbuterin/stark_aggregation
Rainbow staking: https://ethresear.ch/t/unbundling-staking-towards-rainbow-staking/18683
What else needs to be done and what trade-offs need to be made?
There are four main possible paths to choose from (we can also take hybrid paths):
Maintaining the status quo
Violent SSF
Orbit SSF
SSF with two-level staking
(1) means doing nothing and keeping things as is, but this would make Ethereum’s security experience and staking centralization properties worse.
(2) Brute force the problem with advanced techniques. To achieve this, a large number of signatures (more than 1 million) need to be aggregated in a very short period of time (5-10 seconds). One way to think about this approach is that it involves minimizing system complexity by embracing the complexity of encapsulation with all your might.
(3) Avoid “advanced technology” and solve problems through clever rethinking around protocol assumptions: We relax the “economic finality” requirement so that we pursue an attack that is expensive, but accepts that the attack cost is likely to be 10 times lower than it is today (e.g., $2.5 billion instead of $25 billion). It is widely believed that Ethereum’s economic finality today is far greater than it needs to be, and the main security risks are elsewhere, so this is arguably an acceptable sacrifice.
The main work is to verify that the Orbit mechanism is secure and has the properties we want, and then fully formalize and implement it. In addition, EIP-7251 (increase maximum valid balance) allows voluntary validator balances to be merged, which immediately reduces chain verification overhead to a certain extent and effectively serves as the initial stage of Orbit's rollout.
(4) It avoids clever rethinking and advanced technology, but it creates a two-tier staking system that still has centralization risks. The risk depends largely on the specific rights obtained by the lower staking level. For example:
If lower-level stakers need to delegate their attestation power to higher-level stakers, then delegation can be centralized, so we end up with two highly centralized staking layers.
If a random sample of lower layers is required to approve each block, then an attacker can spend a very small amount of ETH to prevent finality.
If stakers in lower layers can only generate inclusion lists, then the proof layer may remain centralized, at which point a 51% attack on the proof layer can censor the inclusion lists on its own.
It is possible to combine multiple strategies, for example:
1 + 2): Add Orbit without single-slot finalization
(1 + 3): Use brute force techniques to reduce the minimum deposit size without single-slot finalization. The amount of aggregation required is 64 times less than the pure (3) case, so the problem becomes easier.
(2 + 3): Use conservative parameters (e.g. 128k validator committee instead of 8k or 32k) for Orbit SSF, and use techniques to make it super efficient.
(1 + 4): Adding rainbow staking without single slot finalization
How does it interact with the rest of the roadmap?
Among other benefits, single-slot finality reduces the risk of certain types of multi-block MEV attacks. Additionally, the prover-proposer separation design and the rest of the in-protocol block production pipeline need to be designed differently in a single-slot finality world.
The weakness of brute force strategies is that they make it more difficult to reduce slot time.
Single secret leader election
What problem are we trying to solve?
Today, which validator will propose the next block is known in advance. This creates a security vulnerability: an attacker can monitor the network, identify which validators correspond to which IP addresses, and perform a DoS attack on each validator when they are about to propose a block.
What is it and how does it work?
The best way to solve the DoS problem is to hide information about which validator will produce the next block, at least until that block is actually produced. Note that this is easy if we remove the "single" requirement: one solution is to let anyone create the next block, but require that the randao Reveal is less than 2(256)/N. On average, only one validator will be able to meet this requirement - but sometimes there will be two or more, and sometimes there will be zero. Combining the "secrecy" requirement with the "single" requirement has always been a hard problem.
The single secret leader election protocol solves this problem by using some cryptography to create a “blind” validator ID for each validator, and then giving many proposers the chance to shuffle the pool of blind IDs (this is similar to how a mixnet works). During each slot, a random blind ID is chosen. Only the owner of that blinded ID can generate a valid proof to propose the block, but no one else knows which validator that blinded ID corresponds to.
What are the connections with existing research?
Paper by Dan Boneh (2020): https://eprint.iacr.org/2020/025.pdf
Whisk (concrete proposal for Ethereum, 2022): https://ethresear.ch/t/whisk-a-practical-shuffle-based-ssle-protocol-for-ethereum/11763 Whisk (concrete proposal for Ethereum, 2022)
Single secret leader election tag on ethresear.ch: https://ethresear.ch/tag/single-secret-leader-election
Simplified SSLE using ring signatures: https://ethresear.ch/t/simplified-ssle/12315 Simplified SSLE using ring signatures
What else needs to be done and what trade-offs need to be made?
Really, all that remains is to find and implement a protocol that is simple enough that we can easily implement it on mainnet. We think highly of Ethereum as a fairly simple protocol, and we don’t want the complexity to increase further. The SSLE implementations we’ve seen add hundreds of lines of canonical code and introduce new assumptions in complex cryptography. Finding a sufficiently efficient quantum-resistant SSLE implementation is also an open problem.
It may eventually be the case that once we take the plunge and introduce mechanisms for general zero-knowledge proofs in the Ethereum protocol at L1 for other reasons (e.g. state tries, ZK-EVM), the additional complexity introduced by SSLE will be low enough.
Another option is to not consider SSLE at all and use extra-protocol mitigations (e.g., at the p2p layer) to address the DoS problem.
How does it interact with the rest of the roadmap?
If we add an attester-proposer separation (APS) mechanism, e.g. execution tickets, then execution blocks (i.e. blocks containing Ethereum transactions) will not need SSLE, as we can rely on specialized block builders. However, we will still benefit from SSLE for consensus blocks (i.e. blocks containing protocol messages, such as attestations, perhaps fragments containing lists, etc.).
Faster transaction confirmation
What problem are we trying to solve?
It would be valuable to see Ethereum's transaction confirmation time reduced further, from 12 seconds to, for example, 4 seconds. Doing so would improve the user experience of L1 and Based Rollups while making defi protocols more efficient. It would also make L2 more decentralized, as it would allow a large class of L2 applications to work on Based Rollups, reducing the need for L2 to build its own committee-based decentralized ordering.
What is it and how does it work?
Reduce slot time, e.g. 8 seconds or 4 seconds. This does not necessarily mean 4 second finalization: finalization inherently requires three rounds of communication, so we can make each round of communication a separate block, which will at least be tentatively confirmed after 4 seconds.
Allow proposers to publish preconfirmations during the slot process. In an extreme case, proposers can add transactions that they see in real time to their own blocks and immediately publish preconfirmation messages for each transaction ("My first transaction is 0x1234...", "My second transaction is 0x5678..."). The situation where a proposer publishes two conflicting confirmations can be handled in two ways: (i) slashing the proposer, or (ii) using provers to vote on the confirmation that came earlier.
What are the connections with existing research?
Based preconfirmations: https://ethresear.ch/t/based-preconfirmations/17353
Protocol-enforced proposer commitments (PEPC): https://ethresear.ch/t/unbundling-pbs-towards-protocol-enforced-proposer-commitments-pepc/13879 Protocol-enforced proposer commitments (PEPC)
Staggered periods across parallel chains (a 2018-era idea for achieving low latency): https://ethresear.ch/t/staggered-periods/1793
What else needs to be done and what trade-offs need to be made?
It is not clear whether reducing slot times is practical. Even today, stakers in many parts of the world have difficulty getting attestations fast enough. Attempting 4-second slot times carries the risk of validator centralization and making it impractical to be a validator outside of a few developed regions due to latency. Specifically, moving to 4-second slots would require reducing the network latency (“delta”) limit to two seconds.
The downside of the proposer preconfirmation approach is that it can greatly improve average-case inclusion time, but does not improve the worst-case scenario: if the current proposer is performing well, the transaction will be preconfirmed in 0.5 seconds instead of being included in 6 seconds (on average), but if the current proposer is offline or performing poorly, you still have to wait a full 12 seconds before the next slot can be started and a new proposer is available.
Furthermore, how to incentivize preconfirmations is also an open question. Proposers have an incentive to maximize their options for as long as possible. If provers sign off on the timeliness of preconfirmations, then transaction senders can condition some of their fees on immediate preconfirmations, but this would impose an additional burden on provers and could make it more difficult for provers to continue to operate as neutral "dumb pipes."
On the other hand, if we don’t try to do this and keep finality times at 12 seconds (or longer), the ecosystem will place more emphasis on L2 pre-confirmation mechanisms and cross-L2 interactions will take longer.
How does it interact with the rest of the roadmap?
Proposer-based pre-confirmation actually depends on an attestor-proposer separation (APS) mechanism, such as execution tickets. Otherwise, the pressure to provide real-time pre-confirmation may be too concentrated for regular validators.
How short the slot time can be depends also on the slot structure, which depends a lot on what version of APS we end up implementing, the include list, etc. Some slot structures contain fewer rounds and are therefore more friendly to short slot times, but they make trade-offs elsewhere.
Other research areas
51% Attack Recovery
It is often assumed that if a 51% attack occurs (including attacks that cannot be cryptographically proven, such as censorship), the community will come together to implement a minority soft fork to ensure that the good guys win and the bad guys get inactivity leaked or slashed. However, this over-reliance on the social layer is arguably unhealthy. We can try to reduce the reliance on the social layer by making the recovery process as automated as possible.
Full automation is impossible, because if it were, that would be considered a >50% fault-tolerant consensus algorithm, and we already know the (very strict) mathematically provable limitations of such algorithms. But we can achieve partial automation: for example, clients could automatically refuse to accept a finalized chain, or even the head of a fork choice, if client censorship of transactions the client has seen for long enough. A key goal is to ensure that the bad guys in an attack can at least not win quickly and cleanly.
Raising the quorum threshold
Today, a block is finalized if 67% of staked holders support it. Some argue that this is too aggressive. In the entire history of Ethereum, there has only been one (very brief) failure of finality. If this percentage increases, e.g. to 80%, then the number of additional non-finality periods will be relatively low, but Ethereum will gain safety properties: in particular, many of the more contentious cases will result in temporary halts of finality. This seems much healthier than an immediate win for the “wrong party”, either when the wrong party is an attacker or a buggy client.
This also answers the question, “What’s the point of being a solo staker?” Today, most stakers already stake via mining pools, and it seems unlikely that a solo staker could ever get 51% of the staked ETH. However, if we work hard enough, it seems achievable for solo stakers to reach a quorum to block the minority, especially if the quorum is 80% (so only 21% is needed for quorum to block the minority). As long as solo stakers disagree with a 51% attack (either finality recovery or censorship), such an attack will not be a “clean win”, and solo stakers will have an incentive to help prevent a minority soft fork.
Note that there is an interaction between quorum thresholds and the Orbit mechanism: if we end up using Orbit, then exactly what “21% of stakers” means becomes a more complex question, and depends in part on the distribution of validators.
Quantum resistance
Metaculus currently believes that quantum computers could begin to crack cryptography sometime in the 2030s, although the error bars are wide:
Quantum computing experts such as Scott Aaronson have also recently started taking the possibility of quantum computers working effectively in the medium term more seriously. This has implications for the entire Ethereum roadmap: it means that every part of the Ethereum protocol that currently relies on elliptic curves needs to have some hash-based or other quantum-resistant replacement. This specifically means that we cannot assume that we can always rely on the superior properties of BLS aggregation to handle signatures from large validator sets. This justifies the conservatism of assumptions around the performance of proof-of-stake designs, and is a reason to be more proactive in developing quantum-resistant alternatives.
Explore More From Creator

Latest News

Explore More From Creator

Latest News

Trending Articles