Author: Ethereum founder Vitalik; Translated by: Deng Tong, Golden Finance

Note: This article is the third part of the series of articles "Possible futures of the Ethereum protocol, part 3: The Scourge" recently published by Ethereum founder Vitalik. For the second part, see Golden Finance's "Vitalik: How should the Ethereum protocol develop during the Surge phase", and for the first part, see "What else can be improved in Ethereum PoS". The following is the full text of the third part:

Special thanks to Justin Drake, Caspar Schwarz-Schilling, Phil Daian, Dan Robinson, Charlie Noyes, and Max Resnick for their feedback and review, and the ethstakers community for discussion.

One of the biggest risks facing Ethereum L1 is the centralization of proof-of-stake due to economic pressure. If there are economies of scale in participating in the core proof-of-stake mechanism, this will naturally lead to large stakeholders dominating and small stakeholders exiting to join large mining pools. This leads to a higher risk of 51% attacks, transaction censorship, and other crises. In addition to the risk of centralization, there is also the risk of value extraction: a small group of people capture the value that would otherwise flow to Ethereum users.

Our understanding of these risks has increased significantly over the last year. As we all know, this risk exists in two key places: (i) block construction, and (ii) staked capital provisions. Larger participants can run more complex algorithms (“MEV extraction”) to produce blocks, which can bring them higher block revenue. Large participants can also more efficiently resolve the inconvenience of locked funds by releasing them as Liquid Stake Tokens (LST) to others. In addition to the direct issue of small stakers vs. large stakers, there is also the question of whether too much ETH is (or will be) staked.

Scourge 2023 Roadmap

This year has seen significant progress in block building, most notably the convergence of “committee inclusion lists plus some targeted ordering solutions” as the ideal solution, and significant research into the economics of proof-of-stake, including two layers etc. Think of the staking model and reduce the issuance to limit the percentage of ETH staked.

Strengthening block construction path

What problem are we trying to solve?

Today, Ethereum block construction is primarily done through an out-of-protocol propser-builder split called MEVBoost. When validators have the opportunity to propose a block, they assign the work of selecting the block’s contents to specialized participants called builders. The task of selecting block contents that maximize revenue is economies of scale intensive: specialized algorithms are required to determine which transactions to include in order to extract as much value as possible from the transactions of on-chain financial instruments and users interacting with them (this is known as “MEV extraction”). Validators face the relatively light-economy-of-scale “dumb-pipe” task of listening to bids and accepting the highest one, as well as other responsibilities such as attestation.

A stylized diagram of what MEVBoost is doing: dedicated builders take on tasks in red, stakeholders take on tasks in blue.

There are multiple versions, including "Proposer-Builder Separation" (PBS) and "Attester-Proposer Separation" (APS). The difference between them has to do with finegrained details about which responsibility falls on which of the two participants: roughly speaking, in PBS, validators still propose blocks but receive payloads from builders, while in APS the entire slot becomes the responsibility of the builder. Lately, APS has been favored over PBS because it further reduces the incentive for proposers to co-locate with builders. Note that APS only applies to execution blocks containing transactions; consensus blocks containing proof-of-stake related data (such as attestations) will still be randomly assigned to validators.

This separation of powers helps keep validators decentralized, but it has an important cost: it’s easy for participants who perform “specialized” tasks to become very centralized. Here’s what Ethereum blocks look like today:

Two actors are choosing the contents of about 88% of Ethereum blocks. What if these two actors decide to censor transactions? The answer is not as bad as it seems: they cannot reorganize blocks, so you don’t need 51% censorship at all to prevent transactions from being included: you need 100%. With 88% censorship, users would have to wait an average of nine slots (technically, an average of 114 seconds, not six). For some use cases, waiting two or even five minutes for some transactions is fine. But for other use cases, such as DeFi liquidations, being able to delay the inclusion of other people’s transactions by even a few blocks is a significant market manipulation risk.

Strategies that block builders can use to maximize revenue can also have other negative effects on users. “Sandwich attacks” can cause users who trade tokens to suffer significant losses due to slippage. Transactions introduced to clog the chain for these attacks increase gas prices for other users.

What is it and how does it work?

The leading solution is to further decompose the block production task: we put the task of selecting transactions back to the proposers (i.e. stakers), and the builders can only select and insert some of their own transactions. This is what inclusion lists want to do.

At time T, randomly selected stakers create an inclusion list, a list of transactions that are valid given the current state of the blockchain at that time. At time T+1, block builders (perhaps selected in advance through an in-protocol auction mechanism) create a block. The block needs to include every transaction in the inclusion list, but they can choose the order and can add their own transactions.

Fork Choice Enforced Inclusion List (FOCIL) proposals involve a committee of multiple inclusion list creators per block. In order to delay a transaction by one block, k of the k inclusion list creators (e.g. k = 16) must review the transaction. The combination of FOCIL with a final proposer selected by auction that is required to include the inclusion list but can reorder and add new transactions is often referred to as "FOCIL + APS".

Another approach to this problem is a multiple concurrent proposer (MCP) scheme, such as BRAID. BRAID tries to avoid splitting the block proposer role into low-economy and high-economy parts, and instead tries to distribute the block production process to many participants, so that each participant proposer only needs to have a moderate level of complexity to maximize their revenue. MCP works by having k parallel proposers generate a list of transactions, and then using a deterministic algorithm (e.g., sorting by fee from high to low) to choose the order.

BRAID does not seek to achieve the goal of having a dumb-pipe block proposer running the default software being optimal. There are two well-understood reasons why it cannot do so:

  • Late-mover arbitrage attack: Assume that the average time for proposers to submit is T, and the last time you could submit and still get included is about T+1. Now, suppose that on a centralized exchange, the ETH/USDC price goes from $2500 to $2502 between T and T+1. The proposer can wait an extra second and then add an additional transaction to arbitrage the on-chain DEX, making a profit of up to $2 per ETH. Sophisticated proposers with good connections to the network are more able to do this.

  • Exclusive order flow: Users have an incentive to send trades directly to a single proposer to minimize their vulnerability to front-running and other attacks. Experienced proposers have an advantage because they can build infrastructure to accept these trades directly from users, and they have a stronger reputation so users sending them trades can trust that the proposer will not betray and front-run (this can be mitigated by using trusted hardware, but trusted hardware has its own trust assumptions.)

In BRAID, the prover can still be detached and run as a dumb-pipe function.

Beyond these two extremes, there are a range of possible designs that fall in between. For example, you could auction actors who only have the right to append to blocks, but not the right to reorder or prepend. You could even have them append or prepend, but not be able to insert themselves in the middle or reorder. The appeal of these techniques is that the winners of auction markets can be very concentrated, so reducing their authority has a lot of benefits.

Encrypted memory pool

One technology that is critical to the successful implementation of many of these designs (specifically, the BRAID or APS versions where there are strict restrictions on the auction functionality) is encrypted mempools. Encrypted mempools are a technique where users broadcast their transactions in encrypted form, along with some kind of proof of validity, and the transactions are included in blocks in encrypted form without the block builders knowing their contents. The contents of the transactions are published later.

The main challenge in implementing an encrypted mempool is to come up with a design that ensures that all transactions are later disclosed: a simple “commit and disclose” scheme does not work because if disclosure is voluntary, the act of choosing to disclose or not disclose is itself a “last mover” influence on exploitable blocks. The two main techniques are (i) threshold decryption and (ii) delayed encryption, a primitive closely related to verifiable delay functions (VDFs).

What are the connections with existing research?

MEV and builder centralization explained: https://vitalik.eth.limo/general/2024/05/17/decentralization.html#mev-and-builder-dependence

MEVBoost:https://github.com/flashbots/mev-boost

Enshrined PBS (early proposed solution to these problems): https://ethresear.ch/t/why-enshrine-proposer-builder-separation-a-viable-path-to-epbs/15710

Mike Neuder's inclusion list of related readings: https://gist.github.com/michaelneuder/dfe5699cb245bc99fbc718031c773008

Contains list EIP: https://eips.ethereum.org/EIPS/eip-7547

FOCIL:https://ethresear.ch/t/fork-choice-enforced-inclusion-lists-focil-a-simple-committee-based-inclusion-list-proposal/19870

Max Resnick's BRAID demo: https://www.youtube.com/watch?v=mJLERWmQ2uw

“Priority is all you need” by Dan Robinson: https://www.paradigm.xyz/2024/06/priority-is-all-you-need

About the multi-proposer gadget and protocol: https://hackmd.io/xz1UyksETR-pCsazePMAjw

VDFResearch.org:https://vdfresearch.org/

Verifiable Delay Functions and Attacks (focused on RANDAO setup, but also applicable to crypto mempools): https://ethresear.ch/t/verABLE-delay-functions-and-attacks/2365

MEV execution ticket capture and decentralization: https://www.arxiv.org/pdf/2408.11255

APS Centralization: https://arxiv.org/abs/2408.03116

Multi-block MEV and inclusion list: https://x.com/_charlienoyes/status/1806186662327689441

What else needs to be done and what trade-offs need to be made?

We can think of all of the above as different ways of dividing up the rights to participate in staking, ranging from lower economies of scale (“dumb-pipe”) to higher economies of scale (“specialization-friendly”). Prior to 2021, all of these rights were bundled into a single participant:

The core dilemma is this: any meaningful power that remains in the hands of stakeholders is likely to end up being “MEV-related” power. We want a highly decentralized set of actors to have as much power as possible; this means (i) handing a lot of power to stakeholders, and (ii) ensuring that stakeholders are as decentralized as possible, which means they have few economies of scale-driven incentives to consolidate. This is a difficult tension to navigate.

A particular challenge is multi-block MEV: in some cases, execution auction winners can make more money if they capture multiple slots in a row, and do not allow any MEV-related transactions in blocks other than the last one they controlled. If the inclusion list forces them to do this, then they can try to get around it by not publishing any blocks at all during those time periods. One can make unconditional inclusion lists that become blocks directly if the builder does not provide one, but this makes the inclusion list MEV-related. The solution here may involve some compromise, including accepting some low-level incentive to bribe people to include transactions in the inclusion list.

We can view FOCIL + APS as follows. Stakers continue to own the power on the left side of the spectrum, while the right side of the spectrum is auctioned off to the highest bidder.

BRAID is completely different. The "stakers" section is larger, but it is divided into two parts: light stakers and heavy stakers. Also, since transactions are sorted in descending order of priority fee, the selection of the top of the block is actually auctioned through the fee market, which can be seen as similar to PBS.

Note that BRAID's security depends heavily on an encrypted mempool; otherwise, the block-top auction mechanism is vulnerable to policy stealing attacks (essentially: copy someone else's transaction, swap the recipient address, and pay a high 0.01% fee). This need for pre-inclusion privacy is also what makes PBS so tricky to implement.

Finally, a more "aggressive" version of FOCIL + APS, e.g. APS only determines the option at the end of the block as follows:

The main tasks remaining are (i) working to consolidate the various proposals and analyzing their consequences, and (ii) combining this analysis with an understanding of the goals of the Ethereum community, i.e. what forms of centralization it will tolerate. There is also some work to be done for each individual proposal, such as:

  • Continued work on the crypto mempool design and reached a point where our design was both robust and reasonably simple and seemed ready for inclusion.

  • Optimize the design of multiple inclusion lists to ensure that (i) no data is wasted, especially in the context of inclusion lists containing blobs, and (ii) it is friendly to stateless validators.

  • More work on optimal auction design for APS.

Additionally, it’s worth noting that these different proposals are not necessarily incompatible forks in the road. For example, implementing FOCIL + APS could easily serve as a stepping stone to implementing BRAID. An effective conservative strategy is a “wait and see” approach, where we first implement a solution where stakers’ permissions are limited, most permissions are auctioned off, and then over time, slowly increase stakeholder power as we learn more about how the MEV market works on a live network.

In particular, the centralization bottlenecks of staking are:

  • Centralization of Blockchain Construction (this section)

  • Centralization of staking for economic reasons (next section)

  • Staking centralization due to 32 ETH minimum (solved via Orbit or other technology; see post on merge)

  • Staking centralization due to hardware requirements (solved in Verge with stateless clients and later ZK-EVM)

Solving any one of these four problems will increase the benefits of solving any of the others.

Additionally, there is an interaction between block construction pipelines and single-slot finality designs, especially when trying to reduce slot times. Many block construction pipeline designs end up increasing slot times. Many block construction pipelines involve the role of a prover at multiple steps in the process. Therefore, it is worth considering both block construction pipelines and single-slot finality simultaneously.

Fixing staking economics

What problem are we trying to solve?

Today, about 30% of the ETH supply is actively staked. This is enough to protect Ethereum from a 51% attack. If the proportion of staked ETH becomes larger, researchers worry about a different scenario: if almost all ETH is staked, risks arise. These risks include:

  • Staking goes from being a lucrative task for experts to a responsibility for all ETH holders. As a result, regular stakers will be less enthusiastic and choose the easiest way (effectively, entrusting their tokens to the most convenient centralized operator)

  • If almost all ETH is staked, the credibility of the slashing mechanism will be weakened

  • A single liquid staking token could take over the majority of equity, and even the “currency” network effect of ETH itself

  • Ethereum is unnecessarily issuing ~1M additional ETH per year. In a scenario where one liquidity staking token gains dominant network effects, a large portion of that value could even be captured by LST.

What is it and how does it work?

Historically, one class of solutions has been: if everyone is inevitably staking, and liquidity staking tokens are inevitable, then let’s make staking friendly to have a liquidity staking token that is actually trustless, neutral, and maximally decentralized. A simple approach would be to cap staking penalties at 1/8, which would make 7/8 of staked ETH non-slashable and therefore eligible to be put into the same liquidity staking token. Another option would be to explicitly create two tiers of staking: “risk-bearing” (slashable) staking.

However, one criticism of this approach is that it seems economically equivalent to something much simpler: drastically reduce issuance if stake approaches some predetermined cap. The basic argument is: if we end up in a world where the risk-taking layer has a 3.4% return and the risk-free layer (where everyone participates) has a 2.6% return, that’s effectively the same world as a world where staking ETH has a 0.8% return and just holding ETH has a 0% return. The dynamics of the risk-taking layer, including the total amount staked and the degree of centralization, are the same in both cases. So we should do the simple thing and reduce issuance.

The main rebuttal to this argument is whether we can have a “risk-free layer” that still has some useful role and some degree of risk (e.g. as Dankrad proposes here ).

Both of these suggestions imply changing the issuance curve, and if the number of shares is too high, the return will be extremely low.

Left: A proposal by Justin Drake to adjust the issuance curve. Right: Another set of proposals by Anders Elowsson.

Two-tier staking, on the other hand, requires setting two return curves: (i) the return for “basic” (risk-free or low-risk) staking, and (ii) the premium for risky staking. There are multiple ways to set these parameters: for example, if you set a hard parameter that 1/8 of the stake is slashable, then market dynamics will determine the premium for the return earned by slashable stake.

Another important topic here is MEV capture. Today, MEV income (e.g. DEX arbitrage, sandwiches, ...) goes to the proposer, the staker. This is completely "opaque" income to the protocol: the protocol has no way of knowing whether it is 0.01% APY, 1% APY or 20% APY. The existence of this income source is very inconvenient from multiple perspectives:

  • This is an unstable source of income, as each stakeholder only gets it when they propose a block, which is now about once every 4 months. This incentivizes people to join the pool to get a more stable income.

  • It leads to an unbalanced distribution of incentives: too many proposals, too few proofs.

  • This makes a stake cap difficult to enforce: even if the "official" return is zero, MEV income alone might be enough to drive all ETH holders to stake. Therefore, a realistic stake cap proposal would actually have to make the return approach negative infinity. This introduces more risk to stakers, especially solo stakers.

We can solve these problems by finding a way to make MEV revenue visible in the protocol and capture it. The earliest proposal was Francesco's MEV smoothing; today it is widely believed that any mechanism that auctions off block proposer rights in advance (or more generally, has enough power to capture almost all MEV) can achieve the same goal.

What are the connections with existing research?

Issue wtf: https://issuance.wtf/

Endgame Stake Economics, a Case for Targeting: https://ethresear.ch/t/endgame-stake-economics-a-case-for-targeting/18751

Properties of Issuance Level, Anders Elowsson: https://ethresear.ch/t/properties-of-issuance-level-consensus-incentives-and-variability-across-pottial-reward-curves/18448

Validators set size caps: https://notes.ethereum.org/@vbuterin/single_slot_finality?type=view#Economic-capping-of-total-deposits

Thoughts on the idea of ​​multi-layer staking: https://notes.ethereum.org/@vbuterin/stake_2023_10?type=view

Rainbow Stake: https://ethresear.ch/t/unbundling-stake-towards-rainbow-stake/18683

Dankrad’s Liquid Staking Proposal: https://notes.ethereum.org/Pcq3m8B8TuWnEsuhKwCsFg

MEV smoothing,作者:Francesco:https://ethresear.ch/t/committee-driven-mev-smoothing/10408

MEV burn,作者:Justin Drake:https://ethresear.ch/t/mev-burn-a-simple-design/15590

What else needs to be done and what trade-offs need to be made?

The main tasks remaining are to either agree to take no action and accept the risk of nearly all ETH being in LST, or to finalize and agree on the details and parameters of one of the above proposals. A rough summary of the benefits and risks is as follows:

How does it interact with the rest of the roadmap?

An important intersection has to do with solo staking. Today, the cheapest VPS that can run an Ethereum node costs about $60 per month, mostly due to hard drive storage costs. For a 32 ETH staker ($84,000 at the time of writing), this reduces the APY to (60 * 12) / 84000 ~= 0.85%. If the total staking return is less than 0.85%, then solo staking is not feasible for many at this level.

This further highlights the need to reduce node operating costs if we want solo staking to continue to be viable, which will be done in Verge: statelessness will remove the storage space requirement, which may be enough on its own, and then the L1 EVM validity proof will make the cost negligible.

On the other hand, MEV burning can be said to help individual staking. Although it reduces the return for everyone, it more importantly reduces the variance and makes staking less like a lottery.

Finally, any changes to issuance will interact with other fundamental changes to staking design (e.g. rainbow staking). A particular concern is if staking rewards become very low, which means we have to choose between (i) lowering penalties, reducing disincentives for bad behavior, or (ii) keeping penalties high, which increases the range of situations where even well-intentioned validators can accidentally receive negative rewards if they are unlucky enough to encounter technical issues or even attacks.

Application layer solutions

The above sections highlight changes to Ethereum L1 that address important centralization risks. However, Ethereum is more than just an L1, it is an ecosystem, and there are important application layer strategies that can help mitigate the above risks. Some examples include:

  • Specialized staking hardware solutions — Some companies (e.g. Dappnode) are selling specially designed hardware to make operating a staking node as easy as possible. One way to make this solution more effective is to ask the question: if a user has already spent the effort to get a box running and connected to the internet, what other services can it provide (to the user or to others) that would benefit from decentralization? Examples that come to mind include (i) running a locally hosted LLM for self-sovereignty and privacy reasons, and (ii) running a node for a decentralized VPN.

  • Squad Staking — This solution from Obol allows multiple people to stake together in an M-of-N format. This will likely become more popular over time as statelessness and later L1 EVM validity proofs reduce the overhead of running more nodes, and the benefits of each participant not having to worry about always being online start to dominate. This is another way to reduce the perceived overhead of staking, and ensure solo staking thrives in the future.

  • Airdrops — Starknet offers airdrops to solo stakers. Other projects that wish to have a decentralized and values-aligned user base may also consider offering airdrops or discounts to validators identified as likely solo stakers.

  • Decentralized Block Builder Marketplace — Using a combination of ZK, MPC, and TEE, it is possible to create a decentralized block builder that participates in and wins the APS auction game, but at the same time provides pre-confirmation privacy and censorship resistance guarantees to its users. This is another avenue for improving user welfare in the APS world.

  • Application layer MEV minimization — Individual applications can be built in a way that “leaks” less MEV into L1, reducing the incentive for block builders to create specialized algorithms to collect MEV. A simple but common strategy is for the contract to put all incoming operations into a queue and execute them in the next block, and auction the right to jump the queue, although this is inconvenient and destroys composability. Other more complex approaches include doing more work off-chain, e.g. as Cowswap does. Oracles can also be redesigned to minimize the value that can be extracted by the oracle.