Author: Tia, Techub News
Blockchain sacrifices efficiency due to its decentralized design, making the improvement of execution speed one of the urgent problems to solve. The 'execution layer' of the blockchain is a key part that processes each transaction and adds it to the chain. To accelerate processing capabilities, enhancing the execution layer has become one of the core strategies, with parallel execution being an important breakthrough in this regard.
Traditional blockchains typically handle transactions serially, which greatly limits transaction speed, especially in transaction-intensive networks, leading to congestion. However, through parallel execution, multiple transactions can be processed simultaneously, significantly improving execution efficiency and alleviating on-chain pressure.
To better understand what parallelism is, we will first introduce execution and take Ethereum under the PBS model after the Merge as an example to explain what execution is while demonstrating its position in the entire transaction lifecycle.
Specific steps of transaction execution
Transactions enter the memory pool and are filtered and sorted: This is the preprocessing phase after transaction submission, involving the interaction of Mempool, Searcher, and Builder to complete the filtering and sorting of transactions.
Builder constructs a block (but does not execute): The Builder arranges profitable transactions into a block to complete the packaging and sorting of transactions.
Proposer verifies and submits the block: After the block is constructed, the Builder sends the proposal of the block to the Proposer. The Proposer verifies the structure and transaction content of the block, then formally submits the block to the network to start execution.
Executing transactions: After the block is submitted, nodes execute the transactions in the block one by one. This is a critical stage for state updates, where each transaction triggers smart contract calls, account balance changes, or state changes.
Witnessing by validators: Validators witness the execution results and state root of the block and confirm it as final. This ensures the authenticity and validity of the block in the execution layer and prevents inconsistencies.
State synchronization: Each node synchronizes the execution results of the block (such as account balances, contract state updates, etc.) to its local state after executing each transaction, calculating and storing a new state root to serve as the initial state in the next block.
Of course, this is just state synchronization for transactions on a block basis. To maintain the latest on-chain state, nodes typically synchronize data block by block and continuously verify blocks and states. However, to achieve finality under the POS mechanism, aggregators need to consolidate the signatures of witnesses in each slot into a complete signature and pass it to the proposer of the next slot, and validators need to confirm the state of all blocks in that epoch based on the number of votes after an epoch, forming a temporary consensus state checkpoint. Finality of blocks and transactions is only achieved after two consecutive epochs receive majority witness support from validators.
From the entire lifecycle of transactions, execution occurs after the Proposer verifies the structure and transaction content of the block sent by the Builder. The actual execution process requires processing transactions one by one and updating the corresponding account or contract states. After all transactions are executed, the Proposer calculates a new state root (Merkle root), summarizing the execution results and final global state of all transactions in the current block. Simply put, the complete block execution process includes a series of computations needed to transition Ethereum from the previous state to the next state, from the execution of each transaction to the calculation of the Merkle root.
Sequential Execution
In contrast to parallel execution, sequential execution is the currently more common execution method in blockchain. Typically, transactions are executed step by step in order. When a transaction is completed, Ethereum updates the account state and related information (such as balances, contract storage data) in the account state tree, generating a new account state hash. Once all account state trees are updated, a root node hash known as the state Merkle root is formed. After the state Merkle root, transaction Merkle root, and receipt Merkle root are completed, the block header will undergo hash computation to generate the block hash.
In this process, the execution order of transactions is crucial. Since the Merkle tree is a binary tree of hash values, different orders will produce different Merkle root values.
Parallel Execution
In a parallel execution environment, nodes will attempt to process transactions in the block in parallel. Transactions are not executed one by one in order, but are allocated to different 'execution paths' so that they can execute simultaneously. Through parallel execution, the system can more efficiently handle transactions in the block, increasing throughput.
After all transaction executions are completed, nodes will summarize the execution results (i.e., state updates affected by transactions) to form a new block state. This state will be added to the blockchain, representing the latest global state on the chain.
State Conflict
Since parallelism processes transactions on different paths simultaneously, a major difficulty of parallelism is state conflicts. This means multiple transactions may read or write to the same piece of data (state) on the blockchain within the same timeframe. If not handled properly, this could lead to indeterminate execution results. Because the order of state updates differs, the final computation results may also differ. For example,
Suppose there are two transactions, Transaction A and Transaction B, both attempting to update the balance of the same account:
Transaction A: Increase account balance by 10.
Transaction B: Increase account balance by 20.
The initial account balance is 100.
If we execute serially, the result of the execution order is determined:
1. Execute transaction A first, then transaction B:
The account balance first increases by 10, becoming 110.
Then increases by 20, finally becoming 130.
2. Execute transaction B first, then transaction A:
The account balance first increases by 20, becoming 120.
Increased by another 10, finally becoming 130.
In both of these sequences, the final balance is 130 because the system ensures consistency in the execution order of transactions.
However, in a parallel execution environment, transaction A and transaction B may simultaneously read the initial balance of 100 and perform their calculations:
Transaction A reads the balance as 100 and updates it to 110 after calculation.
Transaction B also reads the balance as 100 and updates it to 120 after calculation.
In this case, because the transactions are executed simultaneously, the final balance is only updated to 120 instead of 130, as the operations of transactions A and B 'overlap' with each other's results, resulting in a state conflict.
Such state conflict issues are commonly referred to as 'data overwriting', where transactions simultaneously attempt to modify the same data, potentially overwriting each other's computational results, leading to incorrect final states. Another type of potential problem due to state conflicts is the inability to guarantee execution order. As multiple transactions complete operations at different times, it can lead to different execution sequences. Different sequences may produce different computational results, resulting in indeterminate outcomes.
To avoid such uncertainty, blockchain parallel execution systems typically introduce some conflict detection and rollback mechanisms or conduct dependency analysis of transactions in advance to ensure they can execute in parallel without affecting final state consistency.
Optimistic Parallelism vs Deterministic Parallelism
There are two approaches to addressing potential state conflict issues: deterministic parallelism and optimistic parallelism. These two modes each have trade-offs in terms of efficiency and design complexity.
Deterministic parallelism requires prior declaration of state access, and validators or sequencers will check the declared state access during transaction sorting. If multiple transactions attempt to write to the same state, those transactions will be marked as conflicts to avoid simultaneous execution. Different chains have different specific implementations for the early declaration of state access, but generally include the following forms:
Through contract specification constraints: Developers directly specify the state access range in smart contracts. For example, ERC-20 token transfers require access to the sender and receiver's balance fields.
By structuring data in transactions: Adding dedicated fields in transactions to indicate state access.
Through compiler analysis: Compilers for high-level languages can statically analyze contract code and automatically generate a set of state accesses.
Framework mandatory declaration: Some frameworks require developers to explicitly specify the states to be accessed when calling functions.
Optimistic parallelism would optimistically process transactions first, and when conflicts occur, affected transactions are re-executed in order. To avoid conflicts as much as possible, the core design of optimistic parallelism is to quickly predict and assume states through historical data and static analysis. That is, the system assumes that certain operations or state updates are valid without complete verification, aiming to avoid waiting for all validation processes, thereby improving performance and throughput.
Although optimistic parallelism can avoid conflicts as much as possible through quick predictions and assumptions about the state, there are still some unavoidable challenges, especially concerning contract execution or cross-chain transactions. If conflicts occur frequently, re-execution may significantly slow down system performance and increase resource consumption.
Deterministic parallelism avoids potential conflicts by performing state dependency checks before transactions, but because it requires accurately declaring state dependencies before transaction submission, it imposes higher demands on developers, thereby increasing implementation complexity.
EVM Parallel Dilemmas
Dealing with state conflicts involves not only determinism and optimism but also requires consideration from the perspective of the chain database architecture in the specific implementation of parallelism. The problem of state conflicts in parallelism is particularly challenging in the EVM under the Merkle tree architecture. The Merkle tree is a hierarchical hash structure, and every time a transaction modifies a piece of state data, the root hash of the Merkle tree must also be updated. This updating process is recursive, calculated layer by layer from the leaf nodes to the root node. Since hashing is irreversible, it can only compute the upper layers after the lower layer's data changes are completed, making it difficult to update in parallel.
If two transactions are executed in parallel and access the same state (such as account balance), it will cause conflicts in the Merkle tree nodes. Resolving such conflicts typically requires additional transaction management mechanisms to ensure a consistent root hash across multiple branches. This is not easy for the EVM, as it requires trade-offs between parallelization and state consistency.
Non-EVM Parallel Solutions
Solana
Unlike Ethereum's global state tree, Solana uses an account model. Each account is an independent storage space stored in the ledger, thus avoiding path conflict issues.
Solana is deterministic parallelism. In Solana, each transaction must explicitly declare the accounts it will access and the required access permissions (read-only or read-write) at the time of submission. This design allows blockchain nodes to analyze the resources that each transaction needs to access before executing the transaction. Because all account dependencies are clearly stated before execution begins, nodes can determine which transactions will access the same accounts and which transactions can be safely executed in parallel, thereby achieving intelligent scheduling and avoiding conflicts, thus laying the foundation for parallel scheduling.
Since each transaction has already declared the accounts and permissions needed for access before execution, Solana can check for account dependencies between transactions (Sealevel model). If there are no shared read or write accounts between transactions, the system can allocate them to different processors for parallel execution.
Aptos
Aptos's parallel execution design is significantly different from Ethereum, as it makes some key innovations in architecture and mechanisms, primarily reflected in the account model and state storage.
Ethereum frequently needs to update the global state tree (MPT) when executing transactions. The states of all accounts and contracts are stored in a shared state tree, and any transaction needs to access and update a part of this state tree. In contrast, Aptos divides accounts into independent state units, where each object is an independent key-value pair that can exist independently without affecting each other, only associating when there is an explicit reference relationship. There are no common tree paths between objects, avoiding lock contention and allowing for complete parallelism.
The underlying data structure of Aptos is the Jellyfish Merkle Tree. The state of each object is ultimately stored in the JMT as an independent key-value pair. Unlike Ethereum's MPT, the Jellyfish Merkle Tree takes the form of a complete binary tree structure, which simplifies the storage and query paths of nodes, significantly reducing verification time. Additionally, the position of each account in the tree is fixed, and the nodes in the tree are stored independently, allowing multiple accounts to be updated and queried in parallel.
Aptos is optimistic parallelism; it does not require prior declaration of all account dependencies. To this end, Aptos uses Block-STM, which estimates dependencies using a predefined transaction order to reduce the number of aborts.
Parallel EVM
Compared to non-EVM parallel systems, parallel EVM faces greater technical challenges in addressing state dependencies, conflict detection, gas management, and rollback mechanisms. To better understand this, we can refer to how some parallel EVM projects (such as Sui, Monad, Canto) solve these issues.
Sui
Like Aptos, Sui also uses an object model to handle states, treating each object (such as account, smart contract state) as an independent resource, which are differentiated by unique object identifiers. When transactions involve different objects, these transactions can be processed in parallel because they operate on different states without creating direct conflicts.
Although Sui uses an object model to manage states, in order to be compatible with EVM, Sui's architecture bridges the object model and EVM's account model through an additional adaptation layer or abstraction mechanism.
In Sui, transaction scheduling uses an optimistic parallel strategy, assuming there are no conflicts between transactions. If conflicts occur, the system uses rollback mechanisms to restore the state.
Sui uses an object model and state isolation techniques to effectively avoid state dependency issues. Each object serves as an independent resource, allowing different transactions to execute in parallel, thereby improving throughput and efficiency. However, the trade-off of this method is the complexity of the object model and the overhead of rollback mechanisms. If conflicts occur between transactions, partial states need to be rolled back, increasing the system's burden and potentially affecting the efficiency of parallel processing. Compared to non-EVM parallel systems (such as Solana), Sui requires more computational and storage resources to maintain efficient parallelism.
Monad
Like Sui, Monad also adopts optimistic parallelism. However, Monad's optimistic parallelism will predict some dependent transactions before the actual execution of transactions, mainly through Monad's static code analyzer. The prediction requires accessing the state, and the way states are stored in Ethereum's database makes accessing states very difficult. To make the process of reading states in parallel more efficient, Monad also restructured the database.
The Monad state tree is partitioned, with each partition maintaining its own state subtree. During updates, only the relevant shards need to be modified, without the need to rebuild the entire state tree. A state index table quickly locates states within partitions, reducing inter-partition interaction.
Summary
The core of parallelism is to improve execution efficiency in the execution layer through multi-path execution. To achieve multi-path execution, the chain needs to undergo a series of conflict detection and rollback mechanisms to ensure they can execute in parallel without affecting the final state consistency and make certain improvements to the database.
Of course, the improvement of execution layer efficiency is not limited to parallelism; optimization of the execution process can also be achieved by reducing the read and write operations required by a transaction on the database. The overall speed improvement of the chain involves a broader range, including the enhancement of consensus layer efficiency.
Every technology has its specific limitations. Parallelism is just one way to enhance efficiency, and the final decision to use this technology needs to consider whether it is developer-friendly and whether it can be completed without sacrificing decentralization, among other factors. The stacking of technologies is not necessarily better; at least for Ethereum, parallelism is not that attractive. From the perspective of improving efficiency, adding parallelism is not an optimal solution for Ethereum, whether considering simplicity or Ethereum's current roadmap centered around Rollup.