In this three-part series, we will reveal technological achievements that can significantly improve data processing for applications running on the Internet Computer Protocol (ICP).
This upgrade is a Stellarator milestone in the ICP roadmap and is currently being rolled out across the network. Stellarator has made breakthroughs in on-chain data storage and throughput, enabling each subnet to hold over 1TB of memory and upload data faster, freeing opportunities for data-rich applications previously constrained by storage and throughput limitations.
This advancement enables developers to build complex applications that require large-scale data processing, bringing new practical levels to blockchain technology.
As the final part of this series, Kamil Popielarz and Yvonne-Anne Pignolet will share the latest progress on improving the throughput of entry messages in the Internet Computer. If you missed the earlier parts of this series, you can find them here and here.
Increase the throughput of entry messages
If you are like us, then waiting for data to upload to the dapp is not your favorite pastime, so we are excited to announce that the Network Nervous System (NNS) is rolling out optimizations for the Internet Computer Protocol to enhance consensus throughput.
These protocol changes reduce the bandwidth consumption and time required to propagate blocks while maintaining the security properties of the ICP. Therefore, users will spend more time enjoying faster interactions with ICP dapps.
Background
The Internet Computer Protocol coordinates its network nodes to provide decentralized computing services, even when some nodes deviate from the protocol.
It is well known that the ICP can host any application that combines code and data, referred to as containers. Containers can process user-submitted entry messages and interact with other containers by exchanging and executing messages.
Network nodes no longer replicate the execution of each container across all nodes but are divided into multiple shards, called subnets, each adopting a robust consensus protocol to ensure the execution and state consistency of containers hosted on its nodes.
The consensus protocol is responsible for creating and validating blocks, with each block containing a set of container messages. After consensus on the order and content of these blocks is reached, nodes can execute the corresponding container code in a deterministic and consistent manner, maintaining the integrity of the computing service.
Block payloads contain user-submitted entry messages to trigger container calls. After receiving entry messages from users, nodes perform a series of checks (e.g., signatures, size, expiration time). If these checks succeed, the nodes add the messages to their entry pools and broadcast the messages to other nodes in the subnet using the ICP's P2P protocol.
When it is a particular node's turn to create a block proposal, that node includes a set of entry messages from its entry pool, and then it broadcasts this proposal to its peers.
However, since most peers already have most of these entry messages in their local pools, this process wastes bandwidth.
Another drawback of this method is that sending a proposal containing 1000 messages of 4KB to all peers takes more time than sending a proposal with 1000 hash values.
Assuming replicas have the suggested minimum bandwidth of 300Mbit/s, broadcasting a block proposal containing a 4MB payload to all peers in a subnet with 13 nodes takes: 4MB * (13-1) / 300Mbit/s = 1.28 seconds.
If each hash value is less than 50 bytes, the same proposal can be transmitted to all nodes at speeds over 800 times faster. For larger subnets, these differences accumulate, making the impact even greater.
Optimization
To reduce proposal delivery time and bandwidth consumption, the protocol has been improved to allow nodes to include references (hash values) to entry messages in blocks instead of full messages. Since nodes will broadcast entry messages using the P2P protocol anyway, replicas should be able to reconstruct block proposals by retrieving all entry messages from their respective entry pools using the references.
However, some entry messages may be lost in a node's local entry pool. This can happen due to poor network conditions, a full pool, node crashes, or malicious node behavior. Nodes need to have all entry messages of a proposal to validate and/or execute it. To retrieve lost entry messages, nodes can request messages they do not yet have from the peer node that published the proposal.
To increase the chances that all entry messages are present in the entry pool of all peers before proposing blocks containing their hash values, certain aspects of the entry pool implementation have been modified.
First, nodes now send entry messages directly to peer nodes instead of sending advertisements to peer nodes and waiting for them to request, saving at least one round trip and allowing for faster broadcasting of entry messages to peer nodes.
In addition, the management of entry pool size has also improved. So far, there are global limits on the number of messages and their total size that they can occupy. If these limits are exceeded, nodes will reject any entry messages broadcasted by peers.
Therefore, under very heavy loads, nodes in a single subnet may end up with almost non-intersecting entry pools, in which case all nodes must request all messages from the block proposer for each block proposal, increasing latency and reducing throughput.
To address this issue, the global boundary has been replaced with each peer node's boundary. So far, as long as there is space in the entry pool to accommodate that peer node, the node will accept entry messages from that peer node.
Since honest replicas broadcast entry messages to every peer boundary at any given time, we can be highly confident that even under heavy loads, nodes on the same subnet will have highly intersecting entry pools.
To minimize changes to the overall protocol, a new component has been added between P2P and consensus, responsible for removing entry messages from the sender's proposal and re-adding them to the receiver. The rest of the P2P and consensus logic remains unchanged.
Performance Evaluation
To illustrate the impact of optimizations, we conducted experiments on a test network subnet with 13 nodes, imposing a 300ms RTT and limiting bandwidth to 300 Mbps.
Experiments show that by transmitting many small messages (about 4KB), throughput increased from 2MB/s to 6MB/s.
Similarly, in our experiments with large messages (slightly less than 2MB), the throughput increased from 2MB/s to 7MB/s.
Note that in our experiments we focused only on consensus throughput, and the containers we used to send messages did not perform any meaningful work.
The following figure shows the throughput from the above experiment:
We also conducted experiments demonstrating that node joining a subnet (catching up) and node failures, as well as subnets with many containers or many nodes, perform at least as well as before (and in many cases much better).
Conclusion
These protocol changes improve the user experience of the Internet Computer and lay the groundwork for further changes to handle more and larger messages.
These changes have been enabled on certain mainnet subnets, where you can experience firsthand that these benefits also apply to actual network conditions and varying loads, not just small experiments using synthetic traffic patterns.
Please let us know your feedback. You can share your thoughts anytime in the DFINITY Developers X channel and developer forum, and stay tuned for more upcoming technical roadmap updates.
The IC content you care about
Technical Progress | Project Information | Global Events
Follow the IC Binance channel
Stay Updated