Original | Odaily Planet Daily (@OdailyChina)

Author | Fu He How (@vincent 31515173)

In August, TON was in dire straits.

First, the founder of Telegram was arrested in France and released on bail, and then the Ton network was interrupted twice and faced great doubts. These two crises poured cold water on the increasingly hot Ton ecosystem and further compressed its future narrative space.

The market's focus is mostly on the arrest of the founder, and not much attention is paid to the technical failure, which is the major event that truly affects the future development of the Ton ecosystem.

Why does the TON network frequently crash? There are many different opinions in the community. Odaily Planet Daily will analyze TON's white paper, related technical documents and the current status of the network to explore the reasons behind the two block interruptions of the TON network.

Multiple parties: Insufficient number of validators, overly complex underlying design

Event review: In the early morning of August 28, the TON network experienced its first block interruption, which took 7 hours to resume. But less than 24 hours later, TON experienced its second block interruption in the early morning of August 29.

Surface reason: Block production stagnation caused by surge in DOGS transaction volume

The direct cause of the network block interruption was the surge in DOGS transaction volume.

DOGS is a Meme coin that has been very popular on the TON network recently. The total amount is 550 billion, and the airdrop share accounts for 72.73% of the total amount. The airdrop threshold only requires a Telegram account. Recently, DOGS was launched on several platforms such as Binance, resulting in a surge in the number of on-chain transactions in a short period of time.

As a PoS public chain, TON relies on its validator nodes to process and confirm transactions and package these transactions into blocks. Under normal circumstances, the blockchain network generates new blocks at set time intervals, but when the system cannot process all pending transactions in time, the block generation process will be delayed or even interrupted.

Transaction overload is not uncommon in the blockchain field. Many networks, including well-known public chains such as Bitcoin and Ethereum, have faced similar problems. When the transaction volume exceeds the instantaneous processing capacity of the network, the transaction verification speed will drop significantly. In the case of TON, the surge in transaction volume may cause the validator to be overloaded, which in turn slows down the overall block speed. This phenomenon is particularly evident during certain peak periods, just like when DOGS suddenly became popular, the transaction volume surged beyond the network's carrying capacity, resulting in delayed block production.

As for the reason why the network interrupted block production twice, the TON Foundation explained that due to the overload of DOGS transactions, garbage collection caused many validators to be overloaded, and they lost consensus for too long.

Interestingly, TON successfully applied for Guinness World Record certification in a public performance test at the end of November last year, with a TPS of 104715. The official explanation that it was due to DOGS transaction overload seems too pale and weak.

The underlying reasons: TON network design limitations and validator issues

In fact, transaction overload is only a symptom of the problem. The fundamental problem of TON network block interruption is hidden in its underlying design and validator mechanism. By analyzing TON's technical architecture, sharding mechanism and the organizational form of its validators, we can analyze why the TON network is unstable under extreme conditions from the following three perspectives.

1. Complexity of Shard Chain Architecture: Challenges of High Scalability

TON’s architecture is designed with high scalability and high performance at its core. Its unique multi-level structure of main chain, work chain and shard chain can theoretically improve the network’s processing capacity by distributing the load. However, this complex shard chain structure also brings many challenges.

Each working chain can be further split into multiple shard chains, each of which is responsible for transaction processing for different accounts. This design allows a large number of transactions to be processed in parallel on different shard chains, thereby improving the TPS of the overall network. However, when the transaction volume surges, if the load in some shard chains is unevenly distributed or the validators fail to process a large number of transactions in a timely manner, the block production speed of these shard chains may slow down or even stagnate. Since the shard chains must be synchronized with the main chain, if a key shard chain has a problem, it may affect the block production process of the entire network.

TON’s sharding approach is highly innovative, allowing shard chains to be reduced to each shard chain being responsible for only a small number of accounts or smart contracts, or even each shard managing one account or contract. However, this extreme sharding approach also increases the complexity of coordination and management. Although sharding technology is an effective means to improve the scalability of blockchain, it requires highly efficient and stable coordination between each shard chain and the main chain. Once a shard chain encounters a bottleneck under extreme conditions, the block generation process of the entire network may be blocked.

2. Insufficient number of validators: potential risk to TON’s decentralization

Another significant problem with the TON network is the insufficient number of validators. Compared with other PoS public chains, TON has significantly fewer validators. Currently, the TON network has only 318 validator nodes, while the number of validators in Ethereum has exceeded 600,000, and the number of validators in Solana far exceeds that of TON. This difference in the number of validators directly affects the degree of decentralization and network security of TON.

In a PoS network, validators are responsible for verifying transactions, reaching consensus, and packaging verified transactions into blocks. The number of validators not only determines the degree of decentralization of the network, but also directly affects the network's processing capacity under high load. TON has a small number of validators, which means that each validator needs to process more transaction requests. When the transaction volume increases sharply, the validators may not be able to process all transactions in time, resulting in block delays or even interruptions.

In addition, TON has high hardware and network requirements for validators, and a large amount of Toncoin needs to be staked to become a validator. These high threshold conditions limit the number of validators, so that only participants with sufficient resources can join the ranks of validators. This not only limits the decentralization of the TON network, but also makes the block delay problem more prominent during peak trading periods.

3. Limitations of the consensus mechanism: Challenges of the Byzantine fault-tolerant protocol under high load

The TON network uses a consensus mechanism based on the Byzantine Fault Tolerance (BFT) theory, the Catchain protocol. This protocol is designed to maintain the normal operation of the network even in the presence of malicious nodes. However, the efficiency of this mechanism will be affected when the number of validators is limited and some validators are unable to participate in the consensus in time due to overloaded transaction volume.

The working principle of the Catchain protocol is that as long as the number of malicious nodes among the validators participating in the consensus does not exceed one-third, the network can reach consensus and generate blocks. However, when the number of validators is limited and the load is too high, multiple validators may be unable to respond at the same time, causing the consensus process to become slow or even unable to reach consensus, resulting in block stagnation.

Although TON's consensus mechanism is designed to be highly risk-resistant, its actual effect depends on the number and distribution of validators. When the number of validators is insufficient and the network load exceeds expectations, the efficiency of the Catchain protocol will decrease significantly, causing the network to slow down or even stagnate.

Decentralization and underlying mechanism defects have become obstacles to the development of Ton

TON has been facing challenges one after another recently. The first is the arrest of Telegram’s founder in France, which not only makes the future development of TON uncertain, but also may affect the cooperation between Telegram and TON ecosystem. Telegram’s 1 billion monthly active users were originally seen as a potential huge force for the development of TON ecosystem. This incident undoubtedly casts a shadow on the future cooperation between the two parties.

In addition, the TON network itself was interrupted for two consecutive block productions in a short period of time, further exposing its limitations under high load conditions. Although these two interruptions were caused by the surge in DOGS transaction volume, the underlying reason involves the underlying design issues of the TON network. The complexity of the shard chain architecture, the insufficient number of validators, and the reduced efficiency of the consensus mechanism under high load all indicate that the TON network has significant technical bottlenecks in dealing with emergencies. These problems not only affect the current stability of TON, but also pose potential threats to its long-term development.

The author believes that the TON ecosystem needs to be improved in the following aspects to ensure its stability and sustainable development.

  • TON needs to expand the number of validators, lower the threshold for becoming a validator, and attract more nodes to participate, so as to improve the degree of decentralization and the network's carrying capacity.

  • TON should optimize its shard chain architecture, improve the coordination efficiency between the shard chain and the main chain, and ensure smooth operation in a high transaction volume environment.

  • Further optimization of the consensus mechanism is also essential. TON should study how to improve the efficiency of the Catchain protocol under high load conditions to ensure that the network can still produce blocks stably under extreme conditions.

TON has faced major crises since its birth, and later relied on community autonomy to achieve nirvana. In the early stages of development, it also faced low popularity and a poor ecosystem. The current situation is not enough to pose a "fatal threat" to the former TON ecosystem. I hope that TON can overcome the current difficulties and improve its own network to better meet future challenges and gradually build a stronger and more prosperous ecosystem.