Binance Exchange - the world's largest Bitcoin exchange, register to receive a 20% commission rebate.

Binance Registration:https://www.binance.com/zh-CN/join?ref=ECZFCWPV(20% commission rebate)

During the 2024 Hong Kong Web3 Carnival, Ethereum co-founder Vitalik Buterin delivered a keynote speech entitled "Reaching the Limits of Protocol Design" at the "Web3 Scholar Summit 2024" hosted by DRK Lab.

The following is the Chinese live recording brought by DeThings, which has been deleted:

The types of technologies we use to build protocols have changed a lot over the last 10 years. When Bitcoin was born in 2009, it actually used very simple cryptography, and the only types of cryptography you saw in the Bitcoin protocol were hashing and elliptic curve ECDSA signatures, and proof of work (Pow). Proof of work is just another way of using hashing. If you look at the types of technologies used to build protocols in the 2020s, you start to see a more complex collection of technologies that have really only appeared in the last 10 years.

These things have certainly been around for a long time, so technically we've had the PCP theorem for decades. We've had fully homomorphic encryption since Craig Gentry's discovery in 2009. We've had garbled circuits, which is a two-party form of computation, for decades. But there's a difference between having these technologies in theory and having them in practice.

I actually think a lot of credit goes to the blockchain space itself for actually bringing a lot of resources to bring these technologies to the stage where you can use them in regular applications.

Blockchains built in the 2020s assumed that hashes and signatures were all you had. Observation protocols built in the 2020s treat all of these things as key components from the outset.

ZK-SNARKs are the first big thing here. ZK-SNARKs are a technology that can prove that you performed a computation and got some output from the computation. You can prove it in such a way that it can be verified much faster than if you ran the computation yourself. You can also verify the proof without revealing the original input information.

The difference between zK-SNARKs in 2010 (theoretical) and zK-SNARKs in 2016 (first used in the Zcash protocol launched in December of that year) and zK-SNARKs today is huge, right?

So a lot of these newer forms of encryption, they've gone from being something that almost nobody knew about to niche interest to mainstream to now almost being the default. These things have changed and improved dramatically over the last decade.

“So Z-SNARKs are very useful in terms of privacy, they’re very useful in terms of scalability. What do blockchains do? Blockchains give you a lot of benefits, they give you openness, they give you permissionless access, they give you global verifiability. But all of that comes at the expense of two big things.

One is privacy, and the other is scalability. ZK-SNARKs give you privacy as well as scalability. In 2016, we saw the Zcash protocol. After that, we started seeing more and more things in the Ethereum ecosystem. Today, almost everything is starting to use zkSNARKs, multi-party computation, and fully homomorphic encryption. People know less about these things today than zkSNARKs, but there are certain types of things that can't be done with ZK-SNARKs. Like private computation, running on people's private data.

Voting is actually a big use case where you can get a certain level of privacy with zk-SNARKs. But if you want to get the really best properties, then you have to use MPC (multi-party computation) and FHE (fully homomorphic encryption). A lot of crypto AI applications end up using MPC and FHE as well, both of which are primitives that have gotten dramatically more efficient over the last decade. BLS (Boneh-Lynn-Shacham, aggregate signatures) is an interesting technique that basically allows you to take a large batch of signatures from a large number of different participants, potentially tens of thousands of participants, and then verify that combined signature as quickly as you could verify a single signature.

This is very powerful. BLS aggregation is actually the core technology of modern proof-of-stake consensus in Ethereum. If you look at proof-of-stake consensus that was built before BLS aggregation, a lot of times the algorithms tended to only support a few hundred validators. In Ethereum, there are currently about 30,000 validators, submitting signatures every 12 seconds. The reason this is possible is because of this new form of cryptography that has only really been optimized enough in the last 5 to 10 years to be used. These new technologies make a lot of things possible."

They are rapidly becoming more powerful. Today's protocols make heavy use of all of these techniques. We've really gone through a major shift from specialized cryptography, where you had to understand how cryptography worked in order to create a new protocol, to general cryptography where you had to create specialized algorithms for specialized applications. Where before you had to create a special-purpose algorithm for a special-purpose application, now you don't even need to be a cryptographer to create an application that uses the things I've talked about in the last five minutes.

You just write a piece of code, write a piece of code in Circom, and Circom compiles it into a validator and a verifier, and you have a zk-SNARK application. What is the challenge here? Essentially, the question is that we have come a long way in the last 10 years. What is left? What is the gap between these technologies that we have today and the theoretical ideal? I think this is the key area where researchers and academics can make a big difference.

I think the two main issues are basically: one is efficiency, the other is security. Now there is a third issue, which is, let's say, expansion capabilities.

For example, one technology we haven’t really mastered yet is indiscriminate obfuscation. If we could have an algorithm that works, that would be even more amazing. But actually, I think it’s more important to improve the efficiency and improve the security of what we have today.”

Let's talk about efficiency. Let's take a specific example, which is the Ethereum blockchain. In Ethereum, the slot time is 12 seconds. The average time between one block and the next is 12 seconds. The normal block verification time, which is the time it takes for any Ethereum node to verify a block, is about 400 milliseconds.

Right now, the time it takes for a zk-SNARK to verify a normal Ethereum block is about 20 minutes. That's improving pretty quickly, it was 5 hours a year or two ago. Now 20 minutes is the average, right? There's still a worst-case scenario. For example, if you have an Ethereum block where the entire Ethereum block is doing zk-SNARK computations, then its proof time is going to be over 20 minutes.

Still, we are further along than we were two years ago. What is the goal now? The goal is real-time proofs, the goal is that when a block is created, before the next block is created, you can get proof of that block. When we have real-time proofs, what do we have? Basically, every Ethereum user in the world can easily become a fully validating user of the Ethereum protocol, but there are very few people who own Ethereum nodes. In fact, an archive node requires 2 TB, and you can do it, but it's inefficient. What if we can make every Ethereum wallet, including browser wallets, mobile wallets, and lightweight wallets for smart contracts on other chains, can actually fully validate the Ethereum consensus rules?

Some people don't actually trust Infura. They don't even trust Ethereum's proof-of-stake validators, but instead verify the rules directly and ensure the correctness of Ethereum blocks directly. How can we do this with ZK-SNARKs? For this to actually work, zK-SNARK proofs need to be real-time, but there needs to be a way for any Ethereum block to be proven, perhaps in less than 5 seconds.

The question is, can we get there? Now, MPC and FHE have similar problems. As I mentioned before, a classic use case for MPC and FHE is voting, and it's actually already being used. There was an Ethereum event in Vietnam about three weeks ago. At that event, they actually used MPC, one of these cryptographically secure voting systems, to vote in projects and hackathons.

The problem with MPC right now is that some of its security properties rely on a central server. Can we decentralize this trust assumption? Yes, but that requires MPC and FHE. The problem is that it's expensive to make these protocols efficient, and FHE on top of ZK-SNARKs. For these protocols to become the default for normal people, they can't cost $5 in computation to prove every vote, right? It has to be fast, even in real time for large numbers of votes.

So how do we achieve the goals of ZK-SNARKs? I think there are three big categories of efficiency gains. One of them is parallelization and aggregation. Imagine that verifying an Ethereum block takes up to 10 million computational steps. You take each of those computational steps and make a proof for each. Then you do proof aggregation. Take the first two proofs and prove them. Take the next two proofs and prove them. Take the next two proofs and prove those. Take the proofs of the first two proofs and prove those, and you get a tree. After about 20 steps on this tree, you get a big proof that represents the correctness of the entire block.

This is doable with today's technology. It can prove the correctness of a theoretical block in 5 seconds. What's the problem? Basically, this requires a lot of parallel computing, right? It requires 10 million proofs. So can we optimize it? Can we optimize parallelization? Can we optimize aggregate proofs? The answer is yes. There are a lot of theoretical ideas about how to do this. But it really needs to be turned into something practical. This is a problem that combines algorithmic improvements, low-level improvements and hardware design improvements, efficiency improvements, so ASICs are also very important. We all see how important ASICs are for mining, right? Remember in 2013 when ASICs first came online, we saw how the Bitcoin hash rate grew rapidly.

ASICs are pretty powerful, right? For the same hardware cost and electricity cost, an ASIC can basically have 100 times the hashrate of a GPU. The question is, can we bring the same benefit to SNARK proofs? I think the answer should be yes. So there are more and more companies starting to actually make ASICs specifically for proving zK-SNARKs. It can be zkEVMs, but it should actually be very general. You should be able to make a SNARK ASIC to prove any type of computation. Doing this, can we go from 20 minutes to 5 seconds?

Finally, efficiency, right? So we need better zK-SNARK algorithms. We have Groth16, we have lookup tables, we have 64-bit SNARKs, we have STARKs, we have 32-bit STARKs, all kinds of different ideas. Can we make SNARK algorithms more efficient? Can we create more SNARK-friendly hash functions, more SNARK-friendly signature algorithms? There are a lot of ideas here, and I really encourage everyone to work on these ideas.

The main security issue is bugs, right? Bugs are one of the biggest issues that I think people don't talk about very often, but they're very important, right? Basically, we have all this amazing cryptography, but if people are worried that there's some kind of flaw in the circuit, they won't trust it, right? Whether it's zK-SNARK or zkEVM, they're 7,000 lines of code. And that's at a very efficient level. On average, there are 15 to 50 bugs per thousand lines of code. In Ethereum, we're working very hard to have less than 15 bugs per thousand lines of code, but more than zero, right? If you have these systems that hold billions of dollars of assets, if there's a bug in one of them, that money is lost, no matter how advanced the crypto is.

The question is, what can we do to actually leverage existing cryptography and reduce the bugs in it? Today, the basic technology that's used here is safety committees, where basically you just get a bunch of people together in Ethereum, and if a majority of them, like over 75%, agree that there's a bug, then they can overturn whatever the proof system says. So it's a pretty centralized system, but it's the best we have right now. In the near future we'll have multiple proofs. Here's a diagram of Starknet, which is one of the Ethereum-based Rollups. The idea is that if you have multiple proof systems, in theory, you can use redundancy to reduce the risk of a bug in any one of them, if you have three proof systems, if one of them is bugged, then hopefully the other two won't be bugged in exactly the same place.

Finally, one of the things that I think is interesting to look into in the future is using AI tools, it's possible to use new tools to do formal verification, right? So, like mathematically proving that something like the ZKEVM has no vulnerabilities, right? Basically, can you actually prove, for example, that the zkEVM implementation is verifying the exact same functions in the EVM code as the Ethereum implementation? For example, can you prove that they have only one output for any possible input? If you can try to actually prove these things, then maybe we can actually achieve a world with a bug-free zkEVM sometime in the future.

That's crazy, right? Because nobody had ever made such a complex bug-free program before. But in 2019, nobody thought it was possible for AI to make really beautiful pictures, right? So, today we just saw how far we've come. We saw what AI is capable of. Now the question is, can we try to apply similar tools to real-world tasks, like automatically generating mathematical proofs of complex statements that span tens of thousands of lines of code? I think this is an interesting open challenge that people should be interested in.

Regarding the efficiency of aggregate signatures, today, Ethereum has 30,000 validators, and it's very demanding to run a node, right? I have an Ethereum node on my laptop, and it can run, but it's not a cheap laptop, and I have to upgrade the hard drive myself. The ideal goal of Ethereum is to support as many validators as possible.

We want proof of stake to be as democratized as possible, to allow people to directly participate in validation at any scale. We want the requirements for running an Ethereum node to be very low, and very easy to use. We want the theory and the protocol to be as simple as possible. What are the theoretical limitations here? All data per participant per period needs to be 1 bit, because you have to broadcast who signed and who didn't.

This is the most basic limit. Beyond this limit, there are no other limits. There is no lower limit for calculations. You can do aggregate proofs, you can do recursive proof trees. You can do signatures, you can do all kinds of aggregate signatures. You can use STARKs, you can use grid-based cryptography, you can use 32-bit STARKs, you can use all kinds of different technologies.

The question is, how much can we optimize signature aggregation? That's peer-to-peer security, and people don't think about peer-to-peer networks enough. This is a point I want to make a lot of, and I think in the crypto space, people tend to create fancy structures on top of peer-to-peer networks and then think that peer-to-peer networks just work.

There are a lot of risks here, right? I think they're going to get more sophisticated, so in the 2010s, every node can see everything. You can certainly do some attacks: there are eclipse attacks, there are denial of service attacks, there are all kinds of attacks.

But when you have a very simple network, and the network's only job is to make sure everybody gets everything, the problem is pretty simple. The problem is that as Ethereum scales, peer-to-peer networks become more and more complex. The Ethereum peer-to-peer network today has 64 shards, right?

To do signature aggregation, to handle 30,000 signatures per session like we do now, we have a peer-to-peer network that is split into 64 different sub-networks, and each node belongs to only one or a few of them. In data availability sampling, this is the technique Ethereum uses to provide data space for blocks to achieve scalability.

This also depends on a more complex peer-to-peer architecture. Here you see a graph of peer nodes, and in this setup, each node can only download 1/8 of all the data. So the question is, is such a network really secure? Can you guarantee its security? Can you increase the guarantee rate as much as possible? How can we protect and improve the security of the peer-to-peer network that Ethereum relies on?

Basically, I think at this point, what we need to focus on is that we need protocols that can reach the limits of cryptography, and our cryptography is already much stronger than it was a decade ago, but it can be much stronger, and at this point, I think we really need to start looking at what the upper limit is and how we can actually reach that upper limit.

There are two equally important areas here. One is to continue to improve efficiency, and we want to prove everything in real time. We want to see a world where every piece of information passed in a blog in a decentralized protocol will have a zk-SNARK attached to it by default to prove that the information and everything that information depends on follows the rules of the protocol.

The second frontier is improving security. Fundamentally, it's about reducing the chances of error. Let's move to a world where the actual technology that these protocols rely on can be very robust and very trustworthy, and people can rely on it as much as possible.

But as we have seen many times, Multisignature can also be hacked, and there are many such examples of these so-called Layer2 projects where the coins in one or two of the projects are actually controlled by a multi-signature, but somehow, five of the nine signatures were hacked at the same time, resulting in a large loss of funds. If we want to transcend this world, then we need to believe in what is the technology that can actually be used and truly enforce the rules through cryptography, rather than trusting a small group of people to ensure that the rules are followed.

But to do that, the code has to be trustworthy. The question is, can we make the code trustworthy? Can we make the network trustworthy? Can we make the economics of these products, these protocols trustworthy? I think these are the core challenges, and I hope we can continue to work together and make continuous improvements. Thank you.