How zk-ASM may deliver a secure and trustless internet

Web3.com Ventures Original Research Analysis

0xFishylosopher

Note: This article is a fairly technically-dense piece, and assumes basic conceptual familiarity with zk-Proofs and/or zk-Rollups. A more general introduction to these principles can be found here.

Introduction

Zero Knowledge Proofs, in particular zk-SNARKs (Succinct Non-interactive Arguments of Knowledge) is perhaps one of the most important technologies at the frontiers of Web 3. While most of the media and investment attention in this sub-field have gone towards zk-Rollups, scaling solutions that provide magnitudes of scalability to L1 blockchains such as Ethereum, this is by no means the only application of zk-SNARKs. In this essay, I will analyze in-depth the concept of Zero-Knowledge Assembly code (or zkASM), evaluating its use-cases in both zk-Rollups and beyond, exploring its theoretical possibilities in re-inventing the Internet as we know.

Technical Principles

zk-ASM, as its name suggests, contains two main technical parts: ZK and ASM. The ZK part refers to zk-SNARKs, or Succinct Non-Interactive Arguments of Knowledge, while the ASM part refers to Assembly code. To understand the potential of zk-ASM, we must first understand the theoretical foundations of both these seemingly arcane concepts.

zk-SNARKs

zk-SNARKs are the crown jewels of zk-Proofs: they are a succinct proof that a certain statement is True where the proof reveals nothing about the data being proved. For example, consider someone asserting the statement “I know an m such that C(m) = 0”, where m is a gigabyte long message and C is a function. A zk-SNARK would be a very short proof (< 1GB) that can be quickly verified and one where nothing about m is revealed (beyond publicly available information) [1].

So what is this “C(m)”? How is it useful? This function is actually an arithmetic circuit, or a Directed Acyclic Graph (DAG) representation of a specific function that we want to carry out, as the diagram shows [2]. The “m” is essentially the entry data into the circuit, and specific “nodes” in the circuit are individual logic gates or arithmetic operations. For example, a “+” node may have “2” and “3” as inputs, and output a “5” onto the next operator. Thus, an arbitrary arithmetic or logical operation may be encoded in an “arithmetic circuit.”

Once we have this arithmetic circuit as a representation of the code we want to run a zk-SNARK on, we can begin building this zk-SNARK. Fundamentally, a zk-SNARK is possible because of the “fundamental theorem of algebra,” which states that a polynomial of degree “d” has at most “d” roots [3]. The mathematical trick is two steps: (1) to somehow convert the function “f(m)” that we want to prove into a polynomial (and stick with that), and (2) use the “fundamental theorem of algebra” to interact with the polynomial and provide a succinct proof. In technical jargon, the first part is called a “Polynomial Commitment Scheme” (PCS), and the second part is called a “Polynomial Interactive Oracle Proof” (PIOP) [4].

While the specific implementations of a PCS and PIOP are beyond the scope of this article, thus far we have derived a rough sketch for the core steps of a zk-SNARK:

  1. Have a function-of-choice (code function, math equation etc.) that you want to run a zk-SNARK

  2. Encode this function as an arithmetic circuit C(m)

  3. Run a PCS to get a polynomial representation of this arithmetic circuit

  4. Run a PIOP to get a succinct proof logarithmic in size to the original “m”

And viola, we have a custom-built zk-SNARK that can prove that someone knows a given message without revealing what that message is.

Assembly Code

The second piece of the puzzle of zk-ASM is the idea of Assembly Code. Assembly code is a class of languages containing very low-language instructions that are easy for a machine to read but fairly difficult for a human to decipher. Unlike higher-level languages, such as Python, Java, or even C, Assembly languages contain very primitive functions, such as move, compare, add, and jump on a series of data registers on the processor and hard-coded memory locations. For example, the Python code to print the numbers 1 to 9 on the screen is 123456789:

Pretty easy to understand, right? Now here’s the x86 Assembly version of it [5]:

A lot nastier, particularly for such a simple operation. So why even use Assembly language? As stated above, while these instructions may not be easy for a human to read, they are very easy to “assemble” into the 110011001 byte-code for a machine to read and execute (this is called an assembler) [6]. Comparatively speaking, higher-level languages such as Python and Java are a lot more human-friendly to read, but programs written in these languages cannot be directly executed by a processor. Instead, we need to rely on a “compiler” that chews on the Python or Java code that we write and spits out a dump of assembly code like the one above to be assembled and executed by the machine. We can expect the same piece of Python or Java to run smoothly across different processors and different operating systems because the compiler does the heavy-lifting, compiling your source code into an Assembly language specific to that processor or operating system.

Because all languages compile down to assembly code (which itself gets compiled down to executable binary), assembly is essentially like a “mother of all languages.” Now suppose that we’re able to turn all of the operands in an Assembly language (such as x86 or RISC-V) into an arithmetic circuit representation, such that we are able to provide zk-SNARK proofs of all the operands in this Assembly language. This means that we are theoretically capable of providing a zk-SNARK of any program written in an arbitrary high-level language (such as Python or Java) that compiles down to this Assembly language. And that’s why we need to care to think about zk-ASMs.

Practical Applications

zk-EVM Rollups: Polygon zk-ASM

One of the most important applications for zk-ASM is in creating Ethereum Virtual Machine-compatible zk-Rollups, or zk-EVMs. A zk-EVM is incredibly important to blockchain scalability because it allows for programmers to deploy on a zk-Rollup based L2 chain without modifying much (if any) of their code [7]. In this field, Polygon’s zk-EVM is an exemplary case study that demonstrates how zk-ASM may be used to achieve this goal.

When programmers develop on the Ethereum L1 blockchain, they usually code in Solidity, which is a high-level language akin to C. This Solidity code is compiled into a series of EVM Opcodes, such as ADD, SLOAD, and EQ, before being executed on the L1 blockchain [8]. By default, this process obviously does not create any sort of zk-Proof. Polygon’s trick is to create a method to interpret each of the EVM Opcodes into their custom-written zk-ASM, which is very zk-SNARK friendly. Then, their L2 zk-EVM will execute the zk-ASM, while also creating a zk-SNARK circuit of the ASM in order to create a zk-SNARK proof [9]. For example, the ADD opcode in the EVM will be translated into Polygon’s zk-ASM as follows [10]:

Because Polygon zk-EVM’s sleight-of-hand happens at the Assembly level, it is two levels removed from the code that the average Ethereum programmer touches, the “Solidity” level. This is the reason why most developers can port their EVM code built for the Ethereum mainnet directly over to the Polygon zk-EVM. Furthermore, because Polygon zk-EVM “keeps” the tech stack of Ethereum down to the opcode level, all of the debugging infrastructure that relies on analyzing compiled opcodes will all be kept usable and intact. This is unlike some other zk-EVM designs, such as zk-Sync, which does not provide zk-Proofs on an opcode level. Thus, even as Polygon invents and proves its own Assembly language, Vitalik writes that “it can still verify EVM code, it just uses some different internal logic to do it” [11].

Beyond Rollups: zk-WASM

zk-EVMs are by no means the only application for zk-ASMs. Recall our previous assertion that Assembly languages are essentially “the mother of all languages,” and that the creation of a zk-ASM will unlock zk-Proofs for generic programs written in any language that compiles to that Assembly language. Web Assembly, or WASM, is one of the most important emerging assembly languages. First published in 2018, the point of WASM is to create an Assembly language that increased the execution speed of Web Apps and provided an executional complement to Javascript, the primary coding language behind the Web [12].

Essentially, as the Web developed over the years, the growing size and complexity of Web Apps has meant that often it is incredibly slow for browsers to compile everything written in Javascript, and must rely on complex compile-optimize-reloading cycles [12]. WebAssembly, on the other hand, removes the need to rely on complex browser execution engines by providing a portable, modular and easily executable assembly language. Furthermore, as an Assembly language, WASM allows programmers to directly write code snippets in C, C++, Rust, Java or Ruby that runs natively in a browser. WASM has therefore become a technology of choice for “providing distributed serverless functions” [13].

So why and how do zk-SNARKs come into the picture? WASM is unique in that it is a client-side technology, able to directly interact with user inputs and data. Because oftentimes this includes sensitive data such as passwords and personal information, we need a technology that (1) ensures that the program executes correctly, and that (2) our sensitive information is not leaked. As described above, a zk-SNARK is a perfect solution to solve both these issues, and is thus an important puzzle piece in securing WASM [14].

While work on developing zk-WASM is still in its early stages, recently there have been some projects that have released prototype zk-SNARK circuits for WebAssembly. For example, Delphinus Lab’s “ZAWA” zk-SNARK Emulator presents a method to encode the operands and semantics of a WASM virtual machine into an arithmetic circuit, which allows it to conduct zk-SNARK proofs [13]. As time goes forth, zk-WASM circuits will undoubtedly be continuously optimized, thus allowing programs written in generic languages (such as C, C++, Rust and Ruby) to adopt the paradigm of zk-Proofs.

Conclusion

Throughout this essay, we have explored the theoretical underpinnings of zk-ASM as well as examined two paradigmatic case studies of zk-ASM: Polygon’s use of zk-ASM to create an opcode-level zk-EVM, as well as the application of zk-SNARKs onto WebAssembly to create zk-WASM. Ultimately, the promise of zk-ASM is to put together the interoperability and scale of Web 2 with the trustlessness and security of Web 3.

One the one hand, blockchains increasingly look to scale beyond their current throughput bottlenecks and potentially support the execution, while on the other, Web 2 methods have become increasingly under attack for inadequately protecting user data and privacy. As programmers are able to employ Web 3 design paradigms in their Web 2 code and introduce Web 2 languages and code onto the blockchain, generic zk-ASMs may represent a merging point in the world of Web 2 and Web 3 [15]. It is in this sense, that zk-ASM may allow us to reimagine a more secure and trustless Internet.

🐦 @0xfishylosopher

📅 17 December 2022

Disclaimer: the information presented above is purely educational does not constitute financial advice, and represents the views of the author alone. Delphinus Lab is a portfolio company of Web3.com Ventures.

References

[1] https://z.cash/technology/zksnarks/

[2] https://cs251.stanford.edu/lectures/lecture14.pdf

[3] https://www.britannica.com/science/fundamental-theorem-of-algebra

[4] Building efficient SNARKs: https://cs251.stanford.edu/lectures/lecture15.pdf

[5] Example from: https://www.tutorialspoint.com/assembly_programming/assembly_loops.htm

[6] https://en.wikipedia.org/wiki/Assembly_language

[7] https://www.alchemy.com/overviews/zkevm

[8] For list of opcodes: https://ethereum.org/en/developers/docs/evm/opcodes/

[9] https://wiki.polygon.technology/docs/zkEVM/zkASM/introduction

[10] https://wiki.polygon.technology/docs/zkEVM/zkASM/some-examples

[11] https://vitalik.ca/general/2022/08/04/zkevm.html

[12] https://blog.developer.adobe.com/understanding-webassembly-wasm-d5b592208ecc

[13] https://jhc.sjtu.edu.cn/~hongfeifu/manuscriptb.pdf

[14] https://hyperoracle.medium.com/zkwasm-the-next-chapter-of-zk-and-zkvm-471038b1fba6

[15] https://delphinuslab.com/zk-wasm/