Written by: Nickqiao, Faust, Shew Wang, Geek web3

Advisor: Bitlayer Research Team

Abstract: Recently, Delphi Digital released a technical research report on Bitcoin's second layer titled "The Dawn of Bitcoin Programmability: Paving the Way for Rollups", which systematically sorted out the core concepts related to Bitcoin Rollup, such as the BitVM family bucket, OP_CAT and Covenant restrictions, the Bitcoin ecosystem DA layer, the bridge, and the four major Bitcoin second layers that use BitVM, including Bitlayer, Citrea, Yona, and Bob.

Although the research report generally shows the general picture of Bitcoin's second-layer technology, it is generally vague and lacks detailed descriptions, making people confused. Geek web3 conducted an in-depth exploration based on the Delphi research report, trying to make more people understand BitVM and other technologies in a systematic way.

We will work with the Bitlayer research team and the BitVM Chinese community to launch a series of columns called "Approaching BTC", which will focus on long-term popular science on key topics such as BitVM, OP_CAT and Bitcoin cross-chain bridges, and are committed to demystifying Bitcoin's second-layer related technologies for more people and paving the way for more enthusiasts.

A few months ago, Robin Linus, head of ZeroSync, published an article titled "BitVM: Compute Anything on Bitcoin", formally proposing the concept of BitVM and promoting the development of Bitcoin's second-layer technology. It can be said that this is one of the most revolutionary innovations in the Bitcoin ecosystem, which has detonated the entire Bitcoin second-layer ecosystem, attracted the participation of star projects such as Bitlayer, Citrea, and BOB, and brought vitality to the entire market.

Afterwards, more researchers participated in improving BitVM, and successively launched different iterative versions such as BitVM1, BitVM2, BitVMX, and BitSNARK. The general situation is as follows:

  1. The BitVM implementation white paper first proposed by Robin Linus last year is a BitVM implementation based on fictitious logic gate circuits, called BitVM0;

  2. In several subsequent speeches and interviews, Robin Linus informally introduced the BitVM solution (called BitVM1) based on a fictitious CPU, which is similar to Optimism's fraud proof system Cannon, and can use Bitcoin scripts to simulate the effect of a general CPU off-chain.

  3. Robin Linus also proposed BitVM2, a permissionless, single-step non-interactive fraud proof protocol.

  4. Members of Rootstock Labs and Fairgate Labs released the BitVMX whitepaper, which, similar to BitVM1, hopes to emulate the effects of a general-purpose CPU through Bitcoin script (off-chain).

At present, the construction of the BitVM-related developer ecosystem is becoming clearer, and the iteration and improvement of surrounding tools are already visible to the naked eye. Compared with last year, today's BitVM ecosystem has changed from the initial "castle in the air" to "vaguely visible", which has also attracted more and more developers and VCs to rush into the Bitcoin ecosystem.

But for most people, it is not easy to understand the technical terms related to BitVM and Bitcoin Layer 2, because you need to have a systematic understanding of the basic knowledge around it, especially the background knowledge such as Bitcoin script and Taproot. The reference materials currently available on the Internet are either too long and full of nonsense, or the explanations are not thorough enough to make people confused. We are committed to solving the above problems and strive to help more people understand the surrounding knowledge of Bitcoin Layer 2 in the clearest possible language, and establish a systematic understanding of the BitVM system.

MATT and Commitment: The Basic Idea of ​​BitVM

First of all, we must emphasize that the basic idea of ​​BitVM is MATT, which means Merkleize All The Things. It mainly refers to displaying complex program execution processes through a tree-like data storage structure such as Merkle Tree, trying to make Bitcoin Native's verification fraud-proof.

Although MATT can express a complex program and its data processing traces, it will not publish this data directly on the BTC chain because the overall scale of this data is very large. The MATT solution only stores data in the Merkle tree off-chain, and only publishes the top summary of the Merkle tree (Merkle Root) to the chain. This Merkle tree mainly contains three core contents:

  • Smart contract script code

  • Data required by the contract

  • Traces left during contract execution (records of changes to memory and CPU registers when smart contracts are executed in virtual machines such as EVM)

(A simple Merkle Tree diagram. Its Merkle Root is obtained by multi-layer hash calculation of the 8 data fragments at the bottom of the diagram)

Under the MATT scheme, only the extremely small Merkle Root is stored on the chain, and the complete data set contained in the Merkle Tree is stored off the chain. This uses an idea called "commitment". Here is an explanation of what "commitment" is.

A commitment is similar to a simplified statement, which can be understood as a "fingerprint" obtained by compressing a large amount of data. Generally speaking, the person who publishes a "commitment" on the chain will claim that certain data stored off-chain is accurate, and these off-chain data must correspond to a simplified statement, which is the "commitment".

Sometimes, the hash of the data can be used as a "commitment" to the data itself. Other commitment schemes include KZG commitment or Merkle Tree. In the fraud proof protocol commonly used by Layer2, the data publisher will publish a complete data set off-chain and publish a commitment to the data set on-chain. If someone finds invalid data in the off-chain data set, they will challenge the data commitment on the chain.

Through commitment, the second layer can compress a large amount of data and only publish its "commitment" on the Bitcoin chain. Of course, it is also necessary to ensure that the complete data set published off-chain can be observed by the outside world.

Currently, several major BitVM solutions, such as BitVM0, BitVM1, BitVM2 and BitVMX, basically adopt similar abstract structures:

1. Program decomposition and commitment: First, we decompose complex programs into a large number of relatively basic opcodes (compilation), and then record the traces generated by these opcodes during their specific execution (in other words, the entire state change record when a program runs in the CPU and memory is called Trace). After that, we organize all the data including Trace and opcodes into a data set, and then generate a commitment for the data set.

Specific commitment schemes can take many forms, such as: Merkle tree, PIOPs (various ZK algorithms), hash functions

2. Asset pledge and pre-signature: Data publishers and verifiers need to lock a certain amount of assets on the chain through pre-signature, and there will be restrictions. These conditions will be triggered specifically for possible future situations. If the data publisher does something evil, the verifier can submit proof to take away the data publisher's assets.

3. Data and commitment release: The data publisher releases the commitment on the chain, releases the complete data set off the chain, and the verifier retrieves the data set and checks for any errors. Each part of the off-chain data set is associated with the commitment on the chain.

4. Challenge and punishment: Once the verifier finds that the data provided by the data publisher is wrong, it will take this part of the data to the chain for direct verification (this part of the data must be cut very finely first). This is the logic of fraud proof. If the verification result shows that the data publisher did provide invalid data off-chain, its assets will be taken away by the verifier who challenged it.

In summary, the data publisher Alice publishes all traces generated during the execution of the second-layer transaction off-chain and publishes the corresponding commitment on-chain. If you want to prove that a part of the data is wrong, first prove to the Bitcoin node that this part of the data is related to the commitment on the chain, that is, prove that this data is made public by Alice herself, and then let the Bitcoin node confirm that this part of the data is wrong.

Now we have a general understanding of the overall idea of ​​BitVM. All BitVM variants are basically inseparable from the above paradigm. Next, let us start to learn and understand some important technologies used in the above process, starting with the most basic Bitcoin script, Taproot and pre-signature.

What is Bitcoin Script?

Bitcoin is more difficult to understand than Ethereum. Even the most basic transfer behavior involves a series of concepts, including UTXO (unspent transaction output), locking script (also known as ScriptPubKey) and unlocking script (also known as ScriptSig). Let's first explain these main concepts.

(An example of a Bitcoin Script code consisting of opcodes at a lower level than a high-level language)

Ethereum's asset expression is more like Alipay or WeChat. Each transfer is just addition and subtraction of the balances of different accounts. This method is account-centric, and the asset balance is just a number under the account name. Bitcoin's asset expression is more like gold. Each piece of gold (UTXO) will be marked with its owner. The transfer actually destroys the old UTXO and generates a new UTXO (the owner will change).

A Bitcoin UTXO contains two key fields:

  • Amount, in satoshi (100 million satoshis equal one BTC);

  • The locking script, also known as the "ScriptPubKey", defines the unlocking conditions of the UTXO.

It should be noted that the ownership of Bitcoin UTXO is expressed through a locking script. If you want to transfer your UTXO to Sam, you can initiate a transaction to destroy one of your UTXOs and write the unlocking condition of the newly generated UTXO as "only Sam can unlock".

After that, if Sam wants to use these bitcoins, he needs to submit an unlocking script (ScriptSig), in which Sam needs to show his digital signature to prove that he is Sam himself. If the unlocking script matches the aforementioned locking script, Sam can unlock and transfer these bitcoins to others.

(The unlocking script must match the locking script)

From the perspective of expression, each transaction on the Bitcoin chain corresponds to multiple inputs and outputs. In each input, you need to declare a UTXO that you want to unlock and submit an unlocking script to unlock and destroy the UTXO. The newly generated UTXO information will be displayed in the output, and the content of the locking script will be made public.

For example, in the Input of a transaction, you prove that you are Sam, unlock multiple UTXOs given to you by others, destroy them all, generate multiple new UTXOs and declare that xxx will unlock them in the future.

Specifically, in the input data of the transaction, you need to declare which UTXOs you want to unlock and indicate the "storage location" of these UTXO data. It should be noted here that Bitcoin and Ethereum are completely different. Ethereum provides two accounts, contract accounts and EOA accounts, to store data. The asset balance is recorded as a number under the name of the contract account or EOA account, and is uniformly placed in a database called "World State". When transferring money, specific accounts can be modified directly from the "World State" to facilitate the location of data storage;

Bitcoin does not have a world state design, and asset data is stored in a decentralized manner in past blocks (that is, unlocked UTXO data is stored separately in the OutPut of each transaction).

If you want to unlock a UTXO, you need to indicate which past transaction’s output the UTXO information is in, show the transaction’s ID (its hash), and let the Bitcoin node search for it in the historical records. If you want to check the Bitcoin balance of a certain address, you need to traverse all blocks from the beginning to find the unlocked UTXO associated with address xx.

When using a Bitcoin wallet, you can quickly check the Bitcoin balance of a certain address. In many cases, this is because the wallet service itself has established an index for all addresses by scanning blocks, making it convenient for us to quickly query.

(When you generate a transaction statement to send your UTXO to someone else, you need to mark the position of the UTXO in the Bitcoin history according to the transaction hash/ID to which these UTXOs belong)

Interestingly, the results of Bitcoin transactions are calculated off-chain. When users generate transactions on local devices, they must directly create all the inputs and outputs, which is equivalent to calculating the output results of the transaction. Transactions are broadcast to the Bitcoin network and verified by nodes before being put on the chain. This "off-chain calculation-on-chain verification" model is completely different from Ethereum. On Ethereum, you only need to provide transaction input parameters, and the transaction results are calculated and output by the Ethereum node.

In addition, the UTXO locking script is customizable. You can set the UTXO to be "unlockable by the owner of a certain Bitcoin address". The owner of the address needs to provide a digital signature and public key (P2PKH). In the Pay-to-Script-Hash (P2SH) transaction type, you can add a Script Hash to the UTXO locking script. Whoever can submit the script original image corresponding to this Hash and meet the conditions preset in the script original image can unlock the UTXO. The Taproot script that BitVM relies on uses features similar to P2SH.

How to trigger Bitcoin script

Here we first use P2PKH as an example to introduce the triggering method of Bitcoin scripts. Only by understanding its triggering method can we understand the more complex Taproot and BitVM. P2PKH stands for "Pay to Public Key Hash". In this scheme, a public key hash will be set in the UTXO locking script. When unlocking, the public key corresponding to the hash needs to be submitted, which is basically the same as the conventional Bitcoin transfer idea.

At this point, the Bitcoin node must make sure that the public key in the unlocking script matches the public key hash specified in the locking script. In other words, it must make sure that the "key" submitted by the unlocker and the "lock" preset by the UTXO match each other.

Furthermore, under the P2PKH scheme, after receiving the transaction, the Bitcoin node will concatenate the unlocking script ScriptSig provided by the user with the locking script ScriptPubkey of the UTXO to be unlocked, and execute them in the execution environment of the BTC script. The following figure shows the concatenation result before execution:

Readers may not be familiar with the script execution environment of BTC, so we will briefly introduce it here. First, the BTC script contains two elements:

Data and operation codes. These data and operation codes are pushed into the stack in order from left to right and executed according to the specified logic to get the final result (what is a stack is not explained in detail here, readers can chatgpt by themselves).

Taking the above picture as an example, the unlocking script ScriptSig uploaded by someone on the left contains his digital signature and public key, while the locking script ScriptPubkey on the right contains an opcode and data set by the UTXO creator when generating the UTXO (we don’t need to understand the meaning of each opcode here, just understand the general idea).

The DUP, HASH160, EQUALVERIFY and other opcodes in the locking script on the right side of the above figure are responsible for taking the hash of the public key carried in the unlocking script on the left and comparing it with the public key hash preset in the locking script. If the two are equal, it means that the public key uploaded in the unlocking script matches the public key hash preset in the locking script, which passes the first verification.

However, there is a problem. The content of the UTXO locking script is actually public on the chain. Anyone can observe the public key hash contained in it, and anyone can upload the corresponding public key and lie that they are the "appointed" person. Therefore, after verifying the public key and public key hash, it is also necessary to verify whether the transaction initiator is really the actual controller of the public key, which requires verification of the digital signature. The CHECKSIG opcode in the locking script is responsible for verifying the digital signature.

To summarize, under the P2PKH scheme, the unlocking script submitted by the transaction initiator contains a public key and a digital signature. The public key must match the public key hash specified in the locking script, and the digital signature of the transaction must be correct. Only when these conditions are met can the UTXO be unlocked successfully.

(This diagram is dynamic: Schematic diagram of Bitcoin unlocking script under P2PKH scheme

Source: https://learnmeabitcoin.com/technical/script )

Of course, the Bitcoin network supports multiple transaction types, not only Pay to public key/public key hash, but also P2SH (Pay to Script hash), etc. Everything depends on how the custom locking script is set when the UTXO is created.

It should be noted here that under the P2SH scheme, a Script Hash can be preset in the locking script, and the unlocking script needs to submit the script content corresponding to the Script Hash in full. The Bitcoin node can execute this script. If the multi-signature verification logic is defined in this script, the effect of a multi-signature wallet can be realized on the Bitcoin chain.

Of course, under the P2SH scheme, the UTXO creator must let the person who unlocks the UTXO in the future know the script content corresponding to the Script Hash in advance. As long as both parties know the content of this Script, we can implement more complex business logic than multi-signature.

One thing to note here is that the Bitcoin chain (block) does not directly record which UTXOs are associated with which addresses. It only records which public key hash/script hash the UTXO can be unlocked by. However, we can quickly calculate the corresponding address based on the public key hash/script hash (the part that looks like garbled code displayed on the wallet interface).

The reason why we can see that there is xx amount of bitcoins under address xx in the block browser and wallet interface is because the block browser and wallet project help you parse this data, scan all blocks and calculate the corresponding "address" based on the public key hash/script hash declared in the locking script, and then display how many bitcoins are under address xx.

Segregated Witness and Witness

When we understand the idea of ​​P2SH, we are one step closer to Taproot, which BitVM relies on. But before that, we need to understand an important concept: Witness and Segregated Witness.

Reviewing the unlocking script and locking script mentioned earlier, as well as the UTXO unlocking process, we will find a problem: the digital signature of the transaction is included in the unlocking script. The unlocking script cannot be overwritten when the signature is generated (the parameters used to generate the signature cannot include the signature itself). Therefore, the digital signature can only cover the part outside the unlocking script, that is, it can only establish an association with the main part of the transaction data, and cannot completely cover the transaction data.

In this way, even if the transaction unlocking script is slightly tampered by the middleman, it will not affect the verification result. For example, a Bitcoin node or mining pool can insert other data into the transaction unlocking script, which will slightly change the transaction data without affecting the verification and transaction results, and the final calculated transaction hash/transaction ID will also change. This is called the transaction ductility problem.

The disadvantage of this is that if you plan to initiate multiple transactions in succession and there are dependencies in order (for example, transaction 3 references the output of transaction 2, and transaction 2 references the output of transaction 1), then the subsequent transaction must reference the ID (hash) of the previous transaction. Any middleman such as a mining pool or Bitcoin node can fine-tune the content of the unlocking script, making the hash of the transaction after it is uploaded to the chain inconsistent with what you expected, then the multiple sequentially related transactions you created in advance will become invalid.

In fact, in the DLC Bridge and BitVM2 solutions, transactions with a sequential order are built in batches, so the scenarios mentioned above are not uncommon.

Simply put, the problem of transaction ductility is that the unlocking script data is included in the calculation of the transaction ID/hash, and the middlemen such as Bitcoin nodes can fine-tune the content of the unlocking script, causing the transaction ID to be inconsistent with the user's expectations. In fact, this is a historical burden left by Bitcoin's early design due to poor consideration.

The later Segregated Witness/SegWit upgrade actually completely decouples the transaction ID and the unlocking script, and the unlocking script data does not need to be included when calculating the transaction hash. The UTXO locking script that follows the SegWit upgrade will set an operation code called "OP_0" at the first position by default as a marker; and the corresponding unlocking script is renamed from SigScript to Witness.

After following the isolated witness rules, the transaction ductility problem will be properly solved, and you don’t need to worry about the transaction data sent to the Bitcoin node being fine-tuned. Of course, we don’t need to think too complicated. The function of P2WSH is no different from the P2SH mentioned above. You can preset a script hash in the UTXO locking script, and wait for the submitter of the unlocking script to submit the script content corresponding to the hash to the chain and execute it.

But if the script you want to implement is very large and contains a lot of code, it is impossible to submit the complete script to the Bitcoin chain through conventional methods (each block has a size limit). What should you do? This requires the use of Taproot to simplify the script content on the chain, and BitVM is a complex solution built based on Taproot.

In the next article of "Approaching BTC", we will provide detailed popularization of other more complex technologies related to BitVM, such as Taproot, pre-signature, etc., so stay tuned!