Author: Chakra; Translator: 0xjs@Golden Finance
This is Part III of Chakra’s series on Bitcoin scalability.
For the first part, please refer to Golden Finance’s previous article “A Review of Bitcoin’s Native Expansion Plans: SegWit and Taproot”.
For the second part, please refer to Golden Finance’s previous article “Bitcoin Scalability: Analysis of Layer 2 Solutions and Related Projects”.
The third part is this article, as follows:
Overview
Compared to Turing-complete blockchains such as Ethereum, Bitcoin scripts are considered to be extremely restrictive, only able to perform basic operations, and not even supporting multiplication and division. More importantly, the data of the blockchain itself is almost inaccessible to scripts, resulting in a serious lack of flexibility and programmability. Therefore, people have been working hard to enable introspection of Bitcoin scripts.
Introspection refers to the ability of Bitcoin scripts to inspect and constrain transaction data. This allows scripts to control the use of funds based on specific transaction details, allowing for more complex functionality. Currently, most Bitcoin opcodes either push user-supplied data onto the stack or manipulate existing data on the stack. However, the introspection opcode can push data from the current transaction (e.g. timestamp, amount, txid, etc.) onto the stack, allowing for more granular control over UTXO spending.
As of now, there are only three main opcodes in Bitcoin Script that support introspection: CHECKLOCKTIMEVERIFY, CHECKSEQUENCEVERIFY, and CHECKSIG, and their variants CHECKSIGVERIFY, CHECKSIGADD, CHECKMULTISIG, and CHECKMULTISIGVERIFY.
Covenants, in simple terms, are restrictions on how money is transferred, allowing users to specify how UTXOs are allocated. Many covenants are implemented through introspection opcodes, and discussions about introspection have now been categorized under the Bitcoin Optech Covenant topic.
Bitcoin currently has two contracts, CSV (CheckSequenceVerify) and CLTV (CheckLockTimeVerify), both of which are time-based contracts and are the basis for many scaling solutions (such as the Lightning Network). This shows that Bitcoin's scaling solution relies heavily on introspection and contracts.
How do we add conditions to the transfer of tokens? In the cryptocurrency space, the most common way we do this is through commitments, usually implemented through hashes. In order to prove that the transfer requirements are met, a signature mechanism is also required for verification. Therefore, there are many adjustments to hashes and signatures in contracts.
Below, we describe the widely discussed Covenant opcode proposal.
CTV(CheckTemplateVerify)BIP-119
CTV (CheckTemplateVerify) is a Bitcoin upgrade proposal in BIP-119, which has attracted widespread attention from the community. CTV allows the output script to specify the template for fund spending in the transaction, including fields such as nVersion, nLockTime, scriptSig hash, input count, sequence hash, output count, output hash, input index, etc. These template restrictions are implemented through hash commitments. When funds are spent, the script checks whether the hash value of the specified field in the spending transaction matches the hash value in the input script. This effectively limits the time, method, and amount of future transactions for that UTXO.
Notably, the input TXID is excluded from this hash. This exclusion is necessary because in both traditional and SegWit transactions, when the default SIGHASH_ALL signature type is used, the TXID depends on the value in the scriptPubKey. Including the TXID would result in a circular dependency in the hash commitment, which would fail to construct.
CTV's introspection method is to directly pull the specified transaction information for hashing, and then compare it with the commitment on the stack. This introspection method has low requirements for chain space, but lacks certain flexibility.
The basis of Bitcoin's second-layer solutions, such as the Lightning Network, is pre-signed transactions. Pre-signing generally refers to generating and signing transactions in advance, but not broadcasting them until certain conditions are met. Essentially, CTV implements a stricter form of pre-signing, publishing pre-signed commitments on-chain and constrained to predefined templates.
CTV was originally proposed to relieve Bitcoin congestion, which can also be called congestion control. When congestion is severe, CTV can commit to multiple future transactions in a single transaction, avoid broadcasting multiple transactions during peak hours, and complete the actual transaction after the congestion is relieved. This may be particularly useful during exchange runs. In addition, the template can also be used to implement Vault to prevent hacker attacks. Since the flow of funds is predetermined, hackers cannot use CTV scripts to point UTXO to their own addresses.
CTV can significantly enhance Layer 2 networks. For example, in the Lightning Network, CTV can create timeout trees and channel factories by extending a single UTXO into a CTV tree, allowing multiple state channels to be opened with just one transaction and one confirmation. Additionally, CTV supports atomic transactions in the Ark protocol via ATLC.
APO(SIGHASH_ANYPREVOUT)BIP-118
BIP-118 introduces a new type of signature hash flag for tapscript, designed to facilitate more flexible spending logic, called SIGHASH_ANYPREVOUT. APO and CTV have many similarities. When solving the circular problem between scriptPubKeys and TXIDs, APO's approach is to exclude relevant input information and only sign outputs, allowing transactions to be dynamically bound to different UTXOs.
Logically, the signature verification operation OP_CHECKSIG (and its variants) performs three functions:
1. Assemble the parts of a spending transaction.
2. Hash them.
3. Verify that the hash has been signed by the given key.
The specific details of the signature are very flexible, and the SIGHASH flag determines which fields of the transaction are signed. According to the definition of the signature opcode in BIP 342, the SIGHASH flag is divided into SIGHASH_ALL, SIGHASH_NONE, SIGHASH_SINGLE, and SIGHASH_ANYONECANPAY. SIGHASH_ANYONECANPAY controls the input, while the others control the output.
SIGHASH_ALL is the default SIGHASH flag, signing all outputs; SIGHASH_NONE does not sign any output; SIGHASH_SINGLE signs a specific output. SIGHASH_ANYONECANPAY can be set with the previous three SIGHASH flags. If SIGHASH_ANYONECANPAY is set, only the specified inputs are signed; otherwise, all inputs must be signed.
Obviously, these SIGHASH flags do not eliminate the effect of input, even SIGHASH_ANYONECANPAY, which requires an input to be committed.
Therefore, BIP 118 proposed SIGHASH_ANYPREVOUT. APO signatures do not require commitment to spent input UTXO (called PREVOUT), but only need to sign outputs, providing greater flexibility for Bitcoin control. By pre-building transactions and creating corresponding one-time signatures and public keys, assets sent to the public key address must be spent through pre-built transactions, thus implementing the contract. The flexibility of APO can also be used for transaction repair; if a transaction is stuck on the chain because the fee is too low, another transaction can be easily created to increase the fee without the need for a new signature. In addition, for multi-signature wallets, not relying on spent inputs makes operations more convenient.
Since the cycle between scriptPubKeys and input TXIDs is eliminated, APO can perform introspection by adding output data in the Witness, although this still requires additional witness space consumption.
For off-chain protocols such as Lightning Network and Vaults, APO reduces the need to save intermediate states, greatly reducing storage requirements and complexity. The most direct use case of APO is Eltoo, which simplifies channel factories, builds lightweight and cheap watchtowers, and allows unilateral exits without leaving error states, thereby enhancing the performance of the Lightning Network in many ways. APO can also be used to simulate CTV functionality, although it requires individuals to store signatures and pre-sign transactions, which is more expensive and less efficient than CTV.
The main criticism of APO focuses on the fact that it requires a new key version, which cannot be achieved through simple backward compatibility. In addition, the new signature hash type may bring potential risks of double spending. After extensive community discussion, APO added regular signatures on top of the original signature mechanism to alleviate security concerns, resulting in the BIP-118 code.
OP_VAULT BIP-345
BIP-345 proposes to add two new opcodes, OP_VAULT and OP_VAULT_RECOVER, which, when used in conjunction with CTV, enable specialized contracts that allow users to force a delay in spending a particular currency. During this delay, previously made transactions can be “undone” via a recovery path.
A user can create a Vault by creating a specific Taproot address, which must contain at least two scripts in its MAST: one with the OP_VAULT opcode to facilitate the intended withdrawal process, and another with the OP_VAULT_RECOVER opcode to ensure that tokens can be recovered at any time before a withdrawal is completed.
How does OP_VAULT achieve interruptible timed locked withdrawals? OP_VAULT does this by replacing the used OP_VAULT script with the specified script, effectively updating a single leaf of the MAST while leaving the rest of the Taproot leaf nodes unchanged. This design is similar to TLUV, except that OP_VAULT does not support updates to internal keys.
By introducing a template during the script update process, it is possible to restrict payments. The timelock parameter is specified by OP_VAULT and the template of the CTV opcode restricts the set of outputs that can be used through this script path.
BIP-345 is designed specifically for Vaults, leveraging OP_VAULT and OP_VAULT_RECOVER to provide users with a secure custody method, using highly secure keys (such as paper wallets or distributed multi-signatures) as a recovery path, while configuring a certain delay for regular payments. The user's device continuously monitors the vault's expenditures, and the user can initiate recovery if an unexpected transfer occurs.
Implementing Vault via BIP-345 requires cost considerations, especially for recovery transactions. Possible solutions include CPFP (child pays parent), temporary anchors, and the new SIGHASH_GROUP signature hash flag.
TLUV(TapleafUpdateVerify)
The TLUV solution is built around Taproot and is designed to effectively solve the shared UTXO exit problem. The guiding principle is that when a Taproot output is spent, the internal keys and MAST (tapscript trie) can be partially updated through cryptographic transformations and the internal structure of the Taproot address, as described in the TLUV script. This makes the implementation of the Covenant function possible.
The concept of the TLUV scheme is to create a new Taproot address based on the current spending input by introducing a new opcode TAPLEAF_UPDATE_VERIFY. This can be achieved by doing one or more of the following:
Update internal public key
Pruning Merkle Paths
Delete the currently executing script
Add a new step at the end of the Merkle path
Specifically, TLUV accepts three types of input:
Specifies how to update the internal public key.
A method of specifying a new step for a Merkle path.
Specifies whether to delete the current script and/or how many steps of the Merkle path to prune.
The TLUV opcode calculates the updated scriptPubKey and verifies whether the output corresponding to the current input is spent on this scriptPubKey.
TLUV is inspired by the concept of CoinPool. Today, joint pools can be created with just a pre-signed transaction, but permissionless exits require an exponentially larger number of signatures. TLUV allows permissionless exits without any pre-signatures. For example, a group of partners can use Taproot to build a shared UTXO to pool their funds together. They can use Taproot keys to transfer funds internally or jointly sign to initiate payments externally. Individuals can exit a shared pool at any time, deleting their payment path, while others can still complete the payment through the original path, and the individual's exit will not expose additional information about others inside. This model is more efficient and private than non-pooled transactions.
The TLUV opcode implements partial spending restrictions by updating the original Taproot Trie, but it does not implement introspection of output amounts. Therefore, a new opcode, IN_OUT_AMOUNT, is also needed. This opcode pushes two items onto the stack: the UTXO amount of this input and the amount of the corresponding output, and the person using TLUV then needs to use mathematical operators to verify that the funds are appropriately reserved in the updated scriptPubKey.
Introspection of output amounts has added complexity, as amounts in satoshis require up to 51 bits to represent, but script only allows 32-bit math. This requires redefining opcode behavior to upgrade operators in script or replacing IN_OUT_AMOUNT with SIGHASH_GROUP.
TLUV has the potential to become a solution for decentralized Layer 2 funding pools, but the reliability of its Taproot Trie adjustment still needs to be confirmed.
MATT
MATT (Merkleize All The Things) aims to achieve three goals: Merkleizing the state, Merkleizing the script, and Merkleizing the performing, thereby realizing universal smart contracts.
Merkleizing the state: This involves building a Merkle Trie where each leaf node represents a hash of the state and the Merkle Root represents the overall state of the contract.
Merkleizing the script: This refers to using Tapscript to form a MAST where each leaf node represents a possible state transition path.
Merkleizing the performing: Merkle the performing through cryptographic commitment and fraud challenge mechanism. For any computational function, a participant can compute it off-chain and then publish a commitment f(x)=y. If other participants find the wrong result f(x)=z, they can initiate a challenge. Arbitration is performed through binary search, similar to the principle of Optimistic Rollup.
Merkelized Execution
In order to implement MATT, the Bitcoin scripting language needs to have the following capabilities:
Force output to have a specific script (and its number)
Append a piece of data to the output
Read data from the current input (or another input)
The second point is crucial: dynamic data means that the state can be computed from the input data provided by the consumer, because this allows the simulation of the state machine, being able to determine the next state and additional data. The MATT scheme achieves this through the OP_CHECKCONTRACTVERIFY (OP_CCV) opcode, which is a merger of the previously proposed OP_CHECKOUTPUTCONTRACTVERIFY and OP_CHECKINPUTCONTRACTVERIFY opcodes, using an additional flag parameter to specify the target of the operation.
Controlling output amounts: The most straightforward approach is direct introspection; however, output amounts are 64-bit numbers, requiring 64-bit arithmetic, which introduces significant complexity in Bitcoin script. OP_CCV employs a delayed check approach like OP_VAULT, where the input amounts of all inputs to the same output in CCV are summed as a floor for that output amount. The delay is because this check occurs during the transaction, not during script evaluation of the inputs.
Given the ubiquity of fraud proofs, some variant of the MATT contract should be able to implement all types of smart contracts or layer 2 constructions, although additional requirements (such as fund locks and challenge period delays) need to be accurately evaluated; further research is needed to evaluate which applications can accept transactions. For example, using cryptographic commitments and fraud challenge mechanisms to emulate the OP_ZK_VERIFY function to implement trustless Rollups on Bitcoin.
In practice, this has already happened. Johan Torås Halseth implemented elftrace using the OP_CHECKCONTRACTVERIFY opcode in the MATT soft fork proposal, which enables any program compiled with RISC-V to be verified on the Bitcoin blockchain, allowing one party in the contract agreement to access funds through contract verification, thus bridging Bitcoin's native verification.
CSFS(OP_CHECKSIGFROMSTACK)
From the introduction of the APO opcode, we know that OP_CHECKSIG (and its related operations) are responsible for assembling transactions, hash calculations, and verifying signatures. However, the messages verified by these operations are serialized transactions through opcodes, and no other messages are allowed to be specified. In short, the role of OP_CHECKSIG (and its related operations) is to verify through the signature mechanism whether the UTXO spent as transaction input is authorized to be used by the signature holder, thereby protecting the security of Bitcoin.
CSFS, as the name implies, is Checks the Signature From Stack. The CSFS opcode receives three parameters from the stack: signature, message, and public key, and verifies the validity of the signature. This means that people can pass any message to the stack through witnesses and verify it through CSFS, thus realizing some innovations of Bitcoin.
The flexibility of CSFS enables it to implement mechanisms such as payment signatures, authorization delegation, oracle contracts, double-spending protection guarantees, and more importantly, transaction introspection. The principle of transaction introspection using CSFS is very simple: if the transaction content used by OP_CHECKSIG is pushed to the stack by a witness, and the same public key and signature are used to verify OP_CSFS and OP_CHECKSIG, and if both verifications pass successfully, then the arbitrary message content passed to OP_CSFS is the same as the serialized spending transaction (and other data) implicitly used by OP_CHECKSIG. We then get verified transaction data on the stack, which can be used to impose restrictions on spending transactions using other opcodes.
CSFS often appears with OP_CAT because OP_CAT can concatenate different fields of a transaction to complete the serialization, allowing for more precise selection of transaction fields needed for introspection. Without OP_CAT, the script cannot recalculate the hash from data that can be checked individually, so all it can really do is check if the hash corresponds to a specific value, which means that the token can only be spent through a single specific transaction.
CSFS can implement opcodes such as CLTV, CSV, CTV, APO, etc., making it a versatile introspection opcode. Therefore, it also contributes to Bitcoin layer 2 scalability solutions. The disadvantage is that it requires adding a full copy of the signed transaction on the stack, which can significantly increase the size of transactions using CSFS introspection. Single-purpose introspection opcodes like CLTV and CSV have little overhead in comparison, but adding each new special introspection opcode requires consensus changes.
TXHASH (OP_TXHASH)
OP_TXHASH is a simple introspection opcode that allows an operator to select a hash of a specific field and push it onto the stack. Specifically, OP_TXHASH pops a txhash flag from the stack, computes the (tagged) txhash based on that flag, and then pushes the resulting hash back onto the stack.
Due to the similarities between TXHASH and CTV, there has been a lot of discussion within the community about the two.
TXHASH can be seen as a general upgrade of CTV, which provides more advanced transaction templates, allowing users to explicitly specify the various parts of the spending transaction, solving many problems related to transaction fees. Unlike other Covenant opcodes, TXHASH does not require a copy of the necessary data in the witness, further reducing storage requirements; unlike CTV, TXHASH is not compatible with NOP and can only be implemented in tapscript; the combination of TXHASH and CSFS can be used as an alternative to CTV and APO.
From a contract construction perspective, TXHASH is more conducive to creating "additive contracts" where all parts of the transaction data you wish to fix are pushed onto the stack, hashed together, and the resulting hash is verified to match a fixed value; CTV is more suitable for creating "subtractive contracts" where all parts of the transaction data you wish to keep free are pushed onto the stack. Then, using a rolling SHA256 opcode, the hashing starts from a fixed intermediate state that is committed to a prefix of the transaction hash data. The free parts are hashed to this intermediate state.
The TxFieldSelector field defined in the TXHASH specification is expected to be extended to other opcodes, such as OP_TX.
The BIP related to TXHASH is currently in Draft status on GitHub and has not yet been assigned a number.
OP_CAT
OP_CAT is a mysterious opcode that was initially abandoned by Satoshi Nakamoto for security reasons, but has recently sparked heated discussions among Bitcoin core developers and even created a Meme culture on the Internet. In the end, OP_CAT was approved under BIP-347 and is known as the BIP proposal most likely to be passed in the near future.
In fact, the behavior of OP_CAT is very simple: it concatenates two elements from the stack. How does it implement the Covenant functionality?
In fact, the ability to connect two elements corresponds to a powerful cryptographic data structure: the Merkle Trie. To build a Merkle Trie, only concatenation and hashing are required, and hashing functions are available in Bitcoin Script. Therefore, using OP_CAT, we can theoretically verify Merkle proofs in Bitcoin Script, which is one of the most common lightweight verification methods in blockchain technology.
As mentioned earlier, CSFS can implement the general Covenant solution with the help of OP_CAT. In fact, even without CSFS, OP_CAT itself can also implement transaction introspection using the structure of Schnorr signatures.
In a Schnorr signature, the message to be signed consists of the following fields:
These fields contain the main elements of the transaction. By placing them in the scriptPubKey or Witness and using OP_CAT in conjunction with OP_SHA256, we can construct a Schnorr signature and verify it with OP_CHECKSIG. If the verification passes, the stack will retain the verified transaction data, enabling transaction introspection. This allows us to extract and "inspect" individual parts of a transaction, such as its inputs, outputs, destination addresses, or the amount of Bitcoin involved.
For specific cryptographic principles, please refer to Andrew Poelstra’s article “CAT and Schnorr Tricks”.
In summary, OP_CAT’s versatility enables it to emulate almost any Covenant opcode. Many Covenant opcodes rely on OP_CAT’s functionality, which greatly increases its position in the merge list. In theory, relying solely on OP_CAT and existing Bitcoin opcodes, we have the potential to build a trust-minimized BTC ZK Rollup. Starknet, Chakra, and other ecosystem partners are actively working to achieve this goal.
in conclusion
As we explored various strategies for scaling Bitcoin and enhancing its programmability, it became clear that the path forward involves a convergence of native improvements, off-chain computation, and complex scripting capabilities.
Without a flexible base layer, it is impossible to build a more flexible second layer.
Off-chain computing expansion is the trend of the future, but Bitcoin's programmability needs a breakthrough to better support this scalability and become a truly global currency.
However, the nature of computation on Bitcoin is fundamentally different from that on Ethereum. Bitcoin only supports "verification" as a form of computation and cannot perform general computation, while Ethereum is computational in nature, with verification being a byproduct of computation. This difference can be seen from one point: Ethereum charges a Gas Fee for transactions that cannot be executed, while Bitcoin does not.
Covenants are a form of smart contract based on verification rather than computation. With the exception of a few Satoshi fundamentalists, it seems that everyone agrees that covenants are a good choice for improving Bitcoin. However, the community is still arguing fiercely about which approach should be taken to implement covenants.
APO, OP_VAULT, and TLUV tend to be directly applied. Choosing these three methods can realize specific applications more cheaply and efficiently. Lightning Network enthusiasts will choose APO to implement LN-Symmetry; users who want to implement Vault are better off using OP_VAULT; and for building CoinPool, TLUV can provide better privacy and efficiency. OP_CAT and TXHASH are more functional and less likely to have security vulnerabilities. Combining with other opcodes can achieve more use cases, but the cost may be increased script complexity. CTV and CSFS adjust the blockchain processing method. CTV implements delayed output and CSFS implements delayed signature. MATT stands out with its optimistic execution and fraud proof strategies, and uses the Merkle Trie structure to implement general smart contracts, but the introspection function still requires new opcodes.
We see that the Bitcoin community is actively discussing the possibility of obtaining Covenants through soft forks. Starknet has officially announced its entry into the Bitcoin ecosystem and plans to achieve settlement on the Bitcoin network within six months after the OP_CAT merger. Chakra will continue to pay attention to the latest developments in the Bitcoin ecosystem, promote the merger of the OP_CAT soft fork, and use the programmability brought by Covenants to build a safer and more efficient Bitcoin settlement layer.