Up to this point in the book, we’ve been doing single-key transactions, or transactions with only a single private key per input. What if we wanted something a little more complicated? A company that has $100 million in bitcoin might not want the funds locked to a single private key: if that single key were lost or stolen, all funds would then be lost. What can we do to reduce the risk of this single point of failure?
The solution is multisig, or multiple signatures. This was built into Bitcoin from the beginning, but was clunky at first and so wasn’t used. As we’ll discover later in this chapter, Satoshi probably didn’t test multisig, as it has an off-by-one error (see OP_CHECKMULTISIG Off-by-One Bug). The bug has had to stay in the protocol because fixing it would require a hard fork.
Note
|
Multiple Private Keys to a Single Aggregated Public Key
It is possible to "split" a single private key into multiple private keys and use an interactive method to aggregate signatures without ever reconstructing the private key, but this is not a common practice. Schnorr signatures will make aggregating signatures easier and perhaps more common in the future. |
Bare multisig was the first attempt at creating transaction outputs that require signatures from multiple parties.
The idea is to change from a single point of failure to something a little more resilient to hacks.
To understand bare multisig, one must first understand the OP_CHECKMULTISIG
opcode.
As discussed in [chapter_script], Script has a lot of different opcodes.
OP_CHECKMULTISIG
is one of them, at 0xae
.
The opcode consumes a lot of elements from the stack and returns whether or not the required number of signatures are valid for a transaction input.
The transaction output is called "bare" multisig because it’s a long ScriptPubKey. Bare multisig ScriptPubKey shows what a ScriptPubKey for a 1-of-2 multisig looks like.
Among bare multisig ScriptPubKeys, this one is on the small end, and we can already see that it’s long. The ScriptPubKey for p2pkh is only 25 bytes, whereas this bare multisig is 101 bytes (though obviously, compressed SEC format would reduce it some), and this is a 1-of-2! Bare multisig ScriptSig shows what the ScriptSig looks like.
We only need 1 signature for this 1-of-2 multisig, so this is relatively short; something like a 5-of-7 would require 5 DER signatures and would be a lot longer (360 bytes or so). Bare multisig combined script shows how the ScriptSig and ScriptPubKey combine.
I’ve generalized here to show what an m-of-n bare multisig would look like (m and n can be anything from 1 to 20 inclusive, though the numerical opcodes only go up to OP_16
; values of 17 to 20 would require 0112
to push a number like 18 to the stack).
The starting state looks like Bare multisig start.
OP_0
will push the number 0 to the stack (Bare multisig step 1).
The signatures are elements, so they’ll be pushed directly to the stack (Bare multisig step 2).
OP_m
will push the number m to the stack, the public keys will be pushed to the stack, and OP_n
will push the number n to the stack (Bare multisig step 3).
At this point, OP_CHECKMULTISIG
will consume m + n + 3 elements (see OP_CHECKMULTISIG Off-by-One Bug) and push a 1 to the stack if m of the signatures are valid for m distinct public keys from the list of n public keys; otherwise, it pushes a 0.
Assuming that the signatures are valid, the stack has a single 1, which validates the combined script (Bare multisig end).
Note
|
OP_CHECKMULTISIG Off-by-One Bug
The stack elements consumed by The opcode does nothing with that extra element, and that extra element can be anything. As a way to combat malleability, however, most nodes on the Bitcoin network will not relay the transaction unless the extra element is |
In an m-of-n bare multisig, the stack contains n as the top element, then n pubkeys, then m, then m signatures, and finally a filler item due to the off-by-one bug.
The code for OP_CHECKMULTISIG
in op.py is mostly written here:
link:code-ch08/op.py[role=include]
-
Each DER signature is assumed to be signed with
SIGHASH_ALL
. -
We take care of the off-by-one error by consuming the only remaining element of the stack and not doing anything with the element.
-
This is the part that you will need to code for the next exercise.
Bare multisig is a bit ugly, but it is functional. It avoids the single point of failure by requiring m of n signatures to unlock a UTXO. There is plenty of utility in making outputs multisig, especially if you’re a business. However, bare multisig suffers from a few problems:
-
A bare multisig ScriptPubKey has many different public keys, and that makes the ScriptPubKey long. Unlike p2pkh or even p2pk ScriptPubKeys, these are not easily communicated using voice or even text messages.
-
Because the output is so long—5 to 20 times larger than a normal p2pkh output—it requires more resources for node software. Nodes keep track of the UTXO set, and a big ScriptPubKey is more expensive to keep track of. A large output is more expensive to keep in fast-access storage (like RAM).
-
Because the ScriptPubKey can be so big, bare multisig can and has been abused. The entire PDF of Satoshi’s original whitepaper is encoded in this transaction in block 230009:
54e48e5f5c656b26c3bca14a8c95aa583d07ebe84dde3b7dd4a78f4e4186e713
The creator of this transaction split up the whitepaper PDF into 64-byte chunks, which were then made into invalid uncompressed public keys. The whitepaper was encoded into 947 1-of-3 bare multisig outputs. These outputs are not spendable but have to be indexed in the UTXO sets of full nodes. This is a tax every full node has to pay and is in that sense abusive.
To mitigate these problems, pay-to-script-hash (p2sh) was born.
Pay-to-script-hash (p2sh) is a general solution to the long address/ScriptPubKey problem. More complicated ScriptPubKeys than bare multisig can easily be made, and they have the same problems as bare multisig.
The solution that p2sh implements is to take the hash of some Script commands and then reveal the preimage Script commands later. Pay-to-script-hash was introduced in 2011 to a lot of controversy. There were multiple proposals, but as we’ll see, p2sh is kludgy but works.
In p2sh, a special rule gets executed only when the pattern shown in Pay-to-script-hash pattern (p2sh) that executes the special rule is encountered.
If this exact command set ends with a 1 on the stack, then the RedeemScript (the top item in Pay-to-script-hash pattern (p2sh) that executes the special rule) is parsed and then added to the Script command set. This special pattern was introduced in BIP0016, and Bitcoin software that implements BIP0016 (anything post 2011) checks for the pattern. The RedeemScript does not add new Script commands for processing unless this exact sequence is encountered and ends with a 1.
If this sounds hacky, it is. But before we get to that, let’s look a little more closely at exactly how this plays out.
Let’s say we have a 2-of-2 multisig ScriptPubKey (Pay-to-script-hash (p2sh) RedeemScript).
This is a ScriptPubKey for a bare multisig. What we need to do to convert this to p2sh is to take a hash of this script and keep the script handy for when we want to redeem it. We call this the RedeemScript, because the script is only revealed during redemption. We put the hash of the RedeemScript as the ScriptPubKey (Pay-to-script-hash (p2sh) ScriptPubKey).
The hash digest here is the hash160 of the RedeemScript, or what was previously the ScriptPubKey. We’re locking the funds to the hash of the RedeemScript, which needs to be revealed at unlock time.
Creating the ScriptSig for a p2sh script involves not only revealing the RedeemScript, but also unlocking the RedeemScript. At this point, you might be wondering where the RedeemScript is stored. It’s not on the blockchain until actual redemption, so it must be stored by the creator of the p2sh address. If the RedeemScript is lost and cannot be reconstructed, the funds are lost, so it’s very important to keep track of it!
Warning
|
Importance of Keeping the RedeemScript
If you are receiving to a p2sh address, be sure to store and back up the RedeemScript! Better yet, make it easy to reconstruct! |
The ScriptSig for the 2-of-2 multisig looks like Pay-to-script-hash (p2sh) ScriptSig.
This produces the combined script in p2sh combined script.
As before, OP_0
is there because of the OP_CHECKMULTISIG
bug.
The key to understanding p2sh is the execution of the exact sequence shown in p2sh pattern that executes the special rule.
Upon execution of this sequence, if the stack is left with a 1, the RedeemScript is inserted into the Script command set. In other words, if we reveal a RedeemScript whose hash160 is the same as the hash160 in the ScriptPubKey, that RedeemScript acts like the ScriptPubKey instead. We hash the script that locks the funds and put that into the blockchain instead of the script itself. This is why we call this ScriptPubKey pay-to-script-hash.
Let’s go through exactly how this works. We start with the Script commands (p2sh start).
OP_0
will push a 0 to the stack, and the two signatures and the RedeemScript will be pushed to the stack directly, leading to p2sh step 1.
OP_HASH160
will hash the RedeemScript, which will make the stack look like p2sh step 2.
The 20-byte hash will be pushed to the stack (p2sh step 3).
And finally, OP_EQUAL
will compare the top two elements.
If the software checking this transaction is pre-BIP0016, we will end up with p2sh end if evaluating with pre-BIP0016 software.
This would end evaluation for pre-BIP0016 nodes and the result would be valid, assuming the hashes are equal.
On the other hand, BIP0016 nodes, which as of this writing are the vast majority, will parse the RedeemScript as Script commands (p2sh RedeemScript).
These go into the Script column as commands (p2sh step 4).
OP_2
pushes a 2 to the stack, the pubkeys are also pushed, and a final OP_2
pushes another 2 to the stack (p2sh step 5).
OP_CHECKMULTISIG
consumes m + n + 3 elements, which is the entire stack, and we end the same way we did for bare multisig (p2sh end for post-BIP0016 software).
The RedeemScript substitution is a bit hacky, and there’s special-cased code in Bitcoin software to handle this.
Why wasn’t something a lot less hacky and more intuitive chosen?
BIP0012 was a competing proposal at the time that used OP_EVAL
and was considered more elegant.
A ScriptPubKey like OP_EVAL would have been a command that adds additional commands based on the top element would have worked with BIP0012.
OP_EVAL
would have consumed the top element of the stack and interpreted that as Script commands to be put into the Script column.
Unfortunately, this more elegant solution comes with an unwanted side effect, namely Turing completeness. Turing completeness is undesirable as it makes the security of a smart contract much harder to guarantee (see [chapter_script]). Thus, the more hacky but more secure option of special-casing was chosen in BIP0016. BIP0016 (or p2sh) was implemented in 2011 and continues to be a part of the network today.
The special pattern of RedeemScript, OP_HASH160
, hash160, and OP_EQUAL
needs handling.
The evaluate
method in script.py is where we handle the special case:
class Script:
...
def evaluate(self, z):
...
while len(commands) > 0:
command = commands.pop(0)
if type(command) == int:
...
link:code-ch08/script.py[role=include]
-
0xa9
isOP_HASH160
,0x87
isOP_EQUAL
. We’re checking that the next three commands conform to the BIP0016 special pattern. -
We know that this is
OP_HASH160
, so we just pop it off. Similarly, we know the next command is the 20-byte hash value and the third command isOP_EQUAL
, which is what we tested for in theif
statement above it. -
We run the
OP_HASH160
, 20-byte hash push to the stack, andOP_EQUAL
as normal. -
There should be a 1 remaining, which is what
op_verify
checks for (OP_VERIFY
consumes one element and does not put anything back). -
Because we want to parse the RedeemScript, we need to prepend the length.
-
We extend the command set with the parsed commands from the RedeemScript.
The nice thing about p2sh is that the RedeemScript can be as long as the largest single element from OP_PUSHDATA2
, which is 520 bytes.
Multisig is just one possibility.
You can have scripts that define more complicated logic, like "2 of 3 of these keys or 5 of 7 of these other keys."
The main feature of p2sh is that it’s flexible and at the same time reduces the UTXO set size by pushing the burden of storing part of the script back to the user.
In [chapter_segwit], p2sh is also used to make Segwit backward compatible.
To compute p2sh addresses, we use a process similar to how we compute p2pkh addresses. The hash160 is prepended with a prefix byte and appended with a checksum.
Mainnet p2sh uses the 0x05
byte, which causes addresses to start with a 3
in Base58, while testnet p2sh uses the 0xc4
byte to cause addresses to start with a 2
.
We can calculate the address using the encode_base58_checksum
function from helper.py:
link:code-ch08/examples.py[role=include]
As with p2pkh, one of the tricky aspects of p2sh is verifying the signatures. p2sh signature verification is different from the p2pkh process covered in [chapter_tx].
Unlike with p2pkh, where there’s only one signature and one public key, we have some number of pubkeys (in SEC format in the RedeemScript) and some equal or smaller number of signatures (in DER format in the ScriptSig). Thankfully, the signatures have to be in the same order as the pubkeys or the signatures are not considered valid.
Once we have a particular signature and public key, we only need the signature hash, or z, to figure out whether the signature is valid (Validation of p2sh inputs).
As with p2pkh, finding the signature hash is the most difficult part of the p2sh signature validation process. We’ll now proceed to cover this in detail.
The first step is to empty all the ScriptSigs when checking the signature (Empty each input’s ScriptSig). The same procedure is used for creating the signature.
Each p2sh input has a RedeemScript. We take the RedeemScript and put that in place of the empty ScriptSig (Replace the ScriptSig of the input we’re checking with the RedeemScript). This is different from p2pkh in that it’s not the ScriptPubKey.
Last, we add a 4-byte hash type to the end.
This is the same as in p2pkh. The integer corresponding to SIGHASH_ALL
is 1 and this has to be encoded in little-endian over 4 bytes, which makes the transaction look like Append the hash type (SIGHASH_ALL), 01000000.
The hash256 of this interpreted as a big-endian integer is our z. The code for getting our signature hash looks like this:
link:code-ch08/examples.py[role=include]
Now that we have our z, we can grab the SEC public key and DER signature from the ScriptSig and RedeemScript (DER signature and SEC pubkey within the p2sh ScriptSig and RedeemScript).
We can now validate the signature:
link:code-ch08/examples.py[role=include]
-
z
is from the code on page 185.
We’ve verified one of the two signatures that are required to unlock this p2sh multisig.