Segregated Witness in General
Segregated Witness has become one of the most interesting offers causing a lively discussion in the community. Pieter Wuille made a proposal during the Scaling Bitcoin workshop in Hong Kong.
Supported by many community members, SegWit is thought to be able to improve the Bitcoin network performance in several areas at once. There are even those who suggest that SegWit is that long-awaited solution to the scaling issue, able to bring peace to the community pulled apart by disputes over the block size.
To fully comprehend the way SegWit works, it is necessary to understand the essence of Bitcoin transactions at a significantly deep technical level. First, of course, it is important to realize that the Bitcoin protocol essentially consists of transactions. The p2p network nodes do not send Bitcoins to each other but transaction data sets.
We can say that transactions are a set of “locks”, consisting of two main components. One component “releases” Bitcoins of previous transactions using data called “inputs”. Inputs include scripts – they are instructions on an input execution manner, called scriptSigs. The other component contains a set of new locks – so-called “outputs” – which “lock” the same or smaller number of Bitcoins. They contain scripts called scriptPubKeys. Consequently, Bitcoins move from inputs to outputs within a single transaction while jumping from one transaction to another.
However, this rule has a fundamental exception. When miners find a new block, they create the coinbase transaction which has nothing to do with the company of the same name of course. The coinbase transaction includes a reward for the block – it is 25 Bitcoins now. Besides, miners can increase their reward for any amount of Bitcoins “released” in transactions but not “locked” back: that is, the difference between inputs and outputs. This difference is a transaction commission fee.
Transaction senders conduct all these Bitcoin “locking” and “unlocking” procedures and actually transmit them as data sets over the network. All network nodes verify the unlocking and locking validity. If everything is all right, they transmit the transaction data to other nodes. If a node is also a miner, it can record a transaction in a block. A miner takes a decision to do or not do so. That is why a commission fee is necessary. When verifying transactions, it is also of vital importance for all nodes to follow the rules accepted by the vast majority of miners. In case some miners start including transactions rejected by other miners in the blocks, the entire block will be considered invalid by this node. If such a node is also a miner, double spending and a fork may occur.
The consensus rules, to which all nodes agree, allow them to lock and unlock transactions, resorting to different ways simultaneously. However, as a rule, outputs which lock Bitcoin include scriptPubKey having approximately the following sense: “prove you are an owner or know your private key that corresponds to the public key attached to this Bitcoin address.”
It is quite easy to recover a public key using a private one. However, it is virtually impossible to do it inversely. To recover an address from a public key is the same easy, but it is not anyhow possible to know a private key from an address. It is simple to recover an address from a private key as well, but there is no way to learn a private key from an address at all.
The address used to lock Bitcoin into scriptPubKey is certainly the address provided by the transaction recipient. Because the recipient created this address using a key known exclusively to them, only they are able to create a valid scriptSig, thus becoming the only person who can conduct a new transaction and spend the locked Bitcoin.
To prove the private key ownership, it is theoretically possible to include this key in the scriptSig transaction, but no one does it because it is rather dangerous. First of all, in such a case, anyone who sees the transaction is able to appropriate such private key and conduct a new transaction (or change the original one), in this manner actually embezzling all the locked Bitcoins before the original transaction appears in the block. If miners included their private keys into the scriptSig transaction, it would be rather simple to steal Bitcoins since miners themselves choose the transactions to be or not to be confirmed. That is why scriptPubKeys usually include a requirement for scriptSig to contain one or more signatures able to unlock Bitcoin.
A signature is a cryptographic technique which uses a private key in combination with any other data to compute a unique sequence of numbers. The appropriate public key can verify that the specific signature was created using the specific private key. Therefore, signatures both prove the private key ownership and allow confirming the particular part of data by the private key owner – and all this happens without disclosing the key. With Bitcoin, private keys are usually used to sign all transaction data except inputs, as well as scriptPubKeys, locked funds, and some other data. As a result, the signature and public key used to spend Bitcoins are added to the transaction in the input field. This procedure proves that the key owner really had an intention to conduct a transaction and guarantees it could not be faked.
Further on, all transaction data, this time also including inputs, are hashed together, generating a transaction ID which serves as its identifier. In case the transaction eventually appears in the block, a miner hashes its ID along with another transaction ID, thus receiving a new hash. This hash is also hashed, this time with a hash of other two transaction IDs. The process persists until only one hash is left. Such hash system is called Merkle tree, and the hash left is called Merkle root. The Merkle root goes together with additional data, contained in the block header, which is used to identify a specific block. The block header hash is eventually included in the next block header, linking one block to another.
Bitcoin is considered unchangeable since changing any part of any transaction retrospectively will lead to the transaction ID change, resulting in the block header change – however, this block will no longer conform to the stipulated requirements. Consequently, since the block header affects the structure of subsequent block headers, they will not conform to them either.
SegWit is based on the sidechains concept developed by Blockstream and complements the Bitcoin core developer Luke Dash’s idea. The common concept was developed several months later in collaboration with core developers Gregory Maxwell and Eric Lombrozo. The system will have been ready by the middle of the current year.
In the context of nodes which do not use SegWit (notionally call them old), some newly created outputs can start using a strange type of scriptPubKeys. They are strange because they can hardly be considered a lock. Generally called “spend-all”, these keys mostly claim they do not need signatures. Moreover, they also contain a completely meaningless text.
Old nodes will consider these transactions meaningless. They will think that anyone is able to create a new scriptSig, unlocking these outputs, meaning that they are virtually completely unprotected. Concurrently, old nodes will technically not be able to reject new transactions. The scriptPubKeys text will be meaningless to them, but it is technically quite acceptable. Hence, old nodes will prove the transactions valid and transmit them to other nodes.
As for the nodes with SegWit (call them new) will behave a little bit differently. The scriptPubKeys text will not be meaningless to them at all. They will see a very specific type of output in it.
Similar to former outputs, these new outputs will require several signatures to unlock Bitcoin – however, unlike them, they will not require the signature contained in the next transaction’s scriptSig. Instead, the signature will have to be included in a completely new transaction part – SegWit.
SegWit essentially works as an add-on which contains signatures and some other data. The main thing here is that SegWit is completely ignored by old nodes and recognized by new ones. Furthermore, their data is not hashed together with other transaction parts to create an ID.
Consequently, both old and new nodes will consider SegWit transactions valid. Old nodes will validate them because they do not require signatures at all, and new ones will do it because SegWit contains the required signature. As both nodes hash transaction data into the same ID, a block compilation consensus will be reached, and therefore, the blockchain will not stir up controversy.
However, one small issue still remains: if signatures do not affect the blockchain arrangement, the blockchain can no longer serve as proof that the transaction contains correct signatures.
To include signatures in a blockchain anyway, miners using SegWit do a little trick: they create the Merkle tree not only of transactions but also of SegWit, and the latter fully corresponds to the transaction tree. The SegWit tree root is included in the coinbase transaction input field. This way the SegWit tree root changes the coinbase transaction data, its ID, and therefore, its header – as a result, the entire blockchain arrangement is altered.
Wuille’s suggestion allows removing signatures from Bitcoin transactions, preserving its irreversibility without violations to any of the accepted consensus rules.
However, the fact that removing signatures from 1 MB blocks can actually increase the Bitcoin block size drew the most attention. This means it will be possible to record more transactions per second into a blockchain ledger, meaning that the transaction handling capacity will increase. Furthermore, all these changes will not even violate the maximum block size rule.
Wuille’s suggestion does not determine a new maximum block size. The formula used to calculate the limiting values is defined rather randomly – the block without a witness and one-fourth of SegWit should not exceed 1 MB in total. Therefore, old nodes will consider all blocks to be smaller than 1 MB because a quarter of SegWit which they do not see at all is included in the same 1 MB. At the same time, new nodes will see that the blocks exceed 1 MB because SegWit real size is larger than the quarter recorded.
The exact additional memory capacity depends on the transaction types included in new blocks. If more transactions store more data in SegWit, as it will surely happen with multisignature transactions, the total size of new blocks will increase, bringing it to the maximum size of about 1.75 MB for ordinary transactions and 4 MB as a strict maximum that cannot be exceeded under any circumstances even if SegWit contains all the data.
Nevertheless, there is another and perhaps the main advantage: SegWit is able to provide transactions flexibility. In fact, it was being developed for the sake of this feature.
This flexibility is ensured through the cryptographic technique of signature changing without changing the parameter which is marked by it. This can be achieved even without an original private key. With Bitcoin, this means that anyone can choose any transaction from the p2p network and replace one valid signature with another. The new signature will correspond to the same data and will be verified with the same key. It does not affect the transaction at all. Yet, since it looks different, the transaction ID becomes beyond recognition.
The flexibility of transactions entails two major issues: it creates difficulties for the software in verifying the transaction confirmation using an ID, but what is more important, it significantly restricts possibilities for utilization of all sorts of Bitcoin network complex tricks dealing with unconfirmed transactions – for example, payment channels and Lightning Network. SegWit removes signatures from that part of the transaction which is used to create an ID. Therefore, although signatures can be changed in it, it will have no significance for payment channels or Lightning Network, as well as the software will be, all the same, able to use the transaction ID. Thus, some space is formed for additional scalability levels.
Script versioning is the third SegWit advantage which stirred great anxiety among Bitcoin coders.
Besides scriptSigs, SegWit includes something else: version bytes. They come prior to the script, specifying its type. If any node that reads the version bytes recognizes the type, it becomes clear for it what conditions are to be met to unlock Bitcoin. If the bytes are not recognized, the node interprets scriptSig as “spend all”. All these features create completely new ways to lock Bitcoin into transactions. In fact, it will allow locking it in any way you wish – some ways are still unexplainable since they have not been invented yet. Nevertheless, Shnorr signatures, which are verified much faster than the modern ones, were among the first ideas which also included more complex types of multi-signature transactions and even scripts like Ethereum. Concurrently, the same consensus about the rules is still preserved, but in this case, not all but the majority of miners have to accept the new features, this way greatly facilitating the algorithm implementation.
In addition, SegWit offers something called “fraud proof”. Invented by Satoshi Nakamoto, such protection mechanisms are able to significantly enhance “light wallets” security, meaning nodes which do not verify all network transactions and do not store the entire blockchain. To verify the transaction conclusion, the node simply scans the blockchain ledger for the corresponding transaction ID. Finding it, the node makes sure that some miner has included the transaction in the block. Nonetheless, such nodes do not confirm the transaction compliance with the rules. This is the reason why these nodes natively rely on the miners’ fair play without checking them. In the worst case, this scenario may encourage miners to pay them with Bitcoins created out of thin air, for example, for creating transactions with no inputs, or awarding themselves huge commission fees for a coinbase transaction.
Issues like these are solved if miners are required to include additional data in the SegWit tree – such data should specify the exact origin of Bitcoin locked in all transactions. Consequently, if a block contains invalid transactions, any full node can easily build protection against forgery, and then send confirmations to light nodes, and they will be able to reject invalid blocks. Nevertheless, even in that event, light nodes will not be able to provide the same level of security as full ones. The offered solution requires operating in a network which is free of censorship (e.g., government blocking). Additionally, light nodes will require at least one full node to provide a real protection against forgery.
And finally, SegWit is able to reduce the data amount which nodes store locally. It will also reduce the full node running cost and significantly shorten the time necessary to synchronize with the network during the first time installation. Although nodes typically store all transaction data, signatures are considered invalid after some time has passed. If the transaction was recognized as valid, recorded in the block, and fixed in a blockchain, say, for a year, it can be faked only if all miners at ones keep on mining the invalid chain during this period and no one notices this activity over the year.
To comprehend how SegWit can influence disputes about the block size, first of all, it makes sense to briefly recap the essence of the debate itself.
Generally, it is all about a compromise between transactions handling capacity and decentralization, including a little bit of economy. The current 1 MB block size allows the network to process up to seven transactions per second. The so-called “progressors” consider it too small – in this case, a comparison with Visa system, able to handle thousands of transactions per second, is especially popular. According to progressors, extremely small blocks can limit the Bitcoin capacity and increase the blockchain transactions cost to the extent when only centralized services will afford it. In turn, it will cause users to massively leave the Bitcoin blockchain in search of alternative payment solutions, and it may even lead to the entire system collapse.
From the perspective of the so-called “decentralists”, excessive increase in the block size can somehow make Bitcoin centralized at the protocol level. Here is one issue they are concerned about – larger blocks are slower to propagate from node to node, and verification takes longer on each separate node, slowing down propagation time even more. This will work in favor of those miners or pools who find more blocks, as they receive a head-start. Decentralists fear that the mining process can get concentrated in a very small group of pools. Furthermore, large blocks will make the full nodes’ work more expensive since they will require a larger handling capacity and more memory. This will make the Bitcoin network operation more difficult, creating a mechanism that does not require trust and motivating users to delegate their rights of reaching a consensus to others, thus additionally centralizing the system. Most decentralists are also sure that limiting the block size is necessary for economic reasons, not to create too many blocks. A large number of blocks, as they believe, can cause miners to start reducing each other’s commission fees until the network transactions become completely free of charge. If it happens this way, miners will not earn enough to ensure the network security.
Some progressors believe the commission fee level to settle down on its own. The more transactions are recorded the larger the block size becomes, resulting in the increased number of abandoned blocks. In this case, miners will surely receive a sufficient fee to compensate for the additional risk. In turn, decentralists object that the dynamics like this will simply provoke another risk of centralization since it will not eventually allow abandoning blocks at all.
As decentralists believe, the aforementioned conditions may lead to the Bitcoin regulation at the protocol level, damaging its immunity to censorship. Although they admit that smaller blocks limit the number of transactions processed in the blockchain, they presume that the future Bitcoin network will resort to add-ons like treelike blockchains or already mentioned Lightning Network. In general, progressors recognize the advantage of such supplementary add-ons but not as a scaling solution. According to them, Bitcoin scaling should first take place “at the chain level”.
As it is now clear, the SegWit properties with all their advantages cannot facilitate any solution to the dispute about the block size. Progressors do not consider Lightning Network to be the solution to the scalability issue. Forgery protection is useful, but light nods will still remain not as reliable as full ones. It would be appropriate to abandon old transactions data, but the core developers had offered similar solutions even before SegWit came into existence. Version bytes can come into play in the future, but it is not clear yet what is to be done with them.
Therefore, the question is whether SegWit will satisfy both factions, meaning: will 2 MB suffice for the progressors (this is exactly how much real transaction data the system is meant for) and will 4 MB seem conservative enough for decentralists (in fact, large miners can force their more modest competitors out, having started mining 4 MB blocks)? SegWit is good for its complete optionality – the user decides to upgrade their software or stick to the existing. Those who want to use SegWit will receive a conditional “discount” on the commission fee since they will actually use less block space which is always in demand. Those who do not wish to use it because of the increased cost of a full node operation by no means have to install it. Therefore, at least one of the decentralists’ concerns (about the full node increased cost operation) is removed. Another concern of the decentralists – increased propagation time – is somewhat more complicated. But Wuille, who himself belongs to decentralists, does not believe it will cause any problems.
The SegWit verification algorithm execution will take some extra time for each individual node, but it will likely be negligible. The propagation time will slightly grow, but Wuille’s models show that 4 MB blocks are within the current network capabilities.
Consequently, most decentralists are in favor of SegWit, considering it a vital part of the scaling workflow chart compiled by Gregory Maxwell. This chart provides some time advantage and postpones the moment when the blocks are completely filled (in case SegWit will operate as expected), without violating existing and generally accepted rules. Decentralists want to spend the time gained, finding long-term solutions and coming with a more persistent policy on the block size, additional add-on levels, and other optimization techniques. However, progressors do not consider 2 MB blocks sufficient: for instance, Gavin Andresen offers increasing the block size to 8 GB within 20 years.
Hard fork and soft fork
Furthermore, some progressors mark that SegWit will become fully functional in a year at the earliest. For that reason, some developers who favor progressors want the entire network to switch to large blocks before SegWit is implemented, and that leads us to the main issue. Wuille’s suggestion can be implemented as a soft fork, being virtually introduced into a system exclusively by miners. Increasing the block size is only possible “the same old way” – employing a hard fork.
Because the consensus has not been reached, some users may risk a hard fork, probably hoping that others will implement the changes, when it is too late to object. Yet, in case all the others do not follow the lead, the entire network will fork. All this had happened before when Gavin Andresen and Hearn had introduced Bitcoin XT, but they never gained any significant support afterwards. Moreover, some developers prefer SegWit as a hard fork which has some advantages over a soft-fork.
First of all, a hard fork ensures all nodes on the network to execute the same set of rules – and according to these rules, full nodes must validate all transactions, even if their Bitcoins are not engaged in them. Secondly, a hard fork would be a “cleaner” solution. SegWit implementation avoids violating existing rules using parts of the Bitcoin protocol – the coinbase transaction input – the way they are not designed for. So, some users anticipate that this complication could lead to issues in the future. Thirdly, as a result of a hard fork, Bitcoin software coding may become more complicated, for example, software for wallets. Some developers, mostly those who have more conservative views, believe that a hard fork should be implemented as the last possible measure. Most of all they want to avoid forking the network – and if a hard fork cannot be evaded, it has to be announced in advance. Besides, arrangements should be taken to provide each user with a chance to upgrade the system. In turn, a soft fork can be arranged as soon as the code is ready – upon miners’ consent – and all the other users can upgrade if and when they want.
In fact, the actual question is whether the potential hard fork will find enough support in the Bitcoin user environment. Although many people consider a consensus to be a rather loose concept, some of them already believe that, with regard to a hard fork, it has already been reached.
At the moment, Maxwell’s workflow chart seems the most likely solution. It is the only plan requiring no early hard fork and supported by most Bitcoin developers. As of this moment, it only lacks support from the owners of most hash capacities. However, a hard fork attempt, done, perhaps, by the major industry players, cannot be ruled out either. By the way, Bitcoin XT is somewhere close by as well.