Overview of blockchain and bitcoin (Part 2)
Block
Block is where transactions are grouped, sealed and stored. Transactions grouped together by a mechanism called Merkle tree. Computer will hash all transactions in a block, to find Merkle root. Merkle root hash is stored in block header.
If a transaction cannot find another one to form a pair, it is hashed with itself.
Merkle root is the final hash, after hashing every transactions in a block.
To save storage, some blocks may not store all transactions, but keep only intermediate hashed results. Reducing storage by skipping transactions like this will decrease security of whole system. Because these lightweight node cannot verify transactions and therefore vulnerable to attacks.
If any transaction in block is altered, Merkel root must be updated. If Merkle root of a block is updated, that block’s data is updated. Hence that block must be re-hashed, which process we call proof-of-work.
Proof-of-work
After having all required data to store in block header, block is ready for calculating block hash.
Calculating block hash from block header data is super easy, a regular computer can finish that job in less than one second. What if everybody can easily rewrite blockchain, modify some blocks, insert invalid transactions, then say that the modified version is the valid one? Process of calculating block hash must be harder, cost huge amount of compute power, which will make brute force attack longer and more costly.
The method used is target threshold. Calculated block hash must be below a pre-defined threshold. This make block hash starts with some zeros. If you look into a block in bitcoin blockchain, you can see block hashes like this:
The hashing process is one-way. Nobody can find original data from hashed result. The only way to check whether one piece of data will hash to a specific string is do hashing. This is brute force. Because data in block header have already specified, one way to hash to some specific string (which are below target threshold), is adding a piece of data. That piece of data is called nonce:
When changing nonce, block header data is changed, hash result will change. The process is: Add a nonce to block data -> hash block -> check if hash result is below target threshold. If so, notify network that a new block is found and built. Otherwise, try a new nonce. Repeat until finding one valid hash result, or get notification from another computer that they found before you.
Block generating and hash calculating is controlled to be about 10 minutes each block. But sometimes computers in network can do that faster or slower. After every 2016 blocks (about 2 weeks), difficulty (target threshold) is adjusted up to 300% harder if the last 2016 blocks are generated faster than 2 weeks; or adjusted down to 75% easier if they are generated longer than 2 weeks. This make proof-of-work always hard enough despite of total compute power. So nobody can invest into power and gain control of bitcoin system.
Chain
After calculating block hash as proof-of-work, blocks are chained together. Every blocks (except the first one) stores previous block’s hash in its header. If any transaction or block is changed, every subsequent blocks must be changed, too. So whole costly proof-of-work hashing process of every dependent blocks must be re-done. This mechanism strengthen blockchain and reinforce confirmation of transactions. If someone want to change a transaction, that person must rewrite blockchain from the block holding that transaction, and must do all hashing process faster than whole honest network combined, to form a longer chain. This is considered extremely hard if attacker does not have majority (51%) compute power.
Number of confirmations is number of blocks after a transaction. The more confirmation, the more secure that transaction is. With small transactions like buying coffee, merchant only require 1 confirmation after bitcoin is transferred, to make it fast and convenience. But with larger transactions like buying a phone, merchant may ask for 6 confirmations. More confirmations, harder to rewrite blockchain and undo bitcoin transfer.
Blockchain will collapse if someone or group control 51% compute power of whole system. With that compute power, that person/group can rewrite blockchain data and insert any transaction that benefit themselves, so damage the society.
Consensus rule
What if there are many chains coexist at one time? (This usually happens)
Consensus rule: the longest chain is the honest one.
If someone want to cheat system, that person must produce a valid (proof-of-work) chain longer than the chain that all honest people are creating, at faster speed. To do so, that person must control 51% computing power. This is called 51% attack. Currently, there is no computer system (including governments) has such computing power.
But we also need to know that the 5 largest mining organizations own more than 75% of total mining capacity; and 58% of hashing power is from China. Therefore, bitcoin mining pools are monitored closely by community.
Bitcoin system is created to be more profit when honestly generating blocks than cheating. Generating blocks earns bitcoin rewards, which is better than use compute power to start a race with all honest computers. But in the future, when bitcoin reward decreases (halved every 4 years), or bitcoin to USD price decreases, while number of miners are huge, then nobody can guarantee kindness will win.
Because of those design principals, number of transactions growth and bitcoin price booming, bitcoin system has just uncovered a lot of limitations.