How does Blockchain Mining work?
From logistics to healthcare, from social media to real estate, from the energy sector to the global economy, Blockchain is predicted to transform almost every single industry in this decade. Often described as even more revolutionary than Artificial Intelligence, this technology is entering our lives at mind-blowing rates.
While inspiring overall, this extreme pace also comes with a negative side as education simply cannot keep up.
There are lots of concepts that allow Blockchain-powered projects and ideas to exist. But let’s keep it simple by only focusing on the most used and, at the same time, most misunderstood topic of Blockchain called Mining.
We’ve all heard about Bitcoin mining and miners. We’ve probably even used these terms. But what exactly do these miners do? What is mining all about? These are the question I will try to answer today in this article and I will do this in three parts:
What’s a cryptographic hash?
The cryptographic puzzle
Block configuration
Part 1: What’s a cryptographic hash?
As we have already heard that a blockchain is a series of successive block cryptographically linked together:
But what does this mean and how is this connected to mining? Let’s have a closer look.
An individual block in a blockchain contains the following elements:
block number, data stored in the block, hash of the previous block and hash of the current block.
A hash (or cryptographic hash) is a long number which acts as a digital fingerprint of any collection of data.
In Bitcoin the SHA256 hashing function is used which generates a 64-digit hexadecimal number. For example, the cryptographic hash of the words in this paragraph is:
C019286295F2CDEC9958BEE25B9603B5F94C76B2CCC69A59CE54872ED26DC479
Hashing algorithms have many interesting properties, however today we are most interested in three:
1) the SHA256 function is deterministic — We will always get the same hash output if we recalculate the function with the same input
2) the SHA256 function is impossible to reverse-engineer. Meaning that we can never know in advance what hash value we will get until we actually calculate it
3) if we feed the SHA256 function to even slightly varying inputs (for example, we change a dot for a comma), we will get wildly different outputs.
we would input the current block’s number, the data stored in the block and the hash of the previous block into the SHA256 function to get the value of the current block’s hash:
SHA256 (Block Number, Data, Previous Block’s Hash) -> Hash
Now we can see how the blocks are linked - not only does each block reference the previous block’s cryptographic hash but, in fact, that hash directly affects the value of the current block’s hash. Therefore, if anyone were to tamper with any given block’s data, such action would render not only that specific block’s hash invalid but also all of the following blocks’ hashes invalid.
Such connection between blocks means that the Blockchain as a whole is much more tamper-proof than standard database structures and other record-keeping methods. And since a Blockchain is in essence a ledger of records, this tamper-proof property is known as the “ Immutable Ledger” property.
Okay, great. That’s how blockchains work.
But what has this got to do with mining? The straightforward answer is that mining is all about calculating the hash value for the newest block which is being added to the chain. However, it’s not all that simple.
The thing is that the SHA256 function only takes a fraction of a second to calculate. And yet, we have heard of the numerous mining pools such as BTC.com and AntPool, and even industrial scale mines — all competing to generate the next Bitcoin block. So the question is — why do we need all that computing power?
Part 2: The Cryptographic Puzzle
This is where complexity layer in Blockchain starts. Buckle up!
Blocks in the blockchain have another field which we have not spoke about yet. This field is called “The Nonce” which stands for number used only once:
The Nonce is an integer number and along with the Block Number, Data and Previous hash the Nonce serves as an input for the SHA256 function to calculate the current block’s hash:
SHA256(Block Number, Nonce, Data, Previous Block’s Hash) -> Hash
Unlike other components of a block, the Nonce is designed to be totally under our control. This means that now we have a mechanism to vary the current block’s hash while keeping the data inside it intact. Indeed, thanks to the nature of the hash function (property #3 in our discussion above), every time we select a new Nonce for the same block the resulting hash will be a different value.
Alright, that’s great. But what has any of this got to do with mining? This is where we come to the fun stuff.
There is a total of 16?? possible SHA256 cryptographic hash values (each hexadecimal digit has 16 possible values and there are 64 of them in a hash). However, not all of them are valid hashes.
Why is that? Well, every two weeks the Bitcoin network will define a minimal target for the hash. Anything above this target will be rejected, anything below — accepted.
For example , if at the time of writing the target is:
0000000000000000005d97dc0000000000000000000000000000000000000000
What is really important in the target is the number of leading zeroes. Just like in the decimal system, leading zeros in a fixed-size number will determine its magnitude. Every leading zero reduces the number’s magnitude by a factor of 16 (ten in the decimal system, but here we’re working with hexadecimals).
There are 18 leading zeros in the current target, meaning that the number of total valid hashes is 16?? (only 64-18=46 non-zero digits remain). Therefore, the probability that a randomly picked hash is valid can be calculated as:
16?? / 16?? = 16^(-18) = 0.00000000000000000002%
In Bitcoin mining terms, this is the probability that any given Nonce value will generate a valid hash for the current block .We can clearly see the pool of valid hashes in reality is extremely small in comparison to the complete SHA256 pool.
And that’s what the cryptographic puzzle is all about: miners compete to find a Nonce (also called a Golden Nonce) which will generate a valid hash for the upcoming block. Whoever finds it first is allowed to add the block to the chain and get’s their reward of 12.5 Bitcoins. At the time of writing one Bitcoin is worth around $18,000 USD or 1894984 Rupee making mining a rather worthwhile activity.
The target is defined based on the network’s hashrate (aggregate computational power of all Bitcoin miners). The more miners join the network — the lower the target will be, and therefore the harder it will be to find a suitable hash. The goal of this difficulty algorithm is to ensure that only one new block to is added every 10 minutes. This is part of the Bitcoin monetary policy to control the total number of coins in circulation.
In a nutshell, that’s what the millions and millions of mining machines are doing day and night — they are simply iterating different values of the Nonce in hopes of being the first to find a valid hash for the next block. Once a valid hash if found, the block is added to the chain and the race starts over again, this time for the next block.
Part 3: Block configuration
Just when we thought we are re done… There’s more.
The Nonce is an integer value with 32 bits of memory allocated to it. Meaning that it has a limited range of around 4 Billion values. This poses two problems:
First, even an average mining device can calculate up to 100 million hashes per second, and therefore will go through the Nonce range in 40 seconds. And that’s an average miner. Mining pools and industrial scale mines are able to go through the Nonce range in fractions of a second.
Secondly, the chance of finding a valid hash is so small that even with 4 billion tries the probability of success is still extremely low:
4 * 10? * 0.0000000002% = 0.0000000001%
So what’s the solution?
For starters, the block contains… yet another field which we haven’t spoken about yet. This field is a timestamp representing the current Unix time (number of seconds elapsed since 1st January 1970):
The Timestamp is also included in the SHA256 calculation for the hash of the current block that’s being mined:
SHA256(Block Number, Timestamp, Nonce, Data, Prev. Block’s Hash) -> Hash
And since the timestamp is constantly refreshing (until the block is successfully mined), this effectively resets the Nonce range every second. Why? Well, as we discussed at the very start — even if the inputs of the SHA256 function are varied slightly, this causes the hash to change.
Therefore, if we try all 4 Billion Nonce values for a fixed combination of other inputs (block number, timestamp, data, previous block’s hash) but have no luck finding a valid hash, all we have to do is wait until the the timestamp increases. A change in the timestamp will mean that the combination is now different and if we try all 4 Billion Nonce values again, every time we will get a brand new hash value.
The timestamp solves the problem for the average miner since it will reset before they get to the end of the Nonce range (reminder: average miner takes 40 seconds to do 4 Billion passes). However, for a mining pool or industrial scale miner even one second is too long — as we discussed, they would get through the Nonce range in fractions of a second. So how do they solve the problem? This is where block transaction configuration comes in.
Participants of the Bitcoin network transact with each other all the time. However, a new block is only added once every ten minutes. So where do the transactions go before they are added to a block? New entries are added to a staging area called the mempool. It is then the miners’ job to pick up a batch of these transactions from the mempool and add them to the new block they are mining.
Block size is limited and not all transactions from the mempool will fit into the new block. This means that miners get to pick which transactions will go into the next block. What this also means is that miners can change the configuration of transactions at will (before the block has been successfully mined).
And this is how miners get additional control over the hash. Well… Control isn’t the right word since the hash cannot be reverse engineered or predicted. Variability is a better term here: changing the configuration of transactions creates additional variability in the hash function inputs.
Similar to the timestamp situation, whenever we try out all 4 Billion possible values in the Nonce range and have no luck, all we have to do is slightly alter the combination of transactions which we have selected from the mempool.
The main difference here is that we don’t have to wait. By altering the selected transactions, we can reset our Nonce range at will — therefore we can do this as many times per second as we want. Of course, all this is done using algorithm. This way even mining pools and industrial scale miners can test new hash values continuously without any idle time.
So, this is how blockchain mining works …
Thank you