Blockchain Technology, Part - 04

Blockchain Technology, Part - 04

Ingredients for a Blockchain 

Hash: Fingerprint for Data

You might have come across the term 'hash', maybe as root hash or previous hash, when we were discussing the structure of a block in the previous articles. Here, we will see what a hash is.

To simplify, hashing is a method in cryptography that converts any form of data into a unique string of text. The algorithm used for conversion is called the hashing algorithm. The fixed-size output from a hashing algorithm is called the hash


How does a hashing algorithm work?

A hash is generated by using a hash function/hash generator on our required data.

The hashing algorithm is analogous to our Kitchen Blender. Say that you want to make some fruit juice. You put all your fruits in the blender and give it a stir. Once the juice is ready, you pour it in a glass of say 250ml. Your aim is to always fill up the glass. If you think the juice won’t fill up the entire glass then you add some water while blending. There! 250ml glass is just like a fixed-size hash you get from a hashing algorithm. 

No alt text provided for this image

To understand the principle of hashing, take the analogy of a human fingerprint. Each person in the world has a unique fingerprint. There is the least possibility that one person has a similar fingerprint that of others. The chances of collision (two people having the same fingerprint) are too low, making it a unique identifier. Similarly, in the digital world, every piece of data can have its unique identifier, aka hash. A hash is a random set of characters that uniquely identify the data much like the fingerprints.



Features of Hashing Algorithm

A cryptographic hash algorithm has to satisfy five properties for it to be useful.

  1. One-way: While it is easy to calculate the hash of data, it isn't possible to decode the original data from a hash. Like our example with the juice blender, we can make juice out of the fruit but not the reverse. 
  2. Deterministic: A hashing algorithm always produces the same output for the same set of input.
  3. Fast Computation: The hash function should be able to compute the hash value without taking a considerable amount of time.
  4. Avalanche Effect: Any small change in the input data would produce an unpredictably different hash as output.
  5. Withstand Collisions: Since the length of a hash is fixed, theoretically, there is a chance of collision (where two different inputs producing the same hash for a given hashing algorithm). But it is very unlikely to happen. Also, a hash algorithm should withstand artificial collisions. Artificial collision is the process of forging the input so that it produces a hash value exactly the same as the one which is generated by a different set of input.

Now that you are familiar with the concept of hashing, let us see how this is used in blockchains. Cryptography and hashing are intertwined with blockchain. There are several instances where a blockchain depends on hashing. To cite a few:

  • Create a transaction hash
  • Create a Merkle tree
  • Create a block hash
  • Create addresses
  • Link blocks to form a chain

We will see all of them in the coming units.

Meanwhile, you can experiment with the following tool to generate hash values for different texts. The hashing algorithm used here is named SHA256, which stands for Secure Hash Algorithm. SHA256 hashing is used in blockchains like Bitcoin.

Understanding Data Structures - Merkle Tree

When we talked about blocks, we mentioned the root hash and Merkle tree, and we promised to discuss it later. So here we are.

In computer science, data structures are used for organizing data in an efficient manner. The use of an efficient data structure would help in quicker retrieval, sorting, deletion, and updating operations on the stored data. The real-world systems and applications which we are using every day methodically use these data structures to optimize their performance. The Bitcoin network also extensively uses certain data structures for the management and verification of transactions stored in a block.

In computer science, data structures are mainly classified into two types:

  1. Linear data structures
  2. Non-linear data structures

However, we are not going to cover all the above data structures as they aren’t the center point of this course. But we will cover a special kind of tree data structure, known as Merkle Tree (Binary hash tree), which is used in the Bitcoin core. Merkle tree is deployed for efficient management and verification of transactions in the Bitcoin block.

Merkle tree generation

Binary Tree

In computer science, the term "Tree" is used to describe a branching data structure, which is having a "root" at the top and "leaves" at the bottom. The below diagram shows a sample tree data structure

No alt text provided for this image

In the above diagram, each element that is marked with numerals is termed as nodes. The topmost element "1" is termed as the root node. All the elements (8,9,10,11,13,14) at the bottom are termed as leaf nodes. The nodes (2,3) are the children of parent node 1, likewise (4,5) are the children of parent node 2. There are more like this in the tree. 

The above tree is also known as a binary tree since every node at most contains only two children.

Merkle Tree

Merkle Tree is named after Ralph Merkle, who patented the concept in 1979. The tree is known for its performance in the secure verification of large amounts of data.

Merkle tree is a kind of binary tree in which the data elements contain cryptographic hashes. In other words, each leaf node of the Merkle tree contains the hash of a block in the blockchain. 

Let's see how a Merkle tree is constructed theoretically. The below diagram shows a simple Merkle tree constructed out of four transactions say, A, B, C, D.

No alt text provided for this image

Let's break down the steps performed to construct the above Merkle tree.

1. The Merkle tree is constructed bottom-up. We first start with four transactions A, B, C, D which forms the leaf nodes of the Merkle tree. Here, the transaction data are not stored in the tree, but the hashes of the four transactions ( HA, HB, HC, HD) are stored in each leaf node.

Hash of a transaction is calculated as follows:   

           HA = SHA256(SHA256(Transaction A))

 2. Consecutive pairs of leaf nodes are then summarized into a parent node by concatenating the two hashes and hashing them together. 

                    HAB = SHA256(SHA256(HA+HB))

 Similarly, the hashes of other nodes are calculated.

3. The process is continued until there is only one node at the top, known as the Merkle root or Root Hash. In the above figure, 'Root Hash' is the Merkle root. The calculated Merkle root is stored in the block header of the corresponding Bitcoin block.

Note: Since the Merkle tree is a binary tree, it needs an even number of leaf nodes. If we have an odd number of transactions, then we have to duplicate the last transaction hash to create an even number of leaf nodes.

For e.g. If we have only HA, HB, and HC, then we will duplicate HC to form an even number of leaf nodes. So now we have HA, HB, HC, HC as the leaf nodes of the Merkle tree. The same steps which are discussed above will be performed to calculate the Merkle root.

The below figure shows a block header with the Merkle root (Root Hash) of transactions in that block (in this case block number 11).

No alt text provided for this image

 We will cover the need and applications of the Merkle tree in the coming unit.

 Reference: https://www.oreilly.com/library/view/mastering-Bitcoin/9781491902639/ch07.html

Summarising Transactions

Merkle trees help to summarize all the transactions in a Bitcoin block by creating the Merkle root. It can be interpreted as a digital fingerprint of the entire set of transactions in the block. If a malicious actor tries to manipulate the transaction data, the Merkle root will change since we have hashed the data elements to calculate the root. This would result in a new block hash, and the link in the chain is broken. Thus, they provide a means to prove the integrity and validity of the data.

Transaction Proof

A node can prove that a transaction K is included in the block by producing a Merkle path. A Merkle path or authentication path is a specific path which connects the specific transaction to the root of the tree. 

The below figure shows a Merkle tree with a Merkle path highlighted to prove that transaction K is present in a block.

No alt text provided for this image

To prove that the transaction K is included in the block, a node needs to produce a Merkle path. Here, the path consists of four hashes HABCDEFGH, HL, HMNOP, HIJ. With those four hashes as the Merkle path, any node can compute four additional pair-wise hashes HIJKLMNOP, HKL, HIJKL, and the Merkle tree root (dashed outline). This proves that the transaction K is included in the block.

Simplified Payment Verification (SPV)

Simplified Payment Verification (SPV) is a method of verifying if particular transactions are included in a block without downloading the entire block. SPV nodes extensively use Merkle Trees. SPV nodes are a specific type of node in a blockchain network. They don't have all the transactions and do not download full blocks, just the block headers. Without downloading all the transactions in a block, we can verify that a transaction is included in the block by making use of the Merkle path.

Since we have seen the use of hashes in creating root hash, now let's look at how hashing helps in creating a chain with the blocks.

How blocks form a chain?

As discussed, in this unit we are going to see how a chain is made with blocks. The hash value of a block in a blockchain provides its unique identity. Here, we are also going to see another component of the block header - previous hash.

The below figure shows a chain of three blocks.

No alt text provided for this image

We can observe that every block is having a field called 'Previous Hash', which actually refers to the hash value of the previous block. This is how the blocks form a chain. Blocks are cryptographically linked using hash values. Any change in the 'data' stored in a block would result in a change in the hash value of the block and the change results in breaking the chain or the cryptographic link established between the blocks. New blocks coming to this chain will be appended at the end and will become the latest block. Blockchain is an append-only ledger, where modifications cannot be done in the middle.

Note: The 'previous hash value’ field of a genesis block is either empty or filled with zeros as there is no previous block for a Genesis block. 

Now we know the basic structure of a blockchain. Let’s look at how this blockchain becomes a network.

Blockchain Network

Consider a distributed peer to peer network. Each node in this network will have the same copy of blockchain with them.

No alt text provided for this image

This is called a blockchain network.

Public and Private Key Cryptography

Now that we have seen what a blockchain network is, let's look at how to interact with this blockchain network. For this, consider online banking as an example. In order to make a transaction, you need to insert your credentials to the online banking system. Your transaction will be processed only if you provide essential information. All your login information is stored in a server.

What about a decentralized system like blockchain? Is there is anyone to control the network or to maintain your password? How can we do a transaction in the network? Here we come across another concept in cryptography - Private and Public keys.

Private Key

A private key moreover functions as a password and is inevitable in carrying any transaction in a blockchain. As the name suggests, private keys are to be guarded, confidential to its respective owner. Whereas the public key can be shared with others.  A private key is basically a random number and we can make use of wallets to generate these. A blockchain wallet is a tool that manages your keys and monitors the balance of your account address if you are involved in the exchange of digital currencies. More about them in the next unit. 

In the case of Bitcoin network, a private key is a 256-bit number, which can be represented in several ways. Here is a 256 bit private key in hexadecimal representation(256 bits in hexadecimal is 32 bytes, or 64 characters in the range 0-9 or A-F).

E9873D79C6D87DC0FB6A5778633389F4453213303DA61F20BD67FC233AA33262

Note: This private key is for demonstration purpose only, kindly do not use this for any real-world blockchain transactions.

Public Key

A public key is generated from a private key using ECC(Elliptic-curve cryptography) mathematical functions. These mathematical functions are practically irreversible, meaning its easy to calculate public key from private key but impossible to do vice versa. Therefore, it is safe to give your public key to someone without the risk of revealing your private key.

Address

With the help of this public key, we can generate an address in the blockchain. The concept of an address is similar to that of an account number in a bank. Your account and account balance are identified using the account number. Similarly, your balance in your blockchain network is identified using these addresses. An address in the Bitcoin network is generated from the hash of your public key. This is another scenario where hashing is used.

No alt text provided for this image

So now you know how the keys and addresses are generated for Bitcoin.

Blockchain Wallets

The private key of a blockchain network should be extremely secret. Once the key is lost, it cannot be retrieved. This has risen concern in the blockchain world. The best way to store a private key securely is to memorize it. However, this isn’t a feasible option as the private key is of length 256 bits and they are normally represented as 64 hexadecimal digits. Also, a person may have more than one account in a network. That means separate private keys for different accounts and you need to memorize all these keys, which is practically difficult. Another method is taking a print out of the keys and keep it somewhere safe. However, this too seems inconvenient.

So, what are our other options?

The concept of wallets comes in this context. Wallets are specific tools that help us in managing private keys and corresponding addresses. Wallets monitor the balance in these addresses. Wallets also establish our identity in the blockchain and ensure we're able to do transactions in the network. Blockchain wallets can be classified as "hot" or "cold". Hot wallets are connected to the internet while cold wallets work offline.

Wallets can be further classified as follows:

1. Web Wallets: It allows you to access blockchain through a web browser interface without downloading or installing the wallet software to your system. Web wallets store your private keys online. eg: MyEtherWallet, MetaMask.

2. Desktop Wallets: It is a software that you can download and execute locally on your computer. Your private keys will be stored locally on your hard disk. eg: Exodus, Bitcoin Core.

3. Mobile Wallets: It is specifically designed for mobile devices and also easy to use. eg: Mycelium, Electrum.

4. Hardware Wallets: It is a hardware device where keys are stored in the device itself. eg: Ledger Nano S, Trezor.

5. Paper Wallets: It is a piece of paper on which public addresses and private keys are printed and stored securely.

Note: All the examples here are for your reference and this does not mean we are promoting or recommending these tools.

Digital Signature: The Trust Layer

One of the major applications of Public Key cryptography is Digital Signatures. For example, if Alice wants to do a transaction with Bob, she signs her transaction with her private key and sends that transaction to the network. The wallet helps Alice in signing the transaction and sending it to the network. A transaction is only valid if it has a valid signature with the authorized private key. Bob can make sure that the transaction is from Alice by checking her public key.

No alt text provided for this image

With the help of a digital signature, we can prove the ownership of funds. Every transaction that we initiate is signed with our private key. Anyone having access to our public key can verify whether the transaction was initiated by us, without actually knowing the private key. 

In Bitcoin, the mathematical function used to generate the public key is ECDSA Algorithm (Elliptic Curve Digital Signature Algorithm).


要查看或添加评论,请登录

社区洞察

其他会员也浏览了