登录查看更多内容

Exploring Key Distributed System Algorithms and Concepts Series: 1 — Markel Tree

Vertisystem

Technology-driven business solutions

发布日期: 2023年9月4日

Distributed systems are at the heart of modern computing, enabling applications to scale, be fault-tolerant, and provide high availability.

In this blog, we’ll dive into several fundamental algorithms and concepts that form the backbone of distributed systems architecture. We’ll cover their real-life usage, provide detailed examples, and explain their working logic.

Merkle Tree

The primary purpose of a Merkle tree is to provide a way to verify whether a specific piece of data is part of a larger dataset without needing to store or transmit the entire dataset. This is especially useful in scenarios where you have a large amount of data and want to ensure that it hasn’t been tampered with, such as in cryptocurrencies (like Bitcoin) for validating transactions and blocks.

Here’s how a Merkle tree works:

Data Segmentation: The dataset is divided into smaller fixed-size blocks. If the last block is not a complete block, it’s often padded to the required size.
Hashing: Each data block is then hashed using a cryptographic hash function (such as SHA-256). This produces a fixed-size hash value, typically represented as a string of characters.
Pairwise Hashing: The hash values are paired up and concatenated, and then hashed again. This process is repeated until there’s only one hash value left. This final hash value is called the “root hash” or “Merkle root.”
Verification: To verify the integrity of a specific piece of data within the dataset, you don’t need to compare the entire dataset. You only need to follow the path from the data’s hash leaf up to the Merkle root, calculating the hashes along the way. If the calculated Merkle root matches the one you have, the data is part of the dataset and has not been tampered with. This process is efficient because you’re only dealing with logarithmically many hash calculations instead of linearly many.

Plain Logic

Calculations in a Merkle tree involve hashing data and building the tree structure. Let’s walk through a simplified example of how calculations are done in a Merkle tree.

Suppose we have four data blocks: A, B, C, and D. Each of these blocks are hashed using a cryptographic hash function (like SHA-256) to produce their respective hash values.

Hashing the Data:

Hash(A) = HashValueA
Hash(B) = HashValueB
Hash(C) = HashValueC
Hash(D) = HashValueD

Building the Tree: Now, we start building the Merkle tree structure by pairing and hashing the hash values. We calculate the hash of concatenated pairs until we reach the root hash.

Pair 1: Hash(HashValueA + HashValueB) = HashValueAB
Pair 2: Hash(HashValueC + HashValueD) = HashValueCD
Pair 3: Hash(HashValueAB + HashValueCD) = MerkleRootHash

In this example, MerkleRootHash is the final root hash of the Merkle tree.

Arpit Bhayani 2 年前

Understanding Kafka System Design: Diving into Kafka…

Lavakumar Thatisetti 8 个月前

Scale with a K.I.S.S: Keep It Simple, Stupid

Sunny R Gupta 1 个月前

Verification: Let’s say we want to verify the integrity of data block B. To do this, we need to calculate the path from B’s hash leaf to the Merkle root and compare the calculated root hash with the expected Merkle root hash.

Calculate the hash of HashValueC + HashValueD = HashValueCD

Calculate the hash of HashValueA + HashValueB = HashValueAB

Calculate the hash of HashValueAB + HashValueCD = MerkleRootHash

If the calculated MerkleRootHash matches the expected root hash, it means the data block B is part of the Merkle tree and the data hasn't been tampered with.

The actual Merkle tree structure might be larger and more complex, especially in blockchains(Example Below) with many transactions. However, the basic principles of pairing hash values and constructing the tree remain the same.

Merkle trees provide several benefits, including:

Efficiency: They enable efficient verification of large datasets or files without transmitting the whole data.
Security: Tampering with any part of the dataset would require recomputing all the hashes, making it extremely difficult to alter data unnoticed.
Scalability: Merkle trees are used in blockchain technology to validate transactions and blocks efficiently, making them suitable for large-scale distributed systems.

Example

In the Bitcoin blockchain, a Merkle tree is used to represent and verify the transactions within a block. Here’s how it works:

Transaction Collection: When users send Bitcoin to each other, these transactions are collected and grouped together in a block. Each transaction contains information about the sender, recipient, and the amount of Bitcoin transferred.
Hashing Transactions: The transactions in the block are hashed using a cryptographic hash function (e.g., SHA-256). This produces a hash for each transaction.
Pairwise Hashing: The transaction hashes are then paired and concatenated, and the resulting hashes are hashed again. This process is repeated until there’s only one hash value left, known as the “Merkle root.”
Merkle Root in the Block Header: The Merkle root is included in the header of the block, along with other information like the timestamp, the previous block’s hash, and a nonce. The header is then hashed to create the block’s unique identifier or “block hash.”
Verification: If someone wants to verify a transaction within a block, they don’t need to download the entire block and all its transactions. Instead, they only need the Merkle root from the block header and the path to the specific transaction they’re interested in. By hashing the transaction hash along with the appropriate sibling hashes from the Merkle tree, they can verify that the transaction is indeed included in the block without needing to access all the transactions.

This approach provides several benefits:

Efficiency: Verifying transactions becomes much faster since you don’t need the entire block, just the relevant part of the Merkle tree.
Security: Any tampering with a transaction would result in a different Merkle root, which would be immediately noticeable.
Scalability: As the number of transactions grows, the Merkle tree allows for efficient verification without needing to handle all transactions at once.

So, in the Bitcoin blockchain, Merkle trees play a critical role in ensuring the integrity of transactions and enabling efficient verification, which is essential for the security and functionality of the entire network.

Next blog in this series we will discuss Consistent Hashing

要查看或添加评论，请登录

Vertisystem的更多文章

See all articles

Exploring Key Distributed System Algorithms and Concepts Series: 1 — Markel Tree

Vertisystem

Technology-driven business solutions

In this blog, we’ll dive into several fundamental algorithms and concepts that form the backbone of distributed systems architecture. We’ll cover their real-life usage, provide detailed examples, and explain their working logic.

Merkle Tree

Here’s how a Merkle tree works:

Plain Logic

领英推荐

Merkle trees provide several benefits, including:

Example

Vertisystem的更多文章

社区洞察

其他会员也浏览了

CAP Theorem: Understanding Trade-Offs in Distributed Systems

Wide Vs. Narrow Transformations in Spark/Distributed Compute

The Case for Shared Nothing

Distributed Snapshots

RAFT Algorithm: Consensus in Distributed Systems

The Art of Coordination: Consensus, State Distribution, and Data Storage in Distributed Systems (Part 1)

Exploring the World of Distributed Computing Frameworks: Empowering Scalable and Efficient Computing

Demystifying Resilient Distributed Datasets (RDD) in Apache Spark

Why I think the future of distributed systems is important

Understanding the CAP Theorem and its No Relationship to Scalability

In this blog, we’ll dive into several fundamental algorithms and concepts that form the backbone of distributed systems architecture. We’ll cover their real-life usage, provide detailed examples, and explain their working logic.

Merkle Tree

Here’s how a Merkle tree works:

Plain Logic

领英推荐

Merkle trees provide several benefits, including:

Example

Vertisystem的更多文章

Future-Proof Your Business: Vertisystem’s Application Development & Modernization Services

How AWS Infrastructure Mastered Prime Day 2024: A Cloud Architect’s Perspective

Simplifying Kubernetes Operations with Terraform, ArgoCD, and GitOps on AWS: A Complete Guide

The Emergence of AI: Transforming Businesses Across All Scales

Demystifying DevOps Series