Merkle Tree :

A concept popular in Distributed Systems. Merkle Tree is a binary tree used for easy search and secure verification of data in time optimal and with low space for large datasets.

Each leaf has a cryptographic hash of blocks of data, so an actual block of data is available at the leaf nodes. A non-leaf node hash is formed by the combination of leaf nodes(i.e. child node) as shown below :

Applications are :

  1. git -> To manage commits and files together and quickly find out something under got changes it uses Merkle tree. As soon as the file changes, the hash change occurs and hence the hash pointer also gets changed, which helps git know that change has been occured. To show the actual changes that we see in the file something similar to the Longest Common Subsequence algorithm seems to be used.For example, the hashes stored in yourprojectfir/.git/object . To see use: ls -la .git/objects. thereafter every commit, there would be a new hash located along with the commits of new hashes. https://initialcommit.com/blog/git-bitcoin-merkle-tree
  2. DynamoDB: Merkle Trees are important to handle transient failures. Each node maintains its own Merkle tree for the range of is assigned to. So when a failure occurs to some node, the request is rerouted to another node with that key range, and the operation(eg: update) gets performed. Hence when the node gets alive again, to maintain the consistency among data through the use of Merkle trees by comparing the hashes and going down in a consistent manner towards the leaf, to sync with actual data. This will help to reduce the number of iterations to copy since now only changed data will get transferred.

https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf

要查看或添加评论,请登录

Pushkal Goyal的更多文章

  • Partitioning & Replication

    Partitioning & Replication

    Partitioning: Process of dividing data into independent segments. Need of Partitioning: If the data is served without…

  • Interpreter and Compiler

    Interpreter and Compiler

    Python => interpreter & other languages like CPP => compiler-based languages. I didn't know the way I could see the…

  • Consistent Hashing

    Consistent Hashing

    It is one of the system design techniques used to optimize performance issues with horizontal hashing while scaling…

  • Circular Imports in Python

    Circular Imports in Python

    **packages : directory with __init__.py **modules : files with .

  • POSIX

    POSIX

    It stands for Portable Operating System Interface. This is a compliance introduced by the US government for procurement…

社区洞察

其他会员也浏览了