A concept popular in Distributed Systems. Merkle Tree is a binary tree used for easy search and secure verification of data in time optimal and with low space for large datasets.
Each leaf has a cryptographic hash of blocks of data, so an actual block of data is available at the leaf nodes. A non-leaf node hash is formed by the combination of leaf nodes(i.e. child node) as shown below :
- git -> To manage commits and files together and quickly find out something under got changes it uses Merkle tree. As soon as the file changes, the hash change occurs and hence the hash pointer also gets changed, which helps git know that change has been occured. To show the actual changes that we see in the file something similar to the Longest Common Subsequence algorithm seems to be used.For example, the hashes stored in yourprojectfir/.git/object . To see use: ls -la .git/objects. thereafter every commit, there would be a new hash located along with the commits of new hashes. https://initialcommit.com/blog/git-bitcoin-merkle-tree
- DynamoDB: Merkle Trees are important to handle transient failures. Each node maintains its own Merkle tree for the range of is assigned to. So when a failure occurs to some node, the request is rerouted to another node with that key range, and the operation(eg: update) gets performed. Hence when the node gets alive again, to maintain the consistency among data through the use of Merkle trees by comparing the hashes and going down in a consistent manner towards the leaf, to sync with actual data. This will help to reduce the number of iterations to copy since now only changed data will get transferred.