Bloom filter in PHP
Fuente: Wikipedia

Bloom filter in PHP

?Prefieres leerlo en castellano? https://www.oscarpascual.com/implementacion-en-php-de-un-filtro-bloom/

This time I am going to share a simple implementation of a Bloom filter in PHP. You can see this implementation on my github: https://github.com/oscarpascualbakker/bloomfilter.

A Bloom filter is a probabilistic data structure used to check whether an element is a member of a set. In this structure, elements can be added to the set, but not removed.

A query to a Bloom filter returns "possibly in the set" or "definitely not in the set".

A Bloom filter is an array of m bits (bitarray), initially all set to 0. There must also be k different hash functions defined, each of which maps or hashes to some element set to one of the m positions of the matrix, generating a uniform random distribution. Normally, k is a small constant that depends on the probability of false errors that we are going to allow.

To add an element to the set, it is necessary to perform k hash functions on the element, obtain the corresponding k positions in the filter, and set these bits to 1.

To verify if an element is in the set, it is necessary to perform the k hash functions on that element again to obtain the positions in the filter. If any of the bits in these positions are 0, the element is definitely not in the set.

If all positions are set to 1, then either the element is in the set or the bits have been set to 1 during the insertion of other elements, resulting in a false positive.

It is not possible to remove elements in a simple Bloom filter, and there is no way to distinguish between a legitimate positive and a false positive. More advanced techniques can solve this problem.

No hay texto alternativo para esta imagen


Source: Wikipedia

Utilities of a Bloom filter

Bloom filters are useful in cases where:

  • the data to be searched is large
  • available system memory is limited/low

For this reason the algorithm has many uses. Since it is a system that responds very quickly wether an element is (probably!) in a set or not, it is used to save time and space.

  • caching systems
  • Weak password detection
  • internet protocol cache
  • Safe browsing in Google Chrome
  • Bitcoin wallet synchronization (this was deprecated from Bitcoin, but was part of early versions)
  • Hash-based IP tracking
  • Cyber security like virus scan

I hope you like this algorithm (I understand you did, if you've read this far).

Thank you!

要查看或添加评论,请登录

óscar Pascual Bakker的更多文章

  • Asset Tokenization: Reinventing Ownership and Investment

    Asset Tokenization: Reinventing Ownership and Investment

    ???? ?Quieres leer este artículo en espa?ol? In a fast-moving world, the term "Asset Tokenization" is gaining…

  • How to Start Programming in Solidity

    How to Start Programming in Solidity

    ???? ?Quieres leer este artículo en espa?ol? At a recent talk in Vic, Catalonia, Spain, I found myself surrounded by…

  • Blockchain-Based Loyalty Program

    Blockchain-Based Loyalty Program

    ???? ?Quieres leer este artículo en espa?ol? The world of blockchain technology has opened up a myriad of possibilities…

  • What Is a Smart Contract and How Does It Work?

    What Is a Smart Contract and How Does It Work?

    ???? ?Quieres leer este artículo en espa?ol? In today's digital world, blockchain technology is revolutionizing the way…

    2 条评论
  • What is Blockchain?

    What is Blockchain?

    ???? ?Quieres leer este artículo en espa?ol? The term "blockchain" has become a buzzword in technology, finance, and…

  • Fiat-Shamir "Zero Knowledge Proof"

    Fiat-Shamir "Zero Knowledge Proof"

    ?Quieres leer el artículo en espa?ol? It's a true pleasure (and also a matter of pride) to share a development of the…

  • Merkle Tree o árbol de Merkle en PHP

    Merkle Tree o árbol de Merkle en PHP

    En este artículo voy a hablar de la implementación en PHP de un ?Merkle Tree?, un árbol de Merkle, una implementación…

  • Dijkstra's algorithm in PHP

    Dijkstra's algorithm in PHP

    The first time I heard about Dijkstra I was studying at the University. Dijkstra's Algorithm, by the way, was the first…

    4 条评论
  • Priority Queue in PHP

    Priority Queue in PHP

    Today it's time to publish the code of a simple Priority Queue in PHP, but with an important update: the ability to…

  • Cola de Prioridad en PHP 7

    Cola de Prioridad en PHP 7

    ?Quién me iba a decir que iba a usar esto algún día para un proyecto profesional? Una Cola de Prioridad es una…

    2 条评论

社区洞察

其他会员也浏览了