登录查看更多内容

Elasticsearch Filters and BM25 Relevancy Scoring

Ankit Thawal

IIT Madras | CSE |

发布日期: 2025年3月17日

In today's data-driven world, finding the needle in the digital haystack is more crucial than ever. As a search engineer who's spent years optimizing Elasticsearch implementations, I wanted to share some insights into how ES actually determines what's "relevant" in your search results.

Elasticsearch Filters

Elasticsearch (ES) is a powerful search and analytics engine that provides fast and efficient ways to search through large volumes of data.

Filters in Elasticsearch are binary decisions that determine whether a document matches certain criteria. Unlike queries that calculate relevance scores, filters simply include or exclude documents. This makes them significantly faster and more efficient.

Common types of filters include:

Term filters (exact matches)
Range filters (for numeric or date ranges)
Exists filters (field must exist)
Bool filters (combining multiple filters with AND/OR logic)

Filters are cached automatically by Elasticsearch, making them ideal for frequently used filtering criteria like status fields, categories, or date ranges. Filters do not take part in score calculations.

How does ES calculate relevancy score?

While filters help us narrow down our result set, they don't tell us which documents are most relevant to a user's query. This is where Elasticsearch's scoring mechanism comes into play. Let's explore how Elasticsearch determines which results should rank higher than others. Elasticsearch uses the BM25 algorithm to calculate the relevancy score. BM25 (Best Matching 25) is an improvement over the older TF-IDF scoring model and has been the default in Elasticsearch since version 5.0.

Understanding BM25

BM25 calculates scores based on three main components:

Term Frequency (TF): How often a term appears in a document
Inverse Document Frequency (IDF): How rare or common a term is across all documents
Field Length Normalization: Adjusting for document length

Term Frequency Formula

The formula used by BM-25 to calculate the TF:

TF = (freq)(k1+1) / (freq + k1 * (1 - b + b *(dl/avg_dl)))

If we use raw term frequency, it gives higher weightage to longer documents, as naturally the longer documents might contain those terms more number of times than shorter documents. But that does not mean if a certain doc contains the same terms a large number of times, that makes it more relevant.

Terms in the formula:

freq = frequency of that term in that particular document
k1 = term saturation
b = length normalization
dl = length of a document
avgdl = average length of documents in that index

How do these terms help?

The purpose of k1 is to control how quickly the importance of term saturates as its frequency increases.

k1 => controls how fast or slow it takes to saturate
k1: It's like a "brake" on how quickly TF saturates

Intuitive Explanation with Examples

Let's use an example to make this more concrete. Suppose we have a query term "apple" and three documents:

Document A: "apple" appears 1 time.
Document B: "apple" appears 5 times.
Document C: "apple" appears 10 times.

We'll calculate TFBM25 for each document using different k1 values.

Case 1: k1 = 0.5

TFBM25 = (0.5+1)?tf / (0.5+tf) = 1.5?tf / (0.5+tf)
Document A: 1.5?1 / (0.5+1) = 1.0
Document B: 1.5?5 / (0.5+5) = 1.36
Document C: 1.5?10 / (0.5+10) = 1.43

Here, the score increases only slightly from Document A to Document C. Saturation happens quickly.

Case 2: k1 = 1.2 (default)

领英推荐

AI Builders Summit Week 1 Takeaways, 10 Trending Open…

Open Data Science Conference (ODSC) 2 个月前

?? Extract Summit 2023's Recap & Future Horizons:…

Zyte 1 年前

4 Database Trends Data-Intensive Businesses Need to…

Ed Huang 2 个月前

TFBM25 = (1.2+1)?tf / (1.2+tf) = 2.2?tf / (1.2+tf)
Document A: 2.2?1 / (1.2+1) = 1.0
Document B: 2.2?5 / (1.2+5) = 1.79
Document C: 2.2?10 / (1.2+10) = 1.96

Here, the score increases more significantly from Document A to Document C. Saturation happens more slowly.

Case 3: k1 = 2.0

TFBM25 = (2.0+1)?tf / (2.0+tf) = 3.0?tf / (2.0+tf)
Document A: 3.0?1 / (2.0+1) = 1.0
Document B: 3.0?5 / (2.0+5) = 2.14
Document C: 3.0?10 / (2.0+10) = 2.5

Here, the score increases significantly even at higher term frequencies. Saturation happens very slowly.

How b helps?

b = 0.75 (default value)

b helps normalize the length of the document, it helps when we might give undue advantage to long documents.

Balanced approach:

If dl > avgdl, the denominator increases, lowering TF.
If dl < avgdl, the denominator decreases, raising TF slightly.

IDF (Inverse Document Frequency)

IDF describes how important a term is across all documents.

If a term appears in all the documents, it's not that important. If the term appears in few of the documents, it is important.

IDF Formula:

IDF = ln([ (N - n + 0.5) / (n + 0.5) ] + 1)

Where:

N = total number of documents
n = number of documents containing the term
smoothening factor = 0.5 is added to prevent the denominator from becoming 0

The 1 is added so the value inside the log always remains more than 1, ensuring the final result never becomes negative.

How is this formula useful?

Let's say a term appears in almost all the documents, so intuitively we can agree that this particular term is of no use when finding the relevant document, as it doesn't add any new value.

How does the IDF formula take care of this?

As n approaches N, ln(1) = 0, so the IDF becomes zero, indicating the term is not at all useful for the search.

Final score calculation

score = TF × IDF × boost

In most cases, boost = 1.0 unless specifically adjusted in the query.

Tuning BM25 in Elasticsearch

Elasticsearch allows you to customize the BM25 parameters to better suit your specific use case by adjusting the index settings:

PUT /my_index
{
  "settings": {
    "index": {
      "similarity": {
        "my_similarity": {
          "type": "BM25",
          "b": 0.75,
          "k1": 1.2
        }
      }
    }
  }
}

By understanding how these parameters affect the scoring, you can fine-tune your Elasticsearch queries to deliver the most relevant results for your specific use case.

#Elasticsearch #SearchEngineering #BM25 #DataScience

Note : written by me, formatted by Claude.

Elasticsearch Filters and BM25 Relevancy Scoring

Ankit Thawal

IIT Madras | CSE |

Elasticsearch Filters

How does ES calculate relevancy score?

Understanding BM25

Term Frequency Formula

How do these terms help?

Intuitive Explanation with Examples

领英推荐

How b helps?

IDF (Inverse Document Frequency)

How is this formula useful?

Final score calculation

Tuning BM25 in Elasticsearch

社区洞察

其他会员也浏览了

4 Database Trends Data-Intensive Businesses Need to Watch in 2025

DATA Pill #025 - Data Meshes Missing element, all-in-one data stack renesans and more

OSAI tl;dr 11th ed — The “Fauxpen:Open” Ratio Approaching 10:1

Boosting Recent Activity in Elasticsearch Search Results

Harnessing Beckn Protocol, Solid Pods, & Open Source Vector Databases for Personal AI??

Why Your Vector Database Needs a Formula 1 Upgrade, Information Retrieval Algorithms, and More!

Graph RAG Over Elasticsearch : Next Step in Data Search

Semantic Web

DBPedia - a case of (hidden) Freemium? Where do Linked-Data (sometimes) disappear ? "State of the art" for industrial use.

The Hidden Power of Relational Databases: Understanding the Foundations of Modern Technology and the Case for Regulation