Understanding Elasticsearch Indexing: Text vs. Keyword Fields

Understanding Elasticsearch Indexing: Text vs. Keyword Fields

Picture this: Elasticsearch, the cool cat in the town, analyzing Text before it hits the Inverted Index dance floor, while Keyword, the rebel, just shows up as is.

The key difference lies in how they're treated before hitting the Inverted Index dance floor. Text undergoes analysis, while Keyword remains unaltered. This distinction significantly impacts their behavior during queries.

How to Use Text and Keyword Fields

When indexing a document without explicit mappings, Elasticsearch dynamically assigns both Text and Keyword types. However, for efficiency, it's recommended to define mapping settings based on specific use cases before the indexing begins. Here's a straightforward guide:

Keyword Mapping

curl --request PUT \
  --url https://localhost:9200/text-vs-keyword/_mapping \
  --header 'content-type: application/json' \
  --data '{
 "properties": {
  "keyword_field": {
   "type": "keyword"
  }
 }
}'        

Text Mapping

curl --request PUT \
  --url https://localhost:9200/text-vs-keyword/_mapping \
  --header 'content-type: application/json' \
  --data '{
 "properties": {
  "text_field": {
   "type": "text"
  }
 }
}'        

Multi Fields

curl --request PUT \
  --url https://localhost:9200/text-vs-keyword/_mapping \
  --header 'content-type: application/json' \
  --data '{
 "properties": {
  "text_and_keyword_mapping": {
   "type": "text",
   "fields": {
    "keyword_type": {
     "type":"keyword"
    }
   }
  }
 }
}'        

How They Work: The Indexing Process

Both Text and Keyword fields are indexed differently in the Inverted Index. The variation in the indexing process plays a significant role when querying Elasticsearch.

Navigating Queries: Text vs. Keyword Fields

Elasticsearch offers two main query types for strings: Match Query and Term Query. Understanding the implications of these queries is crucial for efficient searching.


  • Querying Keyword Field with Term Query:

curl --request POST \
  --url 'https://localhost:9200/text-vs-keyword/_doc/_search?size=0' \
  --header 'content-type: application/json' \
  --data '{
 "query": {
  "term": {
   "keyword_field": "The quick brown fox jumps over the lazy dog"
  }
 }
}'        

Returns a result since both the field type and query are unanalyzed.


  • Querying Keyword Field with Match Query:

curl --request POST \
  --url https://localhost:9200/text-vs-keyword/_doc/_search \
  --header 'content-type: application/json' \
  --data '{
 "query": {
  "match": {
   "keyword_field": "The quick brown fox jumps over the lazy dog"
  }
 }
}'        

Produces a result, as the query undergoes index-time analysis mapped to the Keyword field.


  • Querying Text Field with Term Query:

curl --request POST \
  --url 'https://localhost:9200/text-vs-keyword/_doc/_search?pretty=' \
  --header 'content-type: application/json' \
  --data '{
 "query": {
  "term": {
   "text_field": "The"
  }
 }
}'        

Yields no result as the entire sentence is not stored in the Inverted Index.


  • Querying Text Field with Match Query:

curl --request POST \
  --url 'https://localhost:9200/text-vs-keyword/_doc/_search?pretty=' \
  --header 'content-type: application/json' \
  --data '{
 "query": {
  "match": {
   "text_field": "The"
  }
 }
}'        

Produces a result, as the query undergoes analysis, matching the lowercased term in the Inverted Index.

要查看或添加评论,请登录

Bryan Vera的更多文章

社区洞察

其他会员也浏览了