ElasticSearch Query: SQL Query

ElasticSearch(ES) is a search engine based on Lucene. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java and is released as open source under the terms of the Apache License.

Just as in my last article, where I gave an equivalent of SQL query in Solr: https://www.dhirubhai.net/pulse/solr-query-sql-amit-chandak, I will do the same for ES. ES provides a lot of flexibility and has an equivalent for complex SQL; we will try to explore some of those here.

It is assumed that you already have installed Elasticsearch and can run JSON Query using Kibana Dev tools or any other methods(Curl,etc..).

Let us start with a table "transactions" with columns-price (number), color (varchar), make(varchar), sold(date).

Insert the sample Elastic data into "transactions" folder. The command to search data is: GET /transactions/_search

1) Select All

SQL Query: Select * from transactions

ES Query: {  "query": {"match_all": {} } }

2) Select a filter. Color: red. There are simpler versions than this but to keep the queries consistent, I have used the following format. Must is alternate of "and" in SQL and will play a role when we have more than one filter.

SQL Query : Select * from transactions where color = 'red'

ES Query: { "query":{ "constant_score" : {"filter":{ "bool":{"must" :[{ "term": { "color": "red" } } ] }} } }}

3) Select using a filter, with more than one value. Color: red, green.

SQL Query : Select * from transactions where color in ('red','green') and

ES Query: { "query":{ "constant_score" : {"filter":{ "bool":{"must" :[{ "terms": { "color": ["red","green"] } },{ "terms": { "make": ["ford","honda"] } } ] }} } }}

4) Multiple filters Boolean AND: "And" in SQL and "Must" in ES

SQL Query: Select * from transactions where color in ('red','green') and make in ('ford','honda')

ES Query: { "query":{ "constant_score" : {"filter":{ "bool":{"must" :[{ "terms": { "color": ["red","green"] } },{ "terms": { "make": ["ford","honda"] } } ] }} } }}

5) Two filters Boolean OR. "OR" in SQL , "Should" in ES

SQL Query: Select * from transactions where (color in ('red','green') or make in ('ford','honda'))

ES Query: { "query":{ "constant_score" : {"filter":{ "bool":{"should" :[{ "terms": { "color": ["red","green"] } },{ "terms": { "make": ["ford","honda"] } } ] }} } }}

6) Date filter

SQL Query:Select * from transactions where (color in ('red','green') and make in ('ford','honda')) and sold between '2014-01-01' and '2014-12-31'

ES Query: { "query":{ "constant_score" : {"filter":{ "bool":{"must" :[{ "terms": { "color": ["red","green"] } },{ "terms": { "make": ["ford","honda"] } } ] ,"filter" :[{ "range" : { "sold" : { "gte" : "2014-01-01", "lte" : "2014-12-31" } } } ] }} } }}

7) Data Aggregation. size:0 has been used to get only aggreated data.

SQL Query: Select sum(price) single_sum_price":, avg(price) single_avg_price from transactions

ES Query: { "size" : 0,   "aggs" : {"single_avg_price": { "avg" : { "field" : "price" }}, "single_sum_price": {"sum" : { "field" : "price" }}}}

8) Group By

SQL Query: Select color, sum(price) single_sum_price , avg(price) single_avg_price from transactions group by color

ES Query:{ "size": 0,"aggs": {"group_by_color": {"terms": {"field": "color.keyword"},"aggs": {"average_balance": {"avg": {"field": "price"}},"sum_balance": {"sum": {"field": "price"}}}}}}

8) More than one Group By

SQL Query: Select make,color sum(price) single_sum_price , avg(price) single_avg_price from transactions group by make,color

ES Query:{ "size": 0, "aggs": { "group_by_make": {"terms": { "field": "make.keyword" }, "aggs": {"group_by_color": { "terms": {"field": "color.keyword"},"aggs": {"average_price": {"avg": {"field": "price"}}}}}}}}

9) A Complex Query: Here we can use formulas like a/b and sum(a)/sum(b). I am using one measure and a constant in place of the other - hence instead of a/b I shall be using a/2. Using different aggegations I am doing sum(A)/count(A). I am also using period label: MTD. We can take use such labels in muti-period query.

SQL Query:Select color,make sum(price) sprice , count(price) cprice, avg(price/2) avgvalue1, sum(price)/count(price) avgvalue2 from transactions group by color,make

ES Query : {"size": 0 , "aggs": { "filter1" : {"filter":{ "bool":{"must" :[{ "terms": { "color": ["red","green"] } },{ "terms": { "make": ["ford","honda"] } } ] }} , "aggs": { "group_by_make": {"terms": { "field": "make.keyword" },"aggs": { "group_by_color": {"terms": { "field": "color.keyword" } , "aggs": {"MTD" : {"filter" :{ "range" : { "sold" : { "gte" : "2014-01-01", "lte" : "2014-12-31" } } }  , "aggs" : {"cprice" : { "value_count" :{ "field" : "price"} },"sprice" : { "sum" :{ "field" : "price"} },"avgvalue1": {"avg": { "script" : {"inline": "doc['price'].value !=0 ? doc['price'].value/2 : 0" }}}}},"avgvalue2": {"bucket_script": { "buckets_path": {"formula_1_1" : "MTD.sprice","formula_2_1" : "MTD.cprice"},"script" : "params.formula_1_1 != 0 ? params.formula_2_1/params.formula_2_1 :0" }}} }} }}}} }

Caveat - I am not an elasticsearch expert. Elastic has provided a comprehensive document on features. You can get more details from their site. If you would like to download the sample data for these queries, please write a comment.

Dileep kumar

Lead Associate at Unilog Content Solutions

7 å¹´

Hey Amit. Super. We are working on Solr Indexing as well. Its very nice... But, Indexing is a challenging thing here..

赞
回复

要查看或添加评论,请登录

Amit Chandak的更多文章

  • Learn Power BI 2025- 20 Hours Video

    Learn Power BI 2025- 20 Hours Video

    This blog provides links to all chapters and Links mentioned in the video On January 7th, 2023, I released my first…

    4 条评论
  • Microsoft Fabric: Unified Integrated Analytics

    Microsoft Fabric: Unified Integrated Analytics

    Microsoft Fabric: New Age Analytics| Lakehouse| Warehouse | Dataflow Gen2 | Power BI | Data Pipeline | Spark | Notebook…

    2 条评论
  • Learn Power BI - Beginner to Expert

    Learn Power BI - Beginner to Expert

    What is Power BI Power BI is a business analytics service provided by Microsoft that allows users to connect, analyze…

    5 条评论
  • Power BI Formatted P&L with Custom Sub Totals and Blank Rows

    Power BI Formatted P&L with Custom Sub Totals and Blank Rows

    The profit and loss statement is an important financial report that illustrates a company’s revenue, expenses, and…

    2 条评论
  • Power BI- Business Day with and Without using DAX NETWORKDAYS

    Power BI- Business Day with and Without using DAX NETWORKDAYS

    Power BI in July 2022 has released a new DAX function Networkdays. I would like to explore the same here.

    2 条评论
  • Five recent Power BI functions you should use more often

    Five recent Power BI functions you should use more often

    Power BI comes up with a monthly release. In these releases, quite a few features are released every month.

  • Solr Query: SQL Query

    Solr Query: SQL Query

    Apache Solr is an open source search platform. It can also be used for analysis of structured data.

社区洞察

其他会员也浏览了