登录查看更多内容

SQL’s Order of Execution

Lasha Dolenjashvili

Data Solutions Architect @ Bank of Georgia | IIBA? Certified Business Analyst | Open to Freelance, Remote, or Relocation Opportunities

发布日期: 2023年10月2日

In the world of data, SQL (Structured Query Language) is a widely used tool. It’s a language that is crucial for extracting data, so it’s important to understand how it works if you want to work with data. One important concept to master in SQL is its Order of Execution, which determines how SQL processes and executes queries. Let’s explore and demystify this concept together using simpler terms.

SQL commands might appear linear, but they don’t run from top to bottom. Instead, they follow a specific order of execution that could seem counter-intuitive at first. Let’s explore this.

Here is the standard order of execution:

-- SQL: Order of Execution                            @lasha-dolenjashvili
--------------------------
1. FROM & JOINS
2. WHERE
3. GROUP BY
4. Aggregation functions (SUM(), COUNT(), etc.)
5. HAVING
6. SELECT
7. DISTINCT
8. ORDER BY

This order is crucial to understanding how SQL queries work.

Deep Dive

Let’s break down each step with a real-world example. Imagine we own a bookstore and we have a database of our book sales.

FROM / JOIN: Select the table(s) where the data is coming from. Here we might select our “sales” table.
WHERE: Filters rows that don’t meet certain criteria. If we’re only interested in sales of science fiction books, our WHERE clause might filter out all other genres.
GROUP BY: Groups selected rows by the values of certain columns. If we want to know total sales by author, we would group by the “author” column.
HAVING: Filters groups that don’t meet certain criteria. If we’re only interested in authors who’ve sold more than 100 books, the HAVING clause will handle this.
SELECT: Specifies columns to be included in the result. Despite being written first, it’s executed after the clauses above. That’s why we can’t use aliases created in SELECT in the previous clauses.
DISTINCT: Removes duplicate rows in the result set.
ORDER BY: Sorts the result set by one or more columns. For instance, we could sort our authors by total sales, either in ascending or descending order.

Case Study

-- Table: employees
+-----+--------+--------+
| id  | name   | dept   |
+-----+--------+--------+
| 1   | Alice  | Sales  |
| 2   | Bob    | HR     |
| 3   | Charlie| Sales  |
| 4   | David  | IT     |
| 5   | Eve    | IT     |
+-----+--------+--------+

-- Table: sales
+----------+------------+
| employee_id | amount  |
+-------------+---------+
|     1       |   100   |
|     1       |   200   |
|     3       |   300   |
|     3       |   400   |
|     5       |   250   |
+-------------+---------+

Let's analyze the following query.

SELECT name
     , SUM(amount) AS total_sales
  FROM employees
       INNER JOIN sales 
          ON employees.id = sales.employee_id
 WHERE dept = 'Sales'
 GROUP BY name
HAVING total_sales > 250
 ORDER BY total_sales DESC;

领英推荐

Joins in SQL

360DigiTMG 3 个月前

SQL Insights: In Conversation With Hadrien Eluere

LearnSQL.com 6 个月前

SQL Challenge: Mastering Advanced Joins and…

StrataScratch 8 个月前

Order of Execution

FROM + JOIN: The query first identifies the tables and joins them.

-- Intermediate Result
+---------+--------+-------------+--------+
| name    | dept   | employee_id | amount |
+---------+--------+-------------+--------+
| Alice   | Sales  |     1       |  100   |
| Alice   | Sales  |     1       |  200   |
| Charlie | Sales  |     3       |  300   |
| Charlie | Sales  |     3       |  400   |
| Eve     | IT     |     5       |  250   |
+---------+--------+-------------+--------+

WHERE: Filters out the rows not in the 'Sales' department.

-- Intermediate Result
+---------+--------+-------------+--------+
| name    | dept   | employee_id | amount |
+---------+--------+-------------+--------+
| Alice   | Sales  |      1      |  100   |
| Alice   | Sales  |      1      |  200   |
| Charlie | Sales  |      3      |  300   |
| Charlie | Sales  |      3      |  400   |
+---------+--------+-------------+--------+

GROUP BY + Aggregations Groups by the name column and calculates the total sales for each name.

-- Intermediate Result
+---------+--------------+
| name    | total_sales  |
+---------+--------------+
| Alice   |    300       |
| Charlie |    700       |
+---------+--------------+

HAVING: Filters out rows with total_sales less than or equal to 250.

Intermediate Result (same in this case, since both have sales > 250)
+---------+-------------+
| name    | total_sales |
+---------+-------------+
| Alice   |    300      |
| Charlie |    700      |
+---------+-------------+

SELECT: This is where the actual column selection happens. All previous steps worked on all columns specified in the FROM and JOIN clauses.
DISTINCT: No DISTINCT keyword in this query, so no action here.
ORDER BY: Sorts the results by total_sales in descending order.

-- Final Result
+---------+--------------+
| name    | total_sales  |
+---------+--------------+
| Charlie |    700       |
| Alice   |    300       |
+---------+--------------+

Remember, SQL’s order of execution isn’t just trivia — it’s fundamental to writing effective, correct queries.

Ketevan Potova ?? ???? ??????

Data Scientist

1 年

Thanks for sharing Lasha Dolenjashvili. I don’t get to use SQL very often but I found it extremely useful while working with PySpark.

2 次回应

查看更多评论

要查看或添加评论，请登录

Lasha Dolenjashvili的更多文章

Printing Your Machine's Specifications with?Python

2024年11月18日

Printing Your Machine's Specifications with?Python

Introduction Have you ever needed to check your machine’s specs quickly, perhaps before running a resource-intensive…
Excel Isn't Going Anywhere, So Let's Automate Parsing?It

2024年11月11日

Excel Isn't Going Anywhere, So Let's Automate Parsing?It

Introduction Excel remains a significant part of our work lives. Despite modern tools and technologies, we frequently…

1 条评论
Building Document Parsing Pipelines with Python

2024年11月3日

Building Document Parsing Pipelines with Python

Why Parse Documents? Recently, I've been working with various document parsing challenges at work. Our systems generate…

4 条评论
Introduction to Network Analysis with Neo4j, AuraDB, and Python ???

2024年10月27日

Introduction to Network Analysis with Neo4j, AuraDB, and Python ???

Learning Objectives of the Article: Introduction to Network Analysis Introduction to Neo4j and AuraDB Generating…
Generating 1 Billion Rows of Complex Synthetic Data ??

2024年10月25日

Generating 1 Billion Rows of Complex Synthetic Data ??

Recently, my coworker introduced me to a library for generating synthetic data - dbldatagen. What is it? "The…

2 条评论
Gaps & Islands: Number of Consecutive Days in SQL

2023年10月8日

Gaps & Islands: Number of Consecutive Days in SQL

In my latest post about Gaps & Islands problem, I promised to provide a real-world example that involves finding the…

2 条评论
Exploring SQL without Window Functions (Part II) - Examples

2023年10月7日

Exploring SQL without Window Functions (Part II) - Examples

Today we will explore different techniques of how we might approach common SQL challenges, but without Window…
SQL’s EXISTS and NOT EXISTS: A Comprehensive Guide

2023年10月5日

SQL’s EXISTS and NOT EXISTS: A Comprehensive Guide

Let's learn about two powerful SQL constructs: and . What are EXISTS and NOT EXISTS? The clause is used to test for the…

2 条评论
Exploring SQL without Window Functions (Part I)

2023年10月4日

Exploring SQL without Window Functions (Part I)

What if SQL did not have window functions? Can we even imagine such a world? Window functions are, after all, the heart…

See all articles

SQL’s Order of Execution

Lasha Dolenjashvili

Data Solutions Architect @ Bank of Georgia | IIBA? Certified Business Analyst | Open to Freelance, Remote, or Relocation Opportunities

Deep Dive

Case Study

领英推荐

Order of Execution

Lasha Dolenjashvili的更多文章

社区洞察

其他会员也浏览了

When to Use CTEs, Subqueries, or Temporary Tables

Advanced SQL Skills for the Modern Analyst

SQL 101: Top Mistakes Beginners Should Avoid

SQL Refresher: Essential Queries for Data Analysts

Understanding the Execution Cycle of an SQL Query: A Key to Optimizing Performance

Advanced SQL for Efficient Data Analysis

Mastering SQL – Phase 2: From Intermediate to Advanced in 10 More Days (Days 11-20)

?????? ?? ?????? ?????????????????? ?????????????????? ?????? ???????? ???????? ?????? ???????? ?????????????????? ??????????????!

Querying tables

Why SQL Remains an Essential Skill for Data Analysts in 2024 and beyond !!

Deep Dive

Case Study

领英推荐

Order of Execution

Lasha Dolenjashvili的更多文章

Printing Your Machine's Specifications with?Python

Excel Isn't Going Anywhere, So Let's Automate Parsing?It

Building Document Parsing Pipelines with Python

Introduction to Network Analysis with Neo4j, AuraDB, and Python ???

Generating 1 Billion Rows of Complex Synthetic Data ??

Gaps & Islands: Number of Consecutive Days in SQL

Exploring SQL without Window Functions (Part II) - Examples

SQL’s EXISTS and NOT EXISTS: A Comprehensive Guide

Exploring SQL without Window Functions (Part I)

社区洞察

其他会员也浏览了

When to Use CTEs, Subqueries, or Temporary Tables

Advanced SQL Skills for the Modern Analyst

SQL 101: Top Mistakes Beginners Should Avoid

SQL Refresher: Essential Queries for Data Analysts

Understanding the Execution Cycle of an SQL Query: A Key to Optimizing Performance

Advanced SQL for Efficient Data Analysis

Mastering SQL – Phase 2: From Intermediate to Advanced in 10 More Days (Days 11-20)

?????? ?? ?????? ?????????????????? ?????????????????? ?????? ???????? ???????? ?????? ???????? ?????????????????? ??????????????!

Querying tables

Why SQL Remains an Essential Skill for Data Analysts in 2024 and beyond !!