SQL’s Order of Execution
Lasha Dolenjashvili
Data Solutions Architect @ Bank of Georgia | IIBA? Certified Business Analyst | Open to Freelance, Remote, or Relocation Opportunities
In the world of data, SQL (Structured Query Language) is a widely used tool. It’s a language that is crucial for extracting data, so it’s important to understand how it works if you want to work with data. One important concept to master in SQL is its Order of Execution, which determines how SQL processes and executes queries. Let’s explore and demystify this concept together using simpler terms.
SQL commands might appear linear, but they don’t run from top to bottom. Instead, they follow a specific order of execution that could seem counter-intuitive at first. Let’s explore this.
Here is the standard order of execution:
-- SQL: Order of Execution @lasha-dolenjashvili
--------------------------
1. FROM & JOINS
2. WHERE
3. GROUP BY
4. Aggregation functions (SUM(), COUNT(), etc.)
5. HAVING
6. SELECT
7. DISTINCT
8. ORDER BY
This order is crucial to understanding how SQL queries work.
Deep Dive
Let’s break down each step with a real-world example. Imagine we own a bookstore and we have a database of our book sales.
Case Study
-- Table: employees
+-----+--------+--------+
| id | name | dept |
+-----+--------+--------+
| 1 | Alice | Sales |
| 2 | Bob | HR |
| 3 | Charlie| Sales |
| 4 | David | IT |
| 5 | Eve | IT |
+-----+--------+--------+
-- Table: sales
+----------+------------+
| employee_id | amount |
+-------------+---------+
| 1 | 100 |
| 1 | 200 |
| 3 | 300 |
| 3 | 400 |
| 5 | 250 |
+-------------+---------+
Let's analyze the following query.
SELECT name
, SUM(amount) AS total_sales
FROM employees
INNER JOIN sales
ON employees.id = sales.employee_id
WHERE dept = 'Sales'
GROUP BY name
HAVING total_sales > 250
ORDER BY total_sales DESC;
领英推荐
Order of Execution
-- Intermediate Result
+---------+--------+-------------+--------+
| name | dept | employee_id | amount |
+---------+--------+-------------+--------+
| Alice | Sales | 1 | 100 |
| Alice | Sales | 1 | 200 |
| Charlie | Sales | 3 | 300 |
| Charlie | Sales | 3 | 400 |
| Eve | IT | 5 | 250 |
+---------+--------+-------------+--------+
-- Intermediate Result
+---------+--------+-------------+--------+
| name | dept | employee_id | amount |
+---------+--------+-------------+--------+
| Alice | Sales | 1 | 100 |
| Alice | Sales | 1 | 200 |
| Charlie | Sales | 3 | 300 |
| Charlie | Sales | 3 | 400 |
+---------+--------+-------------+--------+
-- Intermediate Result
+---------+--------------+
| name | total_sales |
+---------+--------------+
| Alice | 300 |
| Charlie | 700 |
+---------+--------------+
Intermediate Result (same in this case, since both have sales > 250)
+---------+-------------+
| name | total_sales |
+---------+-------------+
| Alice | 300 |
| Charlie | 700 |
+---------+-------------+
-- Final Result
+---------+--------------+
| name | total_sales |
+---------+--------------+
| Charlie | 700 |
| Alice | 300 |
+---------+--------------+
Remember, SQL’s order of execution isn’t just trivia — it’s fundamental to writing effective, correct queries.
Data Scientist
1 年Thanks for sharing Lasha Dolenjashvili. I don’t get to use SQL very often but I found it extremely useful while working with PySpark.