?? Day 36: Exploring Practical SQL Implementation - Part 8??

A. Window/Analytic Functions in SQL:-

In SQL, Window/Analytic Functions bring a new dimension to data analysis by providing insights within a specific window of data. These functions operate over a range of rows related to the current row, allowing for advanced computations.

Here is a list of common Window/Analytic Functions in SQL:

1. RANK(): - Assigns a unique rank to each distinct row within the result set based on the ORDER BY clause. Rows with equal values receive the same rank, and the next rank is skipped.

2. DENSE_RANK():- Similar to RANK(), but it doesn't skip the next rank if there are equal values. It assigns consecutive ranks to rows with equal values.

3. ROW_NUMBER():- Assigns a unique number to each row within the result set. Unlike RANK() and DENSE_RANK(), it doesn't consider equal values.

4. LEAD():- Provides access to a row at a specified physical offset following that row within the result set. It's useful for comparing a row with subsequent rows.

5. LAG():- Similar to LEAD(), but provides access to a row at a specified physical offset preceding that row within the result set. It's useful for comparing a row with preceding rows.

6. FIRST_VALUE():- Returns the value of a specified expression for the first row within the window frame.

7. LAST_VALUE():- Returns the value of a specified expression for the last row within the window frame.

8. CUME_DIST():- Calculates the cumulative distribution of a value within a group of values.

9. PERCENT_RANK():- Calculates the relative rank of a value within a group of values.

10. NTILE():- Divides the result set into a specified number of roughly equal parts, assigning a bucket number to each row.

11. SUM():- Calculates the sum of a numeric column over a specified window frame.

12. AVG():- Calculates the average of a numeric column over a specified window frame.

13. MIN():- Returns the minimum value of a column over a specified window frame.

14. MAX():- Returns the maximum value of a column over a specified window frame.

15. COUNT():- Counts the number of rows in a window frame.

These functions are powerful tools for performing complex analytical operations within specific subsets of data in SQL queries.

Let's walk through examples for each of the Window/Analytic Functions:

1. RANK():

  • Query:

SELECT student_id, first_name, last_name, gender, fees_paid,
RANK() OVER (PARTITION BY gender ORDER BY fees_paid) AS rank_val
FROM student;        

  • Output:

| student_id | first_name | last_name | gender | fees_paid | rank_val |
|------------|------------|-----------|--------|-----------|----------|
| 1          | Jignesh    | Kumar     | Male   | 1000      | 1        |
| 2          | Kriti      | Kumari    | Female | 1200      | 1        |
| 3          | Siddharth  | Kumar     | Male   | 800       | 2        |
        

  • Use Case: Ranking students based on fees_paid within each gender group.

2. DENSE_RANK():

  • Query:

SELECT department_id, salary, DENSE_RANK() OVER (PARTITION BY department_id ORDER BY salary) AS dense_rank_val
FROM employees;        

  • Output:

| department_id | salary | dense_rank_val |
|---------------|--------|-----------------|
| 1             | 5000   | 1               |
| 1             | 6000   | 2               |
| 2             | 7000   | 1               |        

  • Use Case: Assigning dense ranks based on salary within each department

3. ROW_NUMBER():

  • Query:

SELECT department_id, salary, ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary) AS row_number_val
FROM employees;        

  • Output:

| department_id | salary | row_number_val |
|---------------|--------|----------------|
| 1             | 5000   | 1              |
| 1             | 6000   | 2              |
| 2             | 7000   | 1              |        

  • Use Case: Assigning row numbers based on salary within each department.

4. LEAD():

  • Query:

SELECT employee_id, salary, LEAD(salary) OVER (ORDER BY salary) AS next_salary
FROM employees;        

  • Output:

| employee_id | salary | next_salary |
|-------------|--------|-------------|
| 1           | 5000   | 6000        |
| 2           | 6000   | 7000        |
| 3           | 7000   | NULL        |
        

  • Use Case: Comparing an employee's salary with the next employee's salary.

5. LAG():

  • Query:

SELECT employee_id, salary, LAG(salary) OVER (ORDER BY salary) AS prev_salary
FROM employees;        

  • Output:

| employee_id | salary | prev_salary |
|-------------|--------|-------------|
| 1           | 5000   | NULL        |
| 2           | 6000   | 5000        |
| 3           | 7000   | 6000        |        

  • Use Case: Comparing an employee's salary with the previous employee's salary.

6. FIRST_VALUE():

  • Query:

SELECT department_id, employee_id, salary,
FIRST_VALUE(employee_id) OVER (PARTITION BY department_id ORDER BY salary) AS first_emp_id
FROM employees;        

  • Output:

| department_id | employee_id | salary | first_emp_id |
|---------------|-------------|--------|--------------|
| 1             | 1           | 5000   | 1            |
| 1             | 2           | 6000   | 1            |
| 2             | 3           | 7000   | 3            |
        

  • Use Case: Finding the first employee_id within each department based on salary.

7. LAST_VALUE():

  • Query:

SELECT employee_id, salary, LAST_VALUE(salary) OVER (ORDER BY salary) AS last_salary
FROM employees;        

  • Output:

| employee_id | salary | last_salary |
|-------------|--------|-------------|
| 1           | 5000   | 7000        |
| 2           | 6000   | 7000        |
| 3           | 7000   | 7000        |        

  • Use Case: Fetching the last salary in the ordered list.

8. CUME_DIST():

  • Query:

SELECT employee_id, salary, CUME_DIST() OVER (ORDER BY salary) AS cumulative_dist
FROM employees;        

  • Output:

| employee_id | salary | cumulative_dist |
|-------------|--------|------------------|
| 1           | 5000   | 0.3333           |
| 2           | 6000   | 0.6667           |
| 3           | 7000   | 1.0              |        

  • Use Case: Calculating the cumulative distribution of salaries.

9. PERCENT_RANK():

  • Query:

SELECT employee_id, salary, PERCENT_RANK() OVER (ORDER BY salary) AS percent_rank_val
FROM employees;        

  • Output:

| employee_id | salary | percent_rank_val |
|-------------|--------|------------------|
| 1           | 5000   | 0.0              |
| 2           | 6000   | 0.5              |
| 3           | 7000   | 1.0              |        

  • Use Case: Calculating the relative rank of each salary.

10. NTILE():

  • Query:

SELECT employee_id, salary, NTILE(2) OVER (ORDER BY salary) AS salary_bucket
FROM employees;        

  • Output:

| employee_id | salary | salary_bucket |
|-------------|--------|---------------|
| 1           | 5000   | 1             |
| 2           | 6000   | 1             |
| 3           | 7000   | 2             |        

  • Use Case: Dividing employees into two salary buckets.

11. SUM():

  • Query:

SELECT department_id, salary, SUM(salary) OVER (PARTITION BY department_id) AS total_salary
FROM employees;        

  • Output:

| department_id | salary | total_salary |
|---------------|--------|--------------|
| 1             | 5000   | 11000        |
| 1             | 6000   | 11000        |
| 2             | 7000   | 7000         |        

  • Use Case: Calculating the total salary within each department.

12. AVG():

  • Query:

SELECT department_id, salary, AVG(salary) OVER (PARTITION BY department_id) AS avg_salary
FROM employees;        

  • Output:

| department_id | salary | avg_salary |
|---------------|--------|------------|
| 1             | 5000   | 5500       |
| 1             | 6000   | 5500       |
| 2             | 7000   | 7000       |        

13. MIN():

  • Query:

SELECT department_id, salary, MIN(salary) OVER (PARTITION BY department_id) AS min_salary
FROM employees;        

  • Output:

| department_id | salary | min_salary |
|---------------|--------|------------|
| 1             | 5000   | 5000       |
| 1             | 6000   | 5000       |
| 2             | 7000   | 7000       |        

  • Use Case: Finding the minimum salary within each department.

14. MAX():

  • Query:

SELECT department_id, salary, MAX(salary) OVER (PARTITION BY department_id) AS max_salary
FROM employees;        

  • Output:

| department_id | salary | max_salary |
|---------------|--------|------------|
| 1             | 5000   | 6000       |
| 1             | 6000   | 6000       |
| 2             | 7000   | 7000       |        

  • Use Case: Finding the maximum salary within each department.

15. COUNT():

  • Query:

SELECT department_id, employee_id, COUNT(employee_id) OVER (PARTITION BY department_id) AS employee_count
FROM employees;        

  • Output:

| department_id | employee_id | employee_count |
|---------------|-------------|----------------|
| 1             | 1           | 2              |
| 1             | 2           | 2              |
| 2             | 3           | 1              |        

  • Use Case: Counting the number of employees within each department.

B. Subqueries in SQL:

Subqueries, or nested queries, are SQL queries embedded within another query. They serve various purposes, such as filtering results, performing calculations, or acting as a data source for the main query.

1. Single Row Subquery Example:

  • Query:

SELECT id, last_name, salary
FROM employee
WHERE salary = (
  SELECT MAX(salary)
  FROM employee
);        

  • Output:

| id  | last_name | salary |
|-----|-----------|--------|
| 101 | Samir     | 8000   |        

  • Use Case-Scenario: Identifying the Highest Earner
  • Explanation: The subquery finds the maximum salary in the entire employee table. The main query then retrieves details of employees whose salary matches this maximum value.

2. Multi-Row Subquery Example:

  • Query:

SELECT id, last_name, salary
FROM employee
WHERE salary IN (
  SELECT salary
  FROM employee
  WHERE last_name LIKE 'C%'
);        

  • Output:

| id  | last_name | salary |
|-----|-----------|--------|
| 103 | Mahesh    | 6500   |
| 105 | Kuldeep   | 6700   |        

  • Use Case-Scenario: Salary Analysis for 'C'-Named Employees
  • Explanation: The subquery filters employees whose last names start with 'C.' The main query then fetches details of employees whose salaries are present in this filtered subset.



  • Use Case: Assigning dense ranks based on salary within each department.

要查看或添加评论,请登录

JIGNESH KUMAR的更多文章

社区洞察

其他会员也浏览了