DIFFERENCES IN SQL
NIKHIL G R
Serving Notice Period, Cloud Data Engineer at TCS, 2x Microsoft Azure Cloud Certified, Python, Pyspark, Azure Databricks, ADLs, Azure Synapse, Azure Data factory, MySQL, Lake House, Delta Lake, Data Enthusiast
WHERE vs HAVING
WHERE and HAVING clauses are both used in SQL to filter data.
WHERE
eg) Consider a table employees with following data.
eg) Suppose consider we want to retrieve employees who belong to IT department and have a salary greater than 55000.
HAVING
eg) Suppose we want to retrieve the deparments with an average salary greater than 50000.
UNION vs UNION ALL
Both are used to combine the results of two or more SELECT queries into a single result set.
Rules
UNION
eg) Consider a table employee and manager with following data.
If we want to combine two tables using UNION then,
The output looks something like this.
UNION ALL
If we want to combine two tables using UNION ALL then,
The output looks something like this.
TRUNCATE, DELETE and DROP
TRUNCATE, DELETE and DROP are used to remove data or database objects.
TRUNCATE
TRUNCATE TABLE employees ;
DELETE
DELETE FROM employees WHERE department = 'IT' ;
DROP
DROP TABLE employees ;
COUNT(*), COUNT(1), and COUNT(column_name)
COUNT(*) and COUNT(1) includes null values while counting but, COUNT(column_name) doesn't take null values into consideration during counting.
COUNT(*)
SELECT COUNT(*) AS total_rows FROM employees ;
COUNT(1)
SELECT COUNT(1) AS total_rows FROM employees ;
领英推荐
COUNT(column_name)
SELECT COUNT(employee_id) AS non_null_employee_ids
FROM employees;
In summary, COUNT(*) and COUNT(1) are used to count all rows in the result set, while COUNT(column_name) is used to count the non-null values.
IN, EXISTS, NOT IN and NOT EXISTS
IN and EXISTS are SQL operators that are used to filter and compare values in different ways. Similarly, NOT IN and NOT EXISTS are used for negating those comparisons.
Correlated and Non-Correlated Sub Queries
Non-Correlated Subquery
eg ) Consider we have employees table and we want to find all employees whose salary is greater than the average salary of all employees. We can use a non-correlated subquery for this:
SELECT EMP_ID, EMP_NAME, DEPARTMENT, SALARY
FROM employees
WHERE SALARY > (
SELECT AVG(SALARY)
FROM employees
);
The subquery (SELECT AVG(salary) FROM employees) is evaluated only once and provides the average salary value. The outer query then uses this value to filter the employees whose salary is greater than the average.
Correlated Subquery
eg ) Consider want to find all employees whose salary is greater than the average salary of their respective departments. We can use a correlated subquery for this:
SELECT EMP_ID, EMP_NAME, DEPARTMENT, SALARY
FROM employees e1
WHERE SALARY > (
SELECT AVG(SALARY)
FROM employees e2
WHERE e1.DEPARTMENT = e2.DEPARTMENT
);
The subquery (SELECT AVG(SALARY) FROM employees e2 WHERE e1.DEPARTMENT = e2.DEPARTMENT) is correlated to the outer query by the department column. For each row processed by the outer query (e1), the subquery is re-evaluated with the specific department value from the current row of the outer query.
PRIMARY KEY and UNIQIUE KEY
Both the PRIMARY key and the UNIQUE key constraint are used to enforce uniqueness in a table's columns.
PRIMARY KEY constraint
CREATE TABLE employees(
EMP_ID INT PRIMARY KEY,
EMP_NAME VARCHAR(30),
DEPARTMENT VARCHAR(50)
);
In this example, the EMP_ID column is designated as the PRIMARY key, ensuring that each employee's ID is unique and not NULL.
UNIQUE KEY constraint
CREATE TABLE employees(
EMP_ID INT,
EMP_NAME VARCHAR(30),
EMAIL VARCHAR(100) UNIQUE
);
In this example, the EMAIL column has a UNIQUE key constraint, ensuring that each email address is unique in the table, but it can allow NULL values.
Clustered and Non Clustered Index
Both clustered and non-clustered indexes are used to improve the performance of database queries by providing quick access to data.?
Clustered Index
CREATE CLUSTERED INDEX idx_emp_id ON employees(EMP_ID) ;
The data in the employees table will be physically sorted based on the values in the EMP_ID column. Any query that uses the EMP_ID column as a search criteria will benefit from faster data retrieval.
Non Clustered Index
CREATE NONCLUSTERED INDEX idx_salary ON employees(salary) ;
The index will contain a copy of the salary column along with pointers to the actual rows in the employees table. Any query that uses the salary column in a search or join operation will benefit from faster data retrieval.
VIEW and MATERIALIZED VIEW
Both views and materialized views are database objects used to provide a logical representation of data from one or more underlying tables.
VIEW
Example of creating a VIEW:
CREATE VIEW EMPLOYEE_DETAILS AS
SELECT EMP_ID, EMP_NAME, DEPARTMENT
FROM employees;
In this example, the EMPLOYEE_DETAILS view provides a simplified representation of the data in the employees table, containing only the EMP_ID, EMP_NAME and DEPARTMENT columns.
MATERIALIZED VIEW
Example of creating a MATERIALIZED VIEW:
CREATE MATERIALIZED VIEW MV_EMPLOYEE_DETAILS AS
SELECT EMP_ID, EMP_NAME, DEPARTMENT
FROM employees;
In this example, the MV_EMPLOYEE_DETAILS materialized view stores a physical copy of the EMP_ID, EMP_NAME and DEPARTMENT columns from the employees table. The data in the materialized view must be refreshed periodically to stay up-to-date.
Credits : GROKKING THE SQL INTERVIEW