SQL Interview Questions and Answers for Beginners and Experienced Professionals
Introduction
Did you know that SQL (Structured Query Language) was developed in the 1970s at IBM by a team led by Donald D. Chamberlin and Raymond F. Boyce? Originally known as SEQUEL (Structured English Query Language), SQL was created to manage the vast amounts of data generated by business operations and financial transactions.
Today, SQL is the standard language for managing and manipulating data stored in relational databases. Regardless of your role in operations, data analysis, data engineering, or data science, a basic understanding of SQL is essential. SQL is a powerful tool for querying, manipulating, and transforming data, and it’s used by organizations of all sizes and industries.
In this article, we’ll provide an overview of the essential SQL concepts and syntax, including SELECT statements, joins, functions, constraints, and transactions. We’ll also share 89 real SQL interview questions and provide step-by-step answers to help you prepare for your next SQL interview.
Whether you’re new to SQL or an experienced professional, this article will help you enhance your SQL skills and prepare for a successful interview. So, let’s dive into the basics of SQL and learn about the must-know concepts and techniques that every SQL developer should master.
In the following sections, we’ll cover the basics of SQL, including its syntax, how to use SELECT statements to retrieve data, how to use joins to combine data from multiple tables, and how to use functions to perform calculations and transform data. We’ll also cover constraints and transactions, two essential concepts in database management that help ensure the integrity and consistency of your data.
By the end of this article, you’ll have a solid understanding of SQL and be better prepared to ace your next SQL interview. So, let’s get started!
Section 1: Basic SQL Concepts and Syntax
In this section, we’ll cover the basic concepts and syntax of SQL. This is essential knowledge for anyone who is preparing for an SQL interview.
1.1 What is SQL?
Structured Query Language, or SQL, is a programming language used to manage and manipulate relational databases. SQL is used to create, modify, and retrieve data from databases. It is widely used in various industries, such as finance, healthcare, e-commerce, and more.
1.2 SQL Syntax
SQL syntax consists of a set of rules that define how to write SQL statements. SQL statements are used to communicate with the database and perform various operations. Here are some of the most common SQL keywords:
1.3 Basic SQL Statements
In SQL, the SELECT statement is used to retrieve data from one or more tables in a database. It’s one of the most important statements in SQL and is used in almost every query you’ll write. In this section, we’ll cover some common SQL interview questions related to SELECT statements.
1. What is a SELECT statement?
A SELECT statement is used to retrieve data from one or more tables in a database. It’s the most commonly used SQL statement and is used to retrieve specific data based on one or more conditions.
2. How do you write a basic SELECT statement?
A basic SELECT statement has the following syntax:
1.3.1 SELECT Statement
The SELECT statement is used to retrieve data from a table. The basic syntax of a SELECT statement is as follows:
SELECT column1, column2, ... FROM table_name;
Here’s an example of a SELECT statement that retrieves all columns from the “customers” table:
SELECT column1, column2, ... FROM table_name ORDER BY column_name ASC/DESC;
1.3.2 WHERE Clause
The WHERE clause is used to filter data based on certain conditions. The basic syntax of a WHERE clause is as follows:
SELECT column1, column2, ... FROM table_name WHERE condition;
Here’s an example of a SELECT statement that retrieves all columns from the “customers” table where the “country” column is equal to “USA”:
SELECT * FROM customers WHERE country = 'USA';
1.3.3 ORDER BY Clause
The ORDER BY clause is used to sort the retrieved data in ascending or descending order. The basic syntax of an ORDER BY clause is as follows:
SELECT column1, column2, ... FROM table_name ORDER BY column_name ASC/DESC;
Here’s an example of a SELECT statement that retrieves all columns from the “customers” table and sorts the data by the “customer_id” column in descending order:
SELECT * FROM customers ORDER BY customer_id DESC;
1.4 SQL Operators
1.4.1 Comparison and Logical Operators
Comparison operators are used to compare two values. Here are some of the most common comparison operators:
There’s a lot to learn here, don’t forget to check out out article covering the most common SQL mistakes so you can be prepared to avoid them!
As for SQL operates, an example, we can use the AND operator to combine conditions in a WHERE clause:
In this example, we’re selecting all columns from the customers table where the age column is greater than or equal to 18 and the state column is equal to ‘CA’.
SELECT *
FROM customers
WHERE age >= 18 AND state = 'CA';
Additionally, it’s important to understand the difference between NULL and 0 in SQL. NULL is a special value that represents the absence of data, while 0 is a specific value that represents the number zero. It’s important to handle NULL values properly in SQL to avoid unexpected results in your queries.
Finally, SQL also supports the use of aliases to rename tables and columns in your queries. Aliases can make your queries more readable and can also be useful when working with complex queries that involve multiple tables or columns with long names. Here’s an example of using aliases in a SELECT statement:Another important concept in SQL is the use of operators. Operators allow us to compare values and combine conditions. Here are some commonly used operators in SQL:
Here’s an example of using aliases in a SELECT statement:
SELECT c.customer_id, c.first_name || ' ' || c.last_name AS full_name, o.order_id
FROM customers AS c
JOIN orders AS o ON c.customer_id = o.customer_id
WHERE c.state = 'CA';
In this example, we’re selecting the customer_id and a concatenated full_name column (created using the || operator) from the customers table and the order_id column from the orders table. We’re also using aliases (c for customers and o for orders) to simplify the table names in the query.
Lastly, it’s important to understand the concept of data types in SQL. Each column in a SQL table has a specific data type that defines what kind of data can be stored in that column. Common data types include VARCHAR (variable-length character strings), INTEGER (whole numbers), DECIMAL (fixed-point numbers), and DATE (date values). Understanding data types is important for designing efficient database schemas and writing correct queries that handle data properly.
By understanding these additional concepts in Section 1, you’ll have a more well-rounded foundation for working with SQL.
Section 2: SQL Joins
In Section 1, we learned how to retrieve data from a single table using the SELECT statement. But what if we need to combine data from multiple tables? That’s where SQL joins come in.
A join allows us to combine rows from two or more tables based on a common column or set of columns. There are several types of joins in SQL, including INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN. Let’s explore each of these in more detail.
2.1 INNER JOIN
An INNER JOIN returns only the rows that have matching values in both tables being joined. Here’s an example:
In this example, we’re selecting the customer_id and order_id columns from the customers and orders tables, respectively. The INNER JOIN keyword tells SQL to match up rows from both tables where the customer_id column values match.
SELECT customers.customer_id, orders.order_id
FROM customers
INNER JOIN orders ON customers.customer_id = orders.customer_id;
2.2 LEFT JOIN
A LEFT JOIN returns all the rows from the left table and the matching rows from the right table. If there’s no match in the right table, the result will contain NULL values for the columns in that table. Here’s an example:
In this example, we’re selecting the customer_id and order_id columns from the customers and orders tables, respectively. The LEFT JOIN keyword tells SQL to include all rows from the customers table, along with any matching rows from the orders table. If there’s no match in the orders table, the result will contain a NULL value for the order_id column.
SELECT customers.customer_id, orders.order_id
FROM customers
LEFT JOIN orders ON customers.customer_id = orders.customer_id;
2.3 RIGHT JOIN
A RIGHT JOIN returns all the rows from the right table and the matching rows from the left table. If there’s no match in the left table, the result will contain NULL values for the columns in that table. Here’s an example:
In this example, we’re selecting the customer_id and order_id columns from the customers and orders tables, respectively. The RIGHT JOIN keyword tells SQL to include all rows from the orders table, along with any matching rows from the customers table. If there’s no match in the customers table, the result will contain a NULL value for the customer_id column.
SELECT customers.customer_id, orders.order_id
FROM customers
RIGHT JOIN orders ON customers.customer_id = orders.customer_id;
2.4 FULL OUTER JOIN
A FULL OUTER JOIN returns all the rows from both tables, along with NULL values for any columns that don’t have a match in the other table. Here’s an example:
In this example, we’re selecting the customer_id and order_id columns from the customers and orders tables, respectively. The FULL OUTER JOIN keyword tells SQL to include all rows from both tables, regardless of whether there’s a match or not. If there’s no match in either table, the result will contain NULL values for the corresponding columns.
SELECT customers.customer_id, orders.order_id
FROM customers
FULL OUTER JOIN orders ON customers.customer_id = orders.customer_id;
2.5 SELF-JOINS
A self-join is a join in which a table is joined with itself. This can be useful for querying hierarchical or recursive data structures. Here’s an example:
Suppose we have a table called “employees” with columns for employee ID, name, and manager ID. The manager ID refers to the ID of the employee’s manager in the same table. We want to find the names of all employees and their managers.
To do this, we can use a self-join. Here’s the SQL code:
SELECT e.name AS employee_name, m.name AS manager_name
FROM employees e
JOIN employees m ON e.manager_id = m.id;
In this query, we’re joining the “employees” table to itself. We give the table two aliases — “e” for the employees being joined and “m” for the employees’ managers. We join the tables on the “manager_id” column in the “e” table and the “id” column in the “m” table.
The SELECT statement then selects the “name” column from both aliases, giving us the names of both the employees and their managers.
Step-by-step explanation:
2.6 NATURAL JOIN
A NATURAL JOIN automatically joins two tables based on their column names. For example:
This query joins the “employees” and “departments” tables based on their shared “department_id” column. The resulting table includes all columns from both tables, but only includes rows where there is a match between the two tables on the “department_id” column.
SELECT *
FROM employees NATURAL JOIN departments;
Section 3: SQL Functions
In SQL, functions are pre-defined operations that perform specific actions on the data. They can be used to manipulate and transform data to produce desired results. SQL functions can be classified into different types based on their purpose, such as mathematical functions, string functions, date/time functions, aggregate functions, and window functions. In this section, we’ll cover some of the most commonly used SQL functions and provide examples of how they can be used in SQL interview questions.
3.1 Mathematical Functions
SQL provides a wide range of mathematical functions that allow us to perform arithmetic operations on numerical data. Here are some of the most commonly used mathematical functions:
Here’s an example problem to illustrate the use of mathematical functions in SQL:
Problem: Calculate the total revenue of a company for the year, including a 5% increase due to inflation.
Solution:
SELECT SUM(revenue * 1.05) AS total_revenue
FROM company
WHERE year = '2022';
Explanation:
3.2 String Functions
String functions operate on string values (i.e., character data) and can be used to manipulate or transform them in various ways. Here are some commonly used string functions:
Here’s an example SQL problem that uses string functions:
Problem: Write a SQL query to return a list of all customers with email addresses that contain the word “gmail”, along with the length of each email address.
Answer:
SELECT customer_name, email, LENGTH(email) AS email_length
FROM customers
WHERE email LIKE '%gmail%'
Explanation:
This query would return a table with three columns: customer_name, email, and email_length. The customer_name and email columns would contain the corresponding values from any rows in the customers table where the email column contains the string “gmail”. The email_length column would contain the length of each email address.
3.3 Date/Time Functions
Date/time functions in SQL are used to manipulate and extract information from date and time values. These functions can perform various operations such as adding or subtracting time, formatting dates in different ways, or extracting specific parts of a date or time value. Being able to use date/time functions is essential for anyone who works with time-based data.
Here are some commonly used date/time functions in SQL:
Problem: Find all orders that were placed in the month of January.
To solve this problem, we need to extract the month part from the order_date column and compare it to January. Here’s how we can do it:
SELECT *
FROM orders
WHERE DATEPART(MONTH, order_date) = 1;
Explanation:
Section 4: SQL Constraints
In SQL, constraints are used to specify rules that should be enforced on data in a table. Constraints help ensure data accuracy, consistency, and integrity, and they play a critical role in maintaining database quality. In this section, we will cover the different types of SQL constraints and how to use them.
One of the most common SQL constraints is the NOT NULL constraint. It ensures that a column cannot contain any NULL values. To use this constraint, you can add the NOT NULL keyword after the column definition.
4.1 NOT NULL Constraint
Here’s an example:
CREATE TABLE Employees (
EmployeeID INT NOT NULL,
Name VARCHAR(50) NOT NULL,
Salary DECIMAL(10,2)
);
In this example, we created a table called “Employees” with three columns: “EmployeeID”, “Name”, and “Salary”. The “EmployeeID” and “Name” columns have the NOT NULL constraint, which means they cannot be NULL.
Here’s a problem that uses the NOT NULL constraint:
Problem:
Create a table called “Customers” with the following columns: “CustomerID”, “Name”, “Email”, and “Phone”. Ensure that the “Name” and “Email” columns cannot be NULL.
Answer:
CREATE TABLE Customers (
CustomerID INT NOT NULL,
Name VARCHAR(50) NOT NULL,
Email VARCHAR(50) NOT NULL,
Phone VARCHAR(20),
);
In this example, we created a table called “Customers” with four columns: “CustomerID”, “Name”, “Email”, and “Phone”. The “Name” and “Email” columns have the NOT NULL constraint, which ensures that these columns cannot contain any NULL values.
4.2 UNIQUE Constraint
The UNIQUE constraint ensures that all values in a column are unique. This means that no two rows in the table can have the same value in that column. To add a UNIQUE constraint to a column, you can use the following syntax:
Here’s an example:
CREATE TABLE students (
id INT PRIMARY KEY,
name VARCHAR(50) NOT NULL,
email VARCHAR(50) UNIQUE
);
In this example, the email column has a UNIQUE constraint. This means that each email address in the table must be unique.
Here’s a problem that uses the UNIQUE constraint:
Problem:
Create a table called “Products” with the following columns: “ProductID”, “ProductName”, and “ProductCode”. Ensure that the “ProductCode” column contains only unique values.
CREATE TABLE Products (
ProductID INT PRIMARY KEY,
ProductName VARCHAR(50),
ProductCode VARCHAR(20) UNIQUE
);
In this example, we created a table called “Products” with three columns: “ProductID”, “ProductName”, and “ProductCode”. The “ProductCode” column has the UNIQUE constraint, which ensures that each value in this column is unique.
4.3 Primary Key
The PRIMARY KEY constraint identifies a column (or a group of columns) that uniquely identifies each row in a table. To use this constraint, you can add the PRIMARY KEY keyword after the column definition.
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT,
OrderDate DATE
);
In this example, we created a table called “Orders” with three columns: “OrderID”, “CustomerID”, and “OrderDate”. The “OrderID” column has the PRIMARY KEY constraint, which means that each value in this column must be unique and cannot be NULL.
Here’s a problem that uses the PRIMARY KEY constraint:
Problem: Create a table called “Students” with the following columns: “StudentID”, “Name”, and “Grade”. Ensure that the “StudentID” column is the primary key.
CREATE TABLE Students (
StudentID INT PRIMARY KEY,
Name VARCHAR(50),
Grade
4.4 Foreign Key Constraints
A foreign key constraint is a way to link two tables together in a relational database. It is a column or a set of columns in one table that refers to the primary key of another table. This ensures referential integrity between the two tables, meaning that data is consistent across both tables.
Here’s an example of how to create a foreign key constraint:
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
name VARCHAR(50)
);
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);
In this example, we have two tables: customers and orders. The customers table has a primary key on customer_id, and the orders table has a foreign key on customer_id that references the customer_id in the customers table. This ensures that any customer_id in the orders table must also exist in the customers table, maintaining referential integrity.
4.5 Check Constraints
A check constraint is a rule that limits the values that can be inserted into a column in a table. It allows you to define conditions that must be met before a row can be added or updated.
Here’s an example of how to create a check constraint:
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
name VARCHAR(50),
age INT,
salary DECIMAL(10,2),
CONSTRAINT age_check CHECK (age >= 18),
CONSTRAINT salary_check CHECK (salary > 0)
);
In this example, we have a table called employees with columns for employee_id, name, age, and salary. The check constraint on age ensures that the age is greater than or equal to 18, and the check constraint on salary ensures that the salary is greater than 0. This ensures that no invalid data is added to the table.
Section 5: SQL Transactions
A transaction is a set of SQL statements that are executed together as a single unit of work. Transactions are used to ensure the integrity of the data in a database. If a transaction fails for any reason, all changes made during the transaction are rolled back, and the database is returned to its previous state.
Transactions are typically used in situations where multiple SQL statements need to be executed together as an atomic operation. For example, if you were transferring money from one bank account to another, you would want to ensure that both the withdrawal from the first account and the deposit into the second account were executed together as a single transaction.
There are four main properties of transactions that are commonly referred to as the ACID properties:
Let’s take a look at an example of how transactions can be used in SQL.
Suppose we have a table called bank_accounts that has the following schema:
bank_accounts(id, account_number, balance)
Suppose we want to transfer $100 from account 123 to account 456. We could do this with the following SQL statements:
BEGIN TRANSACTION;
UPDATE bank_accounts
SET balance = balance - 100
WHERE account_number = 123;
UPDATE bank_accounts
SET balance = balance + 100
WHERE account_number = 456;
COMMIT TRANSACTION;
In this example, we start a transaction using the BEGIN TRANSACTION statement. We then execute two UPDATE statements, one to decrement the balance of account 123 by $100, and another to increment the balance of account 456 by $100. Finally, we commit the transaction using the COMMIT TRANSACTION statement.
If either of the UPDATE statements fails for any reason, the transaction will be rolled back and the database will be returned to its previous state. For example, if account 123 does not have sufficient funds to transfer $100, the first UPDATE statement will fail, and the entire transaction will be rolled back, ensuring that the database remains in a consistent state.
One useful tip that can be used in conjunction with transactions is the SAVEPOINT statement. A SAVEPOINT creates a named point in a transaction to which you can later roll back. This is useful if you want to execute a set of statements within a transaction, but still be able to roll back to a specific point within the transaction if something goes wrong.
Here’s an example:
BEGIN TRANSACTION;
UPDATE bank_accounts
SET balance = balance - 100
WHERE account_number = 123;
SAVEPOINT transfer_point;
UPDATE bank_accounts
SET balance = balance + 100
WHERE account_number = 456;
-- Oops, we made a mistake, let's roll back to the transfer_point
ROLLBACK TO transfer_point;
UPDATE bank_accounts
SET balance = balance + 50
WHERE account_number = 789;
COMMIT TRANSACTION;
In this example, we create a SAVEPOINT called transfer_point after the first UPDATE statement. We then execute the second UPDATE statement, but suppose we realize that we made a mistake and want to undo the second UPDATE statement. We can do this by using the COMMIT statement will save the changes made to the database, and the ROLLBACK statement will undo the changes made during the transaction. It is important to use transactions when dealing with complex database operations that involve multiple queries or updates to ensure data integrity and consistency.
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
name VARCHAR(50)
);
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);
Section 6: Real SQL Questions
If you’re preparing for a SQL interview, it’s important to familiarize yourself with some real-world SQL questions. These questions can test your knowledge of various SQL concepts and help you understand how to apply them in practice. In this section, we’ll go through some commonly asked SQL interview questions and provide step-by-step solutions to help you prepare for your upcoming interview.
We’ll cover a variety of topics, including SELECT statements, SQL joins, and SQL functions. We’ll also discuss some more complex SQL queries that you may encounter in a data-related job. Whether you’re an aspiring data analyst or a seasoned data engineer, these SQL questions will help you sharpen your skills and demonstrate your expertise in SQL. So, let’s get started!
6.1 Write a SQL query to find the top 5 highest-paid employees in the “Sales” department.
Table: employee
+------------+-------+-----------+
|employee_id |salary |department |
+------------+-------+-----------+
|1 |3000 |Sales |
|2 |1000 |Marketing |
|3 |1500 |Sales |
|4 |2400 |Marketing |
|5 |2000 |Tech |
|6 |3000 |Tech |
+------------+------+------------+
Solution:
We can use the RANK function to rank employees by salary within each department, and then select the top 5 ranked employees in the Sales department.
Explanation:
SELECT department, employee_id, salary
FROM (
SELECT department, employee_id, salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM employee
) ranked_employees
WHERE department = 'Sales' AND rank <= 5;
6.2 Write a SQL query to find the employee with the highest salary.
To find the employee with the highest salary, we can use the MAX() function to get the highest salary, and then join it back to the employee table to get the employee with that salary.
Explanation:
SELECT e.employee_id, e.salary, e.department
FROM employee e
JOIN (
SELECT MAX(salary) AS max_salary
FROM employee
) max_salary_employee
ON e.salary = max_salary_employee.max_salary;
6.3 Write a SQL query to calculate the total revenue generated per year and per month from a sales table.
We need to use the GROUP BY clause with the year and month extracted from the sale_date. This can be done using the YEAR() and MONTH() functions. The SUM() function is then used to calculate the revenue per month.
Explanation:
Table: sales
+--------+----------+------+----------+
|sales_id |sale_date|amount|product |
+--------+----------+------+----------+
|1 |2022-01-01|200 |product_1 |
|2 |2022-01-05|150 |product_1 |
|3 |2022-02-02|300 |product_2 |
|4 |2022-02-05|250 |product_2 |
|5 |2022-02-07|150 |product_1 |
+--------+----------+------+----------+
SELECT YEAR(sale_date) as year,
MONTH(sale_date) as month,
SUM(amount) as revenue
FROM sales
GROUP BY YEAR(sale_date), MONTH(sale_date)
ORDER BY year, month
6.4 Write a SQL query to find the total sales amount for each product in February 2022. Order the results by the total sales amount in descending order.
We can use a GROUP BY clause to group the sales by product and use the HAVING clause to filter the sales that occurred in February 2022. Finally, we can use the SUM function to calculate the total sales amount and order the results by the total sales amount in descending order.
Explanation:
SELECT product,
SUM(amount) AS total_sales
FROM sales
WHERE sale_date >= '2022-02-01' AND sale_date < '2022-03-01'
GROUP BY product
ORDER BY total_sales DESC
6.5 Write a SQL query to find the total sales amount for each product, as well as the percentage of total sales that each product represents. Order the results by percentage in descending order.
To calculate the total sales amount for each product, we can use a simple GROUP BY clause to group the records by product and sum the amount column. To calculate the percentage of total sales, we can use a subquery to get the total sales for all products, and then divide the sales amount for each product by the total sales and multiply by 100.
Explanation:
Overall, this query provides useful information about the sales data in the table, including the total number of products sold, the total amount of sales, the average amount of sales per product, and the highest sales amount for a single product.
SELECT product,
SUM(amount) AS total_sales,
ROUND(SUM(amount) / (SELECT SUM(amount) FROM sales) * 100, 2) AS percentage
FROM sales
GROUP BY product
ORDER BY percentage DESC
Conclusion
SQL is a powerful tool for managing and analyzing large datasets, and it’s an essential skill for any data professional. In this guide, we covered a range of topics, including SQL syntax, data manipulation, subqueries, joins, constraints, transactions, and more. We also provided examples and explanations for each concept to help you understand how they work in practice.
Overall, this guide provides a comprehensive introduction to SQL and covers a range of essential topics for anyone looking to become proficient in SQL. Whether you’re just starting out or looking to expand your skills, we hope you found this guide helpful and informative.