4 logical SQL queries to retrieve data
If you need to find data in a table, it can be both simple and challenging. In this article, we will create a real database with CO2 emissions by country and test some logical queries to the data in the table.
First, we must create a table and add the information into the columns. In this case, we will work with dimension year with an attribute DATE, which is common in most business tables. The information regarding the volume of emissions is given in mln format. So, the code is the following.
CREATE TABLE co2emissions (
id INTEGER PRIMARY KEY,
country TEXT,
year DATE,
emissions INTEGER);
INSERT INTO co2emissions (id,country,year,emissions) VALUES (1,’Germany’,’1900–01–01',326);
INSERT INTO co2emissions (id,country,year,emissions) VALUES (2,’Germany’,’1950–01–01',510);
INSERT INTO co2emissions (id,country,year,emissions) VALUES (3,’Germany’,’2000–01–01',899);
INSERT INTO co2emissions (id,country,year,emissions) VALUES (4,’Germany’,’2020–01–01',644);
INSERT INTO co2emissions (id,country,year,emissions) VALUES (5,’United_States’,’1900–01–01',625);
INSERT INTO co2emissions (id,country,year,emissions) VALUES (6,’United_States’,’1950–01–01',254000);
INSERT INTO co2emissions (id,country,year,emissions) VALUES (7,’United_States’,’2000–01–01',601000);
INSERT INTO co2emissions (id,country,year,emissions) VALUES (8,’United_States’,’2020–01–01',471000);
INSERT INTO co2emissions (id,country,year,emissions) VALUES (9,’Switzerland’,’1900–01–01',5);
INSERT INTO co2emissions (id,country,year,emissions) VALUES (10,’Switzerland’,’1950–01–01',10);
INSERT INTO co2emissions (id,country,year,emissions) VALUES (11,’Switzerland’,’2000–01–01',32);
INSERT INTO co2emissions (id,country,year,emissions) VALUES (12,’Switzerland’,’2020–01–01',32);
1. Which countries have the largest volume of CO2 Emissions?
We can reach the answer with a query:
SELECT * FROM co2emissions
ORDER BY emissions DESC;
As we can see, the United States produces more carbon dioxide (CO2) compared to Germany and Switzerland.
Here we have just a small amount of rows, which is why it was not difficult to figure out the countries by emissions volume. What if our table contains thousands of rows. In this situation, it makes sense to use the statistic function SUM.
SELECT country,SUM(emissions) as total_emissions from co2emissions group by country order by total_emissions DESC;
It turns out that the United States is very polluted, while Switzerland has the lowest amount of emissions.
2. How many countries are in the table?
SELECT COUNT(DISTINCT country) as quantity_countries
FROM co2emissions;
Make sure that you add DISTINCT country to remove the duplicates. Since we have some countries on the list, if you want to see the unique dimensions, it is crucial to specify them.
3. What is the volume of CO2 Emissions in Switzerland in 2022?
Data manipulations could be tricky due to the difference in SQL dialect, various functions, and the necessity to convert data to the proper datatype.
Option 1
The most simple way to find a record for a specific date could be using the WHERE clause.
SELECT country,year,emissions FROM
co2emissions WHERE year = '2020-01-01' and country = 'Switzerland';
Option 2
Using PostgreSQL, you can also extract this information with a datatype specification using double-colon. When you run this query, you can see such details as timezone and timestamp. Since it was not set in my table, you can’t see it.
领英推荐
SELECT country,year::date,emissions FROM
co2emissions WHERE year = '2020-01-01' and country = 'Switzerland';
Option 3
You can also run the query containing CAST to achieve the same results in both PostgreSQL and MySQL.
SELECT cast(year as date),country,emissions FROM
co2emissions WHERE year = '2020-01-01' and country = 'Switzerland';
4. What was the volume of CO2 Emissions in Switzerland in May 2020?
A widespread request during job interviews is to find data in a certain period of time, for instance, a month or a day. Let’s add another record to our table in the year 2020 for Switzerland.
insert into co2emissions (id,country,year,emissions) values (13,'Switzerland','2020-02-01',30);
insert into co2emissions (id,country,year,emissions) values (14,'Switzerland','2020-03-01',28);
insert into co2emissions (id,country,year,emissions) values (15,'Switzerland','2020-04-01',25);
insert into co2emissions (id,country,year,emissions) values (16,'Switzerland','2020-05-01',29);
insert into co2emissions (id,country,year,emissions) values (17,'Switzerland','2020-06-01',31);
insert into co2emissions (id,country,year,emissions) values (18,'Switzerland','2020-07-01',26);
insert into co2emissions (id,country,year,emissions) values (19,'Switzerland','2020-08-01',27);
insert into co2emissions (id,country,year,emissions) values (20,'Switzerland','2020-09-01',27);
insert into co2emissions (id,country,year,emissions) values (21,'Switzerland','2020-10-01',30);
insert into co2emissions (id,country,year,emissions) values (22,'Switzerland','2020-11-01',31);
And now we have our monthly data for 2020.
SELECT * FROM co2emissions
WHERE country = 'Switzerland' and year >= '2020-01-01'
ORDER BY year;
If we want to extract the data for this particular month, we can use the EXTRACT function.
SELECT country,extract(month from cast(year as date)) as month,year, emissions FROM
co2emissions WHERE year = '2020-05-01' and country = 'Switzerland';
4. Compare the emission amounts in Germany and Switzerland
Assume you live in Germany and you consider the possibility of moving out to Switzerland. As a person who cares about ecology, you want to know the CO2 level in this country. How can we represent all combined emissions from 1900 to 2020 in two countries?
Let’s eliminate the rows we created to demonstrate how the ‘extract’ function works.
DELETE FROM co2emissions WHERE id IN (13,14,15,16,17,18,19,20,21,22);
You can run the SELECT WHERE IN function and sort the records by country.
SELECT * FROM co2emissions
WHERE country IN(‘Switzerland’,’Germany’)
ORDER BY country;
As you can see, the data is slightly unstructured and possible but difficult to compare. To make it more precise, we can use the statistical function AVERAGE.
SELECT country, AVG(emissions) as average_emissions from co2emissions group by country;
This function allows us to quickly retrieve and count the emissions. Thus, Switzerland looks more appealing for living from an ecological standpoint.
I hope the article was helpful for you.
Stay tuned, and have a nice day.