课程: Coding Exercises: SQL Data Science
Inner joins
- The growth team is reviewing the penetration ratio of different countries. Karl Branson, who is the VP of the growth team, wants you to analyze the top 10 countries where your organization has the highest penetration ratio. Here the challenge is, that this information resides in two tables. FansPerCountry and PopStats, which is basically population stats stable. And penetration ratio is defined as the number of fans divided by the total population. Now to break it down, you need to join these two tables and compute the penetration ratio for each country to get the top 10 countries with the highest penetration ratio. You need to extract this data for the latest date, which is 2018-10-16, and then you have to order it by penetration ratio and limit the results to only 10 rows so that you can see the top 10 countries with the highest penetration ratio. So, you know the drill now. Try to write the SQL query to calculate the penetration ratio. Pause the video and once you are done, I'll show you how I solve this challenge. (upbeat music) So here I am and my query editor again. So I'm going to start writing this query using the select clause. So select. I'm going to first access the name of the country and that's going to come from the pop stats table. So, I'm going to write pop stats. So here one thing that you should note, is I'm writing this in double quotes. In different SQL flavors you use different specifiers here in Postgres in pgAdmin, I use double quotes. In some SQL flavors you'll use brackets or square brackets. But here, if there is any uppercase letter in any of the table name or the column name, then you will have to specify it or wrap it up and double quote otherwise that will give you an error. So I'm going to access country name from pop stats, and then I'm going to calculate the penetration ratio, which is number of fans divided by the population. So let's calculate that. So a number of fans is going to come from the fans per cuntry table. I'm going to give it an alias FPC. So, number of fans, then I'm going to multiply it with 100.00 because I need percentage rather than just ratios. And then I need to divide it with population, so that's going to come from pop stats stable. And this whole calculation, I'm going to name it as penetration_ratio. All right. So penetration ratio is calculated and all of this information is going to come from the table fans per country and let's give it that alias FPC. And since we're using these two tables, we need to join them. And in order to join them, we need a common column, which is going to be country code, which is present in both of these tables. So, you're going to join these two tables. So you're going to join fans per country with pop stats on this common field or common column, which is country code. So in FPC, it is denoted as country code like this and and the pop stats stable, this field is written as country_code. So you have joined these two tables now, and we need this data for the latest date. So, that will basically mean that we'll have to add where clause. So where fpc.date is equal to 2018-10-16. Now we need to order this data by penetration ratio. So, order by penetration ratio that we have calculated and that too in their descending order, because we need the top 10. And we need to limit it to only 10 rows, so that we look at the top 10 countries and we run this query and ta-da. Following are the countries with the respective penetration ratio. And you can see, we have at the top reunion with a penetration ratio of 2.41% and then we have French Polynesia, Caledonia, Mauritius, so on and so forth It is interesting how joints can help us in extracting the required information from multiple tables. Don't you agree?
随堂练习,边学边练
下载课堂讲义。学练结合,紧跟进度,轻松巩固知识。