登录查看更多内容

DAY 5: PROSQL45 CHALLENGE

Suparna Chowdhury

Data Scientist | Python, SQL, and Tableau Expert | Driving Data Insights

发布日期: 2024年11月21日

A big thanks to Ankit Bansal the year-end challenge! I’m taking it on by solving 2 complex SQL questions every day from a playlist of 79 problems.

PROBLEM 9:

This problem is sourced from Leetcode.com : https://leetcode.ca/all/1159.html?

There are three tables: Users, Orders, and Items. The Users table contains user details, the Orders table records transactions, and the Items table has details of the items sold.

Write an SQL query to determine for each user whether the brand of the second item they sold (ordered by the sale date) matches their favorite brand. If a user sold fewer than two items, return "no" for that user.

Video Solution by Ankit: https://www.youtube.com/watch?v=1ias-sP_XAY&list=PLBTZqjSKn0IeKBQDjLmzisazhqQy4iGkb&index=9

Step-by-Step Solution Approach:

I started by creating a ranked_orders CTE to rank each seller's orders to get the second sold item. I used the ROW_NUMBER() function, partitioned by o.seller_id, to order the sales by order_date for each seller. This allowed me to select the seller's ID, their favorite brand, the order date, the item's brand, and assign a rank to each order.
Next, I checked if the second item sold (where order_rank = 2) matched the seller’s favorite brand. I used a CASE statement for this—if it matched, I returned "Yes", otherwise "No". To handle cases where there was no second item or the seller hadn’t sold anything (like user_id 1), I used the COALESCE function to default to "No". The LEFT JOIN ensured that all users were included, even those who hadn’t sold a second item.
Lastly, I used the WHERE clause to filter the results, ensuring that only the second order was considered, or returned results for users who hadn’t made a second sale (using ro.order_rank IS NULL). This approach ensured that I checked each user’s second item and whether it matched their favorite brand.

CODE:

WITH ranked_orders
     AS (SELECT o.seller_id,
                o.order_date,
                i.item_brand,
                Row_number()
                  OVER (
                    partition BY o.seller_id
                    ORDER BY o.order_date) AS rn
         FROM   orders o
                LEFT JOIN items i
                       ON o.item_id = i.item_id)
SELECT u.user_id           AS seller_id,
       COALESCE(CASE WHEN u.favorite_brand = ro.item_brand 
	           THEN 'Yes' ELSE 'No' END, 'No') 
	          AS '2nd_item_fav_brand'
FROM   users u
       LEFT JOIN ranked_orders ro
              ON u.user_id = ro.seller_id
WHERE  ro.rn = 2
        OR ro.rn IS NULL ?

PROBLEM 10:

领英推荐

LeetCode Curated SQL Solutions and Discussion - Week 2

Ankur Ranjan 3 年前

How to import the Manhole Loads from every InfoSewer…

Robert Dickinson 1 年前

How to import the Manhole Loads from Only One…

Robert Dickinson 1 年前

Question: Given the tasks table with columns date_value and state, write an SQL query to find the start date, end date, and state of tasks. The start date should be the first date for each state group (e.g., the first occurrence of 'success' or 'fail'), and the end date should be the last date for each state group. The query should return the results grouped by the task's state, and for each state, the corresponding start and end dates.

Video Solution by Ankit: https://www.youtube.com/watch?v=WrToXXN7Jb4&list=PLBTZqjSKn0IeKBQDjLmzisazhqQy4iGkb&index=10

Step-by-Step Solution Approach:

The first step is to group the data by state. I calculate a row number (rn) for each row within each state, ordered by date_value. Then, I subtract the row number from the date_value to create a group_date, which acts as a common identifier for consecutive date groups. This logic is placed in a CTE (Common Table Expression). The outcome of the CTE is as follows:

The main query then selects the minimum (start_date) and maximum (end_date) date_value for each group of consecutive dates, grouped by state and group_date. Finally, the results are ordered by start_date, returning the consecutive date ranges for each state.

CODE:

With cte AS (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY state ORDER BY date_value) AS rn,
DATEADD(DAY, -1 * ROW_NUMBER() OVER(PARTITION BY state ORDER BY date_value), date_value) AS group_date
FROM tasks
ORDER BY date_value)
SELECT MIN(date_value) AS start_date, MAX(date_value) AS end_date, state
FROM cte
GROUP BY state, group_date
ORDER BY start_date

Let’s level up our SQL skills!

Join me on this journey as I tackle more challenges each day. Let’s improve our SQL skills together before the year ends!

Check out the full solutions on GitHub here!

#SQL #Database #SQLServer #MySQL #dataanalytics #SQLQuery #DataAnalysis #SQL #DataScience #DataAnalytics #RelationalDatabases #BusinessIntelligence #InterviewPreparation #sqlpractice #careerdevelopment #interviewquestions

Suparna Chowdhury

Data Scientist | Python, SQL, and Tableau Expert | Driving Data Insights

3 个月

Want to see the full solution? Check it out on my GitHub : https://github.com/suchow07/Complex_SQL/blob/main/DAY_5.sql

要查看或添加评论，请登录

Suparna Chowdhury的更多文章

Mastering Data Accuracy: Confidence Intervals & Data Integrity Simplified

2025年1月2日

Mastering Data Accuracy: Confidence Intervals & Data Integrity Simplified

Welcome to this edition of our newsletter! Today, we’ll explore two essential topics that will help you work smarter…
PROSQL 45: Complex SQL Questions Solution

2024年11月19日

PROSQL 45: Complex SQL Questions Solution

DAY 4: PROSQL45 CHALLENGE (19 NOVEMBER, 2024) A big thanks to Ankit Bansal the year-end challenge! I’m taking it on by…
Striking the Balance: How to Avoid Underfitting & Overfitting in ML Models

2024年11月19日

Striking the Balance: How to Avoid Underfitting & Overfitting in ML Models

Creating a good model is about finding the right balance. If the model is too simple, it misses important patterns…
PROSQL 45: Complex SQL Questions Solution

2024年11月18日

PROSQL 45: Complex SQL Questions Solution

DAY 3: PROSQL45 CHALLENGE (18 NOVEMBER, 2024) A big thanks to Ankit Bansal the year-end challenge! I’m taking it on by…
PROSQL 45: Complex SQL Questions Solution

2024年11月18日

PROSQL 45: Complex SQL Questions Solution

DAY 2: PROSQL45 CHALLENGE (17 NOVEMBER, 2024) A big thanks to Ankit Bansal the year-end challenge! I’m taking it on by…
SQL CHALLENGE DAY 1

2024年11月16日

SQL CHALLENGE DAY 1

Big thanks to Ankit Bansal for the year-end challenge! I’m taking it on by solving 2 complex SQL questions every day…
A Data Sapient Guide to Feature Engineering: Handling Missing Data

2024年10月27日

A Data Sapient Guide to Feature Engineering: Handling Missing Data

Feature engineering is the process of transforming and selecting relevant features from raw data to enhance machine…
Mastering MySQL: Key Techniques for SQL Query Optimization

2024年10月10日

Mastering MySQL: Key Techniques for SQL Query Optimization

In the fast-paced world of data management, optimizing SQL queries is crucial for enhancing performance and ensuring…
Mastering Dynamic Zone Visibility in Tableau

2024年9月20日

Mastering Dynamic Zone Visibility in Tableau

What is Dynamic Zone Visibility? Dynamic zone visibility is a powerful feature that can greatly enhance the…

See all articles

DAY 5: PROSQL45 CHALLENGE

Suparna Chowdhury

Data Scientist | Python, SQL, and Tableau Expert | Driving Data Insights

PROBLEM 10:

领英推荐

CODE:

Suparna Chowdhury的更多文章

社区洞察

其他会员也浏览了

Reading wells from SPE data repository

Write Aging Reports Using the SQL WITH Clause

Query Store Set Up SQL 2019

Swift - 4 - Core Data - Creating a Singleton Core Data & refactoring insert, update, delete operations

The story of how I optimized a SQL code to improve our data analysis in my company

Getting started with T-SQL (Part 2)

Unpacking a Dataverse Multi-Select field in Fabric.

What are the best practices for optimizing SQL queries?

SUCCESS FACTORS: HOW TO QUERY MORE THAN 1000 ROWS OF A DATASET

PROBLEM 10:

领英推荐

CODE:

Suparna Chowdhury的更多文章

Mastering Data Accuracy: Confidence Intervals & Data Integrity Simplified

PROSQL 45: Complex SQL Questions Solution

Striking the Balance: How to Avoid Underfitting & Overfitting in ML Models

PROSQL 45: Complex SQL Questions Solution

PROSQL 45: Complex SQL Questions Solution

SQL CHALLENGE DAY 1

A Data Sapient Guide to Feature Engineering: Handling Missing Data

Mastering MySQL: Key Techniques for SQL Query Optimization

Mastering Dynamic Zone Visibility in Tableau

社区洞察

其他会员也浏览了

Reading wells from SPE data repository

Write Aging Reports Using the SQL WITH Clause

Query Store Set Up SQL 2019

Swift - 4 - Core Data - Creating a Singleton Core Data & refactoring insert, update, delete operations

The story of how I optimized a SQL code to improve our data analysis in my company

Getting started with T-SQL (Part 2)

Unpacking a Dataverse Multi-Select field in Fabric.

What are the best practices for optimizing SQL queries?

SUCCESS FACTORS: HOW TO QUERY MORE THAN 1000 ROWS OF A DATASET