Monthly Sales Commission Analysis with Pandas - A Complete Workflow
Optimizing Sales Performance: A Pandas-Based Sales Commission Workflow
Tracking monthly sales commissions is essential for evaluating sales team performance. In this guide, we will merge, clean, transform, aggregate, and export sales commission data using Pandas in a real-world workflow.
By the end of this guide, you'll learn:
Dataset Overview
We will use two datasets:
1?? Toyota Sales Data (Toyota_sales_data.csv)
2?? Sales Reps Data (sales_reps_data.csv)
Goal:
You can download the datasets from the following GitHub link: GitHub Datasets
Step 1: Load the Data
import pandas as pd
# Load datasets
toyota_sales_data = pd.read_csv("data/car_sales/toyota_sales_data.csv")
sales_reps_data = pd.read_csv("data/car_sales/sales_reps_data.csv")
# Preview data
print(toyota_sales_data.head())
print(sales_reps_data.head())
Step 2: Merge Sales & Reps Data
We merge Toyota sales data and Sales reps data using rep_id as the key.
Merging DataFrames
merged_data = pd.merge(
sales_reps_data,
toyota_sales_data,
left_on="rep_id",
right_on="sale_rep_id",
how="inner"
)
print(merged_data.head()) # Check merged dataset
Ensures sales data is linked to the correct sales representatives
Step 3: Handle Missing Values
We noticed that the commission_percentage column contains NaN values.
Replace missing commission percentages with 0
merged_data = merged_data.fillna({"commission_pct": 0})
# Verify missing values are handled
print(merged_data["commission_pct"].isnull().sum()) # Should be 0
Step 4: Compute Monthly Sales Commission
Extract sale_month from sale_date
Calculate commission_earned using:
?? commission_earned = sale_amount * commission_percentage
merged_data = (
merged_data
.assign(commission_earned=lambda df: round(df["sale_amount"] * df["commission_pct"], 2)) \
.assign(sale_month=lambda df: pd.to_datetime(df["sale_date"]).dt.to_period("M")) \
)
print(merged_data.head()) # Check transformed data
? Ensures all transactions are assigned to the correct month
? Commission calculations are rounded for accuracy
Step 5: Group & Aggregate Sales Data
We will group data by sale_month and rep_id to calculate:
? Total sales per rep
? Total commissions earned
monthly_summary = (
merged_data
.groupby(["sale_month", "rep_id", "first_name", "last_name", "email"]) \
.agg(
total_sales=("sale_amount", "sum"),
total_commission=("commission_earned", "sum")
) \
.reset_index()
)
print(monthly_summary.head()) # Check grouped summary
? Efficiently computes performance metrics for each sales rep per month
领英推荐
Step 6: Export Final Report
? Save the final summary as a CSV file for stakeholders.
# Define file path
file_path = "data/car_sales/monthly_sales_commission.csv"
# Export to CSV
monthly_summary.to_csv(file_path, index=False)
print(f"Report saved successfully at: {file_path}")
? Removes the default index to keep the report clean
? Ready for further analysis in Excel, BI tools, or databases
Final Chained Workflow - Clean & Efficient
?? Complete workflow in one streamlined code block:
# Load Data
toyota_sales_data = pd.read_csv("data/car_sales/Toyota_sales_data.csv")
sales_reps_data = pd.read_csv("data/car_sales/sales_reps_data.csv")
# Process Data
monthly_summary = (
pd.merge(sales_reps_data, toyota_sales_data, left_on="rep_id", right_on="sale_rep_id", how="inner")
.fillna({"commission_pct": 0}) \
.assign(commission_earned=lambda df: round(df["sale_amount"] * df["commission_pct"], 2)) \
.assign(sale_month=lambda df: pd.to_datetime(df["sale_date"]).dt.to_period("M")) \
.groupby(["sale_month", "rep_id", "first_name", "last_name", "email"]) \
.agg(
total_sales=("sale_amount", "sum"),
total_commission=("commission_earned", "sum")
) \
.reset_index()
)
# Export Report
monthly_summary.to_csv("data/car_sales/monthly_sales_commission.csv", index=False)
print("Monthly Sales Commission Report Saved Successfully!")
? Handles missing data
? Extracts monthly data
? Computes total commissions
? Exports a ready-to-use report
Best Practices
?? Practice Assignment
?? Want to practice? Attempt the Monthly Sales Commission Calculation Using Pandas Assignment ?? Click here.
What’s Next?
In the next module, we will dive into Introduction to Fundamentals of Statistics for Data Analysis. This module will provide a strong foundation in statistical concepts that are essential for analyzing and interpreting data effectively.
Click ?? to Enroll in the Python for Beginners: Learn Python with Hands-on Projects. It only costs $10 and you can reach out to us for $10 Coupon.
Conclusion
In this module, we explored:
Conclusion: Data Transformation Using Pandas
Congratulations on completing the module on data transformation and processing using Pandas. You’ve taken a significant step toward mastering data preparation techniques, which are essential for effective data analysis.
Key Takeaways:
Throughout this module, we explored various techniques to modify, enrich, and prepare data efficiently:
Keep in Mind:
Data transformation is a critical step in preparing datasets for meaningful analysis. These techniques are widely used in:
Data transformation bridges raw datasets and actionable insights. By mastering these techniques, you are well-equipped for advanced analytics, visualization, and machine learning tasks.
?? Engage With Us!
? Authored by Siva Kalyan Geddada , Abhinav Sai Penmetsa
?? Share this article with anyone interested in data engineering, Python, or data analysis. ?? Have questions or need help? Comment below! Let's discuss.
?? Follow us for more hands-on data science tutorials!