Overview: The project aims to analyze Amazon sales data to improve sales management and distribution methods, to reduce costs, and increase profits. The main tasks include:
- Performing ETL (Extract-Transform-Load) on an Amazon dataset
- Analyzing sales trends on a monthly, yearly, and yearly-month basis
- Identifying key metrics and factors
- Finding meaningful relationships between attributes
- Data Extraction and Preparation: Download the Amazon dataset from the provided link Python to load and preprocess the data.
- Data Analysis: Use pandas for data manipulation and analysis to create time-based features for month, year, and year-month Analyze sales trends using group by operations and aggregations.
- Visualization: Use Matplotlib or Seaborn to create visualizations of sales trends line plots for monthly and yearly trends heatmaps or grouped bar charts for yearly-month analysis.
- Key Metrics and Relationships: Calculate important metrics such as total sales, average order value, and top-selling products correlation analysis to find relationships between attributes and apply statistical tests to validate findings.
- Reporting: Create a detailed project report summarizing findings and insights including visualizations and statistical results in the report
- Code Structure: Write modular, maintainable, and testable code functions and classes to organize the code including comments and docstrings for clarity
- Optional: Tableau or Power BI Integration: Create interactive dashboards for more dynamic visualizations using Tableau or Power BI.