P2-Prediction of Google App's Ratings
Amit Kumar
AI Engineer | Gen AI | Agentic AI | LLM | RAG | Machine Learning | Computer Vision | NLP | Deep Learning |
In this project, I have done the data analysis of Google App's rating.
The data is taken from the kaggle.
Following Modules are used in the project: Pandas, Numpy, Seaborn, and Matplotlib
The following are the steps performed to get the desired result.
- Read Data
- Inspecting the data
- checking the shape, describe
- making boxplot and checking the outliers
- making the histogram
- using the info() function to check the null values
Data Cleaning
- count the missing values in the dataframe
- count the number of missing values in each column
- check how many ratings are having more than 5 outliers
- removing the outliers
- again making boxplot and histogram
- removes columns that are 90% empty
Data Imputation and Manipulation
- fill the null values with appropriate values using the aggregate functions as mean, median and mode
- count the number of null values in each column and removing it
- check and correct the format of price, reviews and ratings