Pandas Car Data Cleaned using Dropna Method
below are all steps of Cleaning data using dropna method. you can find the dataset and Github repo https://github.com/MuhammadHammad02/Pandas/blob/main/Car%20Data%20Cleaning.ipynb
first of all import numpy and pandas
then view csv data
import numpy as np
import pandas as pd
df = pd.read_csv('car_data.csv')
df
# How many rows and columns
df.shape
#view top 5 rows data
df.head()
# view last 5 rows data
df.tail()
# views columns
df.columns
To get the unique years from the “year” column in DataFrame,
df.info()
df.year.unique()
This filters the DataFrame, retaining only rows where the “year” column contains numeric values
df[df.year.str.isnumeric()]
# view in year columns that is there any value does not a numeric value
df[~df.year.str.isnumeric()]
# check any null value
df.isnull().sum()
dropna removes any rows with missing (NaN) values from your DataFrame
df.dropna(inplace=True)
df.info()
df.isnull().sum()
calculates the total number of duplicated rows in your DataFrame
df.duplicated().sum()
df[df.duplicated()]
df.drop_duplicates(inplace=True)
df.info()
df.name
type(df.name)
df.name = df.name.str.split(" ").str[0:3].str.join(" ")
df.company.unique()
df.year = df.year.astype('int')
df.year.unique()
df.info()
df.Price.unique()
df.Price.value_counts()
df = df[df.Price!="Ask For Price"]
df.info()
df.Price
df.Price = df.Price.str.replace(",","").astype(float)
df.Price
df.info()
df.kms_driven.unique()
df.kms_driven = df.kms_driven.str.replace(" kms","").str.replace(",","").astype(float)
df.kms_driven
df.fuel_type.unique()
df
df.to_csv("cleaned_data.csv")