Pandas Car Data Cleaned using Dropna Method

below are all steps of Cleaning data using dropna method. you can find the dataset and Github repo https://github.com/MuhammadHammad02/Pandas/blob/main/Car%20Data%20Cleaning.ipynb

first of all import numpy and pandas

then view csv data

import numpy as np
import pandas as pd

df = pd.read_csv('car_data.csv')

df
# How many rows and columns
df.shape

#view top 5 rows data
df.head()

# view last 5 rows data
df.tail()

# views columns

df.columns        

To get the unique years from the “year” column in DataFrame,

df.info()

df.year.unique()        

This filters the DataFrame, retaining only rows where the “year” column contains numeric values

df[df.year.str.isnumeric()]        


# view in year columns that is there any value does not a numeric value

df[~df.year.str.isnumeric()]
        
# check any null value

df.isnull().sum()        

dropna removes any rows with missing (NaN) values from your DataFrame

df.dropna(inplace=True)

df.info()        
df.isnull().sum()        

calculates the total number of duplicated rows in your DataFrame

df.duplicated().sum()

df[df.duplicated()]

df.drop_duplicates(inplace=True)

df.info()        
df.name

type(df.name)

df.name = df.name.str.split(" ").str[0:3].str.join(" ")        
df.company.unique()

df.year = df.year.astype('int')

df.year.unique()

df.info()        
df.Price.unique()

df.Price.value_counts()

df = df[df.Price!="Ask For Price"]

df.info()

df.Price

df.Price = df.Price.str.replace(",","").astype(float)

df.Price

df.info()        
df.kms_driven.unique()

df.kms_driven = df.kms_driven.str.replace(" kms","").str.replace(",","").astype(float)

df.kms_driven        
df.fuel_type.unique()

df
        
df.to_csv("cleaned_data.csv")        


要查看或添加评论,请登录

社区洞察

其他会员也浏览了