?? Day15 of #100DaysOfPython ??

?? Day15 of #100DaysOfPython ??

Today, we're diving into different types of techniques for handling missing values in a dataset!

Checkout the what is missing data and its different types in Day14 Article.

Q. What are the different types of techniques for handling missing data?

  1. Imputing missing values with mean/median/mode
  2. Random sample imputation
  3. Imputing NaN values with a new feature
  4. End of distribution imputation
  5. Arbitrary imputation
  6. Frequent categories imputation

In today's article we will start with -

1. Imputing missing values with mean/median/mode:

  • This imputation works on the assumption that the data is missing completely at random (MCAR).
  • In this method we replace the NaN with the most frequently occurring observation in the feature.

Let's dive deeper with an example below:

Link to titanic dataset

1. Reading


2. Imputing missing values in the


3. Checking Standard deviation of the data in

Imputing feature with mean/median/mode can have the following:

Advantages:

  • Robust to outliers
  • Faster way to obtain complete dataset

Disadvantages:

  • Change/Distortion in the original variance of the data
  • Impacts correlation


What real world examples can you think of where imputing missing data in feature with mean/median/mode would be a better choice?






要查看或添加评论,请登录

Surya Singh的更多文章

  • ?? Day100 of #100DaysOfPython ??

    ?? Day100 of #100DaysOfPython ??

    Today, we're diving into map(), filter(), & reduce() in python! map() The map() function in Python is used to apply a…

    2 条评论
  • ?? Day99 of #100DaysOfPython ??

    ?? Day99 of #100DaysOfPython ??

    Today, we're diving into 'is' & '==' in python! The 'is' and '==' operators might seem similar at first glance, but…

  • ?? Day98 of #100DaysOfPython ??

    ?? Day98 of #100DaysOfPython ??

    Today, we're diving into the use of .join() function for string concatenation in python! The .

  • ?? Day97 of #100DaysOfPython ??

    ?? Day97 of #100DaysOfPython ??

    Today, we're continuing to dive into Object Oriented Programming in python! How do we initialise a class and create…

  • ?? Day96 of #100DaysOfPython ??

    ?? Day96 of #100DaysOfPython ??

    Today, we're diving into Object Oriented Programming in python! What is a class? A class is a blueprint for creating…

  • ?? Day95 of #100DaysOfPython ??

    ?? Day95 of #100DaysOfPython ??

    Today, we're diving into regex in python! Regex allows you to define search patterns for strings, making it easier to…

  • ?? Day94 of #100DaysOfPython ??

    ?? Day94 of #100DaysOfPython ??

    Today, we're diving into another technique for handling missing values known as Random Sample Imputation! Random sample…

  • ?? Day93 of #100DaysOfPython ??

    ?? Day93 of #100DaysOfPython ??

    Today, we're diving into Local & Global variables in python! Local variables are defined within a function or block and…

  • ?? Day92 of #100DaysOfPython ??

    ?? Day92 of #100DaysOfPython ??

    Today, we're diving into the use of .join() function for string concatenation in python! The .

  • ?? Day91 of #100DaysOfPython ??

    ?? Day91 of #100DaysOfPython ??

    Today, we're diving into Count/Frequency Encoding for handling categorical feature! Count or frequency encoding is a…

社区洞察

其他会员也浏览了