Handling Missing Data
Sometimes we may not have 100 percent access to all the data we would wish to have in order to carry out our analysis. In such scenarios, how can we work around this and ensure we are able to conduct an objective analysis?
Five Percent Threshold:
Option 1: Consider it safe to drop a certain proportion of the null values. For instance, if your sample size is 1,000, to determine the threshold for dropping, compute five percent of the total values (0.05 * 1,000 = 50).
Option 2: Substitute the missing values with the median. Why the median? It is not affected by skewness, making it a robust choice when handling missing data.
By carefully considering these options, you can maintain the objectivity of your analysis and avoid distorting your results.
?
?
#21DaysLinkedlnChallengeWithAmbibola ?Abimbola Arowolo
hashtag#DataThatMatters
Molecular Biologist | Microbiologist | Data Analyst | Genomics & Bioinformatics
3 周I will take these steps, next time I have missing data, thank you for sharing