Exploratory Data Analysis (EDA)
Exploratory Data Analysis (EDA) is the process of visualizing and analyzing data to extract insights from it. In other words, EDA is the process of summarizing important characteristics of data in order to gain better understanding of the data set.
In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task. Refer to CODEC https://www.codecnetworks.com/
Purpose of EDA
- · Check for missing data and other mistakes.
- · Gain maximum insight into the data set and its underlying structure.
- · Uncover a parsimonious model, one which explains the data with a minimum number of predictor variables.
- · Check assumptions associated with any model fitting or hypothesis test.
Types of Exploratory Data Analysis
· Univariate non-graphical
· Univariate graphical.
· Multivariate non-graphical
· Multivariate graphical.
Merging Datasets
Graphical Representations
1.Histogram
2.Box Plot
3.Scatter Plot
4.Violin Plot
Handling Missing values
Heatmap
Heatmap takes a rectangular data grid as input and then assigns a color intensity to each data cell based on the data value of the cell. This is a great way to get visual clues about the data.
You can also Learn DATA SCIENCE ANALYSIS
Author- Riya Goel
Mentor-Vishwa Prabhakar Singh