Exploratory Data Analysis (EDA)

Exploratory Data Analysis (EDA) is the process of visualizing and analyzing data to extract insights from it. In other words, EDA is the process of summarizing important characteristics of data in order to gain better understanding of the data set.

In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task. Refer to CODEC https://www.codecnetworks.com/

No alt text provided for this image

           

Purpose of EDA


  • ·        Check for missing data and other mistakes.
  • ·        Gain maximum insight into the data set and its underlying structure.
  • ·        Uncover a parsimonious model, one which explains the data with a minimum number of predictor variables.
  • ·        Check assumptions associated with any model fitting or hypothesis test.


Types of Exploratory Data Analysis


·        Univariate non-graphical

·        Univariate graphical.

·        Multivariate non-graphical

·        Multivariate graphical.

Merging Datasets

No alt text provided for this image

Graphical Representations

1.Histogram

No alt text provided for this image

2.Box Plot

No alt text provided for this image

3.Scatter Plot

No alt text provided for this image

4.Violin Plot

No alt text provided for this image

Handling Missing values

Heatmap

No alt text provided for this image

Heatmap takes a rectangular data grid as input and then assigns a color intensity to each data cell based on the data value of the cell. This is a great way to get visual clues about the data.

You can also Learn DATA SCIENCE ANALYSIS


Author- Riya Goel

Mentor-Vishwa Prabhakar Singh




 

          

要查看或添加评论,请登录

Deepak Baghel的更多文章

  • XSS

    XSS

    What is XSS (cross site scripting) ? Cross site scripting or XSS has consistency been ranked as one of the top of 10…

    4 条评论
  • SYSTEM HIJACKING

    SYSTEM HIJACKING

    System Hijacking is a type of illegal security attack through which attacker/hacker gains unauthorized access to a…

  • REMOTE CODE EXECUTION (RCE)

    REMOTE CODE EXECUTION (RCE)

    Command injection is an attack in which the goal is execution of arbitrary commands on the host operating system via a…

  • Local File Inclusion

    Local File Inclusion

    Local file inclusion is web based vulnerability in which the attacker can put any file on the place of other file in…

  • Unsupervised Learning

    Unsupervised Learning

    Unsupervised Learning is a one of the types of machine learning . It's a part of learning where we don't offer focus to…

  • Supervised Learning

    Supervised Learning

    In Supervised Learning, algorithms learn from labeled data. After understanding the data, the algorithm determines…

  • Pandas

    Pandas

    Why Pandas ? · Pandas are used for data framing . · Pandas are generally based on numpy and matplotlib to give you a…

  • List in Python

    List in Python

    LISTS In python, list is a type of array or a container where you can store different types of data. The elements…

  • DATA SCIENCE PIPELINE

    DATA SCIENCE PIPELINE

    DATA SCIENCE PIPELINE What is Data Science? Data science is the extraction of relevant insights from data. It uses…

社区洞察

其他会员也浏览了