Data Visualisation: What do you want to Achieve?
While working on numerous types of data analytics projects in the last 2 decades I have always wondered which visualization type is the best for the business users and why? What are the various factors that we can keep in mind while deciding upon one? And just few days ago while speaking with one of my close friends, I realized that there is still a gap, we don't have to have a certification to decide on basic visualization types. The trick is to select the one that will best represent your data’s message and story.
Small take on different visualization types (I am covering 6 types here in this write up). There are few that I have used really well in the past like Radar, Population Pyramid, Geospatial, Histogram and Venn Diagram etc.
Even before we start let us take note of few relevant questions, what do you want to achieve?
Now let us get going with few of the charts/graphs type.
Type: Bar Chart; Functions: Comparative, Patterns; Also known as: bar graph or column graph
A bar chart displays categorical data with rectangular bars whose length or height corresponds to the value of each data point.
The classic Bar Chart uses either horizontal or vertical bars (column chart) to show discrete, numerical comparisons across categories. One axis of the chart shows the specific categories being compared and the other axis represents a discrete value scale.
Bar charts use volume to demonstrate differences between each bar. Because of this, bar charts should always start at zero. When bar charts do not start at zero, it risks users misjudging the difference between data values.
ways
Never
Recommended
Type: Bubble Chart; Functions: Corelation, Comparisons, Data over time, Distribution, Patterns, Proportions, Relationships
A Bubble Chart is a multi-variable graph that is a cross between a Scatterplot and a Proportional Area Chart. A bubble chart consists of a series of values that are plotted on an x-axis and y-axis, with each axis representing a variable and each value represented as a dot. The third variable value is then used to proportionally scale each bubble or dot. Bubble charts often include an independent variable, such as years of education, a dependent variable, such as annual income, and a proportional variable, such as population. When the dots are plotted against these two axes, bubble charts communicate the strength, type, and proportion of the relationship that exists between these variables.
Always
Never
Recommended
Not Recommended
Type: Line Graph; Functions: Data over time, Patterns and when grouped used for Comparisons; Also known as: Line Chart
Line graphs, or line charts, are a simple but effective staple for representing time-series data. They are visually similar to scatterplots but represent data points separated by time intervals with segments joined by a line. This allows for quick observation of features like acceleration (when the line goes up), deceleration (when the line goes down), and volatility (when the line moves up and down erratically).
While the simple line graph shown represents a single dataset, more complex line graphs may overlay several lines to represent different data. This is useful for spotting correlations or deviation. A common example of a line graph in action is the measure of stock market behavior or resource costs over time, e.g. the price of gold over several years.
Always
Never
Not Recommended
Type: Scatter Plot; Functions: Patterns, Relationships, Correlation, Distribution; Also known as: scatterplot, scatter graph, scatter chart, scattergram, scatter diagram
A scatterplot displays the relationship between two variables on an x- and y-axis. Each item of data is shown as a single point, creating the chart’s visual ‘scatter’ effect. When there are three interrelated data points (i.e., if there is a z-axis) 3D scatterplots are also possible.
Scatterplots are best used for large datasets where time is not a significant factor. For instance, a simple scatterplot might measure people’s weight against height. This would help identify any correlation between the two measures. However, because other factors affect the data (e.g., people’s weights are also related to their diet) scatterplots are best for inferring relationships between variables rather than drawing firm conclusions. Nevertheless, they are an excellent tool for hypothesis creation.
领英推荐
A common variant of the scatterplot is the bubble chart. Displaying different-sized circles (rather than single points), bubble charts represent three dimensions of data, rather than the usual two.
Always
Never
Recommended
Not Recommended
Type: Pie Chart; Functions: Comparative, Part to a whole, Proportions; Also known as: Circle Chart
Before using a pie chart, consider using a bar chart or displaying numeric values directly for improved usability.
Another visualization you may remember from school is the pie chart. While pie charts are similar to bar charts in that they represent categorical data, this is where the similarities end. The main difference (besides how they look) is that bar charts represent numerous categories of data, while pie charts represent a single variable, broken down into percentages or proportions.
Each ‘slice of the pie’ in a pie chart is proportional to the quantity it contributes to the whole, i.e. the entire circle. For this reason, pie charts are best suited to data that is split into about five or six categories…add more than that and it quickly becomes too complex to effectively represent the data.
Always
Never
Not Recommended
Type: Stacked Bar Chart; Functions: Comparative
A stacked bar chart is a bar chart that includes subgroups of data in each bar.
The length of each bar communicates the total value of a group which is a sum of it’s subgroup values, and the length of each subgroup represents their individual values. Stacked bar charts are best used to compare data between groups and between subgroups.
Simple Stacked Bar Graphs place each value for the segment after the previous one. The total value of the bar is all the segment values added together. Ideal for comparing the total amounts across each group/segmented bar.
100% Stack Bar Graphs show the percentage-of-the-whole of each group and are plotted by the percentage of each value to the total amount in each group. This makes it easier to see the relative differences between quantities in each group.
Always
Never
Recommended
Not Recommended
Now the for the questions that were mentioned right at the beginning, let me try to align the charts against each one of those:
Hoping you were able to align the questions to the type of graph that was eventually selected. Remember, you will come across many other types that are really good looking, but business is more inclined towards data and its representation, so simple is always the best visualization, don't over complicate your visualizations.
For good learning and exploring more visualization do refer to this link: https://datavizcatalogue.com/index.html
Graphs used above in the post are from: https://datavizcatalogue.com/index.html