DATA VISUALIZATION : TABLEAU
Diksha Arvind Chaudhary
Senior Data Engineer | ETL developer | Business Intelligence | Currently on EAD looking for jobs
In this article, we will talk about data visualization, what are the different types of visualization and when it should be used. We will also talk about the data visualization tools like Tableau and Microsoft Power BI and some basics to start with Tableau.
What is Data Visualization ?
Data visualization is graphical representation of data and information using visual elements like maps, charts, etc. There are several data visualization tools available to represent data which makes us easy to see the trends, outliers in data.
Data visualization shows the relationship between the data with images so that decision makers can see the analytics, grasp difficult concepts and identify patterns.
With rise in big data we are expected to respond to the issues more rapidly and visual summary of information can make it easier to help find the trends in data.
During our college days we are often asked to present the data output visually, it helps the audience understand the information more easily rather than looking at thousand rows in spreadsheets. It will be very difficult to grasp the insights if it is not represented visually.
Different types of visualizations
Line chart: Line charts are perfect to represent the trend over time. The x axis is usually the time (months, quarters, years) and y axis is quantity.
Area Chart: We can say it is variation of line chart and is used to represent trend over time. It shades the area under line which emphasize the significance. The shades used in area chart should be transparent so that overlapping areas are visible.
Bar chart: Bar chart displays the change over time similar to line and area chart but these representations are great when there are lots of values for comparison.
Pie chart: It is best illustration if we want to visualize percentage of whole. It gives clear representation by proper proportion of pieces explaining breakdown in percentages.
Scatter plot: It can be used for finding the correlation, if the points trend in certain way then it means that there is a relationship between the x axis and y axis values. If the points are scattered then there is no relationship.
Bubble chart: Variation of scatter plot, here the points are represented by bubble with variable size. Not all data can be effectively visualized in this type of chart.
Gauge: Gauge charts can be used to illustrate the distance between the interval like speedometer, thermometer. The chart is used to represent the single value like total revenue year-to-date compared to last year. Multiple gauges can be displayed next to each other for comparison between different intervals.
Map: Best way of representation if the data contains location elements. Can be used to display sales in different parts of country with shades lighter and darker depending on numbers which can be very beneficial for business.
Heat map: It is matrix representation with each cell colored depending on the value of the cell. Colors are interpreted more easily than numbers and the color code mostly consist of red, yellow and green.
These are the common visualization types and there are many more data visualization types.
Data Visualization tools
Many data visualization tools are available in market that helps business for better project management and business insights. Tableau, Microsoft PowerBI, Quire, Plotly, Visually, Sisence, QlikView and more.
My personal favorite is Tableau. I would like to discuss some of the differences between Tableau and Microsoft PowerBI as both are very trending tools.
Products: Tableau offers variety of products that helps in visualization like Tableau desktop, Server, Public Server, Public, Prep Builder, Mobile and Reader. Microsoft PowerBI also has different products like desktop, service(online), premium, report server, pro and mobile.
Customers: Tableau has around 86K customers and Microsoft PowerBI also has around same or larger number of customers.
Visualization quality: Tableau offers very powerful quality of visualization, it has animation and large number of options available in it. Compared to Tableau, the visualizations in PowerBI aren't that powerful in terms of wide spread options beyond basics but the options are more than sufficient for most of the users.
Amount of Data: Tableau is capable to handle massive amount of data whereas PowerBI is elevated version of excel and access and works best with small amount of data.
Data Sources: Tableau provides a lot more data sources and has a in-depth list of data sources. PowerBI provides sufficient data source connectivity.
Cost: Large number of features provided by Tableau comes with a cost, Tableau is more expensive than PowerBI which may cost around $70/user and PowerBI is around $10/user per month.
TABLEAU BASICS
Tableau has many desirable features and provides solution to all kind of business, industries, departments. Tableau does not require any complex setup to start with.
Tableau desktop can be easily installed and used by users which contains many features to start with data analysis.
Tableau Server is centralized location which is used to manage all published dashboards and data sources by organization. Administration tasks like changing permissions, adding tags, scheduling refresh can be done using Tableau server.
There are different types of file in Tableau, let's look at them,
- Tableau workbook (.twb): The visualizations are stored without source data.
- Tableau data source (.tds): The server details, passwords and other information of data source is stored in this file.
- Tableau bookmark (.tbm): This file stores the connection to worksheet in another tableau workbook by reference, it stores the link to workbook so that it is more reachable.
- Packaged workbook (.twbx): It contains the workbooks along with copy of local data file in it.
Diverse Datasets
Tableau blends with diverse datasets and it is absolutely fine if users don’t know how the data is stored. Some of the common datasets are discussed here.
Spreadsheets: The records are stored in single rows of data as a flat structure, e.g. Microsoft Excel, Google spreadsheets.
Relational databases: The data can be stored in multiple tables with unique identifiers and users can pull data from them using Structured Query Language(SQL).
Cloud data: Some organizations stores the data in the cloud such as Amazon Web Services and Microsoft Azure.
Tableau also connects to Statistical files in R, spatial files .kml or .shp (ESRI file) and many more.
After the data connection is established, Tableau identifies the fields data types and role automatically. User can verify the types and change it if needed. The changes made will not affect the native data source instead it will be written in metadata file known as Tableau Data Source or .tds file.
Building Blocks of Data
In Tableau , quantitative fields are referred as Measures and qualitative fields as Dimensions.
Quantitative field (Measures) : Numerical data and is used for calculations.
Qualitative field (Dimensions): Describes the data and categories it. Dimensions tells us what, who, when by slicing the quantitative data.
When Tableau connect to the data it categories the fields in these two building blocks.
These are the basics for Tableau, once your data is available on worksheet you are good to start applying the visualization techniques.
Assistant Vice President at citi
4 年I'm new at this technology but this was worth reading!
Sr. Mobile Automation Engineer at NFON AG
4 年Great article!! Keep going ????????
Software Developer at Go1
4 年Great work Diksha Chaudhary
Senior Data Scientist
4 年Good going Diksha.. Keep up the good work