4 simple steps to enhance working with Tableau reports having billions of data from multiple data sources ??
One of the challenges working with several huge datasets is that it keeps on growing exponentially which may result in crashing the systems memory or slows down the performance of the reports, thus resulting in bad user experience. Recently, I worked with a data set which had several SQL tables and combined data of billions of rows and sharing the 4 step approach to keep the report smooth with faster turnaround.
To make it simple I have used Sample Superstore data which comes through Tableau and the process is exactly the same.
The steps involved are:
1.?????? Creating Data models (Tables and relationships)
2.?????? Using filters when bringing data before saving the extracts
3.?????? Saving data sources as Extracts
4.?????? Limiting the use of filters/calculations in the reports
Time to deep dive ??
Step 1: Creating Data model: (Tables and relationships)
This is the first and foremost step working with large and multiple datasets and the steps involved here are:
1.1 Understand the data sources and how to access it (e.g. SQL, excel etc)
1.2 What the data is about by checking the columns headers, (e.g. here it is about Superstore sales data, including data on orders, purchases, people and region )
1.3 How different data sources or Tables are linked to each other, to apply Joins/relationships (in this example we see below that Orders table can be used as Master data and it can be linked to the Returns Table using OrderID and it can be linked with People’s table using region).
Understanding this step is essential when we connect data in Tableau and replicate the same as shown below
领英推荐
Step 2: ?Filtering the data when connecting through data sources.
Once we have understood the data sources, we can use filters to bring in the data only required for analysis. As an example, the Order Table has 7 years of sales data and hypothetically if the client only needs to analyse current financial year orders, thus we can filter the data when exporting to Tableau, through the steps below as shared in screenshot thus enhancing the user experience when working with the reports.
Step 3: Saving data sources as an Extract
This is a critical step that will save huge time for the developers and improve user experience by enhancing the efficiency of the reports.
Once the data and tables are imported in Tableau, click on the Extract button, click to create a new sheet and then save the Tableau file in twb format as shown in the screenshot below. This will save the database in an extract format that the Tableau workbook will refer to rather than referring to SQL database or other databases. The extracts are saved locally in the systems and enhances the reports performance.
Note: If the reports require live connections, then save as an extract to build the reports and then change it to live connections. This would save a lot of a of time when creating the reports first time.
?
?
Step 4 Limiting the use of filters/calculations in the reports: From the step 2, as data was filtered from the data source itself, the filters applied on the individual reports, can be reduced. If this is not the scenario in your case then create a strategy on reducing the use of filters. This will enhance the report functionality to give real time experience to the user. For each filter, Tableau software will update and refresh each time which would consume lot of memory.
Let me know in comments if you have questions, and what data strategy you use to improve the reports performance ??
Assistant Director at Acuity Knowledge Partners
1 年Great Learning! A quick question, when using cloud data sources, keeping big extract with complex visualisation doesn’t work well. Do you have any recommendations? Appreciate it.
Author, Public Speaker, Honorary Senior Lecturer in Leadership & Organisational Behaviour at University of Newcastle, and Alumni Advisory Committee Member at University of Newcastle
1 年Wow this is great !!! Thank you Ankit for sharing ????