Combining the Power of R and Tableau for Data Science
Manoj Kumar
Sr. Data Architect / Manager: Data Warehouse | ER / Dimensional Data Modeling | Data Lakehouse | ETL | Cloud | Reports & Dashboards | DBA | SQL | Database | Cross Platform Data Migration
Introduction:
Making beautiful, colorful and intuitive Charts, Reports and Dashboards is Tableau's undisputed strength. R is well known and respected in the Data Science and Machine Learning world for its wide variety of libraries for all kinds of Machine Learning Algorithms.
Combining them we can produce visualizations which are accessible, elegant and nonpareil. So it becomes easy to create and reproduce Machine Learning Models which offer the best of both worlds.
Benefit of using R:
R has about 8,000 packages, which can be easily downloaded from CRAN (Comprehensive R Archive Network). Almost all the popular Machine Learning Models can be built with a few commands using prebuilt R packages.
Benefit of using Tableau:
Tableau is an intuitive and utile tool to play with all sorts of data. The main Tableau product, Tableau Desktop is used to connect, import and enhance data.
It offers data connection with hundreds of sources including Hadoop, Hive, JSON, Presto, Statistical file, Amazon Redshift, Google BigQuery, Firebird, Web Data Connector, etc.
Live data connection and data extract features are also offered.
Many operations can be done in Tableau as a part of data enhancement like adding calculated fields using arithmetic, string and date manipulations, etc.
It also provides table calculations across, down and combined. For example: running total, profit difference from previous quarter etc.
A user can explore it visually and publish dashboards on Tableau Server or Tableau Online for online collaborations.
Integrating R with Tableau:
To use R with Tableau, we have to do just below two steps:
1. Install, load and run Rserve and any other package (say mvoutlier) in R, which we want to use in Tableau.
2. In Tableau Desktop. configure R with Tableau. For this go to Help, Setting and Preference, Manage External Service Connection. Provide Server name and port and you are all set.
Once we have completed these steps, we can use the mvoutlier package of R inside Tableau's calculated field.
Benefit of combining Tableau with R:
Once Tableau is integrated with R, results can be dynamically recomputed in Tableau without manually having to recreate the data set in R each time.
It is a common requirement and hence practice to manually recreate the data set in R many times for different kinds of exploratory data analysis.
In Tableau this exploratory data analysis is agile, facile and incisive.
For example we can easily perform Linear Regression, Outlier Detection, address Geo-coding (providing latitude and longitude of addresses), K-Means Cluster analysis, etc in Tableau.
To do this we can use Calculated Field feature of Tableau. We need to create Calculated Fields, which can call R packages and functions.
Then these calculated fields can be easily used in the Dashboard or Report. We can apply any Filter, Action, etc as usual. We can also experiment by changing calculated fields to see which algorithm or formula works well with the data.
Conclusion:
Therefore, integration of R with Tableau is extremely beneficial. This speeds up all the phases of a Data Science project including Descriptive, Exploratory, Inferential and Predictive Data Analysis and is easily reproducible.
Copyright ? 2018 Manoj Kumar & Associates, All Rights Reserved.
Business Development Manager at Confedential
6 年The machine learning holds the highest CAGR of 44.86% during the forecast period 2019-2025. Request a sample @ https://www.envisioninteligence.com/industry-report/global-machine-learning-market/?utm_source=lic-chitti
Data Analyst | Python | SQL | Statistics | Research Analyst | 99th percentile in SAT
6 年I am writing a program in python to make interactive candlestick charts for historical stock data analysis. Would it be possible to integrate my program with Tableau similarly?