Data visualization is one of the most important skills in Data Science. Not just for the fact that visualization is important presenting results, but also to have an intuitive way to communicate Data Science and Statistics with the key stakeholders in Business, Research and other Industries.
R is one of the best (probably the best) programming languages for Data visualization and in this article i will create a summary of 15 libraries for Data Viz in different domains.
- ggplot2 - Of course this is the first library. One of the best and the most popular Data Viz libraries ever. Focused mostly on Data Viz and applicable in almost any methodological setting / domain. Additionally, ggplot has one of most advanced esthetics capabilities of all and everything can be customized for different purposes. It is also the main driver of large number of other Data visualizations in different packages. My favorite use - almost every segment of Data Science visualization.
- ggpubr - ggplot based package for publication ready plots. Essential for most R programmers working in Research publication area. One of my favorite packages for Research Data Visualization.
- PerformanceAnalytics - Not just for the fact that its very good for Time Series plots, this package is also well adapted for Econometrics, Financial, Risk analytics and many other areas.
- rayshader - This package is so advanced that it brings the features of R visualization to the level of top Graphical design software. Very useful in advanced GIS, geospatial data visualization.
- facto_extra - This package will indeed produce ggplot2 based visualization and is fantastic for Exploratory data analysis, PCA, Factor analysis, different types of clustering and produces plots with great visuals. One of the best things about facto_extra is that is very easy and straightforward to use. My favorite use - Visualization in Unsupervised Machine Learning.
- lattice - A package which is very well adapted to what is needed in Research publication based Data Analysis, from barplots and boxplots to histograms and regressions lines. These visualizations are essential for Research Data Analysts and are very well implemented in lattice. Another area where lattice is great is making subplots within one plot.
- RColorBrewer - Making sure that even the finest details like color palletes are fine tuned is another aspect of R Data Visualization. By far one of the best packages for this is RColorBrewer.
- rgl- This is one of my favorite package for 3D plotting. It can be well adapted to methods, like PCA, kmeans, t-SNE and many others. Interactive 3D plots are well defined and esthetically great. My favorite use - 3D t-SNE visualization.
- pheatmap - A great heatmap cluster dendrogram package. Esthetically fantastic, fully customizable. Very often my choice for Unsupervised analysis Data visualization.
- EnhancedVolcano - This package is created by Bioconductor. Producing volcano plots is essential for Differential gene expression analysis in Bioinformatics. Enhanced volcano is one of my favorite libraries for DGE visualization.
- pROC - ROC curves are one of my favorite ways of Data Viz aspect of predictive models and pROC is on the top of my list.
- bayesplot - Bayesian analysis is very dependent on good visualization. Quantifying uncertainty and using visuals to have intuitive perspective on probabilistic aspect is one of the best features of "bayesplot". It is used in many other other Bayesian packages and is one of my favorite packages for Bayesian Data Viz.
- Shiny - This is one of the best packages ever for R, not just for the fact that its a very important segment in model deployment framework, but also for the fact that it has fantastic interactive Data visualization esthetics and workflows.
- lavaan - Structural equation modeling is highly dependent on the Variable visualization features and when it comes to SEM in R lavaan is my number one choice for Data viz.
- GenVisR - When it comes to Bioinformatics Data visualization, GenVisR is one of my favorite. Very accurate, fully customizable and publication ready and advanced esthetics.
Darko Medin, A Data Scientist
Making clinical trials accessible with GCP-Service | CEO | President of AICROS
2 年Thanks for compiling this list, Darko. I love using data visualization to enhance good story telling and so I am sure I will find some inspiration :)
International Expert Biostatistician. Digital products, CROs/Academia as a Data Scientist. Meta-analysis expert. Specialized for Cardiology, Oncology Research / other Life Science areas. Machine Learning, AI Researcher.
2 年In the next edition, scheduled for next week, the main theme will be SEM (Structural equation modeling).