R (Programming Language)- A Comprehensive Tool for Data Analytics & Statistical Computing
Photo Source: R-project.org

R (Programming Language)- A Comprehensive Tool for Data Analytics & Statistical Computing

Introduction to R

R is a popular programming language used for statistical computing and graphical presentation. Its most common use is to analyze and visualize data. [1] R is a powerful programming language and software environment primarily used for statistical computing, data analysis, and graphical visualization.

R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language [7] and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R. [2] As an interpreted language, R has a native command line interface. Moreover, multiple third-party graphical user interfaces are available, such as RStudio—an integrated development environment—and Jupyter—a notebook interface.

The R environment

R is an integrated suite of software facilities for data manipulation, calculation and graphical display. It includes

  • an effective data handling and storage facility
  • a suite of operators for calculations on arrays, in particular matrices
  • a large, coherent, integrated collection of intermediate tools for data analysis
  • graphical facilities for data analysis and display either on-screen or on hardcopy
  • a well-developed, simple and effective programming language which includes conditionals
  • loops, user-defined recursive functions and input and output facilities [3]

Why Use R?

  • It is a great resource for data analysis, data visualization, data science and machine learning
  • It provides many statistical techniques (such as statistical tests, classification, clustering and data reduction)
  • It is easy to draw graphs in R, like pie charts, histograms, box plot, scatter plot, etc++
  • It works on different platforms (Windows, Mac, Linux)
  • It is open-source and free
  • It has a large community support
  • It has many packages (libraries of functions) that can be used to solve different problems [4]

Application of R

  • Statistical Analysis: R is primarily used for performing complex statistical computations. It offers a vast array of statistical tests, models, and techniques that are essential for data analysis.
  • Data Visualization: R is well-known for its data visualization capabilities. Packages like ''ggplot2'' allow users to create a wide range of static and interactive graphs, charts, and plots.
  • Data Mining: R is used in data mining to discover patterns and relationships in large datasets. It supports techniques such as clustering, classification, and regression.
  • Bioinformatics: R is extensively used in bioinformatics for analyzing and visualizing biological data, such as genomic sequences and protein structures.
  • Machine Learning: R provides tools for implementing machine learning algorithms, including decision trees, random forests, and neural networks, to predict outcomes and classify data.
  • Finance and Economics: R is used in finance for time series analysis, risk assessment, and portfolio optimization. Economists use R for econometric modeling and forecasting.
  • Social Sciences: Researchers in sociology, psychology, and other social sciences use R for survey analysis, psychometrics, and text mining.
  • Environmental Science: R is applied in environmental science for analyzing climate data, modeling ecosystems, and assessing environmental impacts.
  • Pharmaceutical Industry: In the pharmaceutical industry, R is used for clinical trial data analysis, drug development, and safety monitoring.
  • Academic Research: R is a popular tool in academia for conducting research across various disciplines, providing tools for data analysis, visualization, and reproducibility.

Syntax of R

To output text in R, use single or double quotes: [To write R Code, most used Code editor is R-Studio ] Example : INPUT: > print("Hello, World!") OUTPUT: [1] "Hello, World!"

Built-in Functions in R

  • print() - Displays an R object on the R console
  • min() / max() - Calculates the minimum and maximum of a numeric vector
  • sum() - Calculates the sum of a numeric vector
  • mean() - Calculates the mean of a numeric vector
  • range() - Calculates the minimum and maximum values of a numeric vector
  • str() - Displays the structure of an R object
  • ncol() - Returns the number of columns of a matrix or a data frame
  • length() - Returns the number of items in an R object, such as a vector, a list, and a matrix.
  • plot() - Visualize data in graph & chart format to share insight [6]


> v <- c(1, 3, 0.2, 1.5, 1.7)

> print(v)

[1] 1.0 3.0 0.2 1.5 1.7

> sum(v)

[1] 7.4

> mean(v)

[1] 1.48

> length(v)

[1] 5

R Studio Code & Output:

For the Following Photo: R Studio is used as a Code Editor. Syntax can be written in R Script File as well as in CONSOLE. Some output come into CONSOLE part. Again graphical output like: Plot, graph, chart result may appears on PLOT part.

R Studio Code INPUT & OUTPUT

R Packages:

The tidyverse is a collection of open source packages for the R programming language. The core tidyverse packages, which provide functionality to model, transform, and visualize data, include: [5]. To use each package, programmer have to install and run code.

Example: > install.packages("ggplot2")

tidyverse package consist of 8 packages

  1. ggplot2 - ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics
  2. dplyr - dplyr provides a grammar of data manipulation, providing a consistent set of verbs that solve the most common data manipulation challenges
  3. tidyr - tidyr provides a set of functions that help you get to tidy data
  4. readr - readr provides a fast and friendly way to read rectangular data (like csv, tsv, and fwf)
  5. purrr - purrr enhances R’s functional programming (FP) toolkit by providing a complete and consistent set of tools for working with functions and vectors
  6. tibble - tibble is a modern re-imagining of the data frame, keeping what time has proven to be effective, and throwing out what it has not
  7. stringr - stringr provides a cohesive set of functions designed to make working with strings as easy as possible
  8. forcats - forcats provides a suite of useful tools that solve common problems with factors


Generally, one can use Excel for Data Cleaning, Mining and Data Analysis for Business Decision making. Besides excel, there are some important tools for data analytics: like: SQL, Tableau, Power BI, Python and others tools. But R Programming is a comprehension tool for analyze & visualize data. R can solve different tools task by itself. R has it own syntax format like C or Python language. R programming is applicable in Statistical Analytics, Business Research, Social Science, Bioinformatics, Business & Finance and others important area.


[1] R Introduction, W3 Schools: https://www.w3schools.com/R/r_intro.asp

[2] What is R, R-project.org : https://www.r-project.org/about.html

[3] What is R, R-project.org : https://www.r-project.org/about.html

[4] R Introduction, W3 Schools: https://www.w3schools.com/R/r_intro.asp

[5] Tidyverse: https://www.tidyverse.org/packages/

[6] datacamp, Using Functions in R Tutorial: https://www.datacamp.com/tutorial/functions-in-r-a-tutorial

[7] Data Scientest , S Language : https://datascientest.com/en/s-language-everything-you-need-to-know-about-this-language


Emran Hosen的更多文章

