Machine Learning - Data visualization with R (II)

Machine Learning - Data visualization with R (II)

This article continues presenting different techniques that can be used to communicate data or information by encoding it in graphs.


Scatterplot Matrix

Scatter plots show many points plotted in a Cartesian plane. Each point represents a set of coordinates. A Scatterplot matrix shows a scatter plot for each pair of variables.

# load required packages.
install.packages("ggplot2");library("ggplot2");
install.packages("ggExtra");library("ggExtra");
install.packages("gclus");library("gclus");
install.packages("car");library("car");
install.packages("hexbin");library("hexbin");
install.packages("latticeExtra");library("latticeExtra");
install.packages("rgl");library("rgl");

# 4 quantitative variables.
pairs(~mpg+disp+drat+wt,data=mtcars, main="Simple Scatterplot Matrix")
# 4 quantitative variables and 1 categorical variable.
scatterplot.matrix(~mpg+disp+drat+wt|cyl, data=mtcars,
  	main="Three Cylinder Options")
# 4 quantitative variables.
dta <- mtcars[c(1,3,5,6)] 
dta.r <- abs(cor(dta)) 
dta.col <- dmat.color(dta.r, cm.colors(10))
dta.o <- order.single(dta.r) 
cpairs(dta, dta.o, panel.colors=dta.col, gap=.5,
  main="Variables Ordered and Colored by Correlation" )


Simple Scatter Plots

# 2 quantitative variables.
ggplot(mtcars, aes(x=wt, y=mpg))+geom_point()
# 3 quantitative variables.
ggplot(mtcars, aes(x=wt, y=mpg, size=cyl)) +
  geom_point(shape=21, fill="red")
# 2 quantitative variables and 1, 2 or 3 categorical variables.
ggplot(mtcars, 
  aes(x=wt, y=mpg, 
    color=as.factor(mtcars$cyl), 
    fill=as.factor(mtcars$cyl), 
    shape=as.factor(mtcars$cyl))) +
  geom_point()
# 3 quantitative variables and 1, 2 or 3 categorical variables.
ggplot(mtcars, 
  aes(x=wt, y=mpg, size=hp,
    color=as.factor(mtcars$cyl), 
    fill=as.factor(mtcars$cyl), 
    shape=as.factor(mtcars$cyl))) +
  geom_point()


More 2D Scatter Plots

# Scatter plots with smooth lines.
ggplot(mtcars, 
  aes(x=wt, y=mpg, 
    color=as.factor(mtcars$cyl), 
    fill=as.factor(mtcars$cyl), 
    shape=as.factor(mtcars$cyl))) +
  geom_point()+
  geom_smooth(method=lm, se=TRUE)+
  geom_rug()
# Scatter plots with normal confidence interval ellipses.
ggplot(mtcars, 
  aes(x=wt, y=mpg, 
    color=as.factor(mtcars$cyl), 
    fill=as.factor(mtcars$cyl), 
    shape=as.factor(mtcars$cyl))) +
  geom_point()+
  stat_ellipse(type="norm",level=0.9)
# Density courves
ggplot(mtcars, aes(x=wt, y=mpg, colour=as.factor(mtcars$cyl))) +
  stat_density2d()
# Density bins
ggplot(mtcars, aes(x=wt, y=mpg)) +
  geom_bin2d(bins=10)
# Density hexagons
ggplot(mtcars, aes(x=wt, y=mpg)) +
  stat_binhex(bins=10)
# Scatter plot with level plot
levelplot(cyl~wt*mpg, mtcars, 
  panel = panel.levelplot.points, cex = 1) +
  layer_(panel.2dsmoother(..., n = 200))


3D Scatter Plots

In 3D scatter plots, each point represents tree coordinates.

# 3 quantitative variables
scatter3d(
  x=iris$Sepal.Length,y=iris$Sepal.Width,z=iris$Petal.Length, 
  surface=FALSE,
  xlab="SL",ylab="SW",zlab="PL")
# 3 quantitative variables and 1 categorical variable
scatter3d(
  x=iris$Sepal.Length,y=iris$Sepal.Width,z=iris$Petal.Length, 
  surface=FALSE,
  xlab="SL",ylab="SW",zlab="PL")
# 3 quantitative variables with tendency plane.
scatter3d(x=iris$Sepal.Length, y=iris$Sepal.Width,z=iris$Petal.Length,
  xlab="SL",ylab="SW",zlab="PL")
# 3 quantitative variables and 1 categorical variable with confidence interval ellipsoid.
scatter3d(x=iris$Sepal.Length, y=iris$Sepal.Width, 
  z=iris$Petal.Length, 
  groups=iris$Species, surface=FALSE, grid=FALSE, ellipsoid=TRUE,
  surface.col=brewer.pal(n=3, name="Set1"),
  xlab="SL",ylab="SW",zlab="PL")

To be continued...

Share your experience and provide feedback!

要查看或添加评论,请登录

?? Fernando Bucci的更多文章

  • Pensando en colores

    Pensando en colores

    En este artículo te contaré cómo, aún hoy, nos seguimos perdiendo en los más básicos razonamientos, cuáles son algunos…

    1 条评论
  • Sustainable IT (I)

    Sustainable IT (I)

    This is the first of a series of articles whose goal is to provide an introduction to the concept of Sustainable IT…

  • API Design Patterns

    API Design Patterns

    APIs bring significant benefits when used in different scenarios. In this article, the most relevant kinds of scenarios…

  • Why strategy gurus have lied to us for decades and the truthful truth

    Why strategy gurus have lied to us for decades and the truthful truth

    You must have already pitched upon several strategy experts and gurus explaining with pride the process for defining…

    3 条评论
  • What if Histiaeus used WhatsApp?

    What if Histiaeus used WhatsApp?

    Steganography is the practice of concealing the fact that a secret message is being sent as well as the contents of the…

  • Notes on Hack the Box

    Notes on Hack the Box

    Hack the Box is an online platform allowing to you test your penetration testing skills. The first challenge you face…

    1 条评论
  • Machine Learning - Supervised Learning - Classification (I)

    Machine Learning - Supervised Learning - Classification (I)

    In this article we will use classification algorithms to predict the species flowers belongs to by knowing petal and…

  • Machine Learning - Some basic definitions

    Machine Learning - Some basic definitions

    Machine learning is a branch in computer science that studies the design and use of algorithms and models that can…

  • Machine Learning - Data visualization with R (III)

    Machine Learning - Data visualization with R (III)

    This article continues presenting different techniques that can be used to communicate data or information by encoding…

  • Machine Learning - Data visualization with R (I)

    Machine Learning - Data visualization with R (I)

    This article presents different techniques that can be used to communicate data or information by encoding it in…

社区洞察

其他会员也浏览了