Hadley Wickham of R Studios
Hadley Wickham of RStudios

Hadley Wickham of R Studios

This man deserves to be on the list (aside from being the Chief Data Scientist of R Studio) with his background in Statistics and creating packages in R such as Tidy, ggplot2, dplyr, plyr, and reshape2.

According to Wickham's "tidy" approach, each variable should be a column, each observation should be a row, and each type of observational unit should be a table.

Wickham was named a Fellow by the American Statistical Association in 2015 for "pivotal contributions to statistical practice through innovative and pioneering research in statistical graphics and computing

tidyverse and ggplot2

You can try it out with R studio, by the way AES means short for aesthetics, I had to look that one up! and I though it was a statistical variable of some sort.

ggplot (mpg, aes(displ, hwy, colour = class)) + geom_point()

mpg

The mpg cars data set is on and we use the displacement and high way mileage variables on different types of vehicles.

The Data Dictionary are as follows (following an inverse pattern of writing lastly what the variables meant)

  • mpg: continuous 
  • cylinders: multi-valued discrete 
  • displacement: continuous 
  • horsepower: continuous 
  • weight: continuous 
  • acceleration: continuous 
  • model year: multi-valued discrete 
  • origin: multi-valued discrete 
  • car name: string (unique for each instance)

ggplot2 is over ten years old is probably a lot of help to R Data Scientists.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++

bert gee's profile




Bert loves to write on Data Science articles and is always up for some coffee and good discussion over Stochastic Gradient Descent, L1 and L2 regularisation or Lasso and Ridge Regression, and well over sum of squares.

Gil Villasotes

Marketing Data Scientist | Learner

4 年

Nice, first heard of him when I was taking one of my basic R training online via www.datacamp.com few years back.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了