Hadley Wickham of R Studios
Albert Anthony D. Gavino, MBA
Book Writer | Data Science | Cloud Solutions
This man deserves to be on the list (aside from being the Chief Data Scientist of R Studio) with his background in Statistics and creating packages in R such as Tidy, ggplot2, dplyr, plyr, and reshape2.
According to Wickham's "tidy" approach, each variable should be a column, each observation should be a row, and each type of observational unit should be a table.
Wickham was named a Fellow by the American Statistical Association in 2015 for "pivotal contributions to statistical practice through innovative and pioneering research in statistical graphics and computing
You can try it out with R studio, by the way AES means short for aesthetics, I had to look that one up! and I though it was a statistical variable of some sort.
ggplot (mpg, aes(displ, hwy, colour = class)) + geom_point()
The mpg cars data set is on and we use the displacement and high way mileage variables on different types of vehicles.
The Data Dictionary are as follows (following an inverse pattern of writing lastly what the variables meant)
- mpg: continuous
- cylinders: multi-valued discrete
- displacement: continuous
- horsepower: continuous
- weight: continuous
- acceleration: continuous
- model year: multi-valued discrete
- origin: multi-valued discrete
- car name: string (unique for each instance)
ggplot2 is over ten years old is probably a lot of help to R Data Scientists.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Bert loves to write on Data Science articles and is always up for some coffee and good discussion over Stochastic Gradient Descent, L1 and L2 regularisation or Lasso and Ridge Regression, and well over sum of squares.
Marketing Data Scientist | Learner
4 年Nice, first heard of him when I was taking one of my basic R training online via www.datacamp.com few years back.