课程: Data Visualization in R with ggplot2

Introducing ggplot2

- [Instructor] Ggplot2 is the most popular and fully featured data visualization package for the R programming language. The Power of ggplot2 comes from the fact that it allows you to build and customize graphics in exactly the manner you'd like them to appear using a concept known as the grammar of graphics. Now, if you've ever struggled with creating a visualization in Excel because you couldn't figure out how to tweak the graphic to appear exactly the way you'd like, ggplot2 is for you. It allows you to easily create simple visualizations while also permitting you to define the precise details of a visualization as specifically as you'd like. Ggplot2 is part of a collection of R packages designed for visualization, known as the Tidyverse. Curated by Hadley Wickham, the Chief Scientist at Posit, the Tidyverse packages provide our developers with a set of tools that follow the entire data analysis lifecycle. Let's talk briefly about a few of the components of the Tidyverse and how they fit into the data analysis lifecycle. The readr package contains a set of functions designed to import data into R in various forms. In this course, we'll use the read CSV function from the readr package to read data files consisting of comma separated values, but readr has a lot more packages available to handle tab separated files, Excel files, and other common data file formats. The tibble package defines a new data structure called the tibble that makes it easy to manipulate data in R. It replaces the data frame structure used in Base R providing a similar data structure that's a little simpler to work with. The dplyr package contains a set of functions to help you with data manipulation. You'll find functions that select the variables you'd like to include in your analysis. Filter the rows included in your tibble, sort data, create new variables, and summarize values using aggregate functions. The tidyr package provides functions that help you create tidy data by making wide data sets long and making long data sets wide. You can learn more about these other Tidyverse packages in my course, "Data Wrangling in R." In this course, we're most concerned with the ggplot2 package. Ggplot2 is a set of functions that implement the grammar of graphics and allow you to visualize your data using scatter plots, bar and column graphs, lines, and basically any other type of visualization that you can imagine. Ggplot2 is especially valuable because it's integrated with all of the other components of the tidyverse. You can read data in using the readr package, manipulate it with dplyr and tidyr, and then visualize it with ggplot2. All of that data passes easily between packages because they're designed to be compatible. You'll see that when we get into some examples later in this course.

内容