107-1 R Basic TA 4
Outline
- pipe operator
- data manipulation
- dplyr
- tidyr
- ggplot2
Pipe operator
Pipe
- a chain of data-processing stages
- mathematical point of view: composite function
- %>% in R (magrittr)
- f(x) => x %>% f
- g( f(x) ) => x %>% f %>% g
- packages: magrittr / dplyr
- Pipes in R Tutorial For Beginners
analytic process
analytic process
-
data manipulation
-
data visualization
-
statistical analysis / modeling
-
deployment
data manipulation
-
cleaning and preparing the data (80% time)
-
well structured data
-
-
Data wrangling, sometimes referred to as data munging, is the process of transforming and mapping data from one "raw" data form into another format with the intent of making it more appropriate and valuable for a variety of downstream purposes such as analytics.
-
dplyr
-
install.packages(‘dplyr’) / library(dplyr)
-
basic verb of data manipulation
tidyr
-
install / library
-
gather / spread (pivot tables in excel)
-
separate / unite
ggplot2
grammar graphics

MAIN parameters
- data: iris, diamonds, (your own data), ...
- aesthetic: x-y, shape, color, ...
- geometry: point, line, bar, boxplot, ...
107-1 R Basic TA 4
107-1 R Basic TA 4
By a136489
107-1 R Basic TA 4
- 573