June 29, 2020
In this worksheet, we will use the library tidyverse:
library(tidyverse)
pivot_longer()
, pivot_wider()
)Consider the following two data sets, male_haireyecolor
and female_haireyecolor
. The data sets record the occurrence of hair and eye color phenotype combinations in a class of statistics students. Use head()
to preview these data sets; are they tidy?
# download male data set
male_haireyecolor <- read_csv("https://rachaelcox.github.io/classes/datasets/male_haireyecolor.csv")
head(male_haireyecolor)
## # A tibble: 4 x 5
## Hair Brown Blue Hazel Green
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Black 32 11 10 3
## 2 Brown 53 50 25 15
## 3 Red 10 10 7 7
## 4 Blond 3 30 5 8
# download female data set
female_haireyecolor <- read_csv("https://rachaelcox.github.io/classes/datasets/female_haireyecolor.csv")
head(female_haireyecolor)
## # A tibble: 4 x 5
## Hair Brown Blue Hazel Green
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Black 36 9 5 2
## 2 Brown 66 34 29 14
## 3 Red 16 7 7 7
## 4 Blond 4 64 5 8
The data set is not tidy, because the columns Brown
, Blue
, Hazel
, and Green
are observations for a single variable, eye color. The following versions of the tables are tidy:
# your R code here
Consider the following data set persons
, which contains information about the sex, weight, and height of 200 individuals:
persons <- read_csv("https://rachaelcox.github.io/classes/datasets/persons.csv")
head(persons)
## # A tibble: 6 x 3
## subject indicator value
## <dbl> <chr> <chr>
## 1 1 sex M
## 2 1 weight 77
## 3 1 height 182
## 4 2 sex F
## 5 2 weight 58
## 6 2 height 161
Is this data set tidy? And can you rearrange it so that you have one column for subject, one for sex, one for weight, and one for height?
The data set is not tidy, because neither indicator
nor value
are variables. The variables are subject
, sex
, weight
, height
. The following version of the table is tidy:
# your R code here