Introduction to R for Biologists, Summer 2022
This is the homepage for the introductory R course offered by the Big Data in Biology Summer School through the Center for Biomedical Research Support. All lecture slides, coding worksheets and coding worksheet solutions will be posted here. More information regarding summer school courses can be found here.
Course information
Time: 1:30PM - 4:30PM | May 31st - June 3rd
Location: FNT 1.104
Class compute servers (see email for your account & password):
These compute servers (aka PODs) are managed by the Biomedical Research Computing Facility. PODs have powerful hardware for handling large data sets, come with many bioinformatics tools pre-installed, are regularly backed up, and feature web-based integrated development environments (IDEs) for both Python and R. You can find out more information about setting up a POD for your own research here and here.
Day 1: Introduction to R programming & the Tidyverse
- Slides (R basics): day1.pdf
- Slides (Tidyverse intro): tidy_intro.pdf
- You can download R from here: https://cran.r-project.org/
- You can download RStudio from here: https://www.rstudio.com/products/rstudio/download/
- R Markdown basics: https://rmarkdown.rstudio.com/authoring_basics.html
- Tidyverse website,
tidyr
vignettes: https://tidyr.tidyverse.org/ - In-class worksheet 1 (R basics):
- In-class worksheet 2 (Tidying data):
- Blank R Markdown project notebook template:
Day 2: Data visualization with ggplot2
- Slides: day2.pdf
- Tidyverse style guide: https://style.tidyverse.org/index.html
- Tidyverse website,
ggplot2
vignettes: https://ggplot2.tidyverse.org/ - Guide to all functions available in ggplot2: https://ggplot2.tidyverse.org/reference/
- Default colors that R recognizes: List of all strings with example output
- Optimize your data viz for your data type: https://serialmentor.com/dataviz/directory-of-visualizations.html
- In-class worksheet:
Day 3: Data manipulation & analysis with dplyr
- Slides: day3.pdf
- Tidyverse website,
dplyr
vignettes: https://dplyr.tidyverse.org/ - Animated visualizations of different join() functions:
- In-class worksheet:
Day 4: Machine learning & advanced data visualization
- Slides: day4.pdf
- Guide to interactive plots using
ggplotly
: https://plot.ly/ggplot2/user-guide/ - Guide to
ggrepel
for dynamic labeling: https://ggrepel.slowkow.com/index.html - Guide to making panels using
patchwork
: https://patchwork.data-imaginist.com/ - Interactive visualization of principal component analysis (PCA): http://setosa.io/ev/principal-component-analysis/
- Caret documentation: http://topepo.github.io/caret/index.html
- ROC animations: https://github.com/dariyasydykova/open_projects/tree/master/ROC_animation
- In-class worksheet: