STATS 32: Introduction to R for Undergraduates

Autumn 2018/2019

Course Description

This short course runs for weeks two through five of the quarter. It is recommended for undergraduate students who want to use R in the humanities or social sciences and for students who want to learn the basics of R programming. The goal of the short course is to familiarize students with R's tools for data analysis. Lectures will be interactive with a focus on learning by example, and assignments will be application-driven. No prior programming experience is needed. Topics covered include basic data structures, File I/O, data transformation and visualization, simple statistical tests, etc, and some useful packages in R. Prerequisite: undergraduate student. Priority given to non-engineering students. Laptops necessary for use in class.

For the course syllabus, click here.


Classes

TTh 12:00 to 1:20pm, 380-380Y (Weeks 2-5 only)

Instructor

Kenneth Tay, Office Hours Friday 10am-12pm, Sequoia Hall Rm 207 (Weeks 1-6 only)

Assignments

Final project & proposal (graded)

The only graded assignments for this class are the final project (80%) and the project proposal (20%). Click here for more details on these assignments.

Programming questions (not graded)

Programming is one of those things that you can't learn just by listening to lectures: you have to practice (and practice and practice)! However, since this is a 1 credit class, I don't want to have graded assignments on top of the project. To that end, after each session I will release a few questions to test your understanding of that session's material, and will release the answers a few days later. Responses to these questions will NOT be graded.


Course materials

There is no textbook for this class. Having said that, much of the material for this class was heavily inspired by "R for Data Science" by Garrett Grolemund and Hadley Wickham which is available online for free here. It is very comprehensive and well-written, and I recommend it highly to anyone who wants to do data science in R!

Course materials will be added progressively to the table below. To save material, right-click and click on "Save Link As..."

Session No. Before class In-class material After class (optional)
Session 1
(2 Oct)
Install R and RStudio on your laptop.
Instructions: Mac/Windows

Required reading: Variable types.

Optional reading:
Slides

Code
Programming questions

Further R4DS reading: Ch 4, 20
Session 2
(4 Oct)
Install the fueleconomy package by entering install.packages("fueleconomy") in the RStudio console.

Required reading: Summary statistics.

Slides

Code
Programming questions

Session 3
(9 Oct)
Install the ggplot2 package by entering install.packages("ggplot2") in the RStudio console.

Optional reading:
Slides

Code
Programming questions

Further R4DS reading: Ch 3, 28
Session 4
(11 Oct)
Install the dplyr and nycflights13 packages in RStudio. Slides

Code
Programming questions

Further R4DS reading: Ch 5, 18
Session 5
(16 Oct)
Make sure the readr and tidyr packages are installed on your machine. Check this by running library(readr): if it loads, it is installed already. If you get an error, install the package using install.packages("readr"). Slides

Code

Drought data

Drought analysis v1.R

Drought analysis v2.R

Drought analysis v3.R
Programming questions

Further R4DS reading: Ch 6, 8, 11
Session 6
(18 Oct)
Make sure the knitr package is installed on your machine. Check this by running library(knitr): if it loads, it is installed already. If you get an error, install the package using install.packages("knitr").

Make sure that R Markdown is working on your machine. Check this by following these instructions.

Optional reading:
Slides

Code

Elections data

Elections analysis.Rmd (input)

Elections analysis.html (output)
Programming questions

Further R4DS reading: Ch 27, 30
Session 7
(23 Oct)
Make sure the maps package is installed on your machine. Check this by running library(maps): if it loads, it is installed already. If you get an error, install the package using install.packages("maps").

Optional reading:
Slides

Code

County map data

Elections starter.Rmd
Programming questions

Session 8
(25 Oct)
Optional reading: Slides

Code

Spotify data

Spotify starter.Rmd

Spotify final.Rmd (input)

Spotify final.html (output)
Programming questions

Further R4DS reading: Ch 23