This short course runs for weeks one through five of the quarter. It is recommended for undergraduate students who want to use R in the humanities or social sciences and for students who want to learn the basics of R programming. The goal of the short course is to familiarize students with R's tools for data analysis. Lectures will be interactive with a focus on learning by example, and assignments will be application-driven. No prior programming experience is needed. Topics covered include basic data structures, File I/O, data transformation and visualization, simple statistical tests, etc, and some useful packages in R. Prerequisite: undergraduate student. Priority given to non-engineering students. Laptops necessary for use in class.
For the course syllabus, click here.
Kenneth Tay, Office Hours Friday 10am-12pm, Sequoia Hall Rm 105 (Weeks 1-6 only)
Note: For Week 1 ONLY, office hours will be Friday 1030am-1230pm (same location).
The only graded assignments for this class are the final project (80%) and the project proposal (20%). Click here for more details on these assignments.
Programming is one of those things that you can't learn just by listening to lectures: you have to practice (and practice and practice)! However, since this is a 1-credit class, I don't want to have graded assignments on top of the project. To that end, after each session I will release a few questions to test your understanding of that session's material, and will release the answers a few days later. Responses to these questions will NOT be graded.
The Piazza forum is a place for you to ask (and answer) questions about the course. This includes both questions about content and about course logistics. I won't be checking on this super often, so it is a better place for posting questions which your classmates might know the answer to. Assignment extensions will NOT be entertained on Piazza.
To access the forum, click on this link. You can get the access code from the first announcement on Canvas.
There is no textbook for this class. Having said that, much of the material for this class was heavily inspired by "R for Data Science" by Garrett Grolemund and Hadley Wickham which is available online for free here. It is very comprehensive and well-written, and I recommend it highly to anyone who wants to do data science in R!
Course materials will be added progressively to the table below. To save material, right-click and click on "Save Link As..."
Session No. | Before class | In-class material | After class (optional) |
---|---|---|---|
Session 1 (24 Sep) Introduction to R |
Required reading:
Optional reading:
|
Install R and RStudio on your laptop (Mac / Windows). Install relevant R packages (instructions here). Slides Lab |
Programming questions Programming solutions Further R4DS reading: Ch 4, 6 |
Session 2 (26 Sep) Basic R objects |
Required reading: |
Slides Lab |
Programming questions Programming solutions Further R4DS reading: Ch 20 |
Session 3 (1 Oct) Data visualization with ggplot2 |
Optional reading:
|
Slides Lab |
Programming questions Programming solutions Further R4DS reading: Ch 3 |
Session 4 (3 Oct) Data visualization with ggplot2 (continued) |
Slides Lab |
Programming questions Programming solutions Further R4DS reading: Ch 28 |
|
Session 5 (8 Oct) Data transformation with dplyr |
Optional reading: |
Slides Lab |
Programming questions Programming solutions Further R4DS reading: Ch 5, 18 |
Session 6 (10 Oct) Functions and more data transformation |
Slides Lab |
Programming questions Programming solutions Further R4DS reading: Ch 6, 12 |
|
Session 7 (15 Oct) Importing your own data and factors |
Slides Lab NBA data |
Programming questions Programming solutions Further R4DS reading: Ch 8, 11, 15 |
|
Session 8 (17 Oct) Publishing in R Markdown |
Required reading: Optional reading: |
Slides Lab Airbnb data Airbnb_analysis.R Airbnb analysis (.Rmd file) Airbnb analysis (.html file) |
Programming questions Programming solutions (.Rmd file) Programming solutions (.html file) Further R4DS reading: Ch 27, 30 |
Session 9 (22 Oct) Data joining and maps |
Optional reading: |
Slides Lab Elections data County map data Elections analysis (.Rmd file) Elections analysis (.html file) |
Programming questions Programming solutions Further R4DS reading: Ch 13 |
Session 10 (24 Oct) Statistical testing and linear regression |
Optional reading:
|
Slides Lab Spotify data Spotify starter R script Spotify analysis (.Rmd file) Spotify analysis (.html file) |
Programming questions Programming solutions Further R4DS reading: Ch 23 |