Here’s your roadmap for the semester!
Lecture (): This page contains the slides (and additional readings) for the data science or data engineering lectures. These materials will typically be released right after each class, and in some cases, directly before the class.
Assignment (): This page contains the instructions for each assignment, i.e., demo-lab or final research project materials.
Orientation
|
Title
|
Lecture
|
Assignment
|
Introduction
|
June 5
|
Welcome to SURE 2023
|
|
|
Exploratory Data Analysis
|
Title
|
Lecture
|
Assignment
|
Data Exploring
|
June 5
|
Demo-lab: Rstudio basics
|
|
|
June 6
|
Exploring data: Into the tidyverse
|
|
|
June 7
|
The grammar of graphics and ggplot2
|
|
|
Data Visualization
|
June 7
|
Demo-lab: wrangling with dplyr
|
|
|
June 8
|
Visualizing 1D categorical and continuous variables
|
|
|
June 8
|
EDA Mini-Project 1: Requirements and Data CMSAC mini project: in lieu of their research project starting in two weeks later.
|
|
|
June 9
|
Visualizing 2D categorical and continuous by categorical
|
|
|
June 12
|
Density estimation
|
|
|
June 12
|
Demo-lab: Data visualization practice with ggplot
|
|
|
June 16
|
Visualizing Geographic Data
|
|
|
June 16
|
Demo-lab: Exploratory data analysis case studies
|
|
|
Data Engineering
|
Title
|
Lecture
|
Assignment
|
UNIX Philosophy
|
June 6
|
Getting acquainted with the UNIX philosophy
|
|
|
June 13
|
Getting comfortable with the UNIX philosophy
|
|
|
June 27
|
Embracing the UNIX philosophy - Part 1
|
|
|
July 12
|
Embracing the UNIX philosophy - Part 2
|
|
|
July 21
|
A Practical Approach to SQL – Part 1
|
|
|
July 26
|
A Practical Approach to SQL – Part 2
|
|
|
Unsupervised Learning
|
Title
|
Lecture
|
Assignment
|
Clustering
|
June 13
|
K-means
|
|
|
June 14
|
Hierarchical clustering
|
|
|
June 15
|
Gaussian mixture models
|
|
|
June 19
|
Demo-lab: clustering Juneteenth holiday, but lab will be released
|
|
|
Presentation Skills
|
Title
|
Lecture
|
Assignment
|
Presenting
|
June 21
|
Working with xaringan and xaringanthemer
|
|
|
Machine Learning
|
Title
|
Lecture
|
Assignment
|
Supervised Learning
|
June 20
|
Model assessment vs selection
|
|
|
June 22
|
Linear regression
|
|
|
June 23
|
Introduction to variable selection
|
|
|
June 26
|
Regularization
|
|
|
June 26
|
Demo-lab: Linear regression
|
|
|
June 27
|
Supervised and unsupervised learning with Tidymodels
|
|
|
June 28
|
Generalized linear models (GLMs)
|
|
|
June 29
|
Logistic regression
|
|
|
June 30
|
K-nearest neighbors regression and classification
|
|
|
July 6
|
Decision trees
|
|
|
July 7
|
Random forests and gradient-boosted trees
|
|
|
Supervised and Unsupervised Learning
|
July 10
|
Dimension Reduction: Principal components analysis (PCA)
|
|
|
July 11
|
Kernels, Smoothers, and Generalized Additive Models
|
|
|
July 14
|
Advanced Topics: Multinomial logistic regression (++)
|
|
|
Final project
|
Title
|
Lecture
|
Assignment
|
Final project
|
July 27
|
Optum Research Project Specification
|
|
|
July 27
|
CMSAC Research Project Specification
|
|
|
July 27
|
SURE 2023: Report Writing Guideline
|
|
|