Wayne's Github Page

A place to learn about statistics

Applied Statistical Methods

UN3105 - Fall 2024

This course is meant to give you a survey of various applied statistic methods beyond linear regression. This can vary drastically depending on the instructor’s background.

Topic What Problems Does It Solve?
Bayesian Statistics How do we introduce prior knowledge into modeling?
Kalman Filters + Kriging How do we deal with temporally or spatially dependent data?
Sampling and data quality How do you get relevant data to your problem?
Survival analysis How do we deal with censored data?
Causal inference What else can quantify the impact besides randomized controlled trials?
(if time allows) Sequential analysis Can we use the data sequentially without cheating?

- Your Job

People

Instructor: Wayne Tai Lee (wtl2109)

Teaching Assistant: Yizi Zhang (yz4123)

Timeline

I reserve the right to change the ordering and the content for the course throughout the semester.

Date Topic Follow-up Before-Class
2024-09-04 Introductions and expectations syllabus  
2024-09-09 Regression Refresher with R
notebook
A Modern Approach to Regression with R Have your R computing setup ready or refresh your R knowledge
2024-09-11 Challenges for regression
notebook
Linear Mixed Models Homework 0 due;
Skim over Trustworthiness of crowds is gleaned in half a second
Download trustworthiness.zip and extract the files from CourseWorks
install lme4 in R
2024-09-16 Linear Mixed models Doing Bayesian Data Analysis by John Kruschke  
2024-09-18 Crash course in Bayesian Statistics
notebook
   
2024-09-23 Contrasting Bayesian Methods with Classical Methods
notebook
   
2024-09-25 Introduction to Data Quality   Homework 1
2024-09-30 Dependent Data - Problems with Temporal Data
notebook
Chapter 1 on this dissertation  
2024-10-02 Dependent Data Continued - Time Series and Kalman Filters    
2024-10-07 Practice - Forecasting Temperature   Project 1 Due
2024-10-09 Dependent Data - GIS View of Spatial Data
notebook
   
2024-10-14 Dependent Data continued - Spatial Statistics
notebook on kriging
For manipulating spatial data in R: rspatial.org Homework 2
2024-10-16 Interpolation of Spatial Data Some Theory for Kriging - Ch 1.2  
2024-10-21 Practice with Kriging Twelve Million Phones, One Dataset, Zero Privacy  
2024-10-23 Sampling and practice with NHANES Sampling: Design and Analysis Chap 1-2.2  
2024-10-28 Sampling continued   Homework 3
2024-10-30 Exploratory data analysis
notebook
  For property investors, the price of homes is still not right
What happens when Wall Street Buys Most of the homes on Your Block
2024-11-04 NO CLASS - Election day    
2024-11-06 Discussion on EDA Exploring characteristics of online news comments and commenters with machine learning approaches by Lee and Ryu  
2024-11-11 Survival data and the issue of censoring
Notebook
Vignette on R package survival, Vignette on time dependent survival analysis, and Survival analysis: models and applications Chapter 1 + 2.1.1 + 2.1.2 + 5.1 Project 2 Due
2024-11-13 Survival curves and the Kaplan-Meier Estimator
notebook
   
2024-11-18 Discussion - flaws in randomized control studies   Randomization in the tropics revisited: a theme and eleven variations
2024-11-20 Issues with missing data   - The prevention and handling of the missing data
- Homework 4
2024-11-25 Final Project Chats    
2024-11-27 NO CLASS - Thanksgiving Holiday    
2024-12-02 Causal inference - AB testing in tech and traps   - Joshua D. Angrist and Jörn-Steffen Pischke (2015). Mastering ’Metrics: The Path from Cause to Effect, chapter 5 (see CourseWorks)
- Modern Algorithms for Matching in Observational Studies by Paul Rosenbaum, 2020
2024-12-04 Causal Inference - matching algorithms, difference-in-differences   Homework 5
2024-12-09 Wrap-up   Project 3 Due

Logistics

Lectures: MW 2:40-4:00pm Eastern Office Hours:

Computer Setup

Grading

If your final grade is in [93-100), you will earn at least an A, [90-93) will earn you at least an A-, [87-90) will earn you at least a B+, etc. A grading curve may be applied depending on the class performance but your grade will not be curved downwards. “At least” implies that there’s a possibility to earn a grade higher than your actual percentage.

A+ will be rewarded only on exceptional cases.

- Homeworks (25%)

Prerequisites

Textbooks / Supplies

No textbook but references are available on the syllabus.