Wayne's Github Page

A place to learn about statistics

Applied Statistical Methods

UN3105 - Fall 2020

This course is meant to give you a survey of various applied statistic methods. This can vary drastically depending on the instructor’s background.

Topic What Problems Does It Solve?
Sampling and data quality How do you get the data relevant to your problem?
Bayesian Statistics How do we introduce prior knowledge into modeling?
Kalman Filters + Kriging How do we deal with temporally or spatially dependent data?
Survival analysis How do we deal with censored data?
Causal inference What else can quantify the impact besides randomized controlled trials?
(if time allows) Sequential analysis Can we use the data sequentially without cheating?

Expectations

- Learning outcomes

- Your Job

People

Instructor: Wayne Tai Lee (wtl2109)

Teaching Assistant(s): Navid Ardeshir (na2844)

Timeline

I reserve the right to change the ordering and the content for the course throughout the semester.

Date Topic Follow-up Before-Class
2020-09-08 Introductions and expectations syllabus  
2020-09-10 Revisiting data collection and common errors Sampling: Design and Analysis Chap 1-2.2  
2020-09-15 Sampling and practice with NHANES and Discussion on Paper Sampling: Design and Analysis Chap 1-2.2 Homework 0 due
2020-09-17 Introduction to Data Quality   Read Modeling Ideology and Predicting Policy Change with Social Media by Zhang and Counts
2020-09-22 How to start a problem? discussion on reading   The Silent Sex: Gender, Deliberation, and Institutions, Mendelberg and Karpowitz, Chapter 3
2020-09-24 Discussion on EDA with focus on NYTimes Comments   Homework 1 - NYTimes EDA
2020-09-29 Regression Refresher with R A Modern Approach to Regression with R  
2020-10-01 Regression with NYTimes based on Reading   Exploring characteristics of online news comments and commenters with machine learning approaches by Lee and Ryu
2020-10-06 Crash course in Bayesian Statistics Doing Bayesian Data Analysis by John Kruschke Project 1 Due
2020-10-08 Contrasting Bayesian Methods with Classical Methods    
2020-10-13 Dependent Data - Problems with Temporal Data   Homework 2
2020-10-15 Dependent Data Continued - Time Series and Kalman Filters Chapter 1 on this dissertation  
2020-10-20 Practice - Forecasting Temperature For manipulating spatial data in R: rspatial.org  
2020-10-22 Dependent Data - GIS View of Spatial Data    
2020-10-27 Dependent Data continued - Spatial Statistics Interpolation of Spatial Data: Some Theory for Kriging - Ch 1.2 Homework3
2020-10-29 Practice with Kriging    
2020-11-03 NO CLASS - Election day    
2020-11-05 Discussion on spatial data privacy   Twelve Million Phones, One Dataset, Zero Privacy
2020-11-10 Survival data and the issue of censoring Survival analysis: models and applications Chapter 1  
2020-11-12 Simulating challenges from censored data Vignette on R package survival, Vignette on time dependent survival analysis, and Survival analysis: models and applications Chapter 2.1.1 + 2.1.2 + 5.1 Project 2 Due
2020-11-17 Survival curves and the Kaplan-Meier Estimator    
2020-11-19 Practice with survival analysis Framingham Heart Study (Study description on Canvas)  
2020-11-24 Missing data   - The prevention and handling of the missing data
- Homework4
2020-11-26 NO CLASS - Thanksgiving Holiday    
2020-12-01 Discussion - flaws in randomized control studies   Randomization in the tropics revisited: a theme and eleven variations
2020-12-03 Causal inference - AB testing in tech and traps   Homework5
2020-12-08 Causal Inference - matching algorithms, difference-in-differences   - Joshua D. Angrist and Jörn-Steffen Pischke (2015). Mastering ’Metrics: The Path from Cause to Effect, chapter 5 (see Canvas)
- Modern Algorithms for Matching in Observational Studies by Paul Rosenbaum, 2020
2020-12-10 Wrap-up   Project 3 Due
TBD Measure understanding Final Exam You!

Logistics

Lectures: TuTh 11:40-12:55 Eastern on Zoom (links on Canvas) Office Hours: No office hours for TA, use discussion board for questions Instructor office hours by appointment only

Computer Setup

Grading

If your final grade is in [93-100), you will earn at least an A, [90-93) will earn you at least an A-, [87-90) will earn you at least a B+, etc. A grading curve may be applied depending on the class performance but your grade will not be curved downwards. “At least” implies that there’s a possibility to earn a grade higher than your actual percentage.

A+ will be rewarded only on exceptional cases.

- Homeworks (20%)

Prerequisites

Textbooks / Supplies

No textbook but references are available on the syllabus.