I believe students need to learn to turn ideas into mathematically definitions, then these definitions into executable code.
Here’s a list of my personal belief what people with different degrees should be able to do:
- A student with a minor in statistics should be able to recognize problems and know the appropriate technical words to search for solutions e.g. if the data is not homoskedastic, what could go wrong?
- A student with a major in statistics should be able to think of how to fix the problem, e.g. transform the data or using weighted least squares could address issues with heteroskedasticity. This is limited to “well-known” problems.
- A student with a masters should be able to read papers and execute state of the art methods and perform minor tweaks on existing algorithms.
- A PhD student should be able to discover new methods for unsolved problems
- Introductory statistics
- Introductory statistics provides the basic statistical literacy for students to understand why and how statistics is related to every problem related to data. In a way, you’ll re-learn the scientific method through the lens of statistics and probability.
- Statistical Computing
- Statistical computing focuses on data manipulation, simulation, and data visualization. The goal is to develop the necessary skills for you to verify the various statistical concepts you’ve learned and will learn in the future.
- Linear Regression
- Linear regression is the most foundational tool in machine learning and multivariate statistics. It demonstrates the simplest “model” that connects problems to data and the challenges faced by fancier models.
- Applied Statistics
- There is no class that can teach you all the “applied statistical methods” out there. This is a topics course that introduces methods that tackle data collection to dealing with dependent data, to censored datasets, to finally topics on some causal inference.
- Applied Data Mining
- Data mining aims to find insightful patterns within large amounts of data or from joining multiple data sources. This class will give cover some basic machine learning then dive into examples in data mining.
- Pre-Research Seminar
- This seminar is meant to give students who are interested in statistical research some basic experience and skills before pursuing research with research faculty (possibly non-Statistics faculty). The goal is for you to identify a professor and initiate research towards the end of the semester.
- This course was a summer bootcamp to quickly introduce some best practices when it comes to statistical/data consulting.
Ideas for future courses:
- Spatial statistics
- Data mining as a deep dive exercise, contrast with causal inference
- Data science class?
- Data quality
- Interactions between models
- Communication and consulting
- Data science for social workers
Miscellaneous Teaching Materials
In 2019, COVID-19 has forced students to learn remotely. To accommodate this, I’ve made some videos using open-sourced software.
Self reflection on teaching
Here are some unorganized self-reflections from teaching semester to semester. These are meant for myself but feel free to read them.