Applied machine learning

This course will expose students to a variety of data mining applications using machine learning methods.

Date Topic Reference Due
2025-01-21 Intro to data mining - Brazilian e-commerce on Kaggle
- ISL Chapter 2.2
2025-01-23 Data mining with basic statistics and regression review ISL Chapter 3 - Have R studio installed
- Informal exploration with the Brazilian e-commerce dataset
2025-01-28 Regression review continued ISL Chapter 3  
2025-01-30 Principal Component Analysis ISL 6.3.1 Homework 1 Due
2025-02-04 Principal Component Analysis Applications ISL 6.3.1 + ISL 10.2  
2025-02-06 Logistic regression + Naive Bayes and notes ISL 6.3.1 + ISL 10.2  
2025-02-11 Beyond classification accuracy + Rise of machine learning and “wrong” models - some history Paper on Why Biased Estimators given Stein Estimator + Gauss Markov Theorem + ISL Chapter 2.2 continued  
2025-02-13 Ridge + Lasso Regression and notebook ISL 6.2 Homework 2 Due Date Delayed Slightly
2025-02-18 Tree Methods and notebook ISL 8.1  
2025-02-20 Trees + forests with real data and notebook ISL 8.2
bias in random forest variable importance
2025-02-25 Ridge + Lasso Simulations ISL 6.2  
2025-02-27 Data Pipelines ISL 8.1 Homework 3
2025-03-04 Data pipeline continued; Optimization and objective functions caret library ISL Chapter 3.1.1 + 3.3.3  
2025-03-06 Guest lecture Paper: Fast Interpretable Greedy-Tree Sums  
2025-03-11 Resampling techniques - accuracy vs robustness Slides 7 + Resampling from ISL - Read paper on Stability
2025-03-13 Automated Model Selection Slides 7 + + ISL on resampling Project 1
2025-03-18 Spring Break    
2025-03-20 Spring Break    
2025-03-25 Clustering - Kmeans ISL 10.2  
2025-03-27 Clustering - Kmeans continued ISL 10.2  
2025-04-01 K-means with real data ISL 10.2  
2025-04-03 Hierarchical clustering ISL 10.2 [Homework 4]
2025-04-08 Hierarchical clustering with real data ISL 10.2  
2025-04-10 DBSCAN DBSCAN from KDNuggets  
2025-04-15 feature engineering - with text Pre-processing Text + Speech and Language Chapter 6.5  
2025-04-17 Working with text data continued    
2025-04-22 Independent Component Analysis Stanford ICA Slides  
2025-04-24 Models on text including Wordfish   [Homework 5]
2025-04-29 Going over final projects in class    
2025-05-01 Going over final projects in class + what we didn’t teach   Final Project


Lectures: TuTh 2:40pm - 3:55pm, 602 Hamilton Hall

Teaching Team

Online Discussion

If your final grade is in [93-100], you will earn at least an A, [90-93) will earn at least an A-, [87-90) will earn at least a B+, etc. A grading curves may occur depending on the class performance but I will not curve downwards. I may not give out A+’s in this class.

- Homeworks (20%)


A lot of these materials are based off the materials from Prof Vincent Dorie.