Applied machine learning
This course will expose students to a variety of data mining applications using machine learning methods.
Students who finish this class should:
- Gain an intuitive understanding of basic machine learning methods
- Understand how fitting models can help explore patterns in data
- Understand how to assess models and clustering in different usecases
Prerequisites
- An introductory statistics class
- Basic probability distributions (e.g. Gaussian, binomial distributions and their likelihoods)
- Basic hypothesis testing (e.g. t-test)
- Summary statistics
- Histograms, boxplots, etc
- A computing course involving data wrangling and visualization
- A modeling course that estimated parameters from data
Textbooks / References
- An Introduction to Statistical Learning with Applications in R by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani
- To collaborate on coding projects, here’s a bare minimum GitHub tutorial. If you ever work officially with code, you should also look into the concept of branches and reviews which are not covered in the tutorial.
Timeline
I reserve the right to change the ordering and the content for the course throughout the semester.
Logistics
Lectures: TuTh 2:40pm - 3:55pm, 602 Hamilton Hall
Teaching Team
- Wayne Tai Lee (wtl2109)
- OH: TuTh 12-2pm at 324 Uris Hall
- Matt Shen (ms7079)
- OH: Monday 11:30-1:30pm at 324 Uris Hall
Online Discussion
The TA and grader will check the online discussion for 30 minutes each weekday. Do not expect an immediate response so please start your work early and understand that you should post your questions more clearly.
Grading
If your final grade is in [93-100], you will earn at least an A, [90-93) will earn at least an A-, [87-90) will earn at least a B+, etc. A grading curves may occur depending on the class performance but I will not curve downwards. I may not give out A+’s in this class.
- Homeworks (20%)
- Late homeworks will receive 0 credit
- Homeworks will receive 0 credit if the code + write up is not submitted in both the .ipynb/.Rmd AND the knitted PDF or HTML form.
- Projects (70%)
- Late projects will be penalized by 50% for each day it’s late.
- Projects should be submitted on Canvas
- Participation (10%)
- Instead of attendance, in class activities, recorded through Canvas, is how we’ll grade this.
- If you surpass 75% here, you’ll receive the full credit for participation.
Acknowledgement
A lot of these materials are based off the materials from Prof Vincent Dorie.