Wayne's Github Page

A place to learn about statistics

Statistical Computing and Introduction to Data Science

GR5206 / 4206 - Fall 2024

Learning outcomes

Prerequisites

Textbooks and references

Timeline

I reserve the right to change the ordering and the content for the course throughout the semester.

Date Topic Reference Due
2024-09-06 Introduction + python as a calculator - Python concepts 1, 2, 3, 4
- Software Carpentry - Python Fundamentals + Analyzing Patient Data
- Python Data Science Handbook Chapter 2: Understanding Data Types in Python to The Basics of Numpy Arrays
 
2024-09-13 Numpy, objects, and subsetting
AB Testing
Python concepts 5, 6, 8, 10, 11 Set up your jupyter notebook environment with the command line
2024-09-20 For-loop, if/else, working with files,
AB testing assignment
Python concepts 7, 9, 21, 23 HW1 Due
2024-09-27 Pandas, summaries, and visualization
Exploratory data analysis
Python concepts 12, 14, 15, 16  
2024-10-04 Nested data and data wrangling
Basic data engineering
Python concept 24 and all previous chapters HW2 Due
2024-10-11 Regular expression and interacting with APIs Python concepts 13, 17  
2024-10-18 Designing data pipelines All previous Python concepts HW3 Due
2024-10-25 Midterm in the evening!    
2024-11-01 Data use cases, relational data, and SQL
Data quality concepts and data engineering
Python concepts 20  
2024-11-08 Modeling data
“Data science methods”
Python concepts 18  
2024-11-15 Optimization
Objective functions
Python concepts 19 HW4 Due
2024-11-22 Bootstrap, permutation, and other simulations
Model evaluation
Validation and what we don’t know
   
2024-11-29 Thanksgiving NO CLASS    
2024-12-06 Big exam   HW5 Due on 12/1
TBD Final    

Logistics

Class time: F 10:10am - 12:40pm, Location: 301 Uris

Teaching Team

See Ed for offiec hours

Grading

If your final grade is in [93-97), you will earn at least an A, [90-93) will earn at least an A-, [87-90) will earn at least a B+, etc. A grading curves may occur depending on the class performance but I will not curve downwards. I will not give out A+ for this class.

- Homeworks (25%)

- Exams (70%)

- Participation (5%)

Exam accomodations

In order to receive disability-related academic accommodations for this course, students must first be registered with their school Disability Services (DS) office. Detailed information is available online for both the Columbia and Barnard registration processes.

Refer to the appropriate website for information regarding deadlines, disability documentation requirements, and drop-in hours(Columbia)/intake session (Barnard).

For this course, students are not required to have testing forms or accommodation letters signed by faculty. However, students must do the following:

· The Instructor section of the form has already been completed and does not need to be signed by the professor.

· The student must complete the Student section of the form and submit the form to Disability Services.

· Master forms are available in the Disability Services office or online: https://health.columbia.edu/services/testing-accommodations

Expectations

Acknowledgement

A lot of these materials are based off the materials from Prof Thibault Vatter and Prof Gabriel Young.