Data Science Consulting
When entrepreneurs talk about their vision for data, people can be overly confident in their intuition, their data can be less than ready to tackle the problems they envisioned, they may lack tangible metrics to quantify the desired outcome, or they may simply not know about the existence of newer tools and methods. A data scientist can help shape these visions into actionable projects with tangible outcomes.
This class will take in data ideas that serve the public and translate them into concrete data science projects. Overall, students will learn to consult domain experts by quickly understanding the context, setting concrete milestones, managing expectations, defining success metrics, validating data sources, and writing out detailed project briefs. Our goal is to help realize the potential for public good with data science.
Students should know that this class will have an emphasis on exercising their creativity, observation, and communication rather than the cutting-edge technologies and algorithms. You will connect weekly with the project owner to flesh out details around the project while learning practical consulting skills in class.
Learning Objectives
- Students will assist others to articulate their vision by formalizing their ideas with mathematical rigor backed by data
- Students will learn to set achievable and relevant milstones with a reasonable deadline for time sensitive projects
- Students will learn to managae expectations around data science projects and learn to trade-off perfection vs speed
- Students will learn to create maintainable models and documentation meant for a wide audience with different backgrounds.
Instructors
- Wayne Tai Lee (wtl2109)
- Tian Zheng (tz33)
Course Structure
- At the end of the course, you will present a project proposal that can be executed by a data scientist in collaboration with a project owner.
Each week, two people will be randomly called to present a data science idea to the class where we jointly shape it into an executable proposal- Each week, we will have a discussion on a special consulting topic
- Participation will be mandatory, we may cold call people during the meeting
Timeline
Date | Topic | Due |
---|---|---|
2021-05-07 | The template, the workflow, and goals of consulting | Reading: the consulting template |
2021-05-14 | Data Science Consulting - Alex Wilson | - Established first contact with project owner -draft 0.0 proposal: outline + preliminary research |
2021-05-21 | Common problems with data science projects in the public sector - Director Terri Matthews, Edna Wells Handy, and Nancy Holt | draft 0.1 proposal: data validation + onboarding tutorial draft |
2021-05-28 | Consulting with non-statisticians - Xiaoyue (Maggie) Niu | - draft 0.2 proposal: outcomes and follow-up |
2021-06-02 | Share initial draft with Lehman College Students | - (due 6/1) 5 minute recording of the project overview |
2021-06-04 | Working with first generation college students - Prof Jenn Laird | draft 1.0 proposal: writing - onboarding tutorial |
2021-06-11 | Presenting your proposal | Your project proposal |
Course resources:
- Template
- How to start a data science project
- How should I incoroporate data into my daily routine?
- What technology/algorithms should I use?
- Is X do-able within Y days?
- I have some data and I want to see if you have any ideas about how to use it?
- Managing expectations
- stakeholders
- collaborators
- The best model vs a useful model, what trade-offs exist?
- How to create maintainable documentation and models
- Technology, versions, packages, test data
- Data science is always a joint effort
- soliciting feedback
- Communication tips
- Avoid assigning numbers to politically charged variables, e.g. party affiliation, organic vs GMO, etc.