Wayne's Github Page

A place to learn about statistics

HW3 - Comparing different models

This homework is to meant for you practice with different regression models on the same dataset.

Context: Finding jobs is difficult. Job descriptions tell us what companies are looking for so we should be able to detect some signal from job descriptions whether we’re prepared enough for industry. However, there are too many job descriptions to look through.

Please download the dataset, job_descrip.csv, from CourseWorks.

Q1. Exploratory Data Analysis (EDA)

Please answer the following:

Q2. Basic model setup

For the rest of the homework, we will set our response variable as does the job title have the word ‘analyst’ in it.

Q3. Fitting PCA + OLS

Q4. Fitting LASSO

Q5. Fitting Naive Bayes

Q6. Back to OLS

Q7. Some questions

Q8. Thinking about Final Project

Please write a paragraph abou what you will do for your final project and where you “might” find data for this. You may need to start collecting data early if you need to scrape data on a routine.