Wayne's Github Page

A place to learn about statistics

Applied Statistical Methods - Homework 0

Goals

Format

Please return a PDF file with your solutions on GradeScope.

Questions

Please download the data on CourseWorks Files/Data/unemployment_cpi_unempl_2000_2020.json This is a dataset from the Bureau of Labor Statistics regarding inflation and unemployment.

In economics, we believe that low unemployment will lead to higher inflation. The reasoning being that labor is hard to find so employers need to attract workers with higher salaries. With higher salaries, people are willing to pay more for goods and therefore leading to inflation. We will roughly validate this theory at the US national level at a very coarse time scale. You are not expected to have taken an economics class to do this assignment.

The time series data:

Q0 - Standardizing the dataset

Please read in the JSON data and export a CSV file with the following columns:

Hint: if you haven’t seen a JSON file before, here’s some sample code for R:

library(jsonlite)
data <- read_json("MYFILE.json")
class(data)
print(data[[1]])

Q1 - Calculate the inflation

Using your results from Q0, please calculate the inflation rate and add a column to your data named inflation. For example, the inflation rate for 2000, 2nd Half should be (170.2 - 167.6)/167.6 * 100 = 1.55%. If an inflation cannot be calculated, please replace the value with NA.

Q1.1 - R’s handling of NA values

Q2 - Visualizing the data

Please plot the scatter plot between the inflation and unemployment rate with the axes labeled with units. Inflation should be on the y-axis.

Q3 - Fit a regression and analyze the output

Fit an OLS to the scatter plot in Q2 and report the fitted slope and its p-value. Why is the slope relevant to our problem here (at most 3 sentences)?

Q4 - Assumptions

Which assumptions were required for your p-value in Q3 to make sense?

Q5 - Test whether the inflation is independent of the unemployment rate via permutation test

Please write the code for the following:

Q6 - Simulation

This is not related to the quesitons above. Please create a simulation in R that demonstrates the unbiasedness property for estimating the parameter of the linear model can be violated if one of the linear regression assumptions are violated.

Your solution should clearly state: