Wayne's Github Page

A place to learn about statistics

Homework 1: Prerequisite

Goals

Homework 1 is meant to tie together the statistics and programming concepts and start to read the outputs of the regression function in R.

Q0 - Different objective functions

The average is a good statistic because it “optimizes” a particular objective, the total squared error:

\[\sum_{i=1}^n (X_i - \alpha)^2\]

Where \(X_i\) is the i-th data point and \(\alpha\) is a candidate statistic.

For this problem, please:

Q1 - Loading data and applying functions

Please calculate the following statistics for the police payroll dataset processed_nyc_payroll_2022.csv on Canvas. Please show your code and print out your final solutions (e.g. using print()). If you have not worked with data frames, the notes here may be useful.

Q2 - Simulating the law of large numbers

Q3 - Logic review

Small reminder, 2 distributions can be the same type of distribution but may not be the same distribution if they do not share the same set of parameters.

Q4 - Functions

We often deal with variables on different scales which can make interpreting results difficult. A useful trick is to standardize each variable such that its mean is 0 and standard deviation is 1. What transformation can do this?

Q5 - Simulation and hypothesis testing review