Homework 1

0 comments

  1. Question 1
    Supposethatwehaveamodelyi =βxi+εi (i=1,…,n)wherey= n1 ni=1yi =0,x= ni=1xi =0,
    and εi is distributed normally with mean 0 and variance σ2; that is, εi ∼ N(0,σ2).
  2. (a) The OLS estimator for β minimizes the Sum of Squared Residuals:
    n

    βˆ=argmin (y −βx)2
    βii

    i=1
    Take the first-order condition to show that

(b) Show that

ˆ ni=1 xiyi
β= n x2.

i=1 i

ˆ ni=1 xiεi
β=β+ ni=1x2i

What is E[βˆ | β] and Var(βˆ | β)? Use this to show that, conditional on β, βˆ has the following

distribution:

ˆ σ2
β|β∼Nβ, n x2.

i=1 i
1

(c) Suppose we believe that β is distributed normally with mean 0 and variance σ2 ; that is,
λ

β ∼ N(0, σ2 ). Additionally assume that β is independent of εi. Compute the mean and
λ

variance of βˆ. That is, what is E[βˆ] and Var(βˆ)?
(Hint you might find useful: E[w1] = E[E[w1 | w2]] and Var(w1) = E[Var(w1 | w2)] +

Var(E[w1 | w2]) for any random variables w1 and w2.)

  1. (d) Since everything is normally distributed, it turns out that
    E[β | βˆ] = E[β] + Cov(β, βˆ) · (βˆ − E[βˆ]).
    Var(βˆ)
    Let βˆRR = E[β | βˆ]. Compute Cov(β,βˆ) and use the value of E[β] along with the values of
    E[βˆ], Cov(β,βˆ), and Var(βˆ) you have computed to show that
    ˆRR ˆ ni=1 x2i ˆ
    β = E[β | β] = ni=1 x2i + λ · β
    (Hint: Cov(w1, w2) = E[(w1 − E[w1])(w2 − E[w2])] and E[w1w2] = E[w1E[w2 | w1]] for any
    random variables w1 and w2)
  2. (e) Does βˆRR increase or decrease as λ increases? How does this relate to β being distributed
    N(0, σ2 )?
    λ

Question 2

Let us consider the linear regression model yi = β0 + β1xi + ui (i = 1, …, n), which satisfies

Assumptions MLR.1 through MLR.5 (see Slide 7 in “Linear_regression_review” under “Modules”

on Canvas)1. The xis (i = 1, …, n) and β0 and β1 are nonrandom. The randomness comes from uis

(i = 1, …, n) where var (ui) = σ2. Let βˆ0 and βˆ1 be the usual OLS estimators (which are unbiased for
 y1  1

 y2  1
β0 and β1, respectively) obtained from running a regression of  .  on  .  (the intercept

column) and

 . 
 y n − 1 

on


 . 

only

x1
 x2

 .
 .


. Suppose you also run a regression of



 y1 

x1 
 x2 

 x n − 1
xn

yn xn
a) Give the expression of β ̃1 as a function of yis and xis (i = 1, …, n).

(excluding the intercept column) to obtain another estimator β ̃1 of β1.
̃ ̃

b) Derive E β1 in terms of β0, β1, and xis. Show that β1 is unbiased for β1 when β0 = 0. If
β0 ̸= 0, when will β ̃1 be unbiased for β1?

c) Derive Var β ̃ , the variance of β ̃ , in terms of σ2 and x s (i = 1,…,n).
11i

1The model is a simple special case of the general multiple regression model in “Linear_regression_review”.
Solving this question does not require knowledge about matrix operations.


 y n − 1   1 

yn 1

 y2 
 . 

 .
 x n − 1 

2

d) Show that Var β ̃ is no greater than Var βˆ ; that is, Var β ̃ ≤ Var βˆ . When do
1111

you have Var β ̃ = Var βˆ ? (Hint you might find useful: use n x2 ≥ n (x − x ̄)2 where
11 i=1ii=1i

x ̄ = n1 ni=1 xi.)
e) Choosing between βˆ1 and β ̃1 leads to a tradeoff between the bias and variance. Comment on

this tradeoff.

Question 3

Let vˆ be an estimator of the truth v. Show that E (vˆ − v)2 = Var (vˆ) + [Bias (vˆ)]2 where Bias (vˆ) =
E (vˆ) − v. (Hint: The randomness comes from vˆ only and v is nonrandom).

Applied questions (with the use of R)

For this question you will be asked to use tools from R for coding.

Installation

  • To install R, please see https://www.r-project.org/.
  • Once you install R, please install also R Studio https://rstudio.com/products/rstudio/
    download/.
  • You will need to use R Studio to solve the problem set.

About the Author

Follow me


{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}