Chapter 01_Review Of Least Squares & Likelihood Methods

However we can establish asymptotic distribution: ]' [... • This lead to the same estimator for β as in OLS and the MLE approach is a systematic way to deal with complex nonlinear model

Trang 1

Chapter 1 REVIEW OF LEAST SQUARES &

LIKELIHOOD METHODS

I LEAST QUARES METHODS:

1 Model:

- We have N observations (individuals, forms, …) drawn randomly from a large

population i = 1, 2, …, N

- On observation i: Yi and K-dimensional column vector of explanatory variables

) , , , ( i1 i2 ik

X = and assumeX = for all i = 1, 2, …, N ie 1

- We are interested in explaining the distribution of Y in terms of the explanatory i

variables X using linear model: i

)) , , ( (

Y =β +ε β = β β

In matrix notation:

ε

β+

Y

i ik k i

i

Y =β1+β2 2+β3 3+ +β +ε

Assumption 1: n

1 i i

i,Y} X { = are independent and identically distributed Assumption 2: ε ׀i X ~ N ( 0, σ i 2 )

Assumption 3: εi ⊥X i (∑

=

n

X

0

Assumption 4: E[ε ׀i X ] = 0 i

Assumption 5: E[ε *i X ] = 0 i

Trang 2

The Ordinary Least Squares (OLS) estimator for β solves:

2

=

−

n

X

Y β

β

This leads to: ˆ ' ( ' ) 1( ' )

1

1 1

Y X X X Y

X X

n

−

=

−

=

























β

The exact distribution of the OLS estimation under the normality assumption is:

] ) ' (

, [

~

β

• Without the normality of the ε it is difficult to derive the exact distribution of βˆ However we can establish asymptotic distribution:

) ]' [ , 0 ( ) ˆ ( − →N 2 E XX − 1

• We do not know σ , we can consistently estimate it as 2

2 1

1

=

−

X Y

k

σ

• In practice, whether we have exact normality for the error terms or not, we will use the following distribution for βˆ:

) , (

β ≈

where: V σ= 2.(E(X'X))−1

1

ˆ

=

∑

X X

V σ

• If we are interested in a specific coefficient:

) ˆ , ˆ (

ˆk N βk V kk

β ≈

ij

Vˆ is the (i,j) element of the matrix Vˆ

• Confidence intervals for β would be (95%) k







kk k

kk

k 1.96 Vˆ ; ˆ 1.96 Vˆ

β

• Test a hypothesis whether β =k α

Trang 3

) 1, 0 (

~

V

t

kk

β −

=

2 Robust Variances:

If we don’t have the homoscedasticity assumption then:

-1 2

]) (E[XX' ,

0 ( ) ˆ

n − →

We can estimate the heteroskedasticity – consistent variance as: (White’s estimator)

1 1

1 2 1

1

'

1 ' ˆ

1 ' 1

=

−

=





































N

N X X N

X X N

II MAXIMUM LIKELIHOOD ESTIMATION:

1 Introduction:

• Linear regression model:

i i

Y = ' β +ε

with ε׀X i ~ N ( 0, σ 2 )

1

ˆ argmin (n i ' )i

i

Y X

β

=

= ∑ −

























=

−

=

n

Y X X

X

1

1 1

' ˆ

β

• Maximum likelihood estimator:

2

,

ˆ ˆ ( , MLE) arg max ( , )L

β σ

β σ = β σ

Where:

∑

=

−

=











=

n

X Y n

X Y L

1

2 2

2 1

2 2

) ' ( 2

1 ) 2 ln(

2

) ' ( 2

1 ) 2 ln(

2

1 )

, (

β σ

πσ

β σ

πσ σ

β

Note: X ~N(µ,σ2)→ density function of X:

2 2

( ) 2 2

1 ( ) 2

X

µ σ

πσ

−

=

Trang 4

• This lead to the same estimator for β as in OLS and the MLE approach is a systematic way to deal with complex nonlinear model

∑

=

−

n 1

2

σ

2 Likelihood function:

• Suppose we have independent and identically distributed random variables Z , ,1 Z n with common density f( θZ i, ) The likelihood function given a sample Z1,Z2, ,Z n is

1

( ) n ( , )i

i

f Z

θ θ

=

=∏



• The log – likelihood function:

i 1

( ) ln ( ) n ln ( , )i

=

=  =∑

• Building a likelihood function is first step to job search theory model

• An example of maximum likelihood function:

 An unemployed individual is assumed to receive job offers

 Arriving according to rate λ such that the expected number of job offers arriving

in a short interval of length dt is λdt

 Each offer consist of some wage rate w, draw independently of previous wages,

with continuous distribution function F w (w)

 If the offer is better than the reservation wage w, that is with probability

) (

1−F w , the offer is accepted

 The reservation wage is set to maximize utility

 Suppose that the arrival rate is constant over time

 Optimal reservation wage is also constant over time

 The probability of receiving an acceptable offer in a short time dt is dtθ with

)) ( 1

θ

Trang 5

 The constant acceptance rate θ implies that the distribution for the unemployment duration is exponential with mean 1 and density function: θ

) (

) (y θe yθ

→

y: unemployment duration - random variable

 Mean & variance θ1,θ12

 S(y)=1−F(y)=e( − θ ): survivor function

<

+

<

=

) Pr(

lim ) (

) ( ) (

dy y Y

y y

S

y f y h

dy

(The rate at which a job is offered and accepted)

Likelihood function:

a) If we observe the exact unemployment duration y i

n i

( ) f(y , ) n ( i )S( i )

i

→  =∏ =∏

b) We observe a number of people all becoming unemployed at the same point in time, but we only observe whether they exited unemployment before a fixed point in time, say c:

( ) θ F(cθ ) (1di F c( θ )) (1-S(cθ )) (di S cθ )

=∏ − =∏



1

=

di denotes that individual i left unemployment before c and di=0to denote this

individual was still unemployed at time c

c) If we observe the exact exit or failure time if it occurs before c, but only an indicator

of exit occurs after c

n

i

( ) f(y , ) (di ) n ( ) S( ) ( )

i i i

=∏ =∏



d) Denote c is the specific censoring time of individual i Letting t denote the minimum i

of the exit time y and censoring time i c , i t = i min(y i,c i)

n

( ) θ =∏f(y , ) ( θ di S c θ ) =∏n f t( θ ) S(t θ ) =∏n h t( θ ) (di S t θ )



Trang 6

3 Properties of MLE

1

ˆ arg max n ln ( , )

MLE i

i

f Z

θ

→Θ =

= ∑

a Consistency:

For all ε> 0

ˆ lim Pr( MLE ) 0

n θ θ ε

→∞ − > =

b Asymptotic normality:

1 2

'

MLE L i

θ θ

−

4 Computation of the maximum likelihood estimator:

Newton – Raphson method:

• Approximate the objective function Q(θ)=−L(θ) around some starting value θ by a 0 quadrate function and find the exact minimum for that quadrate approximation Call this

1

θ

• Redo the quadrate

Định dạng
Số trang	6
Dung lượng	244,98 KB