1. Trang chủ
  2. » Tài Chính - Ngân Hàng

Class Notes in Statistics and Econometrics Part 29 ppsx

53 160 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 53
Dung lượng 491,27 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Heteroskedastic DisturbancesHeteroskedasticity means: error terms are independent, but their variances arenot equal.. This is a plausible example of a situation in which the relative var

Trang 1

on them We will discuss such examples first: heteroskedastic disturbances withknown relative variances, and some examples involving equicorrelated disturbances.

Trang 2

57.1 Cases when OLS and GLS are identicalProblem 498 Fromy= Xβ +ε withε∼ (o, σ2I) follows Py= P Xβ + Pε

with Pε ∼ (o, σ2P P>) Which conditions must P satisfy so that the generalizedleast squares regression of Py on P X with covariance matrix P P> gives the sameresult as the original regression?

Problem 499 We are in the model y = Xβ +ε, ε ∼ σ2Ψ As always, weassume X has full column rank, and Ψ is nonsingular We will discuss the specialsituation here in which X and Ψ are such that ΨX = XA for some A

• a 3 points Show that the requirement ΨX = XA is equivalent to therequirement that R[ΨX] = R[X] Here R[B] is the range space of a matrix B, i.e.,

it is the vector space consisting of all vectors that can be written in the form Bc forsome c Hint: For ⇒ show first that R[ΨX] ⊂ R[X], and then show that R[ΨX]has the same dimension as R[X]

Answer ⇒: Clearly R[ΨX] ⊂ R[X] since ΨX = XA and every XAc has the form Xd with

d = Ac And since Ψ is nonsingular, and the range space is the space spanned by the column vectors, and the columns of ΨX are the columns of X premultiplied by Ψ, it follows that the range space of ΨX has the same dimension as that of X ⇐: The ith column of ΨX lies in R[X], i.e.,

it can be written in the form Xai for some ai A is the matrix whose columns are all the ai 

• b 2 points Show that A is nonsingular

Trang 3

Answer A is square, since XA = ΨX, i.e., XA has has as many columns as X Now assume

Ac = o Then XAc = o or ΨXc = o, and since Ψ is nonsingular this gives Xc = o, and since X has full column rank, this gives c = o 

• c 2 points Show that XA−1= Ψ−1X

Answer X = Ψ−1ΨX = Ψ−1XA, and now postmultiply by A−1 

• d 2 points Show that in this case (X>Ψ−1X)−1X>Ψ−1 = (X>X)−1X>,i.e., the OLS is BLUE (“Kruskal’s theorem”)

Answer (X > Ψ −1 X) −1 X > Ψ −1 = (A −1 ) > X > X−1(A −1 ) > X > = (X > X) −1 A > (A −1 ) > X > =

57.2 Heteroskedastic DisturbancesHeteroskedasticity means: error terms are independent, but their variances arenot equal Ψ is diagonal, with positive diagonal elements In a few rare cases therelative variances are known The main example is that the observations are means

of samples from a homoskedastic population with varying but known sizes

This is a plausible example of a situation in which the relative variances areknown to be proportional to an observed (positive) nonrandom variable z (which may

Trang 4

or may not be one of the explanatory variables in the regression) Here V[ε] = σ2Ψwith a known diagonal

Problem 500 3 points The specification is

of β , β , and β using the ˆγ What properties do your estimates of the β have?

Trang 5

Answer Divide the original specification by xt to get

γt∼IID(β, σ2), one can also write itγt= β +δtorγ= ιβ +δwithδ∼ (o, σ2I).This model can be converted into a heteroskedastic Least Squares model if onedefinesε= x ∗δ Theny= xβ +ε withε∼ (o, σ2Ψ) where

Since x>Ψ−1= x−1>(taking the inverse element by element), and therefore x>Ψ−1x =

Trang 6

of a random variable xwith zero mean and finite fourth moments, it follows(57.2.6) plimvar[

ˆ

βOLS]var[βˆ] = plim

This is the kurtosis (without subtracting the 3) Theoretically it can be anything

≥ 1, the Normal distribution has kurtosis 3, and the economics time series usuallyhave a kurtosis between 2 and 4

57.3 Equicorrelated Covariance MatrixProblem 501 Assumeyi= µ +εi, where µ is nonrandom, E[εi] = 0, var[εi] =

σ2, and cov[εi,εj] = ρσ2 for i 6= j (i.e., the εi are equicorrelated)

If ρ ≥ 0, then these error terms could have been obtained as follows: ε = z+ ιu

where z∼ (o, τ2I) andu∼ (0, ω2) independent ofz

• a 1 point Show that the covariance matrix ofε is V[ε] = τ2I + ω2ιι>

Trang 7

Answer V[ιu] = ι var[u]ι>, add this to V[z 

• b 1 point What are the values of τ2 and ω2so thatεhas the above covariancestructure?

Answer To write it in the desired form, the following identities must hold: for the diagonal elements σ 2 ρ = ω 2 , which gives the desired formula for ω 2 and for the diagonal elements

off-σ 2 = τ 2 + ω 2 Solving this for τ 2 and plugging in the formula for ω 2 gives τ 2 = σ 2 − ω 2 = σ 2 (1 − ρ),

Trang 8

Now multiply the left sides and the righthand sides (use middle term in ( 57.3.4 ))

As n → ∞ this converges towards ω 2 , not to 0 Problem 502 [Chr87, pp 361–363] Assume there are 1000 families in acertain town, and denote the income of family k by zk Let µ = 10001 P1000

pur-You pick at random 20 families without replacement, ask them what their income

is, and you want to compute the BLUE of µ on the basis of this random sample Call

Trang 9

the incomes in the sampley1, ,y20 We are using the lettersyiinstead of zifor thissample, because y1 is not necessarily z1, i.e., the income of family 1, but it may be,e.g., z258 Theyi are random The process of taking the sample of yi is represented

by a 20 × 1000 matrix of random variables qik (i = 1, , 20, k = 1, , 1000) with:

qik= 1 if family k has been picked as ith family in the sample, and 0 otherwise Inother words, yi=P1000

if family k has been selected as the ith family in the sample, then it cannot be selected again as the jth family of the sample Is qikindependent of qjl? I think it is 

• b Show that the first and second moments are

Trang 10

For these formulas you need the rules how to take expected values of discrete randomvariables.

Answer Since qik is a zero-one variable, E[qik] = Pr[qik= 1] = 1/1000 This is obvious if

i = 1, and one can use a symmetry argument that it should not depend on i And since for a one variable, q2ik= qik, it follows E[q2ik] = 1/1000 too Now for i 6= j, k 6= l, E[qikqjl] = Pr[qik=

zero-1 ∩ qjl= 1] = (1/1000)(1/999) Again this is obvious for i = 1 and j = 2, and can be extended by symmetry to arbitrary pairs i 6= j For i 6= j, E[qikqjk] = 0 since zk cannot be chosen twice, and for k 6= l, E[qikqil] = 0 since only one zkcan be chosen as the ith element in the sample 

(57.3.9) E[εi] = 0 var[εi] = σ2 cov[εi,εj] = −σ2/999 for i 6= j

Hint: For the covariance note that from 0 =P1000

k=1(zk− µ) follows(57.3.10)

1000

X(zk−µ)2=X(zk−µ)(zl−µ)+1000σ2

Trang 11

E[εi] =

1000Xk=1 (zk − µ) E[qik] =

1000Xk=1

zk − µ

1000 = 0(57.3.11)

var[εi] = E[ε2i] =

1000Xk,l=1 (zk − µ)(zl− µ) E[qikqil] =

1000Xk=1

(zk − µ) 2

1000 = σ

2 (57.3.12)

and for i 6= j follows, using the hint for the last equal-sign

cov[εi, εj] = E[εiεj] =

1000Xk,l=1



With ι20 being the 20 × 1 column vector consisting of ones, one can thereforewrite in matrix notation

y= ι µ +ε E[ε] = o V[ε] = σ2Ψ

Trang 13

CHAPTER 58

Unknown Parameters in the Covariance Matrix

If Ψ depends on certain unknown parameters which are not, at the same time,components of β or functions thereof, and if a consistent estimate of these parameters

is available, then GLS with this estimated covariance matrix, called “feasible GLS,”

is usually asymptotically efficient This is an important result: one does not notneed an efficient estimate of the covariance matrix to get efficient estimates of β! Inthis case, all the results are asymptotically valid, with ˆΨ in the formulas instead of

Ψ These estimates are sometimes even unbiased!

Trang 14

58.1 HeteroskedasticityHeteroskedasticity means: error terms are independent, but do not have equalvariances There are not enough data to get consistent estimates of all error variances,therefore we need additional information.

The simplest kind of additional information is that the sample can be partitionedinto two different subsets, each subset corresponding to a different error variance,with the relative variances known Write the model as

To make this formula operational, we have to replace the κ2i by estimates Thesimplest way (if each subset has at least k + 1 observations) is to use the unbiasedestimates s2

i (i = 1, 2) from the OLS regressions on the two subsets separately.Associated with this estimation is also an easy test, the Goldfeld Quandt test [Gre97,551/2] simply use an F-test on the ratio s2/ 2; but reject if it is too big or too

Trang 15

small If we don’t have the lower significance points, check s2/ 2 if it is > 1 and

O σ2I



in which X1is a 10×5 and X2a 20×5 matrix, you run the two regressions separatelyand you get s2= 500 and s2= 100 Can you reject at the 5% significance level thatthese variances are equal? Can you reject it at the 1% level? The enclosed tables arefrom [Sch59, pp 424–33]

Answer The distribution of the ratio of estimated variances is s 2 / 2 ∼ F 15,5, but since its observed value is smaller than 1, use instead s 2 / 2 ∼ F5,15 The upper significance points for 0.005%

F(5,15;0.005)= 5.37 (which gives a two-sided 1% significance level), for 1% it is F(5.15;0.01)= 4.56 (which gives a two-sided 2% significance level), for 2.5% F(5,15;0.025) = 3.58 (which gives a two-sided 5% significance level), and for 5% it is F(5,15;0.05)= 2.90 (which gives a two-sided 10% significance level) A table can be found for instance in [ Sch59 , pp 428/9] To get the upper 2.5% point one can also use the Splus-command qf(1-5/200,5,15) One can also get the lower significance points simply by the command qf(5/200,5,15) The test is therefore significant at the 5% level but not

Trang 16

Since the so-called Kmenta-Oberhofer conditions are satisfied, i.e., since Ψ doesnot depend on β, the following iterative procedure converges to the maximum like-lihood estimator:

(1) start with some initial estimate of κ2 and κ2 [Gre97, p 516] proposes tostart with the assumption of homoskedasticity, i.e., κ2= κ2= 1, but if each grouphas enough observations to make separate estimates then I think a better startingpoint would be thes2

i of the separate regressions

(2) Use those κ2

i to get the feasible GLSE

(3) use this feasible GLSE to get a new set κ2

i =s2

i (but divide by ni, not ni− k).(4) Go back to (2)

Once the maximum likelihood estimates of β, σ2, and κ2

i are computed (actually

Trang 17

Since ˆσ2 = 1n(y− Xβ)>Ψ−1(y− Xβ) and det[kΨ] = kndet[Ψ] one can rewrite(35.0.17) as

Trang 18

58.1.1 Logarithm of Error Variances Proportional to Unknown LinearCombination of Explanatory Variables When we discussed heteroskedasticitywith known relative variances, the main example was the prior knowledge that theerror variances were proportional to some observed z To generalize this procedure,[Har76] proposes the following specification:

z>n

 consists

of observations of m nonrandom explanatory variables which include the constant

“variable” ι The variables in Z are often functions of certain variables in X, butthis is not necessary for the derivation that follows

A special case of this specification is σ2

t = σ2xpt or, after taking logarithms,

Trang 19

εt/σt are i.i.d The lefthand side of (58.1.9) is not observed, but one can take theOLS residuals ˆt; usually lnˆ2t → lnε2

t in the probability limit

There is only one hitch: the disturbances in regression (58.1.9) do not havezero expected value Their expected value is an unknown constant If one ignoresthat and runs a regression on (58.1.9), one gets an inconsistent estimate of theelement of α which is the coefficient of the constant term in Z This estimate reallyestimates the sum of the constant term plus the expected value of the disturbance

As a consequence of this inconsistency, the vector exp(Zα) estimates the vector ofvariances only up to a joint multiplicative constant I.e., this inconsistency is suchthat the plim of the variance estimates is not equal but nevertheless proportional

to the true variances But proportionality is all one needs for GLS; the missingmultiplicative constant is then the s2provided by the least squares formalism

Therefore all one has to do is: run the regression (58.1.9) (if theF test does notreject, then homoskedasticity cannot be rejected), get the (inconsistent but propor-tional) estimates ˆσ2

t = exp(z>tα), divide the tth observation of the original regression

by ˆσt, and re-run the original regression on the transformed data Consistent mates of σ2

esti-t are then the s2from this transformed regression times the inconsistentestimates ˆσ2

t

58.1.2 Testing for heteroskedasticity: One test is theF-test in the dure just described Then there is the Goldfeld-Quandt test: if it is possible to order

Trang 20

proce-the observations in order of increasing error variance, run separate regressions on proce-theportion of the date with low variance and that with high variance, perhaps leavingout some in the middle to increase power of the test, and then just making an F-testwith SSEhigh /d.f.

Trang 21

Look at the following simple example from [Gre97, fn 3 on p 547:] y= xβ +ε

with var[εi] = σ2z2i For the variance of the OLS estimator we need

P

ix2

iz2 i 1 n

P

ix2 i

=E[x

2z2]E[x2] =

cov[x2,z2] + E[x2] E[z2]

E[x2] = 1I.e., if one simply runs OLS in this model, then the regression printout is not mis-

leading On the other hand, it is clear that always var[βˆOLS] > var[βˆ]; therefore ifz

is observed, then one can do better than this

Trang 22

Problem 505 Someone says: the formula

one gets White’s heteroskedastic-consistent estimator

(58.1.18) Est.V ar[βˆOLS] =εˆ

>εˆ

n (X

>X)−1(Xˆ2ixix>i )(X>X)−1

Trang 23

This estimator has become very fashionable, since one does not have to bother withestimating the covariance structure, and since OLS is not too inefficient in thesesituations.

It has been observed, however, that this estimator gives too small confidenceintervals in small samples Therefore it is recommended in small samples to multiplythe estimated variance by the factor n/(n − k) or to use ˆ2i

m ii as the estimates of σi2.See [DM93, p 554]

58.2 AutocorrelationWhile heteroskedasticity is most often found with cross-sectional data, autocor-relation is more common with time-series

Properties of OLS in the presence of autocorrelation If the correlation betweenthe observations dies off sufficiently rapidly as the observations become further apart

in time, OLS is consistent and asymptotically normal, but inefficient There is oneimportant exception to this rule: if the regression includes lagged dependent variablesand there is autocorrelation, then OLS and also GLS is inconsistent

Trang 24

Problem 506 [JHG+88, p 577] and [Gre97, 13.4.1] Assume

• a 2 points Show that vt is independent of allεsand ys for 0 ≤ s < t

Answer Both proofs by induction First independence of vt of εs: By induction assumption,

vt independent of εs−1 and since t > s, i.e., t 6= s, vt is also independent of vs, therefore vt independent of εs = ρεs−1 + vs Now independence of vt of ys: By induction assumption, vt independent of ys−1 and since t > s, vt is also independent of εs, therefore vt independent of

Trang 25

Answer Do not yet compute var[εt−1] at this point, just call it σ 2 Assuming stationarity, i.e., cov[εt, yt−1] = cov[εt−1, yt−2], it follows

cov[εt,yt−1](1 − ρβ) = ρσ 2 (58.2.7)

cov[εt,yt−1] = ρ

1 − ρβσ2 (58.2.8)



Trang 26

• e (e) 2 points Show that, again under conditions of stationarity,

var[yt−1] = var[yt] = 1 + βρ

1 − βρ

σ 2

1 − β 2 (58.2.12)

Ngày đăng: 04/07/2014, 15:20