1. Trang chủ
  2. » Giáo Dục - Đào Tạo

THE MULTIVARIATE LINEAR REGRESSION MODEL

36 373 0
Tài liệu được quét OCR, nội dung có thể không chính xác

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề The Multivariate Linear Regression Model
Trường học Unknown University
Chuyên ngành Statistics / Data Analysis
Thể loại Thesis
Định dạng
Số trang 36
Dung lượng 0,96 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

THE MULTIVARIATE LINEAR REGRESSION MODEL

Trang 1

where y,: mx 1, B: kx m, x, k x 1, u,: mx 1 The system (1) is effectively a system of m linear regression equations:

with B=(B,, B2,. Bn)

In direct analogy with the m= 1 case (see Chapter 19) the multivariate linear regression model will be derived from first principles based on the

joint distribution of the observable random variables involved, D(Z,; ý) -

where Z,=(yi, Xi), (m+k)x 1 Assuming that Z, is an HD normally distributed vector, Le

Trang 2

Moreover, by construction, u, and y, satisfy the following properties:

(i) E(u,) = ELE(u,/X, = x,)] =0;

(it) u,u,) = E[E(uu,/X, = x,)] = 0 ts:

(iii) EUsw)=E[Eua/X,=xJ]=E[w,Etw/X,=x)]=0 reT,

where O=X,¡—¡¿:ŠX;; *¿;¡ (compare these with the results in Section

19.2)

The similarity between the m= 1 case and the general case allows us to

consider several loose ends left in Chapter 19 The first is the use of the joint

distribution D(Z,; ys) in defining the model instead of concentrating exclusively on D(y,/X,; w,) The loss of generality in postulating the form of the joint distribution is more than compensated for by the additional insight provided In practice it is often easier to ‘judge’ the plausibility of

assumptions relating to the nature of D(Z,; y) rather than D(y,/X,; p,)

Moreover, in misspecification analysis the relationship between the assumptions underlying the model and those underlying the random vector process {Z,, t¢ 1} enhances our understanding of the nature of the

possible departures An interesting example of this is the relationship of the

assumption that {Z,,reT} isa

(1) normal (N);

(2) independent (J); and

(3) identically distributed (ID) process; and

[6] (i) ——_Dly,/X,; 4) is normal;

(iii)

The question which naturally arises is whether (i)}(ii) imply (N) or not The following lemma shows that if (i)-{ili) are supplemented by the assumption

Trang 3

that X,~ N(0,Z,,), det(X,,)40, the reverse implication holds

Lemma 24.1

Z,~ N(0,X) for te TT if and only if

(i) X,~N(0,E,,), det(L,) #0;

whereY:T xm.X:T x k.B:k x m.U: 7 x m The system in (1) can be viewed

as the tth row of (6) The ith row taking the form

represents all T observations on the ith regression in (2) In order to define the conditional distribution D(Y/X; w,) we need the special notation of Kronecker products (see Appendix 2) Using this notation the matrix distribution can be written in the form

vec(Y) =(l„ @ X) vec(B) + vec(U) (24.9)

or

in an obvious notation

Trang 4

The multivariate linear regression (MLR) model is of considerable interest in econometrics because of its direct relationship with the simultaneous equations formulation to be considered in Chapter 25 In particular, the latter formulation can be viewed as a reparametrisation of the MLR model where the statistical parameters of interest @=(B, Q) do not coincide with the theoretical parameters of interest € Instead, the two sets of parameters are related by some system of implicit equations of the form:

These equations can be interpreted as providing an alternative parametrisation for the statistical GM in terms of the theoretical parameters of interest In view of this relationship between the two statistical models a sound understanding of the MLR model will pave the way for the simultaneous equations formulation in Chapter 25

24.2 Specification and estimation

In direct analogy to the linear regression model (m= 1) the multivariate linear regression model is specified as follows:

qd) Statistical GM: y,=B’x,+u,, teT

y:mxl, Xx;:kx1, B:kxm

[1] The systematic and non-systematic components are:

H,= E(y,X,=x,)=Bx, u,=y,— Ety,/X,=x,),

and by construction

E(u,) = EL E(u,/X, =x,)]=9,

E(uu,) = E[LE(wu,/X, = x,)] =0, reT

[2] The statistical parameters of interest are 6=(B,Q) where B=

X2; E¿y, Q=%,, —2,2%37'E),

[3] X, is assumed to be weakly exogenous with respect to 0

[4] No a prion information on Ø

[5] Rank(X)=k, X =(x,, X5, , X7): T xk, for T>k.

Trang 5

(IW) Sampling model

{8} Y=(¡,Y¿ Yr} 1S an independent sample sequentially drawn

from D(y,/X,; 6), t=1,2, , T.and T2=m+k

The above specification is almost identical with that of m= 1 considered

in Chapter 19 The discussion of the assumptions in the same chapter applies to [1]-[8] above with only minor modifications due to m> 1 The only real change brought about by m> | is the increase in the number of statistical parameters of interest being mk +4m(m + 1) It should come as no

surprise to learn that the similarities between the two statistical models

extend to estimation, testing and prediction

From assumptions [6] to [8] we can deduce that the likelihood function takes the form

r

10; Y) )=c(Y) IP (y,/X,; 9)

and the log likelihood is

log L= const ~5, lost (det Q)— 5 S (y, )@~!(y,—Bx,) (24.12)

=const —4[T log(det 2) + trQ~!(Y —XB(Y—XB)] (24.13)

(see exercise 1) The first-order conditions are

Trang 6

These first-order conditions lead to the following MLE’s:

and #, 1 ũ, This orthogonality can be used to deñne a goodness-of-lit measure by extending R? = 1—(a'd) (yy) to

G=I—(Ù Õ)\(Y'Y)!=(Y'YT-ÙÔ)J(Y'Y)T1, (24.20) The matrix G varies between the identity matrix when U =0 and zero when

Y=U (no explanation) In order to reduce this matrix goodness-of-fit measure to a scalar we can use the trace or the determinant

where E(:) is relative to D(V/X: 0)

Trang 7

Finite sample properties of B and Oo

From the fact that B and Q are MLE’s we can deduce that they enjoy the invariance property of such estimators (see Chapter 13) and they are functions of the minimal sufficient statistics, if they exist Using the Lehman— Scheffe result (see Chapter 12) we can see that the ratio

D(Y/X; 6) |

DIV gi 8) RPL 20767 1[YY —Y0Yu —(Y —YuJXB—

BX(Y-Yạ]; (24.24)

is independent of 0 if YY=Y’Y, and Y’'X = YX This implies that

a(Y)=(t,(¥),t,(¥)), where t,(Y)=Y’Y, t,(¥Y) =Y'X

defines the set of minimal sufficient statistics and

Trang 8

and thus Õ=[1/(T—k)]Ữ Ô is an unbiased estimator of Q In view of (25}-(31) we can summarise the finite sample properties of the MLE’s B and

(3) B is an unbiased estimator of B (i.e E+(B)=B) but Q is a biased

estimator of Q; Q=[1/(T—k)]U'U being unbiased

(4) Bisa fully efficient estimator of B in view of the fact that Cov(B) =

O@(XX)'! and the information matrix of @=(B,Q) takes the form

(5) B and Q are independent: in view of the orthogonality in (19)

Asymptotic properties of B and Q

Arguing again by analogy to the m= 1 case we can derive the asymptotic properties of the MLE’s B and Q of B and Q, respectively

(1) Consistency: (B *,B,Q24 Q)

In view of the result (B—B)~ N(0, Q @ (X’X)z!) we can deduce that if

lim,_, , (XX); ! =0 then Cov(B) — 0 and thus B is a consistent estimator of

B (see Chapters 12 and 19) Similarly, given that lim, ,,, E(Q)=Q and lim,,,, CoQ) =0, AS Q

Note that the following statements are equivalent:

Trang 9

where 2„„(XX)y and Ama(X’X)7' refer to the smallest and largest eigenvalue of (X'X); and its inverse respectively; see Amemiya (1985) (ii) Strong consistency: (B — B)

From the theory of maximum likelihood estimation we know that under

relatively mild conditions (see Chapter 13) the MLE 6 of 6 \/T(6—6) ~

N(,I,,(0)~) For this result to apply, however, we need the boundedness of I,,(0)=lim,_, ,.(1/T)1(8) as well as its non-singularity In the present case the asymptotic information matrix is bounded and non-singular (full rank)

if lim,_, , (X'X)/T=Q, < «x and non-singular Under this condition we can

deduce that

\/ T(B—B) ~ N0,2 @ Qz') (24.33) and

J/T(Q —Q) ~ NO, 2(Q © Q)) (24.34) (see Rothenberg (1973))

Note that if {(X’X),;, T >k} is a sequence of k x k positive definite matrices such that (X’X),;_, —(X’X); is positive semi-definite and e’(X’X);c + x as

T— x for every c#0 then lim;_,(X’X)7'=0

(iv) In view of (iii) we can deduce that B and Q are both asymptotically unbiased and efficient

24.3 A priori information

One particularly important departure from the assumptions underlying the multivariate linear regression model is the introduction of a priori

restrictions related to 6 When such additional information is available

assumption [4] no longer applies and the results on estimation derived in

Section 24.2 need to be modified The importance of a priori information in

Trang 10

the present context arises partly because it allows us to derive tests which can be usefully employed in misspecification testing and partly because this will provide the link between the multivariate linear regression model and the simultaneous equations model to be considered in Chapter 25 (1) Linear restrictions ‘related’ to X,

The first form of restrictions to be considered is

where B,, = vec(B) =(B' Bo , Bi.) 1 mk x 1, Rip x mk, px 1 This form of

linear restrictions is more general than (35) as well as

is to ‘solve’ the system (35) for B and substitute the ‘solution’ into (40) In

order to do that we define two arbitrary matrices D¥:(k — p) x k, rank (D*)=

Trang 11

k—p, and C#: (k—p) xm, and reformulate (35) into

Given that L?=L, P? = P and LP=0 (ie they are orthogonal projections)

we can deduce that P takes the form

apart from the constant terms, say B ,,are zero This can be expressed in the

Trang 12

form (35) with

D, =(0,1,-1), B=(B ;,B,,,), C=0

and H, takes the form B,,,=9

(2) Linear restrictions ‘related’ to y,

The second form of restrictions to be considered is

where I',: mx q (q<m) and A,: kx q are known matrices with rank(E) =a The restrictions in (50) represent linear between-equations restrictions because the ith row of B represents the ith coefficient on all equations Interpreted in the context of (35) these restrictions are directly related to the

yS This implies that if we follow the procedure used for the restrictions in (38) we have to be much more careful because the form of the underlying probability model might be affected Richard (1979) shows how this procedure can give rise to the restricted MLE’s of Band Q For expositional purposes we will adopt the Lagrange multiplier procedure The Lagrangian function is

\(B, Q, M) = -5 log(det 9)—‡ tr Q~!(Y —XBJ(Y —XB)

Trang 13

This implies that the constrained MLE’s of B and Q are

~ law a lage ~ 2

Q=— ỮŨ=ôÔ +7 B- By (X'X)(B—B) (24.59) (see Richard (1979)) If we compare (58) with (48) we can see that the main difference is that Q enters the MLE estimator of B in view of the fact that the restrictions (50) affect the form of the probability model It is interesting to note that if we premultiply (58) by F, it yields (54) The above formulae, (58), (59), will be of considerable value in Chapter 25

(3) Linear restrictions ‘related’ to both y, and X,

A natural way to proceed is to combine the linear restrictions (38) and (50)

where Y*=YT,, B*=BL, and E=UF;

The linear restrictions in (60) in vector form can be written as

vec(D, BI, + C)=(T, © D,) vec(B) + vec(C) =0 (24.66)

or

Trang 14

where B, = vec B and r= — vec(C) This suggests that an obvious way to generalise this is to substitute (/', © D,) with a pxkm matrix R to formulate the restrictions in the form

excluded variables, and zeros everywhere else

Across-equations linear restrictions can be accommodated in the off block-diagonal submatrices R;,,i,j=1,2, m,iAj of R with R,, referring

to the restrictions between equations i and j

Let us consider the derivation of the constrained MLE’s of Band Q under the linear restrictions (68) The most convenient form of the statistical GM for the sample period r= 1, 2, , Tis not

Trang 15

ñ =B,— x Xx) ) IR TR(X,Q7X,)7 'R*]- (RB, —r),(24.76)

and _

If we compare these formulae with those in the m= 1 case (see Chapter 20)

we can see that the only difference (when Q is known) is the presence of Q,

This is because in the m>1 case the restrictions RB,=r affect the

underlying probability model by restricting y, In the econometric literature the estimator (78) is known as the generalised least-squares (GLS) estimator

In practice Q is unknown and thus in order to ‘solve’ the conditions (73)-{75) we need to resort to iterative numerical optimisation (see Harvey (1981), Quandt (1983) inter alia)

The purpose of the next section is to consider two special cases of (68) where the restrictions can be substituted directly into a reformulated statistical GM These are the cases of exclusion and across-equations linear

homogeneous restrictions In these two cases the constrained MLE of B,,

takes a form similar to (78)

24.4 The Zellner and Malinvaud formulations

In econometric modelling two special cases of the general linear restrictions

are particularly useful These are the exclusion and across-equations linear

homogeneous restrictions In order to illustrate these let us consider the

Trang 16

two-equation case

X1

u (ante Bai nh) xX», +( ") teT (24.80)

Vat Bi2 Bar Bas Har

X3¢

(i) Exclusion restrictions: B,,=0, B.3=0;

(ii) Across-equation linear homogeneous restrictions: B,, =P , >

It turns out that in these two cases the restrictions can be accommodated directly into a reformulation of the statistical GM and no constrained optimisation is necessary The purpose of this section is to discuss the

estimation of B, under these two forms of restrictions and derive explicit

formulae which will prove useful in Chapter 25

Let us consider the exclusion restrictions first The vectorised form of

where X;, refers to the regressor data matrix for the ith equation and Ø# the

corresponding coefficients vector In the case of the example in (80) with the

Trang 17

restrictions ¡¡ =0, 633 =0, (84) takes the form

hs LIB Y2 0 X2/\B2 H; nm

where X, =(x ,X3), X.=(%1,%2), Bi =(B21B31)' and B,=(B, 2822)’

The formulation (84) is known as the seemingly unrelated regression equations (SURE), a term coined by Zellner (1962), because the m linear regression equations in (84) seem to be unrelated at first sight but this turns out to be false When different restrictions are placed on different equations the original statistical GM is affected and the various equations become interrelated In particular the covariance matrix Q enters the estimator of

* As shown in the previous section, in the case where Qis known the MLE

of B* takes the form

‡=(X‡(9~! @1,)X‡)"'X‡(@~' @I;)y, (24.87)

Otherwise, the MLE is derived using some iterative numerical procedure For this case Zellner (1962) suggested the two-step least-squares estimator

where Q=(1/T)U'U, U=Y—XB It is not very difficult to see that this

estimator can be viewed as an approximation to the MLE defined in the

previous section by the first-order conditions (73}{75) where only two iterations were performed One to derive © and then substitute into (87)

Zellner went on to show that if

Trang 18

of across-equation linear homogeneous restrictions such as B,,=f,> in example (80) Such restrictions can be accommodated into the formulation (82) directly by redefining the regressor matrix as

j =(Š xrô, 'x7] Y XO; 'y,, i=1,2, ,1, (24.96

where / refers to the number of iterations which is either chosen a priori or

determined by some convergence criterion such as

|Ê*.(—f|<e for somee>0, eg ¢=0.001 (24.97)

In the case where /=2 the estimator defined by (96) coincides with

Ngày đăng: 17/12/2013, 15:17

TỪ KHÓA LIÊN QUAN

w