Handbook of Econometrics Vols1-5 _ Chapter 4 docx

Thus, if we postulate a set of structural relations 3.1 with reduced form 3.7, then all structures obtained by premultiplying the original structure by an arbitrary non-singular matrix

Trang 1

4 Dynamic models with serially correlated residuals 242

5 Non-linear a priori constraints and covariance restrictions 255

Handbook of Econometrics, Volume I, Edited by Z Griliches and M.D Intriligalor

0 North-Holland Publishing Company, I983

Trang 2

224 C Hsiao

1 Introduction

The study of identification has been aptly linked to the design of experiments In biological and physical sciences an investigator who wishes to make inferences about certain parameters can usually conduct controlled experiments to isolate relations Presumably, in a well-designed experiment the treatment group and the control group are similar in every aspect except for the treatment The difference

in response may therefore be attributed to the treatment and the parameters of interest are identified

In economics and other social sciences we are less fortunate We observe certain facts (which are usually characterized by a set of quantities) and wish to arrange them in a meaningful way Yet we cannot replace the natural conditions

by laboratory conditions We cannot control variables and isolate relations The data are produced by an unknown structure, where the effects of a change in this structure are the objects of our study None of these changes was produced by an investigator as in a laboratory experiment, and often the impact of one factor is confounded with impacts of other factors

To reduce the complex real-world phenomena into manageable proportions, an economist has to make a theoretical abstraction The result is a logical model presumably suited to explain the observed phenomena That is, we assume that there exists an underlying structure which generated the observations of real-world data However, statistical inference can relate only to characteristics of the distribution of the observed variables Thus, a meaningful statistical interpreta- tion of the real world through this structure can be achieved only if our assumption that real-world observations are generated by this structure and this structure alone is also true The problem of whether it is possible to draw inferences from the probability distribution of the observed variables to an underlying theoretical structure is the concern of econometric literature on identification We now illustrate this concept using the well-known demand and supply example

Trang 3

225

Assume U, and u2 have an independent (over time) bivariate-normal distribution

Solving for p, and qt we obtain the distribution of the observed variables:

of the parameters and not any parameter itself

As is obvious, there are infinitely many possible values of (a, b, c, d, u,,, a,,,

and u12) which could all generate the observed data (p,, qt)_ Consequently, without additional information (in the form of a priori restrictions), the model specified by (1.1) and (1.2) cannot be estimated and therefore is not useful in confronting economic hypotheses with data The study of identifiability is under- taken in order to explore the limitations of statistical inference (when working with economic data) or to specify what sort of a priori information is needed to make model parameters estimable It is a fundamental problem concomitant with the existence of a structure Logically it precedes all problems of estimation or of testing hypotheses

The general formulation of the identification problems were made by Frisch (1934), Haavelmo (1944), Hurwicz (1950), Koopmans and Reiersol (1950), Koopmans, Rubin and Leipnik (1950), Marschak (1942), Wald (1950), Working (1925, 1927) and others An extensive study of the identifiability conditions for the simultaneous equations models under various assumptions about the underlying structures was provided by Fisher (1966) In this chapter I intend to survey the development of the subject since the publication of Fisher’s book, although some pre-1966 results will be briefly reviewed for the sake of completeness Because the purpose of this chapter is expository, I shall draw freely on the work

by Anderson (1972), Deistler (1975, 1976), Deistler and Schrader (1979) Drbze (1975), Fisher (1966), Hannan (1969, 1971), Hatanaka (1975), Johnston (1972) Kadane (1975), Kohn (1979) Koopmans and Reiersol (1950), Preston and Wall ( 1973), Richmond (1974), Rothenberg (197 l), Theil(197 l), Wegge (1965), Zellner (1971) etc without specific acknowledgement in each case

Trang 4

226 C Hsiao

In Section 2 we define the basic concepts of identification Section 3 derives some identifiability criteria for contemporaneous simultaneous equation models under linear constraints; Section 4 derives some identifiability criteria for dynamic models Section 5 discusses criteria for models subject to non-linear continuous differentiable constraints and covariance restrictions with special emphasis on the applications to errors in variables and variance components models The Bayesian view on identification and concluding remarks are given in Section 6

2 Basic concepts’

It is generally assumed in econometrics that economic variables whose formation

an economic theory is designed to explain have the characteristics of random variables Let y be a set of such observations A structure S is a complete specification of the probability distribution function of y, P(y) The set of all

Q priori possible structures S is called a model The identification problem consists

in making judgements about structures, given the model S and the observations y

In most applications, y is assumed to be generated by a parametric probability distribution function P(ylS) = P(yla), w h ere (Y is an m-dimensional real vector The probability distribution function P is assumed known, conditional on CX, but a: is unknown Hence, a structure is described by a parametric point (Y, and a model is a set of points A c R” Thus, the problem of distinguishing between structures is reduced to the problem of distinguishing between parameter points

In this framework we have the following definitions

Since the set of structures is simply a subset of R”‘, it is possible that there may

be a number of observationally equivalent structures, but they are isolated from each other It is natural then to consider the concept of local identification We define this concept in terms of the distance between two structures

‘Professor F Fisher has pointed out to the author that “overidentification” is part of the general concept of identification and we ought to distinguish collinearity and lack of identification The concept of “overidentification” has been found relevant for the existence of sampling moments and the efficiency of the estimates in simultaneous-equations models, presumably the topics will be treated

in the chapters on estimation and sample distribution The problem of collinearity is a case of

Trang 5

Ch 4: Identification

Definition 2.3

227

A structure So = (Y’ is “locally identified” if there exists an open neighborhood W,

containing (Y’ such that no other (Y in w is observationally equivalent to (y”

On many occasions a structure S may not be identifiable, yet some of its characteristics may still be uniquely determinable Since the characteristics of a structure S are described by an m-dimensional real vector ar, we define this concept of identifiability of a substructure in terms of functions of (Y

Definition 2.4

Let ,$(a) be a function of (Y [(a) is said to be (locally) identifiable if (there exists

an open neighborhood w, such that) all parameter points which are observationally equivalent have the same value for $(a) (or lie outside o)

A special case of [(a) will be coordinate functions For instance, we may be interested in the identifiability of a subset of coordinates (Y, of (Y Then the subset

of coordinates of of (Y’ is said to be locally identifiable if there exists an open neighborhood o, containing CX’, such that all parameter points observationally equivalent to (Y’ have the same value for ap or lie outside w

In this chapter, instead of deriving identifiability criteria from the probability law of y [Bowden (1973), Rothenberg (1971)] we shall focus on the first- and second-order moments of y only If they are normally distributed, all information

is contained in the first- and second-order moments If the y are not normally distributed, observational information apart from the first and second moments may be available [Reiers@ (1950)] However, most estimation methods use second-order quantities only; also, if a structure is identifiable with second-order moments, then it is identifiable with a probability law [Deistler and Seifert (1978)] We shall therefore restrict ourselves to the first and second moments of y (or identifiability in the wide sense) for the sake of simplicity Thus, we shall view two structures as observationally equivalent if they produce identical first- and second-order moments Consequently, all the definitions stated above should be modified such that the statements with regard to the probability law of y are replaced by corresponding statements in terms of the first and second moments ofy

3 Contemporaneous simultaneous equation models

3.1 The model

In this section we consider the identification of a contemporaneous simultaneous equation model We first discuss conditions for two structures to be observationally equivalent We then derive identifiability criteria by checking conditions

Trang 6

228 C Hsiao

which will ensure that no two structures or part of two structures are observationally equivalent Finally, we illustrate the use of these conditions by considering some simple examples

For simplicity, let an economic theory specify a set of economic relations of the form

where

y, is a G X 1 vector of observed endogenous variables;

x, is a K X 1 vector of observed exogenous variables;

B is a G X G matrix of coefficients;

r is a G X K matrix of coefficients; and

u, is a G X 1 vector of unobserved disturbances

We shall indicate how these generalizations can be made in Section 3.4

3.2 Observationally equivalent structures

Suppose u, has the density P( z&2), then the joint density of (u,, , uT) is

t=1

The joint density of (y , , ,yr) can be derived through the structural relation

Trang 7

Ch 4: Identification

(3.1) and the density of u’s Conditional on x,, we have:

229

(3.3)

Suppose that we multiply (3.1) through by a G x G non-singular matrix F This

wouid involve replacing each equation of the original structure by a linear combination of the equations in that structure The new structure may be written

which is identical to the density (3.3) determined from the original structure Hence, we say that the two structures (3.1) and (3.4) are observationally equivalent

A special case of (3.4) occurs when we set F = B-’ so that the transformed

Trang 8

Eq (3.7) is called the “reduced form” of the “structural system” (3.1) We can alternatively write down the density of y in terms of the reduced form parameters (II, V) as

From (3.4) and (3.6) we know that (3.3) and (3.10) yield identical density functions for the endogenous variables Thus, if we postulate a set of structural relations (3.1) with reduced form (3.7), then all structures obtained by premultiplying the original structure by an arbitrary non-singular matrix of order G will have this same reduced form, and moreover, all these structures and the reduced forms will be observationally equivalent.2

Given that we will focus only on the first and second moments, we may formally state the conditions for two structures to be observationally equivalent in the following lemma

Lemma 3.2.1

- Two structures S = (B, r, 2) and S = (B, r, 2) are observationally equivalent if and only if the following equivalent conditions hold:

(i) B-IT= B- ‘r and B- ‘EB- 1’ = jj- ‘EB- 1’

(ii) There exists a non-singular matrix F such that

(3.11) and

Proof

(i) follows from (3.1), (3.7), and (3.10) The probability density of the data is assumed to be completely specified by the first and second moments, i.e the reduced-form parameters (II, V) If S and S are observationally equivalent, they must have identical reduced-form parameter matrix and variance-covariance matrix Condition (i) is exactly the condition which must be satisfied for the two reduced forms to be equal, and thus (i) is necessary and sufficient

Now consider (ii) Sufficiency of (ii) is easy to check using (i) To prove its necessity, suppose S and S are observationally equivalent Let F = BB-‘ As- sumption 3.1 implies that F is non-singular Then B- ‘r = B- ‘r implies that

(3.11) holds Let wI = Fu,, we have (3.12)

21n other words, we restrict our attention to the class of models which have identifiable reduced

Trang 9

If there were no a priori restrictions on the parameters of the model (3.1) any non-singular F will be admissible in the sense that the transformed structure satisfies the same restrictions of the model3 The situation would indeed be hopeless An infinite set of structures would have generated any set of sample observations If economic theories are available to specify a set of a priori

restrictions on the model, any transformed structure must also satisfy the same

a priori restrictions if the transformed structure is to belong to the model that has been specified (i.e the transformation is admissible) For instance, suppose economic theory specifies that (3.1) must obey certain restrictions, then the transformed structure (3.4) will have to obey these same restrictions A priori

information on the parameters of the model thus would rule out many structures which would otherwise be observationally equivalent, i.e they imply restrictions

on the elements of F In this section we shall assume that

Assumption 3.4

All prior information is in the form of linear restrictions on B and r

We ignore the information contained in the variance-covariance matrix for the moment We shall discuss this situation together with non-linear a priori restric-

tions in Section 5

The identification problem thus may be stated as

(a) If one considers the transformation matrix Fused to obtain the transformed structure (3.4), do the a priori restrictions on B and r imply sufficient restrictions on the elements of F to make some or all of the coefficients in the original and transformed structures identical (and thus identifiable)? Since Assumption 3.2 ensures the identifiability of II (which is consistently estimable by the ordinary least squares method), we may state the identification problem in an alternative but equivalent fashion

(b) Assuming the elements of II to be known can one then solve for some or all

of the elements of B and r uniquely?

3.3 Identification in terms of trivial transformations

We first consider the classical identifiability conditions for single equations or a set of equations These will be expressed in terms of the equivalence conditions (3.11) and (3.12) From Definitions 2.2 and 2.4, we can define the identification

‘The word “admissible” has a different meaning in statistical decision theory Its use in these two

Trang 10

C Hsiao

of the g th equation and the complete system as follows: 4

Definition 3.3.1

The gth equation is identified if and only if all equivalent structures are related

by admissible transformations which are trivial with respect to the gth equation That is, all admissible transformation matrix should have the form

Let A = [B, r], and a; be the gth row of A We assume that there exists an (G + K) X R, matrix c#$ and a 1 X R, vector di whose elements are known constants, such that all prior informatron about the gth equation of the model including the normalization rule takes the form

4Strictly speaking, Definitions 3.3.1 and 3.3.2 are theorems derived from Definitions 2.2 and 2.4 However, the proof of these theorems is trivial and sheds no light on the problem For simplicity of exposition, we treat them as definitions

Trang 11

Thus, restrictions stating & = 0, pg3 = - &, and & = 1 have the form

= Kw, 1) + gth row

233

(3.16)

With the restrictions written in the form of (3.15), we have the following important theorems

Theorem 3.3.1 (rank condition)

The g th equation is identified if and only if rank( A$;) = G

Corollary 3.3.1 (order condition)

A necessary condition for the identifiability of the gth equation is that there are

at least G linear restrictions including, the normalization rule, or (G - 1) linear restrictions excluding the normalization rule

Suppose the prior information about the gth equation is in the form of excluding certain variables from this equation (zero restrictions) Excluding the normalization rule, we may let this prior information take the form

where & is an (Rg - l)x(G + K) matrix The elements of each row of & are zero except for thejth which is set to unity in order to restrict thejth element of a; to zero

Trang 12

234

Corollary 3.3.2

C Hsiao

The gth equation is identified if and only if rank(&i.) = G - 1, i.e the submatrix

of A obtained by taking the columns of A with prescribed zeros in the g th row has rankG - 1

Proof

Consider two observationally equivalent structures 1 and A We know that there must exist a non-singular F such that A= FA by Lemma 3.2.1 And we wish to

show that the gth row of F has zeros everywhere but the gth entry Without loss

of generality, let g = 1, so that we are considering the first row of A or 2 Let

1 G-l

columns

By Lemma 3.2.1, we know that A= FA; hence,

ii; = f,,a; + f;.A

If Ais equivalent to A,

(3.19)

because a# = 0 Thus, for f; = 0, A.$; must have rankG - 1 But rank( A+; ) =

rank( A.# ) because

3.4 Identification in terms of “linear estimable functions”

(3.21)

We can also approach the identification problem by considering the reduced form (3.7) Assumptions 3.1-3.3 are sufficient to identify II We therefore treat the elements of II as known and investigate the conditions which will allow some or all of the elements of B and r to be uniquely determined

Trang 13

Ch 4: Identification 235

We approach the problem from the “ theory of estimable functions” [Richmond (1974)] The approach yields rather simple proofs of theorems on the identifiability of individual parameters, thus providing more general results than the conditions for single equation identification described above The discussion is carried out at the level of the whole system of equations; that is to say, it is not confined

to restrictions on the parameters of a particular equation, but rather is also applicable to restrictions on the parameters of different equations (cross-equation restrictions); [Kelly (1975), Mallela (1970), Wegge (1965), etc.] The cross-equation constraints do exist naturally in a model For instance, Hansen and Sargent (1980), Revankar (1980) and Wallis (1980) have shown that under the rational expectation hypothesis [Muth (1961)] a model containing an expected value of a variable would imply the existence of cross-equation restrictions

To implement this more general approach, we first write (3.8) in a more convenient form Let pi (g = 1, , G)denotethegthrowofB.Le{y;(g=I, ,G)

denote the gth row of r Let 6’= (pi,& , , B&y; , , y&) be the 1 X(G + K)G constant vector Then, eq (3.8) may be written in stacked form as

In this equation, @ denotes the kronecker product [e.g see Theil (1972, p 303)],

IK denotes the K x K identity matrix, and I,, denotes the GK X GK identity matrix

We now assume that additional prior information, including normalization rules, is available in the form of R linear restrictions on the elements of 6 We are given then:

where @ is an R x G(G + K) matrix and d is an R X 1 vector with known

Trang 14

Theorem 3.4 I

The vector 8 is identified if and only if rank(W) = G(G + K)

We can obtain equivalent results in terms of structural parameters Letting

Trang 15

Ch 4: Identificution 231

By Theorem 3.4.1, the whole vector 6 is identified if and only if rank(W) = G( G + K), hence if and only if rank(M@‘) = G2

It will sometimes be the case that the whole vector 8 is not identified but the a

priori restrictions would permit a subset of the parameters in 8 to be identified

To deal with this case, we now turn to the question of identifiability of individual parameters or a linear combination of the parameters The problem can be conveniently put in the form of a linear parametric function, which is a linear combination of the structural parameters of the form (‘8, where t’ is a 1 x G(G + K) vector with known elements From Definition 2.4 and the identifiability of II,

it follows that

Theorem 3.4.3

A parametric function E’S is identified, given II, if it has a unique value for every

S satisfying (3.26)

Finding conditions to identify (‘8 is equivalent to finding the conditions for 5’8

to be estimable We note that (3.26) is a set of linear equations A fundamental theorem with regard to the solvability of linear equations is

The condition for identifying 5’8 follows from this theorem

Trang 16

238 C Hsiao

The Jacobian matrix summarizes the information in the equation system (locally) Theorem 3.4.5 says that if [‘S is identifiable, then knowing the value of E’S does not add new information to (3.26) An equivalent condition in terms of structural parameters is

Theorem 3.4.6

The parametric function (‘8 is identified if and only if there exists K such that

But if (3.32) has a solution, then the equations

z’[zG1~zG~(-II)]W=o and z’[ZGz:ZG@( IZ)].$=l

have no solution

(3.33)

Partitioning @ = ($J,:@~) and E = ([,I(~), the equations

z’@; - z’(Z,@ZZ)@; = 0 and z’& - z’(Z,@,n)& = 1 (3.34)

have no solution Now suppose 6’8 is not identified Then the equations (Q’ : W) 11

= 5 have no solution, Hence there exists a solution to

Trang 17

if there exists K such that M@‘K = M(,, then 5’8 is identified

We can easily obtain the conditions for the identifiability of individual parameters from Theorems 3.4.5 and 3.4.6 by letting [ = ej, where ej is a G(G + K)x 1 vector whose elements are all zero except for the j th which is unity

(ii) rank(M@‘;mj) = rank( M@‘), where mj is thejth column of M

If all the restrictions are confined to restrictions on the parameters of a particular equation, including the normalization rule, we can obtain the conventional rank condition using Lemma 3.4.1

Theorem 3.4.7

The gth equation of (3.1) is identified if and only if the following equivalent conditions hold:

(i) rank =G+K;

(ii) rank( A$) = G,

where $ denotes the R, x (G + K) matrix whose elements are known constants such that all prior information on the gth equation takes the form (3.15)

We note that in the derivation of the above theorems, the basic assumption is that II is identifiable Under Assumption 3.2, plim(xT, , x,u,/T) = 0 is sufficient

to ensure it Consequently, we can relax the restrictive assumptions about our model in two ways The first is to allow the model to contain lagged endogenous variables and keep the assumption that u, are serially uncorrelated In this case, the lagged endogenous variables are included in the x, vector and treated as

Trang 18

240 C Hsiao

predetermined The second relaxation will be to allow serial correlation in u,, but

to keep the assumption that xt are exogenous It would of course be desirable in general to permit both relaxations However, we cannot use the theorems of this section and allow both lagged endogenous variables and autocorrelation in the disturbance term to appear since then plim(CT_ , x,u,/T) == 0 The more general case, with both autocorrelated disturbances and lagged endogenous variables, needs a different treatment We take this up in the next section after a brief illustration of the conditions of identifiability already discussed

3.5 Examples

To illustrate the application of the conditions for identifiability we consider a two-equation system:

PllYl, + P12Y2, + Yllxlt + Y12X2t = U1rr

Without further restrictions neither equation is identified since we can premultiply (3.37) by any 2 x 2 non-singular constant matrix F, and the new equation will

be observationally equivalent to (3.37) We consider the identifiability of the parameters of this model under various a priori restrictions We normalize the equations by letting /3,, = 1 and p22 = 1

Trang 19

241

so that from Theorem 3.3.1 the first equation is not identified, but the second equation is

Case 2: y,2 = 0 and y,, + y2, = 0

Since we have a cross-equations constraint, we put the model in stack form (as

in Section 3.4) and then write the constraint matrices of (3.24) as

Case 3: /3,, = 0 and y2, = 0

The first equation is not identified, but the second equation is because rank( A+;) = 1 and rank(&;) = 2 However, the individual coefficient y, , is also identified by Lemma 3.4.1 since

P21 0 0

_

0 42 Yll ’

0 P22 Y21 /

Trang 20

242 C Hsiao

and (since y,, is the fifth element of 6):

Noting that /?, , = p,, = 1 and P2, = y21 = 0, we thus have

I1 0 o:y,,

(M@‘: m5) = 0 0 0:o

0 PI2 Yll : 0

so that rank(M@’ : m,) = 3 = rank(M@‘)

4 Dynamic models with serially correlated residual&

4.1 The model

In this section we consider identification criteria for models of the form

where A, x,, and u, are G X 1, K x 1, and G X 1 vectors of observed jointly dependent variables, exogenous variables, and unobserved disturbance terms,

respectively; B, and r, are G X G and G X K parameter matrices Rewriting (4.1)

in terms of the lag operator L, Ly, = y,_ ,, we have

Trang 21

by {C,} We also assume that C,(O) is non-singular

Assumptions 4.1-4.3 are conventional In fact, Assumption 4.3 implies As- sumption 4.2 However, for ease of comparison between contemporaneous and dynamic models, we state them separately Below we shall relax Assumption 4.1 Assumption 4.5 will also be modified to take into account additional information

in the disturbance term For the moment we assume we have no such information Assumption 4.4 is a dynamic analogy to Assumptions 3.2 and 3.3 made for contemporaneous models It is made to ensure the identifiability of the transfer

‘A stochastic process is stationary in the wide sense if the mean Ex, = Ex, is a constant and the covariance E[ X, - Ex,][ x, - Ex,]’ depends only on their distance apart in time (t - s) A stationary process is said to be ergodic if the sample mean of every function of finite observations tends to its

Trang 22

244 C Hsiao

function [The transfer function is the dynamic equivalent of the matrix II in the contemporaneous model It is defined in Section 4.2, eq (4.8) below.] This assumption rules out the occurrence of multicollinearity; it also implies no deterministic right-hand-side variables so that the prediction errors about these variables will not be zero In particular, it means that among the K components of

x, none is a constant, a trend, or a seasonal variation, which are expressed as a solution of a linear difference equation with constant coefficients The assumption is not as restrictive as it appears The model can be generalized to take care

of these cases as Hannan (197 1) and Hatanaka (1975) have shown

We know that in the contemporaneous model (3.1) two structures are observationally equivalent if and only if they are connected by a G X G non-singular constant matrix (Lemma 3.2.1) However, in the dynamic model (4.1) we may premultiply by a non-singular G X G polynomial matrix F(L) to yield observationally equivalent structures This may be illustrated by the following example of

a two-equation system [Koopmans, Rubin and Leipnik (1950, p 109)] for which the identification conditions discussed in the previous section are incomplete

As one can see from this example, the major complication in the identification

of (4.1) lies in the possibility of premultiplication by a non-singular G x G polynomial matrix F(L) while retaining observational equivalence This additional complexity does not arise in the case,of lagged dependent variables alone, since if u, is serially uncorrelated, the left factor F(L) can be at most a non-singular constant matrix; otherwise, serial correlation will be introduced The problem arises when both lagged dependent variables and serially correlated residuals are allowed In what follows we first state formally the requirements for two structures to be observationally equivalent, and then illustrate how conditions may be imposed so that F(L) is again restricted to a constant matrix

Trang 23

4.2 Observationally equivalent structures

245

We now derive conditions for two structures to be observationally equivalent when both lagged endogenous variables and serial correlation in the disturbance terms are allowed Premultiplying (4.2) by B(L)-’ we obtain the model in transfer function form:

The second-order moments of { y,} conditional on x, are given by the sequence

HY, - Y,l[Y,+, - Yf+A’= E[ ~W’~t] [aJ’%+,]

= 5 E D,C,,(j -i)D;, i=o j-0

7 = - 00 ) ) - l,O, 1 ,***, 00, (4.11) which will be denoted by B- ’ * C, * B- “ The convolution notation * will be used also for the autocovariance sequence of any other transformation of (u,}.’

Lemma 4.2 I

Two structures S = [B(L), r( L),{C,}] and ?? = [B(L), r( L),{c,}] are observationally equivalent if and only if the following equivalent conditions hold: 0)

*A more convenient way of representing the sequence of the covariance of y (4.11) is to use the

spectral representation ,(eiX)-‘f,B’(e-‘“)~‘, where!, = (1/2n)xCO,C,(r)e-‘r’ and B(elX) = B,, + B,elh + + Bperph In fact we can derive the identifiability criteria more elegantly using the spectral approach However, the spectral approach will require a more advanced reasoning in statistics than is required in this chapter If readers find (4 I I), (4.13), and (4.15) confusing, they may be ignored They are presented for the completeness of the argument They are not used in the actual derivation of the

Trang 24

(i) Follows from (4.8), (4.10), and (4.11)

(ii) Sufficiency is easy to check using (i) To prove necessity, we suppose that S and 3 are observationally equivalent Let

This implies either that IT(L) = n(L) or that there exist d, such that CT=“=,djx,_,

= 0 However, Assumption 4.4 excludes the existence of such d, Therefore

IqL)=TT(L)

Trang 25

247

4.3 Linear restrictions on the coefficients

Given II(L), the model (4.2) is completely identified if it is possible to factor II(L) into the two unique polynomial matrices B(L) and r(L) In order to obtain a unique decomposition of II(L) in the form of - B(L)- ‘IJ L), given (4.8), we must eliminate the possibility that there is a common (left) factor in B(L) and IJ L) which cancels when B(L)- ‘T( L) is formed This is usually done

in three steps [Hannan (1969, 1971)]:

(i) eliminate the redundancy specification; 9

(ii) restrict the admissible transformations to constant matrices; and

(iii) restrict the constant transformation matrix to an identity matrix

The first step is achieved by imposing

Condition 4.3.1

[B(L) T(L)] are relatively left prime

By “relatively left prime” we mean that the greatest common left divisor is a unimodular matrix, i.e its determinant is a non-zero constant [For this and some useful definitions and theorems on polynomial matrices see Gantmacher (1959), MacDuffee (1956) or the Appendix.] When a left common divisor is not unimodular, it means that there is redundancy in the specification The following

is an example of an equation with a redundancy

md WA1 = [I+opL :I[ l+?L l+lz,,L 2’1

where the common left devisor F(L) has a non-constant determinant (1 + pL)

9By “redundancy” we mean that there are common (polynomial) factors appearing in B(L) and T(L)

Trang 26

By ruling out such a redundancy, we help rule out the transformation matrix F(L) which has the form like that of (4.20) Also, from a practical point of view this imposition of the relatively left prime condition may not be unreasonable, since if common factors are allowed there may be infinitely many structures which are observationally equivalent Redundant specification also poses serious estimation problems [Box and Jenkins (1970) Hannan, Dunsmuir and Deistler (1980)] Condition 4.3.1 puts restrictions on the observationally equivalent structures in the sense that

Lemma 4.3 I

Under Condition 4.3.1, two structures S = [B(L), r(L), {C,}] and 3 = [B(L), F( L),{cu}] are observationally equivalent if and only if there exists a unimodular matrix F(L) such that (4.14) and (4.15) hold

Having removed non-unimodular factors from the model (4.1) by imposing Condition 4.3 I, the second step is to restrict E’(L) to a constant matrix F This is achieved by imposing

Condition 4.3.2

Rank[ BP r,] = G

To see that this condition constrains F to be a constant matrix, let F(L) = F, +

F, L be a unimodular matrix Condition (4.14) states that

[B(L) F(L)] = F(L)[B(L) r(L)]

[ + F,B,LP+‘fF,r, + - - +(FOTq+ F,;rq_,)Lq+ F,T,Lq+‘]

to eliminate the possibility that 5 may not be equal to 0

After imposing conditions to restrict the non-singular transformation matrix

F(L) to a constant matrix F, we then look at conditions which will restrict F to

an identity matrix and thus make the dynamic equations identified From Section

Trang 27

Let at least (G - 1) zeros be prescribed in each row of A and let one element in

each row of B, be prescribed as unity Let the rank of each submatrix of A

obtained by taking the columns of A with prescribed zeros in a certain row be G-l

Then we have

Theorem 4.3 I

Under Conditions 4.3.1-4.3.3 the model (4.1) with Assumptions 4.1-4.5 is identifiable lo

In the contemporaneous case discussed in Section 3, p = q = 0, and hence

Condition 4.3.2 is automatically satisfied For dynamic models this condition is sufficient,” but not necessary since even if it is not satisfied, it may not be

possible to find a non-trivial unimodular factor which preserves other a priori

restrictions [Hannan (1971)] Condition 4.3.2 may be replaced by other conditions

which serve the same purpose of constraining F(L) to a constant matrix F

[Hannan (1971)] For instance, in Theorem 4.3.1 we may replace Condition 4.3.2

by

Condition 4.3.2’

The maximum degree of each column of [B(L) r(L)] is known a priori and

there exist G columns of [B(L) r(L)] such that the matrix of coefficient vectors

corresponding to the maximum degrees of these G columns has rank G

Similarly, instead of zero restrictions we may impose linear restrictions, e.g the

g th row of A may satisfy

Trang 28

250 C Hsiao

4.3.1 by

Condition 4.3.3’

For each row of A, rank( A$$) = G

On the other hand, there may be cross-equation linear constraints We can

derive a similar condition for restricting F to an identity matrix by stacking the coefficients of A in one row, vec( A) = (a;, a;, , a&), and state the restrictions as

For further details, see Deistler (1976)

It is possible to identify a dynamic model by relaxing some of these assumptions and conditions For example, when the prior information is in the form of excluding certain variables from an equation, the identification of (4.1) may be

achieved without specifying a priori the maximum order of lags (Assumption 4 I),

but instead allowing the lags to be empirically determined The restrictive Condition 4.3.2 or its variant may also be dropped These relaxations are permitted by the following theorem

Theorem 4.3.2

The model (4.1) with Assumptions 4.2-4.5 under Condition 4.3.1 is identified if and only if the following condition holds

Condition 4.3.3 “’

Let A(L) = [B(L) r(L)] Let at least (G - 1) zeros be prescribed in each row of

A(L) and let one element in each row of R,, be prescribed as unity For each row, the matrix consisting of the columns of A(L) having prescribed null elements in

that row is of rank(G - 1)

Proof

The sufficiency follows from the fact that under Condition 4.3.3 “’ F(L) must be diagonal Condition 4.3.1 restricts F(L) to be unimodular, i.e the determinant is a constant, and hence the diagonal elements of F(L) must be constants The

normalization rules then further restrict F equal to an identity matrix

To prove the necessity, suppose the structure S does not satisfy Condition 4.3.3 1/f Without loss of generality we suppose that the matrix consisting of the

Trang 29

columns of A(L) having prescribed null elements in the first row, a(L) say, has rank less than (G - 1) Let b(L) be the first column of B(L) Then there exists a

G x 1 vector polynomial f(L) = (1, f.,(L)‘)‘, with f.,(L) s 0, c = f(O)‘b(O) f 0,

and f(L)‘A(L) = 0 Let x(L) be the greatest common divisor of the elements of f( L)‘A( L), with x(O) = c Let

(4.26)

Then the structure [F(L) A( L),(C,}] is observationally equivalent to S and belongs to the model Because F(L) P IG by construction, S is not identifiable Although the elimination of redundancy is intuitively appealing, in practice it may be difficult to determine since the elements of [B(L) r(L)] are unknown Hatanaka (1975) has suggested that we treat a model as identified when the structures that are observationally equivalent (in the wide sense) and consistent with the a priori specifications are unique apart from the redundancy in each equation I2 In this the identification of a model is achieved when the only admissible transformation F(L) is a G x G non-singular diagonal matrix of rational functions of L Condition 4.3.3 “’ is then necessary and sufficient for the identification of model (4.1) with Assumptions 4.2-4.5 For other identification conditions without the relatively left prime assumption, see Deistler and Schrader (1979)

4.4 Additional information about the disturbances

Assumption 4.5 provides us with no information about the identification of

[B(L) r(L)] However, if [B(L) F(L)] are identified through conditions stated in Section 4.3, then by Weld’s decomposition theorem [Hannan (1970) and Rozanov (1967)] we can obtain a unique representation of u, from the covariance

of y (4.11) as

s=o

(4.27)

where t3, = Io, c,l]0]]2 < co, and V, are independently, identically distributed

‘*It should be noted that although in general we do not know the true order, it is essential to have a prior bound on the order Deistler and Hannan (1980) and Hannan, Dunsmuir and Deistler (1980) have shown that the maximum likelihood estimator of the transfer function will converge to the true value provided the assumed order is not smaller than the true order, but some pathology arises that the numerical problem of finding the unique maximum of the likelihood can be difficult

Trang 30

random variables with mean 0 and covariance matrix 9 We did not pursue such a factorization in Section 4.3 because it does not affect the estimation of [B(L) IYL)l*

However, if we have some information about u,, then this will aid identification; also, making use of it will improve the efficiency of estimates [e.g., see Espasa and Sargan (1977) and Hannan and Nicholls (1972)] In this section we illustrate how we may use this additional information to identify the model For additional discussion, see Deistler (1975), Hannan (197 1, 1976), Hatanaka (1976), Kohn (1979), Sargan (1961, 1979), etc

Suppose instead of Assumption 4.5, we assume

Tiêu đề	Identification
Tác giả	C. Hsiao
Trường học	University of Toronto
Chuyên ngành	Econometrics
Thể loại	Chương
Năm xuất bản	1983
Thành phố	Toronto

Định dạng
Số trang	61
Dung lượng	2,94 MB