The Instrumental Variables Estimator

Một phần của tài liệu A guide to modern econometrics, 5th edition (Trang 164 - 171)

If one or more explanatory variables in a regression model are endogenous, that is, cor- related with the error term, the OLS estimator is biased and inconsistent. In these cases there is need for an alternative estimator. In the current section, we shall discuss the instru- mental variables estimator using the wage equation from Subsection 5.2.3 as motivation.

5.3.1 Estimation with a Single Endogenous Regressor and a Single Instrument

Suppose we explain an individual’s log wageyifrom a number of personal characteristics, x1i, as well as the years of schooling,x2i, by means of a linear model

yi=x1i𝛽1+x2i𝛽2+𝜀i. (5.33) We know from Chapter 2 that this model has no interpretation unless we make some assumptions about𝜀i. Otherwise, we could just set 𝛽1 and𝛽2 to arbitrary values and define𝜀isuch that the equality in (5.33) holds for every observation. The most common interpretation so far is that (5.33) describes the conditional expectation or the best linear approximation ofyigivenx1iandx2i. This requires us to impose that

E{𝜀ix1i} =0 (5.34)

E{𝜀ix2i} =0, (5.35)

which are the necessary conditions for consistency of the OLS estimator. As soon as we relax any of these conditions, the model no longer corresponds to the conditional expectation ofyigivenx1iandx2i.

In the above wage equation,𝜀iincludes all unobservable factors that affect a person’s wage, including things like ‘ability’ or ‘intelligence’. Typically, it is argued that years of schooling of a person also depend upon these unobserved characteristics. If this is the case, OLS isconsistentlyestimating the conditional expected value of a person’s wage given, among other things, years of schooling, butnot consistentlyestimating the causal effect of schooling. That is, the OLS estimate for𝛽2 would reflect the difference in expected wages of two arbitrary persons with the same observed characteristics inx1i, but withx2andx2+1 years of schooling, respectively. It does not, however, measure the expected wage difference if an arbitrary person (for some exogenous reason) decides to increase his or her schooling fromx2tox2+1 years. The reason is that, when interpreting the model as a conditional expectation, the unobservable factors affecting a person’s wage are not assumed to be constant across the two persons, whereas in the causal interpreta- tion the unobservables are kept unchanged. Put differently, when we interpret the model as a conditional expectation, the ceteris paribus condition only refers to the included vari- ables inx1i, whereas for a causal interpretation it also includes the unobservables (omitted variables) in the error term.

Quite often, coefficients in a regression model are interpreted as measuring causal effects. In such cases, it makes sense to discuss the validity of conditions like (5.34) and (5.35). IfE{𝜀ix2i}≠0, we say that x2i is endogenous (with respect to the causal effect𝛽2). For micro-economic wage equations, it is often argued that many explanatory

k k

THE INSTRUMENTAL VARIABLES ESTIMATOR 151

variables are potentially endogenous, including education level, union status, sickness, industry and marital status. To illustrate this, it is not uncommon (for USA data) to find that expected wages are about 10% higher if a person is married. Quite clearly, this is not reflecting the causal effect of being married, but the consequence of differences in unobservable characteristics of married and unmarried people.

If it is no longer imposed thatE{𝜀ix2i} =0, the OLS method produces a biased and inconsistent estimator for the parameters in the model. The solution requires an alterna- tive estimation method. To derive a consistent estimator, it is necessary that we make sure that our model is statistically identified. This means that we need to impose additional assumptions; otherwise the model is not identified, and any estimator is necessarily incon- sistent. To see this, let us go back to the conditions (5.34) and (5.35). These conditions are so-called moment conditions, conditions in terms of expectations (moments) that are implied by the model. These conditions should be sufficient to identify the unknown parameters in the model. That is, theKparameters in𝛽1and𝛽2should be such that the followingKequalities hold:

E{(yix1i𝛽1−x2i𝛽2)x1i} =0 (5.36) E{(yix1i𝛽1−x2i𝛽2)x2i} =0. (5.37) When estimating the model by OLS we impose these conditions on the estimator through the corresponding sample moments. That is, the OLS estimatorb= (b1,b2)for 𝛽= (𝛽1, 𝛽2)is solved from

1 N

N i=1

(yix1ib1−x2ib2)x1i=0 (5.38) 1

N

N i=1

(yix1ib1−x2ib2)x2i=0. (5.39) In fact, these are the first-order conditions for the minimization of the least squares cri- terion. The number of conditions exactly equals the number of unknown parameters, so thatb1andb2can be solved uniquely from (5.38) and (5.39). However, as soon as (5.35) is violated, condition (5.39) drops out, and we can no longer solve for b1 andb2. This means that𝛽1and𝛽2are no longer identified.

To identify𝛽1and𝛽2in the more general case, we need at least one additional moment condition. Such a moment condition is usually derived from the availability of an instrument orinstrumental variable. An instrumental variable z2i, say, is a variable that can be assumed to be uncorrelated with the model’s error𝜀ibut correlated with the endogenous regressorx2i.6 If such an instrument can be found, condition (5.37) can be replaced by

E{(yix1i𝛽1−x2i𝛽2)z2i} =0. (5.40) An instrument that is uncorrelated with the equation’s error term and satisfies (5.40) is referred to as ‘exogenous’. Provided the moment condition in (5.40) is not a combination

6The assumption that the instrument is correlated withx2iis needed for identification. If there is no correlation the additional moment does not provide any (identifying) information on𝛽2.

k k of the other ones (z2iis not a linear combination ofx1is), this is sufficient to identify theK

parameters𝛽1 and𝛽2. The condition in (5.40) is referred to as anexclusion restriction, which reflects the implicit assumption thatz2i is validly excluded from the model of interest in (5.33).

Theinstrumental variables estimator ̂𝛽IVcan then be solved from 1

N

N i=1

(yix1i ̂𝛽1,IVx2î𝛽2,IV)x1i=0 (5.41)

1 N

N i=1

(yix1i ̂𝛽1,IVx2î𝛽2,IV)z2i=0. (5.42) The solution can be determined analytically and leads to the following expression for the IV estimator

̂𝛽IV = ( N

i=1

zixi )−1∑N

i=1

ziyi, (5.43)

wherexi= (x1i,x2i)andzi= (x1i,z2i). Clearly, ifz2i=x2i, this expression reduces to the OLS estimator.

Identification of the model and consistency of the IV estimator requires that the moment conditions uniquely identify the parameters of interest. This requires that the K×K matrix

plim 1 N

N i=1

zixi= Σzx (5.44)

is finite and invertible. This means that the partial correlation between the instrument and the endogenous variable is nonzero. To be precise, it requires the coefficient𝜋2 in the reduced form equation

x2i=x1i𝜋1+z2i𝜋2+𝑣i

to be different from zero, which says that the endogenous regressorx2iand the instrument z2ihave nonzero correlation, after netting out the effects of all other exogenous variables in the model. Note that this also requires thatz2iis not a linear combination of the elements inx1i. If this condition is satisfied we call the instrument ‘relevant’. The requirement that an instrument be relevant is not a trivial regularity condition and in many applications is a point of concern (see below).

The asymptotic covariance matrix of ̂𝛽IVdepends upon the assumptions we make about the distribution of 𝜀i. Under assumptions (5.36), (5.40) (valid instrument) and (5.44) (relevant instrument), and assuming𝜀iisIID(0, 𝜎2), independently ofzi, it can be shown

that

N(̂𝛽IV𝛽)→N(0, 𝜎2(ΣxzΣ−1zzΣzx)−1), (5.45) where the symmetricK×Kmatrix

Σzz≡plim 1 N

N i=1

zizi

k k

THE INSTRUMENTAL VARIABLES ESTIMATOR 153

is assumed to be invertible, andΣzx= Σxz. Nonsingularity ofΣzzrequires that there is no multicollinearity among theKelements in the vectorzi. In finite samples we can estimate the covariance matrix of ̂𝛽IVby

V{̂ ̂𝛽IV} = ̂𝜎2⎛

⎜⎜

⎝ ( N

i=1

xizi ) ( N

i=1

zizi

)−1( N

i=1

zixi )⎞⎟

⎟⎠

−1

, (5.46)

where ̂𝜎2 is a consistent estimator for 𝜎2 based upon the residual sum of squares, for example,

̂𝜎2= 1 NK

N i=1

(yixî𝛽IV)2. (5.47) Similarly to OLS, it is also possible to compute a heteroskedasticity-consistent covari- ance matrix for the IV estimator. Accordingly, it is very easy to calculate standard errors for the IV estimator that are robust to heteroskedasticity of unknown form; see Davidson and MacKinnon (2004, Section 8.5).

The above results show that it is possible to consistently estimate the coefficients in a linear regression model when one of the regressors is correlated with the error term, provided that we can find an instrumental variable that is both relevant and exogenous.

The problem for the practitioner is that it is often far from obvious to find variables that could serve as valid instruments, or to establish whether a chosen instrument is indeed exogenous. The requirement that an instrument is relevant is relatively easy. It requires that the instrument is correlated with the endogenous regressor, conditional upon the other regressors in the equation. This correlation should be sufficiently strong to increase statistical power and to avoid a so-called weak instruments problem. If the instrument is only weakly correlated with the endogenous regressor, this means that theR2 of the reduced form increases only marginally when the instrument is added. In this case, the instrumental variables estimator has poor properties (see Subsection 5.6.4). Evaluating the significance of the instrument in the reduced form is a helpful exercise. The usual rule of thumb is that an instrumental variable should have anF-statistic in the reduced form larger than 10, corresponding to at-ratio exceeding 3.16 (Stock and Watson, 2007, Chapter 12).

The requirement that an instrumental variable is exogenous is more complicated.

As stressed by Angrist and Pischke (2009, Chapter 4) this actually requires two things.

One is that the instrument is as good as randomly assigned and cannot be influenced by the dependent variable yi (conditional upon the other regressors). Second is an ‘only through’ condition and requires that the instrument predicts the dependent variable yi only though the instrumented variable (x2i), conditional upon the other regressors, not directly or through a third unobserved variable. This is often called ‘an exclusion restriction’, and it requires that the instrument itself is appropriately excluded from the equation of interest.

In the above wage equation example, we require an instrumental variable that is correlated with years of schooling x2i but uncorrelated with wages directly or with the unobserved ‘ability’ factors that are included in 𝜀i. This requires a variable that is correlated with the costs of schooling, or the likelihood of having certain levels of schooling, while being unrelated to a person’s ability. Potential instruments relate to

k k differences in costs due to loan policies or other subsidies that vary independently of

ability or earnings potential, or to variation in institutional constraints, like changes in compulsory schooling laws; we come back to this example in Section 5.4.

Unlike the relevance condition, the exclusion or exogeneity condition cannot be tested.

This is because𝜀iis unobserved. Essentially, when using instrumental variables we are replacing one untestable assumptionE{𝜀ix2i}= 0 with another untestable assumption E{𝜀iz2i}= 0 (imposingE{𝜀ix1i}= 0 in both cases). In other words, in both cases the moment conditions we impose are identifying conditions. Accordingly, they cannot be tested statistically. The only case where moment conditions are partially testable is when there are more conditions than actually needed for identification, that is, when we have more instrumental variables than endogenous regressors. In this case, one can test the so-called overidentifying restrictions, without, however, being able to specify which of the moment conditions corresponds to these restrictions (see below). The fact that the scope for testing the validity of instruments is very limited indicates that researchers should pay careful attention to the justification of their instruments, paying attention to theoretical arguments or institutional background. The reliability of an instrument relies on argumentation, not on empirical testing.

Another drawback of instrumental variables estimation is that the standard errors of an IV estimator are typically quite high compared to those of the OLS estimator. The most important reason for this is that instrument and regressor have a low correlation; see Wooldridge (2010, Subsection 5.2.6) for more discussion. Due to the concerns above, some authors argue that under poor conditions instrumental variable estimates are more likely to provide the wrong statistical inference than simple OLS estimates that make no correction for endogeneity (Larcker and Rusticus, 2010).

Keeping in mind the above, the endogeneity ofx2ican be tested provided we assume that the instrumentz2iis valid. Hausman (1978) proposes to compare the OLS and IV estimators for𝛽. Assuming (5.44) andE{𝜀izi} =0, the IV estimator is consistent. If, in addition,E{𝜀ix2i} =0, the OLS estimator is also consistent and should differ from the IV one by sampling error only. A computationally attractive version of theHausman test for endogeneity (often referred to as the Durbin–Wu–Hausman test) can be based upon a simple auxiliary regression. First, estimate a regression explainingx2ifromx1iandz2i, and save the residuals, say ̂𝑣i. This is the reduced-form equation. Next, add the residuals to the model of interest and estimate

yi=x1i𝛽1+x2i𝛽2+ ̂𝑣i𝛾+ei

by OLS. This reproduces7 the IV estimator for𝛽1 and𝛽2, but also produces an estimate for𝛾. If𝛾=0, x2iis exogenous. Consequently, we can easily test the endogeneity ofx2i by performing a standardt-test on𝛾=0 in the above regression. Note that the endogene- ity test requires the assumption that the instrument is exogenous and therefore does not help to determine which identifying moment condition,E{𝜀ix2i} =0 orE{𝜀iz2i} =0, is appropriate.

The concerns with instrumental variables approaches, or with causal inference more generally, have received substantial attention recently, with Angrist and Pischke (2009) as a prominent example. Larcker and Rusticus (2010) are very critical on the use of instru- mental variables in accounting research. After inspecting a number of recently published

7Although the estimates for𝛽1and𝛽2will be identical to the IV estimates, the standard errors will not be appropriate; see Wooldridge (2010, Section 6.2).

k k

THE INSTRUMENTAL VARIABLES ESTIMATOR 155

studies, they conclude that the variables selected as instruments seem largely arbitrary and not justified by any rigorous theoretical discussion. According to them, many IV applications in accounting are likely to produce highly misleading parameter estimates and test statistics. In a similar vein, Roberts and Whited (2013) argue that truly exoge- nous instruments are extremely difficult to find in corporate finance research and conclude that ‘many papers in corporate finance discuss only the relevance of the instrument and ignore any exclusion restrictions’. Sovey and Green (2011) show that many of the arti- cles in political science do a poor job in providing argumentation for the validity of instruments. Atanasov and Black (2016) focus on shock-based instrumental variables in corporate finance and accounting, which rely on an external shock as the basis for causal inference, for example a change of governance rules imposed by governments.

They conclude that only a small minority of the studies they investigated have convinc- ing causal inference strategies. Similarly, Durlauf, Johnson and Temple (2005) state that many IV procedures in the empirical growth literature are ‘undermined by the failure to address properly the question of whether these instruments are valid, i.e., whether they may be plausibly argued to be uncorrelated with the error term in a growth regression’.

Bazzi and Clemens (2013) also demonstrate that invalid and weak instruments are com- monly used, even in the more recent growth literature.8See Section 5.7 for an empirical illustration in this context.

5.3.2 Back to the Keynesian Model

The problem for the practitioner is thus to find suitable instruments. In most cases, this means that somehow our knowledge of economic theory has to be exploited. In a complete simultaneous equations model (that specifies relationships for all endogenous variables), this problem can be solved because any exogenous variable in the system that is not included in the equation of interest can be used as an instrument. More precisely, any exogenous variable that has an effect on the endogenous regressor can be used as an instrument. Information on this is obtained from the reduced form for the endogenous regressor. For the Keynesian model, this implies that investments z2t provide a valid instrument for incomex2t. The resulting instrumental variable estimator is then given by

̂𝛽IV = [ T

t=1

(1 z2t

) (1 x2t)]−1∑T t=1

(1 z2t

) yt,

which we can solve for ̂𝛽2,IVas

̂𝛽2,IV =

T

t=1(z2t̄z2)(yty)̄

T

t=1(z2t2)(x2t2), (5.48) wherēz2,and2denote the sample averages.

An alternative way to see that the estimator (5.48) works is to start from (5.27) and take the covariance with our instrumentz2ton both sides of the equality sign. This gives

cov{yt,z2t} =𝛽2cov{x2t,z2t} +cov{𝜀t,z2t}. (5.49)

8Most of this literature uses panel data, which we discuss in Chapter 10.

k k Exogeneity of the instrumentz2timplies that the last term in this equality is zero. Further,

when the instrument is relevant, cov{x2t,z2t}≠0, and we can solve for𝛽2as 𝛽2= cov{z2t,yt}

cov{z2t,x2t}. (5.50)

This relationship suggests an estimator for𝛽2 by replacing the population covariances with their sample counterparts. This gives the instrumental variables estimator we have seen above:

̂𝛽2,IV = (1∕T)∑T

t=1(z2t2)(yty)̄ (1∕T)∑T

t=1(z2t2)(x2t2). (5.51) Consistency follows directly from the general result that, under weak regularity condi- tions, sample moments converge to population moments.

5.3.3 Back to the Measurement Error Problem The model is given by

yt=𝛽1+𝛽2xt+𝜀t,

where (as an interpretation) yt denotes savings and xt denotes observed disposable income, which equals true disposable income plus a random measurement error. The presence of this measurement error induces correlation betweenxtand𝜀t.

Given this model, no obvious instruments arise. In fact, this is a common problem in models with measurement errors due to inaccurate recording. The task is to find an observed variable that is (1) correlated with incomext but (2) not correlated with ut, the measurement error in income (nor with𝜀t). If we can find such a variable, we can apply instrumental variables estimation. Mainly because of the problem of finding suit- able instruments, the problem of measurement error is often ignored in empirical work.

5.3.4 Multiple Endogenous Regressors

If more than one explanatory variable is considered to be endogenous, the dimension of x2iis increased accordingly, and the model reads

yi=x1i𝛽1+x2i𝛽2+𝜀i.

To estimate this equation, we need an instrument for each element inx2i. This means that, if we have five endogenous regressors, we need at least five different instruments.

Denoting the instruments by the vectorz2i, the instrumental variables estimator can again be written as in (5.43),

̂𝛽IV = ( N

i=1

zixi )−1∑N

i=1

ziyi, where nowxi= (x1i,x2i)andzi= (x1i,z2i).

It is sometimes convenient to refer to the entire vectorzias the vector of instruments.

If a variable inxiis assumed to be exogenous, we do not need to find an instrument for it. Alternatively and equivalently, this variable is used as its own instrument. This means

k k

ILLUSTRATION: ESTIMATING THE RETURNS TO SCHOOLING 157

that the vector of exogenous variables x1i is included in theK-dimensional vector of instrumentszi. If all the variables are exogenous,zi=xiand we obtain the OLS estimator, where ‘each variable is instrumented by itself’.

In a simultaneous equations context, the exogenous variables from elsewhere in the system are candidate instrumental variables. The so-called ‘order condition for identifica- tion’ (see Greene, 2012, Section 10.6) essentially says that sufficient instruments should be available in the system. If, for example, there are five exogenous variables in the sys- tem that are not included in the equation of interest, we can have up to five endogenous regressors. If there is only one endogenous regressor, we have five different instruments to choose from. It is also possible and advisable to estimate more efficiently by using all the available instruments simultaneously. This is discussed in Section 5.6. First, however, we shall discuss an empirical illustration concerning the estimation of the causal effect of schooling on earnings.

Một phần của tài liệu A guide to modern econometrics, 5th edition (Trang 164 - 171)

Tải bản đầy đủ (PDF)

(523 trang)