Fortunately, the usual t statistic on ^qq has a limiting standard normal dis-tribution under H0, so it can be used to test H0.. We will see some examples inPart IV.There-One consequence
Trang 16.1 Estimation with Generated Regressors and Instruments
6.1.1 OLS with Generated Regressors
We often need to draw on results for OLS estimation when one or more of theregressors have been estimated from a first-stage procedure To illustrate the issues,consider the model
We observe x1; ; xK, but q is unobserved However, suppose that q is related toobservable data through the function q¼ f ðw; dÞ, where f is a known function and
w is a vector of observed variables, but the vector of parameters d is unknown (which
is why q is not observed) Often, but not always, q will be a linear function of w and
d Suppose that we can consistently estimate d, and let ^ d d be the estimator For each
observation i, ^qi¼ f ðwi; ^dÞ e¤ectively estimates qi Pagan (1984) calls ^qia generatedregressor It seems reasonable that, replacing qiwith ^qiin running the OLS regression
yion 1; xi1; xi2; ; xik; ^qi; i¼ 1; ; N ð6:2Þshould produce consistent estimates of all parameters, including g The question is,What assumptions are su‰cient?
While we do not cover the asymptotic theory needed for a careful proof untilChapter 12 (which treats nonlinear estimation), we can provide some intuition here.Because plim ^d ¼ d, by the law of large numbers it is reasonable that
qiwith ^qiin an OLS regression causes no problems
Things are not so simple when it comes to inference: the standard errors and teststatistics obtained from regression (6.2) are generally invalid because they ignore thesampling variation in ^d d Since ^ d d is also obtained using data—usually the same sample
of data—uncertainty in the estimate should be accounted for in the second step.Nevertheless, there is at least one important case where the sampling variation of ^d
can be ignored, at least asymptotically: if
Trang 2which usually holds in generated regressor contexts.
We often want to test the null hypothesis H0: g¼ 0 before including ^qq in the finalregression Fortunately, the usual t statistic on ^qq has a limiting standard normal dis-tribution under H0, so it can be used to test H0 It simply requires the usual homo-skedasticity assumption, Eðu2j x; qÞ ¼ s2 The heteroskedasticity-robust statisticworks if heteroskedasticity is present in u under H0
Even if condition (6.3) holds, if g 0 0, then an adjustment is needed for the asymptotic variances of all OLS estimators that are due to estimation of d Thus,
standard t statistics, F statistics, and LM statistics will not be asymptotically valid
when g 0 0 Using the methods of Chapter 3, it is not di‰cult to derive an
ad-justment to the usual variance matrix estimate that accounts for the variability in
^
d (and also allows for heteroskedasticity) It is not true that replacing qi with ^qisimply introduces heteroskedasticity into the error term; this is not the correct way
to think about the generated regressors issue Accounting for the fact that ^d d depends
on the same random sample used in the second-stage estimation is much di¤erentfrom having heteroskedasticity in the error Of course, we might want to use
a heteroskedasticity-robust standard error for testing H0: g¼ 0 becauseheteroskedasticity in the population error u can always be a problem However, just
as with the usual OLS standard error, this is generally justified only under H0: g¼ 0
A general formula for the asymptotic variance of 2SLS in the presence of erated regressors is given in the appendix to this chapter; this covers OLS with gen-erated regressors as a special case A general framework for handling these problems
gen-is given in Newey (1984) and Newey and McFadden (1994), but we must hold o¤until Chapter 14 to give a careful treatment
6.1.2 2SLS with Generated Instruments
In later chapters we will need results on 2SLS estimation when the instruments havebeen estimated in a preliminary stage Write the population model as
Trang 3greatly simplifies calculation of asymptotic standard errors and test statistics fore, if we have a choice, there are practical reasons for using 2SLS with generatedinstruments rather than OLS with generated regressors We will see some examples inPart IV.
There-One consequence of this discussion is that, if we add the 2SLS homoskedasticityassumption (2SLS.3), the usual 2SLS standard errors and test statistics are asymp-totically valid If Assumption 2SLS.3 is violated, we simply use the heteroskedasticity-robust standard errors and test statistics Of course, the finite sample properties of theestimator using ^zi as instruments could be notably di¤erent from those using zi asinstruments, especially for small sample sizes Determining whether this is the caserequires either more sophisticated asymptotic approximations or simulations on acase-by-case basis
6.1.3 Generated Instruments and Regressors
We will encounter examples later where some instruments and some regressors areestimated in a first stage Generally, the asymptotic variance needs to be adjustedbecause of the generated regressors, although there are some special cases where theusual variance matrix estimators are valid As a general example, consider the model
y¼ xb þ gf ðw; dÞ þ u; Eðu j z; wÞ ¼ 0
and we estimate d in a first stage If g ¼ 0, then the 2SLS estimator of ðb0;gÞ0in theequation
Trang 4ð ^d dÞ under conditions (6.3) and (6.8) Therefore, the
usual 2SLS t statistic for ^gg, or its heteroskedsticity-robust version, can be used to test
H0: g¼ 0
6.2 Some Specification Tests
In Chapters 4 and 5 we covered what is usually called classical hypothesis testing forOLS and 2SLS In this section we cover some tests of the assumptions underlyingeither OLS or 2SLS These are easy to compute and should be routinely reported inapplications
6.2.1 Testing for Endogeneity
We start with the linear model and a single possibly endogenous variable For tional clarity we now denote the dependent variable by y1and the potentially endog-enous explanatory variable by y2 As in all 2SLS contexts, y2 can be continuous orbinary, or it may have continuous and discrete characteristics; there are no restric-tions The population model is
Hausman (1978) suggested comparing the OLS and 2SLS estimators of b11
ðd10;a1Þ0 as a formal test of endogeneity: if y2 is uncorrelated with u1, the OLS and2SLS estimators should di¤er only by sampling error This reasoning leads to theHausman test for endogeneity
Trang 5The original form of the statistic turns out to be cumbersome to compute becausethe matrix appearing in the quadratic form is singular, except when no exogenousvariables are present in equation (6.9) As pointed out by Hausman (1978, 1983),there is a regression-based form of the test that turns out to be asymptoticallyequivalent to the original form of the Hausman test In addition, it extends easily toother situations, including some nonlinear models that we cover in Chapters 15, 16,and 19.
To derive the regression-based test, write the linear projection of y2 on z in errorform as
where r1¼ Eðv2u1Þ=Eðv2
2Þ, Eðv2e1Þ ¼ 0, and Eðz0e1Þ ¼ 0 (since u1 and v2 are eachorthogonal to z) Thus, y2is exogenous if and only if r1¼ 0
Plugging equation (6.13) into equation (6.9) gives the equation
The key is that e1is uncorrelated with z1, y2, and v2by construction Therefore, a test
of H0: r1¼ 0 can be done using a standard t test on the variable v2 in an OLS gression that includes z1and y2 The problem is that v2is not observed Nevertheless,
re-the reduced form parameters p2are easily estimated by OLS Let ^vv2denote the OLSresiduals from the first-stage reduced form regression of y2 on z—remember that zcontains all exogenous variables If we replace v2with ^vv2we have the equation
and d1, a1, and r1 can be consistently estimated by OLS Now we can use the results
on generated regressors in Section 6.1.1: the usual OLS t statistic for ^r1 is a valid test
of H0: r1 ¼ 0, provided the homoskedasticity assumption Eðu2
1j z; y2Þ ¼ s2
1 is isfied under H0 (Remember, y2is exogenous under H0.) A heteroskedasticity-robust
sat-t ssat-tasat-tissat-tic can be used if hesat-teroskedassat-ticisat-ty is suspecsat-ted under H
Trang 6As shown in Problem 5.1, the OLS estimates of d1and a1 from equation (6.15) are
in fact identical to the 2SLS estimates This fact is convenient because, along withbeing computationally simple, regression (6.15) allows us to compare the magnitudes
of the OLS and 2SLS estimates in order to determine whether the di¤erences arepractically significant, rather than just finding statistically significant evidence ofendogeneity of y2 It also provides a way to verify that we have computed the statisticcorrectly
We should remember that the OLS standard errors that would be reported fromequation (6.15) are not valid unless r1¼ 0, because ^vv2 is a generated regressor Inpractice, if we reject H0: r1¼ 0, then, to get the appropriate standard errors andother test statistics, we estimate equation (6.9) by 2SLS
Example 6.1 (Testing for Endogeneity of Education in a Wage Equation): Considerthe wage equation
logðwageÞ ¼ d0þ d1experþ d2exper2þ a1educþ u1 ð6:16Þfor working women, where we believe that educ and u1 may be correlated Theinstruments for educ are parents’ education and husband’s education So, we firstregress educ on 1, exper, exper2, motheduc, fatheduc, and huseduc and obtain theresiduals, ^vv2 Then we simply include ^vv2along with unity, exper, exper2, and educ in
an OLS regression and obtain the t statistic on ^vv2 Using the data in MROZ.RAWgives the result ^r1¼ :047 and t^1¼ 1:65 We find evidence of endogeneity of educ atthe 10 percent significance level against a two-sided alternative, and so 2SLS isprobably a good idea (assuming that we trust the instruments) The correct 2SLSstandard errors are given in Example 5.3
Rather than comparing the OLS and 2SLS estimates of a particular linear nation of the parameters—as the original Hausman test does—it often makes sense
combi-to compare just the estimates of the parameter of interest, which is usually a1 If,under H0, Assumptions 2SLS.1–2SLS.3 hold with w replacing z, where w includesall nonredundant elements in x and z, obtaining the test is straightforward Underthese assumptions it can be shown that Avarð^a1; 2SLS ^a1; OLSÞ ¼ Avarð^a1; 2SLSÞ Avarð^a1; OLSÞ [This conclusion essentially holds because of Theorem 5.3; Problem6.12 asks you to show this result formally Hausman (1978), Newey and McFadden(1994, Section 5.3), and Section 14.5.1 contain more general treatments.] Therefore,the Hausman t statistic is simplyð^a1; 2SLS ^a1; OLSÞ=f½seð^a1; 2SLSÞ2 ½seð^a1; OLSÞ2g1=2,where the standard errors are the usual ones computed under homoskedasticity Thedenominator in the t statistic is the standard error of ð^a ^a Þ If there is
Trang 7heteroskedasticity under H0, this standard error is invalid because the asymptoticvariance of the di¤erence is no longer the di¤erence in asymptotic variances.
Extending the regression-based Hausman test to several potentially endogenousexplanatory variables is straightforward Let y2 denote a 1 G1 vector of possibleendogenous variables in the population model
y1¼ z1d1þ y2a1þ u1; Eðz0u1Þ ¼ 0 ð6:17Þ
where a1 is now G1 1 Again, we assume the rank condition for 2SLS Write thereduced form as y2¼ zP2þ v2, where P2 is L G1 and v2 is the 1 G1 vector ofpopulation reduced form errors For a generic observation let ^v2 denote the 1 G1vector of OLS residuals obtained from each reduced form (In other words, take eachelement of y2 and regress it on z to obtain the RF residuals; then collect these in therow vector ^v2.) Now, estimate the model
and do a standard F test of H0: r1 ¼ 0, which tests G1 restrictions in the unrestricted
model (6.18) The restricted model is obtained by setting r1 ¼ 0, which means weestimate the original model (6.17) by OLS The test can be made robust to hetero-skedasticity in u1 (since u1¼ e1under H0) by applying the heteroskedasticity-robust
Wald statistic in Chapter 4 In some regression packages, such as Stata=, the robust
test is implemented as an F-type test
An alternative to the F test is an LM-type test Let ^u1 be the OLS residuals fromthe regression y1on z1; y2(the residuals obtained under the null that y2is exogenous).Then, obtain the usual R-squared (assuming that z1 contains a constant), say R2,from the regression
and use NR2as asymptotically w2
G1 This test again maintains homoskedasticity under
H0 The test can be made heteroskedasticity-robust using the method described inequation (4.17): take x1¼ ðz1; y2Þ and x2¼ ^vv2 See also Wooldridge (1995b).Example 6.2 (Endogeneity of Education in a Wage Equation, continued): We addthe interaction term blackeduc to the log(wage) equation estimated by Card (1995);see also Problem 5.4 Write the model as
logðwageÞ ¼ a1educþ a2blackeduc þ z1d1þ u1 ð6:20Þwhere z1 contains a constant, exper, exper2, black, smsa, 1966 regional dummy vari-ables, and a 1966 SMSA indicator If educ is correlated with u , then we also expect
Trang 8blackeduc to be correlated with u1 If nearc4, a binary indicator for whether a workergrew up near a four-year college, is valid as an instrumental variable for educ, then anatural instrumental variable for blackeduc is blacknearc4 Note that blacknearc4 isuncorrelated with u1 under the conditional mean assumption Eðu1j zÞ ¼ 0, where zcontains all exogenous variables.
The equation estimated by OLS is
log ^ððwageÞ ¼ 4:81
ð0:75Þ
þ :071ð:004Þ
educ þ :018ð:006Þ
blackeduc :419
ð:079Þblack þ
Therefore, the return to education is estimated to be about 1.8 percentage pointshigher for blacks than for nonblacks, even though wages are substantially lower forblacks at all but unrealistically high levels of education (It takes an estimated 23.3years of education before a black worker earns as much as a nonblack worker.)
To test whether educ is exogenous we must test whether educ and blackeduc areuncorrelated with u1 We do so by first regressing educ on all instrumental variables:those elements in z1 plus nearc4 and blacknearc4 (The interaction blacknearc4should be included because it might be partially correlated with educ.) Let ^vv21 be theOLS residuals from this regression Similarly, regress blackeduc on z1, nearc4, andblacknearc4, and save the residuals ^vv22 By the way, the fact that the dependentvariable in the second reduced form regression, blackeduc, is zero for a large fraction
of the sample has no bearing on how we test for endogeneity
Adding ^vv21and ^vv22to the OLS regression and computing the joint F test yields F ¼0:54 and p-value¼ 0.581; thus we do not reject exogeneity of educ and blackeduc.Incidentally, the reduced form regressions confirm that educ is partially corre-lated with nearc4 (but not blacknearc4) and blackeduc is partially correlated withblacknearc4 (but not nearc4) It is easily seen that these findings mean that the rankcondition for 2SLS is satisfied—see Problem 5.15c Even though educ does not ap-pear to be endogenous in equation (6.20), we estimate the equation by 2SLS:log ^ððwageÞ ¼ 3:84
ð0:97Þ
þ :127ð:057Þ
educ þ :011ð:040Þ
blackeduc :283
ð:506Þblack þ
The 2SLS point estimates certainly di¤er from the OLS estimates, but the standarderrors are so large that the 2SLS and OLS estimates are not statistically di¤erent.6.2.2 Testing Overidentifying Restrictions
When we have more instruments than we need to identify an equation, we can testwhether the additional instruments are valid in the sense that they are uncorrelatedwith u To explain the various procedures, write the equation in the form
Trang 9y1¼ z1d1þ y2a1þ u1 ð6:21Þwhere z1 is 1 L1 and y2 is 1 G1 The 1 L vector of all exogenous variables isagain z; partition this as z¼ ðz1; z2Þ where z2is 1 L2and L¼ L1þ L2 Because themodel is overidentified, L2> G1 Under the usual identification conditions we coulduse any 1 G1 subset of z2 as instruments for y2 in estimating equation (6.21) (re-member the elements of z1 act as their own instruments) Following his generalprinciple, Hausman (1978) suggested comparing the 2SLS estimator using all instru-ments to 2SLS using a subset that just identifies equation (6.21) If all instruments arevalid, the estimates should di¤er only as a result of sampling error As with testing forendogeneity, constructing the original Hausman statistic is computationally cumber-some Instead, a simple regression-based procedure is available.
It turns out that, under homoskedasticity, a test for validity of the cation restrictions is obtained as NR2from the OLS regression
where ^u1 are the 2SLS residuals using all of the instruments z and R2 is the usual squared (assuming that z1and z contain a constant; otherwise it is the uncentered R-squared) In other words, simply estimate regression (6.21) by 2SLS and obtain the2SLS residuals, ^u1 Then regress these on all exogenous variables (including a con-stant) Under the null that Eðz0u1Þ ¼ 0 and Assumption 2SLS.3, NR2
R-u@a w2
Q 1, where
Q11L2 G1 is the number of overidentifying restrictions
The usefulness of the Hausman test is that, if we reject the null hypothesis, then ourlogic for choosing the IVs must be reexamined If we fail to reject the null, then wecan have some confidence in the overall set of instruments used Of course, it could also
be that the test has low power for detecting endogeneity of some of the instruments
A heteroskedasticity-robust version is a little more complicated but is still easy toobtain Let ^y2denote the fitted values from the first-stage regressions (each element of
y2onto z) Now, let h2be any 1 Q1subset of z2 (It does not matter which elements
of z2 we choose, as long as we choose Q1 of them.) Regress each element of h2onto
ðz1; ^y2Þ and collect the residuals, ^rr2 ð1 Q1Þ Then an asymptotic w2
Q1 test statistic isobtained as N SSR0 from the regression 1 on ^u1^rr2 The proof that this methodworks is very similar to that for the heteroskedasticity-robust test for exclusionrestrictions See Wooldridge (1995b) for details
Example 6.3 (Overidentifying Restrictions in the Wage Equation): In estimatingequation (6.16) by 2SLS, we used (motheduc, fatheduc, huseduc) as instruments foreduc Therefore, there are two overidentifying restrictions Letting ^u1 be the 2SLSresiduals from equation (6.16) using all instruments, the test statistic is N times the R-squared from the OLS regression
Trang 10^1on 1; exper; exper2; motheduc; fatheduc; huseduc
Under H0 and homoskedasticity, NRu2@a w22 Using the data on working women inMROZ.RAW gives R2¼ :0026, and so the overidentification test statistic is about1.11 The p-value is about 574, so the overidentifying restrictions are not rejected atany reasonable level
For the heteroskedasticity-robust version, one approach is to obtain the residuals,
^
rr1 and ^rr2, from the OLS regressions motheduc on 1, exper, exper2, and e ^dduc andfatheduc on 1, exper, exper2, and e ^dduc, where e ^dduc are the first-stage fitted valuesfrom the regression educ on 1, exper, exper2, motheduc, fatheduc, and huseduc Thenobtain N SSR from the OLS regression 1 on ^u1 ^rr1, ^u1 ^rr2 Using only the 428observations on working women to obtain ^rr1 and ^rr2, the value of the robust test sta-tistic is about 1.04 with p-value¼ :595, which is similar to the p-value for the non-robust test
6.2.3 Testing Functional Form
Sometimes we need a test with power for detecting neglected nonlinearities in modelsestimated by OLS or 2SLS A useful approach is to add nonlinear functions, such assquares and cross products, to the original model This approach is easy when allexplanatory variables are exogenous: F statistics and LM statistics for exclusionrestrictions are easily obtained It is a little tricky for models with endogenous ex-planatory variables because we need to choose instruments for the additional non-linear functions of the endogenous variables We postpone this topic until Chapter 9when we discuss simultaneous equation models See also Wooldridge (1995b).Putting in squares and cross products of all exogenous variables can consumemany degrees of freedom An alternative is Ramsey’s (1969) RESET, which hasdegrees of freedom that do not depend on K Write the model as
[You should convince yourself that it makes no sense to test for functional form if weonly assume that Eðx0uÞ ¼ 0 If equation (6.23) defines a linear projection, then, bydefinition, functional form is not an issue.] Under condition (6.24) we know that anyfunction of x is uncorrelated with u (hence the previous suggestion of putting squaresand cross products of x as additional regressors) In particular, if condition (6.24)holds, thenðxb Þpis uncorrelated with u for any integer p Since b is not observed, we
replace it with the OLS estimator, ^b b Define ^yi¼ xib as the OLS fitted values and ^^ ui
as the OLS residuals By definition of OLS, the sample covariance between ^uiand ^yi
is zero But we can test whether the ^u are su‰ciently correlated with low-order
Trang 11Example 6.4 (Testing for Neglected Nonlinearities in a Wage Equation): We useOLS and the data in NLS80.RAW to estimate the equation from Example 4.3:logðwageÞ ¼ b0þ b1experþ b2tenureþ b3marriedþ b4south
þ b5urbanþ b6blackþ b7educþ u
The null hypothesis is that the expected value of u given the explanatory variables
in the equation is zero The R-squared from the regression ^uu on x, ^y2, and ^y3 yields
R2u¼ :0004, so the chi-square statistic is 374 with p-value A :83 (Adding ^y4 onlyincreases the p-value.) Therefore, RESET provides no evidence of functional formmisspecification
Even though we already know IQ shows up very significantly in the equation(t statistic¼ 3.60—see Example 4.3), RESET does not, and should not be expected
to, detect the omitted variable problem It can only test whether the expected value
of y given the variables actually in the regression is linear in those variables
6.2.4 Testing for Heteroskedasticity
As we have seen for both OLS and 2SLS, heteroskedasticity does not a¤ect the sistency of the estimators, and it is only a minor nuisance for inference Nevertheless,sometimes we want to test for the presence of heteroskedasticity in order to justify use
Trang 12con-of the usual OLS or 2SLS statistics If heteroskedasticity is present, more e‰cientestimation is possible.
We begin with the case where the explanatory variables are exogenous in the sensethat u has zero mean given x:
y¼ b0þ xb þ u; Eðu j xÞ ¼ 0
The reason we do not assume the weaker assumption Eðx0uÞ ¼ 0 is that the lowing class of tests we derive—which encompasses all of the widely used tests forheteroskedasticity—are not valid unless Eðu j xÞ ¼ 0 is maintained under H0 Thus
fol-we maintain that the mean Eð y j xÞ is correctly specified, and then fol-we test the stant conditional variance assumption If we do not assume correct specification of
con-Eð y j xÞ, a significant heteroskedasticity test might just be detecting misspecifiedfunctional form in Eð y j xÞ; see Problem 6.4c
Because Eðu j xÞ ¼ 0, the null hypothesis can be stated as H0: Eðu2j xÞ ¼ s2.Under the alternative, Eðu2j xÞ depends on x in some way Thus it makes sense totest H0by looking at covariances
and d0¼ s2 Thus we can apply an F test or an LM test for the null H0: d¼ 0
in equation (6.26) One thing to notice is that vi cannot have a normal distributionunder H0: because vi¼ u2
i s2; vibs2 This does not matter for asymptotic ysis; the OLS regression from equation (6.26) gives a consistent, ffiffiffiffiffi
anal-N
p-asymptotically
normal estimator of d whether or not H0 is true But to apply a standard F or LMtest, we must assume that, under H0, Eðv2
i j xiÞ is constant: that is, the errors inequation (6.26) are homoskedastic In terms of the original error ui, this assumptionimplies that
Eðu4
under H0 This is called the homokurtosis (constant conditional fourth moment) sumption Homokurtosis always holds when u is independent of x, but there are
Trang 13as-conditional distributions for which Eðu j xÞ ¼ 0 and Varðu j xÞ ¼ s2 but Eðu4j xÞdepends on x.
As a practical matter, we cannot test d¼ 0 in equation (6.26) directly because uiisnot observed Since ui¼ yi xib and we have a consistent estimator of b, it is natu-
ral to replace ui2with ^u2i, where the ^uiare the OLS residuals for observation i Doingthis step and applying, say, the LM principle, we obtain NR2from the regression
where R2
c is just the usual centered R-squared Now, if the u2
i were used in place ofthe ^u2
i, we know that, under H0 and condition (6.27), NR2@a w2
Q, where Q is the mension of hi
di-What adjustment is needed because we have estimated ui2? It turns out that, cause of the structure of these tests, no adjustment is needed to the asymptotics (Thisstatement is not generally true for regressions where the dependent variable has beenestimated in a first stage; the current setup is special in that regard.) After tediousalgebra, it can be shown that
of the Breusch-Pagan test relies heavily on normality of the ui, in particular k2¼ 3s2,
so that Koenker’s version based on NR2c in regression (6.28) is preferred.] White’s(1980b) test is obtained by taking hito be all nonconstant, unique elements of xiand
xi0xi: the levels, squares, and cross products of the regressors in the conditional mean.The Breusch-Pagan and White tests have degrees of freedom that depend on thenumber of regressors in Eð y j xÞ Sometimes we want to conserve on degrees of free-dom A test that combines features of the Breusch-Pagan and White tests, but whichhas only two dfs, takes ^hhi1ð ^yi; ^y2
iÞ, where the ^yi are the OLS fitted values (Recallthat these are linear functions of the xi.) To justify this test, we must be able to re-place hðxiÞ with hðxi; ^bÞ We discussed the generated regressors problem for OLS inSection 6.1.1 and concluded that, for testing purposes, using estimates from earlierstages causes no complications This is the case here as well: NR2from ^u2