It is quite clear that, on average, people with more education have higher wages. It is less clear, however, whether this positive correlation reflects a causal effect of schooling, or that individuals with a greater earnings capacity have chosen more years of schooling.
If the latter possibility is true, the OLS estimates on the returns to schooling simply reflect differences in unobserved characteristics of working individuals, and an increase in a person’s schooling owing to an exogenous shock will have no effect on this person’s wage. The problem of estimating the causal effect of schooling upon earnings has therefore attracted substantive attention in the literature; see Card (1999) for a survey.
Most studies are based upon the human capital earnings function, which says that wi=𝛽1+𝛽2Si+𝛽3Ei+𝛽4E2i +𝜀i,
where wi denotes the log of individual earnings, Si denotes years of schooling andEi denotes years of experience. In the absence of information on actual experience, Ei is sometimes replaced by ‘potential experience’, measured as agei−Si−6, assuming people start school at the age of 6. This specification is usually augmented with additional explanatory variables that one wants to control for, like regional, gender and racial dummies. In addition, it is sometimes argued that the returns to education vary across individuals. With this in mind, let us reformulate the wage equation as
wi=zi𝛽+𝛾iSi+ui
=zi𝛽+𝛾Si+𝜀i, (5.52)
where𝜀i=ui+ (𝛾i−𝛾)Si, andziincludes all observable variables (exceptSi), including the experience variables and a constant. It is assumed thatE{𝜀izi} =0. The coefficient𝛾 has the interpretation of the average return to (an additional year of) schoolingE{𝛾i} =𝛾 and is our parameter of interest. In addition, we specify a reduced form forSias
Si=zi𝜋+𝑣i, (5.53)
where E{𝑣izi} =0. This reduced form is simply a best linear approximation of Si and does not necessarily have an economic interpretation. OLS estimation of 𝛽 and
k k 𝛾 in (5.52) is consistent only if E{𝜀iSi} =E{𝜀i𝑣i} =0. This means that there are no
unobservable characteristics that both affect a person’s choice of schooling and his or her (later) earnings.
As discussed in Card (1995), there are different reasons why schooling may be correlated with𝜀i. An important one is ‘ability bias’ (see Griliches, 1977). Suppose that some individuals have unobserved characteristics (ability) that enable them to get higher earnings. If these individuals also have above-average schooling levels, this implies a positive correlation between𝜀i and𝑣i and an OLS estimator that is upward biased.
Another reason why𝜀i and𝑣i may be correlated is the existence of measurement error in the schooling measure. As discussed in Subsection 5.2.2, this induces a negative correlation between𝜀iand𝑣iand, consequently, adownwardbias in the OLS estimator for𝛾. Finally, if the individual specific returns to schooling(𝛾i)are higher for individuals with low levels of schooling, the unobserved component (𝛾i−𝛾)Si will be negatively correlated withSi, which, again, induces adownwardbias in the OLS estimator.
In the above formulation there are no instruments available for schooling as all potential candidates are included in the wage equation. Put differently, the number of moment conditions in
E{𝜀izi} =E{(wi−zi𝛽−𝛾Si)zi} =0
is one short to identify𝛽and𝛾. However, if we can think of a variable inzi(z2i, say) that affects schooling but not wages, this variable can be excluded from the wage equation so as to reduce the number of unknown parameters by 1, thereby making the model exactly identified. In this case the instrumental variables estimator for9𝛽and𝛾, usingz2ias an instrument, is a consistent estimator.
A continuing discussion in labour economics is the question as to which variable can legitimately serve as an instrument. Typically, an instrument is thought of as a variable that affects the costs of schooling (and thus the choice of schooling) but not earnings.
There is a long tradition of using family background variables, for example the number of siblings or parents’ education, as instruments. As Card (1999) notes, the interest in family background is driven by the fact that children’s schooling choices are highly correlated with the characteristics of their parents. More recently, institutional factors of the schooling system are exploited as potential instruments. For example, Angrist and Krueger (1991) use an individual’s quarter of birth as an instrument for schooling. Using an extremely large data set of men born from 1930 to 1959, they find that people with birth dates earlier in the year have slightly less schooling than those born later in the year.
Assuming that quarter of birth is independent of unobservable taste and ability factors, it can be used as an instrument to estimate the returns to schooling. Card (1995) uses the presence of a nearby college as an instrument that can validly be excluded from the wage equation. Students who grow up in an area without a college face a higher cost of college education, while one would expect that higher costs, on average, reduce the years of schooling, particularly in low-income families. Evans and Montgomery (1994) and Dickson (2013), among others, use early smoking habits as an instrument for schooling.
They argue that the choice to smoke at a young age is related to an individual’s rate of time preference and therefore correlated with schooling, making the instrument relevant.
Moreover, smoking behaviour is unlikely to have a direct impact on a person’s earnings at higher ages, which – when true – would make the instrument exogenous.
9Note thatz2iis excluded from the wage equation so that the element in𝛽corresponding toz2iis set to zero.
k k
ILLUSTRATION: ESTIMATING THE RETURNS TO SCHOOLING 159
In this section we use data on 3010 men taken from the US National Longitudinal Survey of Young Men, also employed in Card (1995). In this panel survey, a group of individuals was followed from 1966 when they were aged 14–24, and interviewed in a number of consecutive years. The labour market information that we use covers 1976.
In this year, the average years of schooling in this sample is somewhat more than 13 years, with a maximum of 18. Average experience in 1976, when this group of men was between 24 and 34 years old, is 8.86 years, while the average hourly raw wage is $5.77.
Table 5.1 reports the results of an OLS regression of an individual’s log hourly wage upon years of schooling, experience and experience-squared and three dummy variables indicating whether the individual was black, lived in a metropolitan area (smsa) and lived in the south. The OLS estimator implies estimated average returns to schooling of approx- imately 7.4% per year.10The inclusion of additional variables, like region of residence in 1966 and family background characteristics, in some cases significantly improved the model but hardly affected the coefficients for the variables reported in Table 5.1 (see Card, 1995), so that we shall continue with this fairly simple specification.
If schooling is endogenous, then experience and its square are by construction also endogenous, given that age is not a choice variable and therefore unambiguously exoge- nous. This means that our linear model may suffer from three endogenous regressors so that we need (at least) three instruments. For experience and its square, age and age-squared are obvious candidates. As discussed previously, for schooling the solution is less trivial. Card (1995) argues that the presence of a nearby college in 1966 may provide a valid instrument. A necessary (but not sufficient) condition for this is that college proximity in 1966 affects the schooling variable, conditional upon the other exogenous variables. To see whether this is the case, we estimate a reduced form, where schooling is explained by age and age-squared, the three dummy variables from the wage equation and a dummy indicating whether an individual lived near a college in 1966. The results, by OLS, are reported in Table 5.2. Recall that this reduced form is not an economic or causal model to explain schooling choice. It is just a statistical reduced form corresponding to the best linear approximation of schooling.
The fact that the lived near college dummy is significant in this reduced form is reas- suring. It indicates that, ceteris paribus, students who lived near a college in 1966 have
Table 5.1 Wage equation estimated by OLS Dependent variable: log(wage)
Variable Estimate Standard error t-ratio
constant 4.7337 0.0676 70.022
schooling 0.0740 0.0035 21.113
exper 0.0836 0.0066 12.575
exper2 −0.0022 0.0003 −7.050
black −0.1896 0.0176 −10.758
smsa 0.1614 0.0156 10.365
south −0.1249 0.0151 −8.259
s=0.374 R2=0.2905 R̄2=0.2891 F=204.93
10Because the dependent variable is in logs, a coefficient of 0.074 corresponds to a relative difference of approximately 7.4%; see Chapter 3.
k k Table 5.2 Reduced form for schooling, estimated by OLS
Dependent variable:schooling
Variable Estimate Standard error t-ratio
constant −1.8695 4.2984 −0.435
age 1.0614 0.3014 3.522
age2 −0.0188 0.0052 −3.386
black −1.4684 0.1154 −12.719
smsa 0.8354 0.1093 7.647
south −0.4597 0.1024 −4.488
lived near college 0.3471 0.1070 3.244
s=2.5158 R2=0.1185 R̄2=0.1168 F=67.29
on average 0.35 years more schooling. Recall that a valid instrument is required to be exogenous and relevant. Relevance requires that the candidate instrument is correlated with schooling but not a linear combination of the other variables of the model. This can be checked by evaluating the reduced form. Exogeneity of the instrument requires that it is uncorrelated with the error term in the wage equation and cannot be tested. It would only be possible to test for such a correlation if we have a consistent estimator for𝛽and 𝛾first, but we can only find a consistent estimator if we impose that our instrument is valid. Accordingly, the exogeneity of instruments can only be tested, to some extent, if the model is overidentified; see Section 5.6. In the present case we need to trust economic arguments, rather than statistical ones, to rely upon the instrument that is chosen.
Using age, age-squared and the lived near college dummy as instruments for expe- rience, experience-squared and schooling,11 we obtain the estimation results reported in Table 5.3. The estimated returns to schooling are over 13%, with a relatively large standard error of somewhat more than 5%. Although the estimate is substantially higher than the OLS one, its inaccuracy is such that this difference could just be due to sampling error. Nevertheless, the value of the IV estimate is fairly robust to changes in the
Table 5.3 Wage equation estimated by IV Dependent variable: log(wage)
Variable Estimate Standard error t-ratio
constant 4.0656 0.6085 6.682
schooling 0.1329 0.0514 2.588
exper 0.0560 0.0260 2.153
exper2 −0.0008 0.0013 −0.594
black −0.1031 0.0774 −1.333
smsa 0.1080 0.0050 2.171
south −0.0982 0.0288 −3.413
Instruments:age,age2andlived near college used for:exper,exper2andschooling
11Although the formulation suggests otherwise, it is not the case that instruments have a one-to-one corre- spondence with the endogenous regressors. Implicitly, all instruments are jointly used for all variables.
k k
ILLUSTRATION: ESTIMATING THE RETURNS TO SCHOOLING 161
specification (e.g. the inclusion of regional indicators or family background variables).
The fact that the IV estimator suffers from such large standard errors is due to the fairly low correlation between the instruments and the endogenous regressors. This is reflected in theR2of the reduced form for schooling, which is only 0.1185.12Although in general the instrumental variables estimator is less accurate than the OLS estimator (which may be inconsistent), the loss in efficiency is particularly large if the instruments are only weakly correlated with the endogenous regressors.
Table 5.3 does not report any goodness-of-fit statistics. The reason is that there is no unique definition of anR2or adjustedR2 if the model is not estimated by ordinary least squares. More importantly, the fact that we estimate the model by instrumental variables methods indicates that goodness-of-fit is not what we are after. Our goal was to obtain a consistent estimator for the causal effect of schooling upon earnings, and that is exactly what instrumental variables methods are trying to do. Again, this reflects that theR2plays no role whatsoever in comparing alternative estimators.
If college proximity is to be a valid instrument for schooling, it has to be the case that it has no direct effect on earnings. As with most instruments, this is a point of discussion (see Card, 1995). For example, it is possible that families that place a strong emphasis on education choose to live near a college, while children of such families have a higher
‘ability’ or are more motivated to achieve labour market success (as measured by earn- ings). Unfortunately, as said before, the current, exactly identified, specification does not allow us to test the exogeneity of the instruments.
The fact that the IV estimate of the returns to schooling is higher than the OLS one suggests that OLS underestimates the true causal effect of schooling. This is at odds with the most common argument against the exogeneity of schooling, namely ‘ability bias’, but in line with the more recent empirical studies on the returns to schooling (including, for example, Angrist and Krueger, 1991). The downward bias of OLS could be due to measurement error, or – as argued by Card (1995) – to the possibility that the true returns to schooling vary across individuals, negatively related to schooling. A model where the returns to schooling are heterogeneous, and where individuals make educational choices comparing their individual returns and costs, is obviously more involved than (5.52); see Card (1999) or Heckman (2001). In such a model, an instrumental variables estimator is typically inconsistent for estimating the ‘average return to schooling’ for the entire population. However, it can be argued to estimate the average return to schooling for a person whose schooling was influenced by the instrument, that is, for a person who acquires more schooling because he or she lives near college. This is known as the ‘local average treatment effect’ (Imbens and Angrist, 1994). This interpretation, however, still requires the instrument to be both exogenous and relevant. Section 7.7 discusses the estimation of average treatment effects in more detail. Carneiro and Heckman (2002) claim that the literature on estimating the returns to schooling is plagued by bad instruments. In particular, they demonstrate that some often-used instru- ments, like distance to college and number of siblings, are correlated with proxies for innate ability.
12The R2s for the reduced forms for experience and experience-squared (not reported) are both larger than 0.60.
k k