The regression model is linear, is correctly speci- 123docz.net

Yi = β0+β1X1i+β2X2i + g+ βKXKi+ei (4.1) The assumption that the regression model is linear1 does not require the underlying theory to be linear. For example, an exponential function:

Yi = eβ0Xiβ1eei (4.2) where e is the base of the natural log, can be transformed by taking the natural log of both sides of the equation:

ln1Yi2 = β0+β1 ln1Xi2 +ei (4.3)

1. The Classical Assumption that the regression model is “linear” technically requires the model to be “linear in the coefficients.” You’ll learn what it means for a model to be linear in the coefficients, particularly in contrast to being linear in the variables, in Section 7.2. We’ll cover the application of regression analysis to equations that are nonlinear in the variables in that same section, but the application of regression analysis to equations that are nonlinear in the coefficients is beyond the scope of this textbook.

The Classical assumptions

I. The regression model is linear, is correctly specified, and has an additive error term.

II. The error term has a zero population mean.

III. All explanatory variables are uncorrelated with the error term.

IV. Observations of the error term are uncorrelated with each other (no serial correlation).

V. The error term has a constant variance (no heteroskedasticity).

VI. No explanatory variable is a perfect linear function of any other explanatory variable(s) (no perfect multicollinearity).

VII. The error term is normally distributed (this assumption is optional but usually is invoked).

M04_STUD2742_07_SE_C04.indd 93 1/6/16 5:06 PM

94 ChAPTEr 4 The ClassiCal model

If the variables are relabeled as Yi* = ln1Yi2 and X*i = ln1Xi2, then the form of the equation becomes linear:

Yi* = β0+β1Xi*+ei (4.4) In Equation 4.4, the properties of the OLS estimator of the βs still hold because the equation is linear.

Two additional properties also must hold. First, we assume that the equation is correctly specified. If an equation has an omitted variable or an incorrect functional form, the odds are against that equation working well.

Second, we assume that a stochastic error term has been added to the equation. This error term must be an additive one and cannot be multiplied by or divided into any of the variables in the equation.

II. The error term has a zero population mean. As was pointed out in Section 1.2, econometricians add a stochastic (random) error term to regression equations to account for variation in the dependent variable that is not explained by the model. The specific value of the error term for each observation is determined purely by chance. Probably the best way to picture this concept is to think of each observation of the error term as being drawn from a random variable distribution such as the one illustrated in Figure 4.1.

Classical Assumption II says that the mean of this distribution is zero.

That is, when the entire population of possible values for the stochastic error

Figure 4.1 an error Term distribution with a mean of Zero

Observations of stochastic error terms are assumed to be drawn from a random variable distribution with a mean of zero. If Classical Assumption II is met, the expected value (the mean) of the error term is zero.

0 Probability

- + g

95 The ClassiCal assumpTions

term is considered, the average value of that population is zero. For a small sample, it is not likely that the mean is exactly zero, but as the size of the sample approaches infinity, the mean of the sample approaches zero.

What happens if the mean doesn’t equal zero in a sample? As long as you have a constant term in the equation, the estimate of β0 will absorb the non- zero mean. In essence, the constant term equals the fixed portion of Y that cannot be explained by the independent variables, and the error term equals the stochastic portion of the unexplained value of Y.

Although it’s true that the error term never can be observed, it’s instructive to pretend that we can do so to see how the constant term absorbs the non-zero mean of the error term in a sample. Consider a typical regression equation:

Yi = β0+β1Xi+ei (4.5) Suppose that the mean of ei is 3 instead of 0, then2 E1ei-32 = 0. If we add 3 to the constant term and subtract it from the error term, we obtain:

Yi = 1β0+32+β1Xi+1ei-32 (4.6) Since Equations 4.5 and 4.6 are equivalent (do you see why?), and since E1ei-32 = 0, then Equation 4.6 can be written in a form that has a zero mean for the error term e*:i

Yi = β0*+β1Xi+e* (4.7)i where β0* = β0+3 and ei* = ei-3. As can be seen, Equation 4.7 conforms to Assumption II. In essence, if Classical Assumption II is violated in an equation that includes a constant term, then the estimate of β0 absorbs the non- zero mean of the error term, and the estimates of the other coefficients are unaffected.

III. All explanatory variables are uncorrelated with the error term. It is assumed that the observed values of the explanatory variables are independent of the values of the error term.

If an explanatory variable and the error term were instead correlated with each other, the OLS estimates would be likely to attribute to the X some of the variation in Y that actually came from the error term. If the error term and X were positively correlated, for example, then the estimated coeffi- cient would probably be higher than it would otherwise have been (biased upward), because the OLS program would mistakenly attribute the variation

2. Here, as in Chapter 1, the “E” refers to the expected value (mean) of the item in parentheses after it. Thus E1ei-32 equals the expected value of the stochastic error term epsilon minus 3.

In this specific example, since we’ve defined E1ei2 = 3, we know that E1ei-32 = 0. One way to think about expected value is as our best guess of the long-run average value a specific random variable will have.

M04_STUD2742_07_SE_C04.indd 95 1/6/16 5:06 PM

96 ChAPTEr 4 The ClassiCal model

in Y caused by e to X instead. As a result, it’s important to ensure that the explanatory variables are uncorrelated with the error term.

Classical Assumption III is violated most frequently when a researcher omits an important independent variable from an equation.3 As you learned in Chapter 1, one of the major components of the stochastic error term is omitted variables, so if a variable has been omitted, then the error term will change when the omitted variable changes. If this omitted variable is correlated with an included independent variable (as often happens in economics), then the error term is correlated with that independent variable as well. We have violated Assumption III! Because of this violation, OLS will attribute the impact of the omitted variable to the included variable, to the extent that the two variables are correlated.

IV. Observations of the error term are uncorrelated with each other. The observations of the error term are drawn independently from each other. If a systematic correlation exists between one observation of the error term and another, then OLS estimates will be less precise than estimates that account for the correlation. For example, if the fact that the e from one observation is positive increases the probability that the e from another observation also is positive, then the two observations of the error term are positively correlated.

Such a correlation would violate Classical Assumption IV.

In economic applications, this assumption is most important in time-series models. In such a context, Assumption IV says that an increase in the error term in one time period (a random shock, for example) does not show up in or affect in any way the error term in another time period. In some cases, though, this assumption is unrealistic, since the effects of a random shock sometimes last for a number of time periods. For example, a natural disaster like the 2015 earthquake in Nepal will have a negative impact on a region long after the time period in which it was truly a random event. If, over all the observations of the sample, et+1 is correlated with et, then the error term is said to be serially correlated (or autocorrelated), and Assumption IV is violated.

Violations of this assumption are considered in more detail in Chapter 9.

V. The error term has a constant variance. The variance (or dispersion) of the distribution from which the observations of the error term are drawn is constant.4 That is, the observations of the error term are assumed to be drawn continually from identical distributions (for example, the one pictured

3. Another important economic application that violates this assumption is any model that is simultaneous in nature. This will be considered in Chapter 14.

4. This is a simplification. The actual assumption (that error terms have positive finite second moments) is equivalent to this simplification in all but a few extremely rare cases.

97 The ClassiCal assumpTions

in Figure 4.1). The alternative would be for the variance of the distribution of the error term to change for each observation or range of observations. In Figure 4.2, for example, the variance of the error term is shown to increase as the variable Z increases; such a pattern violates Classical Assumption V.

The actual values of the error term are not directly observable, but the lack of a constant variance for the distribution of the error term causes OLS to generate inaccurate estimates of the standard error of the coefficients.5

For example, suppose that you’re studying the amount of money that the 50 states spend on education. New York and California are more heavily populated than New Hampshire and Nevada, so it’s probable that the variance of the error term for big states is larger than it is for small states. The amount of unexplained variation in educational expenditures seems likely to

5. Because some observations have errors with a large variance, those observations are not as reliable and so should be given less weight when minimizing the sum of squares. OLS, how- ever, gives equal weight to each observation, so it will be less precise than estimators that weigh the observations more appropriately.

Figure 4.2 an error Term Whose Variance increases as Z increases

One example of Classical Assumption V not being met is when the variance of the error term increases as Z increases. In such a situation (called heteroskedasticity), the observations are on average farther from the true regression line for large values of Z than they are for small values of Z.

0 Z

Small gs Associated with

Small Zs Large gs Associated

with Large Zs

E(Y|X) =d0+d1Z

M04_STUD2742_07_SE_C04.indd 97 1/6/16 5:06 PM

98 ChAPTEr 4 The ClassiCal model

be larger in big states like New York than in small states like New Hampshire.

The violation of Assumption V is referred to as heteroskedasticity and will be discussed in more detail in Chapter 10.

The regression model is linear, is correctly specified, and has an additive

Using Regression to Explain Housing Prices

Estimating Single-Independent-Variable Models with OLS