s12is the higher sample variance for the sample with length T1, even if it comes from the secondsubsample: The test statistic is distributed as an F T1− k, T2− k under the null hypoth-e
Trang 1as s2
1 = ˆu
1uˆ1/ (T1− k) and s2
2 = ˆu
2uˆ2/ (T2− k), respectively The null
hypothe-sis is that the variances of the disturbances are equal, which can be written
H0: σ2
1 = σ2
2, against a two-sided alternative The test statistic, denoted GQ,
is simply the ratio of the two residual variances, for which the larger of the
two variances must be placed in the numerator (i.e s12is the higher sample
variance for the sample with length T1, even if it comes from the secondsubsample):
The test statistic is distributed as an F (T1− k, T2− k) under the null
hypoth-esis, and the null of a constant variance is rejected if the test statistic exceedsthe critical value
The GQ test is simple to construct but its conclusions may be contingentupon a particular, and probably arbitrary, choice of where to split thesample Clearly, the test is likely to be more powerful when this choice
Trang 2is made on theoretical grounds – for example, before and after a majorstructural event.
Suppose that it is thought that the variance of the disturbances is related
to some observable variable z t (which may or may not be one of the sors); a better way to perform the test would be to order the sample according
regres-to values of z t (rather than through time), and then to split the reordered
sample into T1and T2
An alternative method that is sometimes used to sharpen the inferencesfrom the test and to increase its power is to omit some of the observationsfrom the centre of the sample so as to introduce a degree of separationbetween the two subsamples
A further popular test is White’s (1980) general test for heteroscedasticity.The test is particularly useful because it makes few assumptions about thelikely form of the heteroscedasticity The test is carried out as in box 6.1
Box 6.1 Conducting White’s test
(1) Assume that the regression model estimated is of the standard linear form – e.g.
y t = β1+ β2x 2t + β3x 3t + u t (6.2)
To test var(u t)= σ2 , estimate the model above, obtaining the residuals,uˆt .
(2) Then run the auxiliary regression
ˆ
u2t = α1+ α2x 2t + α3x 3t + α4x22t + α5x 3t2 + α6x 2t x 3t + v t (6.3) wherev t is a normally distributed disturbance term independent ofu t.
This regression is of the squared residuals on a constant, the original explanatory variables, the squares of the explanatory variables and their cross-products To see why the squared residuals are the quantity of interest, recall that, for a random variableu t, the variance can be written
Under the assumption that E(u t) = 0, the second part of the RHS of this expression disappears:
t, so their sample counterparts, the squared residuals, are used instead.
The reason that the auxiliary regression takes this form is that it is desirable to investigate whether the variance of the residuals (embodied inuˆ 2
t) varies systematically with any known variables relevant to the model Relevant variables will include the original explanatory variables, their squared values and their cross-products Note also that this regression should include a constant term,
Trang 3even if the original regression did not This is as a result of the fact thatuˆ 2
t will always have a non-zero mean, even ifuˆt has a zero mean.
(3) Given the auxiliary regression, as stated above, the test can be conducted using
two different approaches First, it is possible to use the F-test framework
described in chapter 5 This would involve estimating (6.3) as the unrestricted regression and then running a restricted regression ofuˆ 2
t on a constant only The
RSS from each specification would then be used as inputs to the standard F-test
formula.
With many diagnostic tests, an alternative approach can be adopted that does not require the estimation of a second (restricted) regression This approach is known as a Lagrange multiplier test, which centres around the value ofR2 for the auxiliary regression If one or more coefficients in (6.3) is statistically significant the value ofR2 for that equation will be relatively high, whereas if none of the variables is significantR2 will be relatively low The LM test would thus operate by obtainingR2 from the auxiliary regression and multiplying it by the number of observations,T It can be shown that
TR2∼ χ2(m)
wherem is the number of regressors in the auxiliary regression (excluding the
constant term), equivalent to the number of restrictions that would have to be
placed under the F-test approach.
(4) The test is one of the joint null hypothesis thatα2= 0 and α3= 0 and α4 = 0 and
α5= 0 and α6= 0 For the LM test, if the χ2 test statistic from step 3 is greater than the corresponding value from the statistical table then reject the null
hypothesis that the errors are homoscedastic.
R2= 0.58; adj R2= 0.55; residual sum of squares = 1,078.26.
We apply the White test described earlier to examine whether the
residu-als of this equation are heteroscedastic We first use the F -test framework.
For this, we run the auxiliary regression (unrestricted) – equation (6.7) –and the restricted equation on the constant only, and we obtain the resid-ual sums of squares from each regression (the unrestricted RSS and therestricted RSS) The results for the unrestricted and restricted auxiliaryregressions are given below
Trang 4R2= 0.24; T = 28; URSS = 61,912.21 The number of regressors k including
the constant is six
Restricted regression (squared residuals regressed on a constant):
ˆ
RRSS= 81,978.35 The number of restrictions m is five (all coefficients are
assumed to equal zero except the coefficient on the constant) Applying the
standard F -test formula, we obtain the test statistic 81978.35−61912.21 61912.21 × 28−6
2.66 The computed F -test statistic is clearly lower than the critical value at
the 5 per cent level, and we therefore do not reject the null hypothesis (as
an exercise, consider whether we would still reject the null hypothesis if weused a 10 per cent significance level)
On the basis of this test, we conclude that heteroscedasticity is not present
in the residuals of equation (6.6) Some econometric software packages
report the computed F -test statistic along with the associated probability
value, in which case it is not necessary to calculate the test statistic ually For example, suppose that we ran the test using a software package
man-and obtained a p-value of 0.25 This probability is higher than 0.05, denoting
that there is no pattern of heteroscedasticity in the residuals of equation(6.6) To reject the null, the probability should have been equal to or lessthan 0.05 if a 5 per cent significance level were used or 0.10 if a 10 per centsignificance level were used
For the chi-squared version of the test, we obtain T R2= 28 × 0.24 = 6.72 This test statistic follows a χ2(5) under the null hypothesis The 5 per cent
critical value from the χ2table is 11.07 The computed test statistic is clearlyless than the critical value, and hence the null hypothesis is not rejected
We conclude, as with the F -test earlier, that there is no evidence of
het-eroscedasticity in the residuals of equation (6.6)
6.5.2 Consequences of using OLS in the presence of heteroscedasticity
What happens if the errors are heteroscedastic, but this fact is ignored andthe researcher proceeds with estimation and inference? In this case, OLS esti-mators will still give unbiased (and also consistent) coefficient estimates, but
Trang 5they are no longer BLUE – that is, they no longer have the minimum ance among the class of unbiased estimators The reason is that the error
vari-variance, σ2, plays no part in the proof that the OLS estimator is
consis-tent and unbiased, but σ2 does appear in the formulae for the coefficientvariances If the errors are heteroscedastic, the formulae presented for thecoefficient standard errors no longer hold For a very accessible algebraictreatment of the consequences of heteroscedasticity, see Hill, Griffiths andJudge (1997, pp 217–18)
The upshot is that, if OLS is still used in the presence of heteroscedasticity,the standard errors could be wrong and hence any inferences made could
be misleading In general, the OLS standard errors will be too large for theintercept when the errors are heteroscedastic The effect of heteroscedastic-ity on the slope standard errors will depend on its form For example, if thevariance of the errors is positively related to the square of an explanatoryvariable (which is often the case in practice), the OLS standard error forthe slope will be too low On the other hand, the OLS slope standard errorswill be too big when the variance of the errors is inversely related to anexplanatory variable
6.5.3 Dealing with heteroscedasticity
If the form – i.e the cause – of the heteroscedasticity is known then analternative estimation method that takes this into account can be used.One possibility is called generalised least squares (GLS) For example, sup-
pose that the error variance was related to some other variable, z t, by theexpression
All that would be required to remove the heteroscedasticity would be to
divide the regression equation through by z t:
Trang 6since under GLS a weighted sum of the squared residuals is minimised,whereas under OLS it is an unweighted sum.
Researchers are typically unsure of the exact cause of the ticity, however, and hence this technique is usually infeasible in practice.Two other possible ‘solutions’ for heteroscedasticity are shown in box 6.2.Box 6.2 ‘Solutions’ for heteroscedasticity
heteroscedas-(1) Transforming the variables into logs or reducing by some other measure of ‘size’.
This has the effect of rescaling the data to ‘pull in’ extreme observations The regression would then be conducted upon the natural logarithms or the transformed data Taking logarithms also has the effect of making a previously multiplicative model, such as the exponential regression model discussed above (with a multiplicative error term), into an additive one Logarithms of a variable cannot be taken in situations in which the variable can take on zero or negative values, however – for example, when the model includes percentage changes in a variable The log will not be defined in such cases.
(2) Using heteroscedasticity-consistent standard error estimates Most standard
econometrics software packages have an option (usually called something such
as ‘robust’) that allows the user to employ standard error estimates that have been modified to account for the heteroscedasticity following White (1980) The effect of using the correction is that, if the variance of the errors is positively related to the square of an explanatory variable, the standard errors for the slope coefficients are increased relative to the usual OLS standard errors, which would make hypothesis testing more ‘conservative’, so that more evidence would be required against the null hypothesis before it can be rejected.
The third assumption that is made of the CLRM’s disturbance terms is thatthe covariance between the error terms over time (or cross-sectionally, forthis type of data) is zero In other words, it is assumed that the errors areuncorrelated with one another If the errors are not uncorrelated with oneanother, it would be stated that they are ‘autocorrelated’ or that they are
‘serially correlated’ A test of this assumption is therefore required
Again, the population disturbances cannot be observed, so tests for correlation are conducted on the residuals, ˆu Before one can proceed tosee how formal tests for autocorrelation are formulated, the concept of thelagged value of a variable needs to be defined
auto-6.6.1 The concept of a lagged value
The lagged value of a variable (which may be y t , x t or u t)is simply the valuethat the variable took during a previous period So, for example, the value
Trang 7Table 6.1 Constructing a series of lagged values and first differences
The value in the 2006M10 row and the y t−1column shows the value that
y t took in the previous period, 2006M09, which was 0.8 The last column in table 6.1 shows another quantity relating to y, namely the ‘first difference’ The first difference of y, also known as the change in y, and denoted y t,
is calculated as the difference between the values of y in this period and in
the previous period This is calculated as
Note that, when one-period lags or first differences of a variable are
con-structed, the first observation is lost Thus a regression of y t using theabove data would begin with the October 2006 data point It is also pos-sible to produce two-period lags, three-period lags, and so on These areaccomplished in the obvious way
6.6.2 Graphical tests for autocorrelation
In order to test for autocorrelation, it is necessary to investigate whetherany relationships exist between the current value of ˆu, ˆu t, and any of its pre-vious values, ˆu t−1, ˆu t−2, The first step is to consider possible relationshipsbetween the current residual and the immediately previous one, ˆu t−1, via agraphical exploration Thus ˆu tis plotted against ˆu t−1, and ˆu tis plotted overtime Some stereotypical patterns that may be found in the residuals arediscussed below
Trang 8Figures 6.3 and 6.4 show positive autocorrelation in the residuals, which
is indicated by a cyclical residual plot over time This case is known as positive
residual at time t is likely to be positive as well; similarly, if the residual
at t − 1 is negative, the residual at t is also likely to be negative Figure 6.3
shows that most of the dots representing observations are in the first and
Trang 9Figures 6.5 and 6.6 show negative autocorrelation, indicated by an
alternat-ing pattern in the residuals This case is known as negative autocorrelation
because on average, if the residual at time t− 1 is positive, the residual at
time t is likely to be negative; similarly, if the residual at t − 1 is negative,
the residual at t is likely to be positive Figure 6.5 shows that most of the dots
Trang 10û t
û t–1+
–
+ –
nega-Finally, figures 6.7 and 6.8 show no pattern in residuals at all: this is what
is desirable to see In the plot of ˆu tagainst ˆu t−1(figure 6.7), the points are domly spread across all four quadrants, and the time series plot of the resid-
ran-uals (figure 6.8) does not cross the x-axis either too frequently or too little.
Trang 116.6.3 Detecting autocorrelation: the Durbin–Watson test
Of course, a first step in testing whether the residual series from an mated model are autocorrelated would be to plot the residuals as above,looking for any patterns Graphical methods may be difficult to interpret inpractice, however, and hence a formal statistical test should also be applied.The simplest test is due to Durbin and Watson (1951)
esti-DW is a test for first-order autocorrelation – i.e it tests only for a tionship between an error and its immediately previous value One way tomotivate the test and to interpret the test statistic would be in the context
rela-of a regression rela-of the time t error on its previous value,
Thus, under the null hypothesis, the errors at time t − 1 and t are
indepen-dent of one another, and if this null were rejected it would be concludedthat there was evidence of a relationship between successive residuals Infact, it is not necessary to run the regression given by (6.12), as the teststatistic can be calculated using quantities that are already available afterthe first regression has been run:
T
t=2( ˆu t − ˆu t−1)2
T
t=2ˆ
u2t
(6.13)
The denominator of the test statistic is simply (the number of observations
− 1)× the variance of the residuals This arises since, if the average of theresiduals is zero,
is positive autocorrelation in the errors this difference in the numerator will
Trang 12be relatively small, while if there is negative autocorrelation, with the sign
of the error changing very frequently, the numerator will be relatively large
No autocorrelation would result in a value for the numerator between smalland large
It is also possible to express the DW statistic as an approximate function
of the estimated value of ρ:
where ˆρ is the estimated correlation coefficient that would have beenobtained from an estimation of (6.12) To see why this is the case, considerthat the numerator of (6.13) can be written as the parts of a quadratic,
T
t=2( ˆu t − ˆu t−1)2=
T
t=2ˆ
u2t +
T
t=2ˆ
u2t−1− 2
T
t=2ˆ
u2t−1
contains ˆu21but not ˆu2T As the sample size, T , increases towards infinity, the
difference between these two will become negligible Hence the expression
in (6.15), the numerator of (6.13), is approximately2
T
t=2ˆ
u2t − 2
T
t=2ˆ
u t uˆt−1
Trang 13Replacing the numerator of (6.13) with this expression leads to
u t uˆt−1
T
t=2ˆ
u t uˆt−1
T
t=2ˆ
The covariance between u t and u t−1 can be written as E[(u t − E(u t ))(u t−1−
E (u t−1))] Under the assumption that E(ut)= 0 (and therefore that E(u t−1)=0), the covariance will be E[ut u t−1] For the sample residuals, this covariancewill be evaluated as
The sum in the numerator of the expression on the right of (6.16) can
therefore be seen as T − 1 times the covariance between ˆu t and ˆu t−1, whilethe sum in the denominator of the expression on the right of (6.16) can be
seen from the previous exposition as T − 1 times the variance of ˆu t Thus it
= 2 (1 − corr( ˆu t , uˆt−1)) (6.17)
so that the DW test statistic is approximately equal to 2(1− ˆρ) Since ˆρ is a
correlation, it implies that−1 ≤ ˆρ ≤ 1 That is, ˆρ is bounded to lie between
−1 and +1 Substituting in these limits for ˆρ to calculate DW from (6.17)
would give the corresponding limits for DW as 0≤ DW ≤ 4 Consider now
the implication of DW taking one of three important values (zero, two andfour):
● ρˆ= 0, DW = 2 This is the case in which there is no autocorrelation in the
residuals Roughly speaking, therefore, the null hypothesis would not berejected if DW is near two – i.e there is little evidence of autocorrelation
● ρˆ= 1, DW = 0 This corresponds to the case in which there is perfect
positive autocorrelation in the residuals
● ρˆ= −1, DW = 4 This corresponds to the case in which there is perfect
negative autocorrelation in the residuals
The DW test does not follow a standard statistical distribution, such as a
t , F or χ2 DW has two critical values – an upper critical value (d U)and a
lower critical value (d )– and there is also an intermediate region in which
Trang 14Reject H0: positive autocorrelation
Figure 6.9
Rejection and
non-rejection
regions for DW test
the null hypothesis of no autocorrelation can neither be rejected nor notrejected! The rejection, non-rejection and inconclusive regions are shown
on the number line in figure 6.9
To reiterate, therefore, the null hypothesis is rejected and the existence ofpositive autocorrelation presumed if DW is less than the lower critical value;the null hypothesis is rejected and the existence of negative autocorrelationpresumed if DW is greater than four minus the lower critical value; the nullhypothesis is not rejected and no significant residual autocorrelation ispresumed if DW is between the upper and four minus the upper limits
6.7 Causes of residual autocorrelation
● Omitted variables A key reason for autocorrelation is the omission ofsystematic influences that are reflected in the errors The exclusion of anexplanatory variable that conveys important information for the depen-dent variable and that is not allowed by the other explanatory variablescauses autocorrelation In the real estate market, the analyst may not have
at his/her disposal all the variables required for modelling – for example,economic variables at the local (city or metropolitan area) level – leading
to residual autocorrelation
● Model misspecification We may have adopted the wrong functional formfor the relationship we examine For example, we assume a linear modelbut the model should be expressed in log form We may also have models
in levels but the relationship may be of a cyclical nature Hence we shouldtransform the variables to allow for the cyclicality in the series Residualsfrom models using strongly trended variables are likely to exhibit auto-correlated patterns in particular if the true relationship is more cyclical
● Data smoothness and trends These can be a major cause for residualautocorrelation in the real estate market The real estate data we use areoften smoothed and frequently also involve some interpolation Therehas been much discussion about the smoothness in valuation data, whichbecomes more acute in markets with less frequent transactions and withdata of lower frequency Slow adjustments in the real estate market alsogive rise to autocorrelation Smoothness and slow adjustments averagethe true disturbances over successive periods of time Hence successive
Trang 15values of the error term become interrelated For example, a large change
in GDP or employment growth in our example could be reflected by theresiduals for several periods as the successive rent values carry this effectdue to smoothness and slow adjustment
● Misspecification of the true random error The assumption E(u i u j)= 0may not represent the true pattern of the errors Major events such as aprolonged economic downturn or the cycles that the real estate marketseems to go through (for example, it took several years for the markets
to recover from the early 1990 crash) are likely to have an impact on themarket that will persist for some time
What is important from the above discussion is that the remedy for ual autocorrelation really depends on its cause
resid-Example 6.2
We test for first-order serial correlation in the residuals of equation (6.6)and compute the DW statistic using equation (6.14) The value of ˆρ is 0.37and the sign suggests positive first-order autocorrelation in the residuals
Applying formula (6.14), we get DW ≈ 2 × (1 − 0.37) = 1.26.
Equation (6.6) was estimated with twenty-eight observations (T = 28) and
the number of regressors including the constant term is three (k= 3) The
critical values for the test are d L = 1.181 and d U = 1.650 at the 1 per cent
For illustration purposes, suppose that the value of ˆρin the above tion were not 0.37 but−0.37, indicating negative first-order autocorrelation
equa-Then DW would take the value of 2.74 From the DW tables with k= 3 and
T = 28, we compute the critical regions:
4− d U = 4 − 1.650 = 2.35 and 4 − d L = 4 − 1.181 = 2.82.
Again, the test statistic of 2.74 falls into the indecisive region If it werehigher than 2.82 we would have rejected the null hypothesis in favour of
Trang 16the alternative of first-order serial correlation, and if it were lower than 2.35
we would not have rejected it
6.7.1 Conditions that must be fulfilled for DW to be a valid test
In order for the DW test to be valid for application, three conditions must
be fulfilled, as described in box 6.3
Box 6.3 Conditions for DW to be a valid test
(1) There must be a constant term in the regression.
(2) The regressors must be non-stochastic – as assumption 4 of the CLRM (see chapter 10).
(3) There must be no lags of the dependent variable (see below) in the regression.
If the test were used in the presence of lags of the dependent variable orotherwise stochastic regressors, the test statistic would be biased towardstwo, suggesting that in some instances the null hypothesis of no autocorre-lation would not be rejected when it should be
6.7.2 Another test for autocorrelation: the Breusch–Godfrey test
Recall that DW is a test only of whether consecutive errors are related toone another: not only can the DW test not be applied if a certain set ofcircumstances is not fulfilled, there will also be many forms of residualautocorrelation that DW cannot detect For example, if corr( ˆu t, ˆu t−1)= 0,but corr( ˆu t, ˆu t−2)= 0, DW as defined above will not find any autocorrelation.One possible solution would be to replace ˆu t−1in (6.13) with ˆu t−2 Pairwiseexamination of the correlations ( ˆu t, ˆu t−1), ( ˆu t, ˆu t−2), ( ˆu t, ˆu t−3), will betedious in practice, however, and is not coded in econometrics softwarepackages, which have been programmed to construct DW using only a one-period lag In addition, the approximation in (6.14) will deteriorate as thedifference between the two time indices increases Consequently, the criticalvalues should also be modified somewhat in these cases
As a result, it is desirable to examine a joint test for autocorrelation thatwill allow examination of the relationship between ˆu t and several of itslagged values at the same time The Breusch–Godfrey test is a more general
test for autocorrelation up to the rth order The model for the errors under
this test is
u t = ρ1u t−1+ ρ2u t−2+ ρ3u t−3+ · · · + ρ r u t −r + v t ,