1. Trang chủ
  2. » Giáo Dục - Đào Tạo

KInh tế ứng dụng_ Lecture 9: Autocorrelation

9 335 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Autocorrelation
Tác giả Nguyen Hoang Bao
Chuyên ngành Applied Econometrics
Thể loại Lecture notes
Năm xuất bản 2004
Định dạng
Số trang 9
Dung lượng 82,89 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Hence a look at the residual plot for a regression that i has no autocorrelation; ii has positive autocorrelation, and, iii has negative autocorrelation.. The positive autocorrelation is

Trang 1

Applied Econometrics

Lecture 9: Autocorrelation

“It is never possible to step twice into the same river”

1) Introduction

Autocorrelation (also called serial correlation) is violation of the assumption that the error terms are not correlated, i.e., with autocorrelation E(∈i, ∈j) ≠ 0 (∈i ≠∈j) That is, the error in the period t is not independent of previous errors

Since we do not know the population line, we do not know the actual errors (∈s), but we estimate them by the residuals (e) Hence a look at the residual plot for a regression that (i) has no autocorrelation; (ii) has positive autocorrelation, and, (iii) has negative autocorrelation The positive autocorrelation is the common problem in economics

2) Consequences of autocorrelation

Ordinary least squares (OLS) estimates in presence of autocorrelation will not have the desirable statistical properties With positive autocorrelation the standard errors are too low (underestimated) This adversely affects the t statistics (overestimated), so we may reject the null when it is in fact valid Likewise the R2 and related F – statistic are likely to be overestimated

3) Detecting autocorrelation

There are many ways to check for autocorrelation such as (1) looking at the residual plot; (2) observing the correlogram; (3) using the runs tests; and, (4) using the Durbin – Watson statistic This section presents the runs tests and Durbin – Watson tests

3.1) Runs test

Autocorrelation can show up in the residual plot A non – autocorrelation error should jump around the mean (zero) in a random manner With positive autocorrelation (we are most likely to get with economic data) the error is more likely to stay above or below the mean for successive observations (with negative autocorrelation it will jump above and below very frequently)

We can formalize this approach in the runs test, by counting the number of runs in the data A run is defined as the succession of positive or negative residual (even just one observation counts as a run)

We saw also that if there is positive autocorrelation then there will be rather fewer runs than we should expect from a series with no – autocorrelation On the other hand, if there is negative autocorrelation then there are more runs than with no autocorrelation

Trang 2

The table for the runs test gave a confident interval – if the observed number of runs is outside this interval we reject null hypothesis of autocorrelation If the actual number of runs is less than the lower bound of the confidence interval then we reject in favor of negative autocorrelation If it is a higher we reject in favor of negative autocorrelation We may sometimes need to calculate the interval ourselves

n

N

N1 2

2

+1

SR

2 =

) 1 (

) 2

( 2

2

2 1 2 1

n n

n N N N N

Where N 1 is the number of positive residuals, N 2 is the number of negative residual, R is total number of runs, and n is the number observations ( so n = N1 + N2 )

The confidence interval at 5 percent level of significance is given by:

E(R) – 1.96 sR ≤ R ≤ E(R) +1.96 sR

We accept the null hypothesis of no autocorrelation if the observed number of runs falls within the confidence interval

3.2) Durbin – Watson test

A second (the most common) test is the Durbin – Watson (DW) test The DW statistic is defined as:

d =

=

n

t t

n

t

t t

e

e e

1 2 2

2 1

Note that d = 2(1+ρ)

-1 ≤ ρ ≤ +1

d will be zero with extreme positive autocorrelation 4 with extreme negative autocorrelation and 2 if there is no autocorrelation

The null hypothesis is that DW = 2, which corresponds to no autocorrelation

Reject H 0 : positive

autocorrelation

Zone of indecision Accept H 0 : no

autocorrelation

Zone of indecision Reject H 0 : positive

autocorrelation

0 dL dU 2 4 – dU 4 – dL 4

Testing for autocorrelation with a lagged dependent variable

If the model contains a lagged dependent variable, d will be biased 2 (this bias may lead us to accept the null when in fact autocorrelation is present In such cases we must instead use Durbin’s h

Trang 3

h =

)]

[var(

1 2

1

1

b n

n d

⎛ −

Where var(b1) is the square of the standard error of the coefficient on the lagged dependent variables and n is the number of observations The test may not be used if n[var(b1)] is greater than one The runs test and DW are not equivalent – they may give different answers Also the fact that DW may frequently fall in the indecision zone mean that some judgment is required If DW is in the in decision zone, but fairly close to du, and the run test indicates no autocorrelation, then you can assume no autocorrelation

4) Why do we get autocorrelation?

We test for autocorrelation on the residuals; but these are only a good proxy for the true error if the model is correct The presence of autocorrelation will very often indicate miss–specification, including:

Incorrect functional form

Omitted variable(s)1

Structural instability

Influential points

Spurious regression

Spurious regression is the very serious problem in time series data A rule of thumb is the R2 > d indicates that a regression is spurious (if R2 > d the regression is almost certainly spurious, but if

R2<d the regression may still be spurious)

A note on cross–section data: autocorrelation must be a time series problem, as we can always remove autocorrelation from cross–section data by re–ordering the data However, if the data are sorted by one of the independent variables then the apparent presence of autocorrelation can still indicate misspecification Reordering is not usually an option in time – series data, and certainly not

so if the equation includes any lags

5) Remedial measures

The first thing to be is to interpret autocorrelation as a symptom of misspecification and so to carry out various specification tests (i.e for omitted variables, structural breaks, etc) This will nearly always cure the problem If the autocorrelation is genuine you can remove the autocorrelated errors by:

1 The exclusion of relevant variable(s) will bias the estimates of the coefficients of all variables included in the model (unless they happen to be orthogonal to the excluded variable) The normal t – tests cannot tell us if the model is misspecified on account of

Trang 4

5.1 The Cochrane – Orcult procedure

Suppose we have the model: Yt = β0 + β1X1 + ut

It is usually assumed that the ei follow the first–order autoregressive scheme, namely,

ut = ρut-1 + εt

Cochrance and Orcutt (1949) then recommend the following steps to estimste ρ

1 Estimate the two – variable model and calculate the residuals, et-1

2 Run the following regression: et = ρet-1 + vt

3 Using ρ, run the generalized difference equation:

(Yt – ρYt-1) = β1(1 – ρ) + β2(Xt – ρXt-1) + (ut – ρut-1)

Yt

*

+ β2

*

Xt

*

+ et

*

4 Calculate the new residuals: et

* *

= Yt – β1

* – β2

*

Xt

*

5 Estimate regression: et

**

= ρet-1

* *

+ wt

This second round estimate of ρ may not be the best estimate We can go into the third round estimate and so on We may stop calculating when the successive estimates of ρ differ by a very small amount from 0.01 to 0.005

5.2 The Durbin procedure

Durbin (1960) suggested can alternative method of estimating ρ The generalized difference equation can be written as:

Yt = β1(1 – ρ) + β2Xt + ρβ2Xt-1 + ρYt-1 +et

Once an estimate of ρ is obtained, we regress the transformed variable Y*

on X* as in

Yt

* = β1

* + β2

*

Xt

*

+ et

*

5.3 The Theil – Nagar procedure

Theil and Nagar (1961) have estimated ρ based on d statistic (in the small samples)

ρ = 2(1 2 /22) 2

k N

k d N

+

Where N is the total number of observations, d is DW, k is the number of coefficients including the intercept

Trang 5

5.4 The Hildreth – Lu procedure

From the first – order autoregressive scheme ut = ρut-1 + εt Hildreth – Lu (1960) recommend selecting ρ lie between ±1 using 0,1 unit intervals and transforming the data by the generalized difference equation and obtain the associated RSS Hildreth – Lu suggest choosing that ρ which minimizes the RSS

The differencing procedure looses one observation To avoid this, the first observation on Y and X is transformed as follows: Y1(1–ρ )0.5

and X1(1–ρ)0.5

(Prais –winsten: 1971)

5.5 Detrending by including a time trend as one of regessors

Yt = β 1 + β2Xt + β3 ut

The first – order transformation of it as follow:

ΔYt =β2ΔXt + β3 + εt

There is an intercept term in the first difference form It signifies that there was a linear trend term in the original model If β3 > 0, there is an upward trend after removing the influence of the variable X

We emphasize that techniques are only to be use if you are sure that there is no problem of misspecification

6) An example

Regression of crop output on the price index and fertilizer input (Table 6.1) was found to be badly autocorrelated: the DW statistic was 0.96 to a critical value of dL of 1.28 We found that the autocorrelation arose from a problem of omitted variable bias But for illustrative purposes we shall see how the autocorrelation may be removed using the Cochrane – Orcutt correction To do this we carry out the following steps:

(1) The estimated equation with OLS gives DW = 0.958; thus ρ = 1 – d/2 = 0.521

(2) Calculate Qt

*

= Qt – 0.521Qt–1, and similarly for P* and F* for observations 1962 to 1990 The results are shown in Table 6.1

(3) Apply the Prais – Winsten transformation to get Q*1961 = (1 – 0.5612)1/2 Q1961, and similarly for the 1961 of P* and F* (Although we do have 1960 values for P and F, though not Q, and so could apply the Cochrane – Orcutt procedure to the 1960 observations, the fact that we use the Prais – Winsten transformation for one variable means that we must also use it for the others) The resulting values are shown in Table 6.1

Trang 6

Table 6.1: Application of Cochrane – Orcutt correction to crop production function data

1961

1962

1963

1964

1965

1966

1967

1968

1969

1970

1971

1972

1973

1974

1975

1976

1977

1978

1979

1980

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

40.4 36.4 35.4 37.9 34.8 27.9 29.8 34.7 38.4 33.6 33.6 32.2 35.3 39.4 30.6 30.5 33.7 35.8 36.0 37.0 30.7 28.0 28.4 27.6 32.9 37.1 36.0 36.6 38.8 37.1

34.5 15.4 16.5 19.4 15.1 9.8 15.3 19.1 20.3 13.6 16.1 14.7 18.5 21.1 10.1 14.5 17.8 18.2 17.3 18.3 11.4 12.0 13.8 12.8 18.5 20.0 16.7 17.9 19.7 16.9

106.0 108.1 110.3 110.1 108.6 103.8 109.5 102.6 101.1 100.9 104.7 107.3 103.0 116.4 112.7 108.0 103.2 101.0 103.6 109.6 105.2 98.7 99.2 94.8 100.6 104.5 98.9 101.8 105.6 108.7

90.5 52.9 54.0 52.6 51.2 47.2 55.4 45.5 47.6 48.2 52.2 52.7 47.1 62.8 52.0 49.3 46.9 47.2 51.0 55.6 48.1 43.9 47.7 43.2 51.2 52.1 44.5 50.2 52.6 53.6

99.4 100.8 102.1 102.9 103.1 104.2 104.6 105.6 106.8 106.7 108.3 108.6 110.4 111.2 111.1 110.7 110.5 112.0 111.6 113.2 114.1 114.8 114.8 114.4 114.6 114.5 114.2 115.5 116.7 118.0

84.8 49.0 49.6 49.7 49.5 50.5 50.3 51.1 51.7 51.1 52.7 52.2 53.8 53.7 53.1 52.8 52.8 54.4 53.2 55.0 55.1 55.4 55.0 54.7 54.9 54.8 54.6 55.9 56.5 57.2

(4) Estimate the following model:

* = β1

* + β2

*

Pt

* + β3

*

Ft

* + ∈t

*

Trang 7

The regression results are given in Table 6.2 (which repeats also those for OLS estimation) Calculate the estimate of the intercept b1 = b1

*/(1–ρ), which equals –19.98

Table 6.2: Regression results with Cochrane – Orcutt procedure

OLS Coefficient

(t – statistic)

10.21 (0.41)

0.26 (1.26)

–0.03 (–0.20)

0.12 0.96

Cochrane – Orcutt

procedure

Coefficient (t – statistic)

–9.57 (–2.02)

0.28 (2.63)

0.22 (1.58)

0.62 1.50

(5) Comparing the two regressions, we see that the DW statistic is now 1.5 This value falls towards the upper end of the zone of indecision, so the evidence for autocorrelation is much weaker than

in the OLS regression, though it may be thought worthwhile to repeat the procedure (using a new ρ of 0.25, calculated from the new DW)

Comparison of the slope coefficients from the two regressions shows price to be relatively unaffected With the Cochrane – Orcutt procedure, the fertilizer variable produces the expected positive sign, though it remains insignificant The unexpected insignificance of fertilizer is a further indication that we should have treated the initial autocorrelation as a sign of misspecification In this case, the Cochrane – Orcutt procedure has suppressed the symptom of misspecification, but cannot provide the cure – which is to include the omitted variables

If you believe the model to be correctly specified and there is autocorrelation, then the Cochrane – Orcutt procedure may be used to obtain efficient estimates

References

Bao, Nguyen Hoang (1995), ‘Applied Econometrics’, Lecture notes and Readings,

Vietnam-Netherlands Project for MA Program in Economics of Development

Maddala, G.S (1992), ‘Introduction to Econometrics’, Macmillan Publishing Company, New York

Mukherjee Chandan, Howard White and Marc Wuyts (1998), ‘Econometrics and Data Analysis for

Developing Countries’ published by Routledge, London, UK

Trang 8

Workshop 9: Autocorrelation

1.1) Use the data given in below table to regress output on the price index and fertilizer in put Draw the residual plot and count the number of runs

Output (Q) Price Index (P) Fertilizer input (F) Rainfall (R)

1960

1961

1962

1963

1964

1965

1966

1967

1968

1969

1970

1971

1972

1973

1974

1975

1976

1977

1978

1979

1980

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

n.a

40.4 36.4 35.4 37.9 34.8 27.9 29.8 34.7 38.4 33.6 33.6 32.2 35.3 39.4 30.6 30.5 33.7 35.8 36.0 37.0 30.7 28.0 28.4 27.6 32.9 37.1 36.0 36.6 38.8 37.1

100.0 106.0 108.1 110.3 110.1 108.6 103.8 109.5 102.6 101.1 100.9 104.7 107.3 103.0 116.4 112.7 108.0 103.2 101.0 103.6 109.6 105.2 98.7 99.2 94.8 100.6 104.5 98.9 101.8 105.6 108.7

100.0 99.4 100.8 102.1 102.9 103.1 104.2 104.6 105.6 106.8 106.7 108.3 108.6 110.4 111.2 111.1 110.7 110.5 112.0 111.6 113.2 114.1 114.8 114.8 114.4 114.6 114.5 114.2 115.5 116.7 118.0

184.2 155.3 107.3 110.1 169.3 81.9 22.0 31.5 198.2 147.5 76.5 90.2 54.3 178.3 70.6 53.0 74.4 136.4 131.1 109.2 112.8 43.0 53.5 20.6 56.4 78.3 151.8 145.2 112.3 161.0 94.5

Trang 9

1.2) Run the following regressions:

Q = α0 + α1P + α2F

and

Q = β0 + β1P + β2F + β3R + β4R–1

Use an F – test to test the hypothesis that the two rainfall variables may be excluded from the regression

1.3) In the light of your results, comment on the apparent problem of autocorrelation in the regression of output on the price index and fertilizer output

2) Compile time – series data for the population of the country of your choice Regress both terms of trade and logged terms of trade on time and graph the residuals in each case How many runs are there in each case? Comment Can you specify the equation to increase the number of runs?

3) Using data given in data file SOCECON, regress the infant mortality rate on income per capita Plot the residuals with the observations ordered: (a) alphabetically; and, (b) by income per capita Count the number of runs in each case Comment on your results

4) Using the Sri Lankan macroeconomic data set (SRINA), perform the simple regression of Ip

on Ig and plot the residuals Use both runs and DW tests to check for autocorrelation Add variables to the equation to improve the model specification and repeat the tests for autocorrelation Use the Cochrane – Orcutt estimation procedure if you feel it is appropriate Comment on your findings

Ngày đăng: 25/10/2013, 09:15

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN