Box 8.2 The invertibility condition for an MA2 model In order to examine the shape of the pacf for moving average processes, consider the followingMA2 process for y t: y t = u t + θ1u t−
Trang 1invertible, it can be expressed as an AR(∞) A definition of invertibility istherefore now required.
8.5.1 The invertibility condition
An MA(q) model is typically required to have roots of the characteristic tion θ (z)= 0 greater than one in absolute value The invertibility condition
equa-is mathematically the same as the stationarity condition, but equa-is different inthe sense that the former refers to MA rather than AR processes This condi-tion prevents the model from exploding under an AR(∞) representation, so
that θ−1(L)converges to zero Box 8.2 shows the invertibility condition for
an MA(2) model
Box 8.2 The invertibility condition for an MA(2) model
In order to examine the shape of the pacf for moving average processes, consider the followingMA(2) process for y t:
y t = u t + θ1u t−1+ θ2u t−2= θ(L)u t (8.40) Provided that this process is invertible, this MA(2) can be expressed as an AR( ∞):
the partial autocorrelation function for anMA(q) model will decline geometrically,
rather than dropping off to zero afterq lags, as is the case for its autocorrelation
function It could therefore be stated that the acf for an AR has the same basic shape
as the pacf for an MA, and the acf for an MA has the same shape as the pacf for an AR.
Trang 2The pacf is therefore useful for distinguishing between an AR(p) process and an ARMA(p, q) process; the former will have a geometrically declining
autocorrelation function, but a partial autocorrelation function, that cuts
off to zero after p lags, while the latter will have both autocorrelation and
partial autocorrelation functions that decline geometrically
We can now summarise the defining characteristics of AR, MA and ARMAprocesses
An autoregressive process has:
● a geometrically decaying acf; and
● number of non-zero points of pacf= AR order
A moving average process has:
● number of non-zero points of acf= MA order; and
● a geometrically decaying pacf
A combination autoregressive moving average process has:
● a geometrically decaying acf; and
● a geometrically decaying pacf
In fact, the mean of an ARMA series is given by
1− φ1− φ2− · · · − φ p
(8.45)The autocorrelation function will display combinations of behaviour
derived from the AR and MA parts, but, for lags beyond q, the acf will simply be identical to the individual AR(p) model, with the result that the
AR part will dominate in the long term Deriving the acf and pacf for anARMA process requires no new algebra but is tedious, and hence it is left as
an exercise for interested readers
Trang 30.05 0 –0.05 –0.1 –0.15 –0.2 –0.25 –0.3 –0.35 –0.4 –0.45
lag,s
acf pacf
8.6.1 Sample acf and pacf plots for standard processes
Figures 8.1 to 8.7 give some examples of typical processes from the ARMAfamily, with their characteristic autocorrelation and partial autocorrelationfunctions The acf and pacf are not produced analytically from the relevant
formulae for a model of this type but, rather, are estimated using 100,000
simulated observations with disturbances drawn from a normal tion Each figure also has 5 per cent (two-sided) rejection bands represented
distribu-by dotted lines These are based on (±1.96/√100000)= ±0.0062, calculated
in the same way as given above Notice how, in each case, the acf and pacfare identical for the first lag
In figure 8.1, the MA(1) has an acf that is significant only for lag 1, whilethe pacf declines geometrically, and is significant until lag 7 The acf at lag
1 and all the pacfs are negative as a result of the negative coefficient in theMA-generating process
Again, the structures of the acf and pacf in figure 8.2 are as anticipatedfor an MA(2) The first two autocorrelation coefficients only are significant,while the partial autocorrelation coefficients are geometrically declining.Note also that, since the second coefficient on the lagged error term in the
MA is negative, the acf and pacf alternate between positive and negative
In the case of the pacf, we term this alternating and declining function a
‘damped sine wave’ or ‘damped sinusoid’
For the autoregressive model of order 1 with a fairly high coefficient– i.e relatively close to one – the autocorrelation function would beexpected to die away relatively slowly, and this is exactly what is observedhere in figure 8.3 Again, as expected for an AR(1), only the first pacf
Trang 4lag, s
acf pacf
Figure 8.5 shows the acf and pacf for an identical AR(1) process to thatused for figure 8.4, except that the autoregressive coefficient is now nega-tive This results in a damped sinusoidal pattern for the acf, which againbecomes insignificant after around lag 5 Recalling that the autocorrelation
Trang 50.5 0.4
0.3 0.2
0.1 0
lag,s
acf pacf
coefficient for this AR(1) at lag s is equal to (−0.5) s, this will be positive for
even s and negative for odd s Only the first pacf coefficient is significant
(and negative)
Figure 8.6 plots the acf and pacf for a non-stationary series (see chapter
12 for an extensive discussion) that has a unit coefficient on the lagged
dependent variable The result is that shocks to y never die away, and persist
indefinitely in the system Consequently, the acf function remains relativelyflat at unity, even up to lag 10 In fact, even by lag 10, the autocorrelationcoefficient has fallen only to 0.9989 Note also that, on some occasions, the
Trang 61 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
lag,s
acf pacf
non-Finally, figure 8.7 plots the acf and pacf for a mixed ARMA process Asone would expect of such a process, both the acf and the pacf declinegeometrically – the acf as a result of the AR part and the pacf as a result ofthe MA part The coefficients on the AR and MA are, however, sufficientlysmall that both acf and pacf coefficients have become insignificant by lag 6
Trang 78.7 Building ARMA models: the Box–Jenkins approach
Although the existence of ARMA models pre-dates them, Box and Jenkins(1976) were the first to approach the task of estimating an ARMA model in
a systematic manner Their approach was a practical and pragmatic one,involving three steps:
This involves determining the order of the model required to capture the dynamic
features of the data Graphical procedures are used (plotting the data overtime and plotting the acf and pacf) to determine the most appropriatespecification
Step 2
This involves estimating the parameters of the model specified in step 1 This
can be done using least squares or another technique, known as maximumlikelihood, depending on the model
Step 3
This involves model checking – i.e determining whether the model specified
and estimated is adequate Box and Jenkins suggest two methods:
overfit-ting and residual diagnostics Overfitoverfit-ting involves deliberately fitoverfit-ting a larger
model than that required to capture the dynamics of the data as identified
in step 1 If the model specified at step 1 is adequate, any extra terms added
to the ARMA model would be insignificant Residual diagnostics implies
check-ing the residuals for evidence of linear dependence, which, if present, wouldsuggest that the model originally specified was inadequate to capture thefeatures of the data The acf, pacf or Ljung–Box tests can all be used
It is worth noting that ‘diagnostic testing’ in the Box–Jenkins world tially involves only autocorrelation tests rather than the whole barrage oftests outlined in chapter 6 In addition, such approaches to determiningthe adequacy of the model would reveal only a model that is under-parameterised (‘too small’) and would not reveal a model that is over-parameterised (‘too big’)
essen-Examining whether the residuals are free from autocorrelation is muchmore commonly used than overfitting, and this may have arisen partly
Trang 8because, for ARMA models, it can give rise to common factors in the fitted model that make estimation of this model difficult and the statisticaltests ill-behaved For example, if the true model is an ARMA(1,1) and wedeliberately then fit an ARMA(2,2), there will be a common factor so thatnot all the parameters in the latter model can be identified This problemdoes not arise with pure AR or MA models, only with mixed processes.
over-It is usually the objective to form a parsimonious model, which is one that
describes all the features of the data of interest using as few parameters –i.e as simple a model – as possible A parsimonious model is desirable forthe following reasons
● The residual sum of squares is inversely proportional to the number of
degrees of freedom A model that contains irrelevant lags of the variable
or of the error term (and therefore unnecessary parameters) will usuallylead to increased coefficient standard errors, implying that it will be moredifficult to find significant relationships in the data Whether an increase
in the number of variables – i.e a reduction in the number of degrees offreedom – will actually cause the estimated parameter standard errors torise or fall will obviously depend on how much the RSS falls, and on the
relative sizes of T and k If T is very large relative to k, then the decrease in the RSS is likely to outweigh the reduction in T − k, so that the standard
errors fall As a result ‘large’ models with many parameters are moreoften chosen when the sample size is large
● Models that are profligate might be inclined to fit to data specific featuresthat would not be replicated out of the sample This means that themodels may appear to fit the data very well, with perhaps a high value of
R2, but would give very inaccurate forecasts Another interpretation ofthis concept, borrowed from physics, is that of the distinction between
‘signal’ and ‘noise’ The idea is to fit a model that captures the signal (the
important features of the data, or the underlying trends or patterns) butthat does not try to fit a spurious model to the noise (the completelyrandom aspect of the series)
8.7.1 Information criteria for ARMA model selection
Nowadays, the identification stage would typically not be done using ical plots of the acf and pacf The reason is that, when ‘messy’ real data areused, they rarely exhibit the simple patterns of figures 8.1 to 8.7, unfortu-nately This makes the acf and pacf very hard to interpret, and thus it isdifficult to specify a model for the data Another technique, which removessome of the subjectivity involved in interpreting the acf and pacf, is to use
Trang 9graph-what are known as information criteria Information criteria embody two
fac-tors: a term that is a function of the residual sum of squares, and somepenalty for the loss of degrees of freedom from adding extra parameters As
a consequence, adding a new variable or an additional lag to a model willhave two competing effects on the information criteria: the RSS will fall butthe value of the penalty term will increase
The object is to choose the number of parameters that minimises the value
of the information criteria Thus adding an extra term will reduce the value
of the criteria only if the fall in the RSS is sufficient to more than outweighthe increased value of the penalty term There are several different criteria,which vary according to how stiff the penalty term is The three most popu-lar information criteria are Akaike’s (1974) information criterion, Schwarz’s(1978) Bayesian information criterion (SBIC) and the Hannan–Quinn infor-mation criterion (HQIC) Algebraically, these are expressed, respectively,as
where ˆσ2 is the residual variance (also equivalent to the residual sum of
squares divided by the number of observations, T ), k = p + q + 1 is the total number of parameters estimated and T is the sample size The information criteria are actually minimised subject to p ≤ ¯p, q ≤ ¯q – i.e an upper limit
is specified on the number of moving average ( ¯q)and/or autoregressive ( ¯p)terms that will be considered
SBIC embodies a much stiffer penalty term than AIC, while HQIC is
some-where in between The adjusted R2measure can also be viewed as an mation criterion, although it is a very soft one, which would typically selectthe largest models of all It is worth noting that there are several otherpossible criteria, but these are less popular and are mainly variants of thosedescribed above
infor-8.7.2 Which criterion should be preferred if they suggest different model orders?
SBIC is strongly consistent, but inefficient, and AIC is not consistent, but
is generally more efficient In other words, SBIC will asymptotically deliverthe correct model order, while AIC will deliver on average too large a model,
Trang 10even with an infinite amount of data On the other hand, the average ation in selected model orders from different samples within a given pop-ulation will be greater in the context of SBIC than AIC Overall, then, nocriterion is definitely superior to others.
vari-8.7.3 ARIMA modelling
ARIMA modelling, as distinct from ARMA modelling, has the additional
letter ‘I’ in the acronym, standing for ‘integrated’ An integrated autoregressive
process is one whose characteristic equation has a root on the unit circle.
Typically, researchers difference the variable as necessary and then build
an ARMA model on those differenced variables An ARMA(p, q) model in the variable differenced d times is equivalent to an ARIMA(p, d, q) model
on the original data (see chapter 12 for further details) For the remainder
of this chapter, it is assumed that the data used in model constructionare stationary, or have been suitably transformed to make them stationary.Thus only ARMA models are considered further
in helping to forecast future values of a series If this is accepted, a modelthat places more weight on recent observations than those further in thepast would be desirable On the other hand, observations a long way inthe past may still contain some information useful for forecasting futurevalues of a series, which would not be the case under a centred movingaverage An exponential smoothing model will achieve this, by imposing ageometrically declining weighting scheme on the lagged values of a series.The equation for the model is
where α is the smoothing constant, with 0 < α < 1, y tis the current realised
value and S t is the current smoothed value
Since α + (1 − α) = 1, S tis modelled as a weighted average of the current
observation y t and the previous smoothed value The model above can berewritten to express the exponential weighting scheme more clearly By
Trang 11lagging (8.49) by one period, the following expression is obtained,
Substituting into (8.53) for S t−2from (8.51),
S t = αy t + (1 − α)αy t−1 + (1 − α)2(αy t−2 + (1 − α)S t−3) (8.54)
S t = αy t + (1 − α)αy t−1 + (1 − α)2αy t−2 + (1 − α)3S t−3 (8.55)
T successive substitutions of this kind would lead to
The forecasts from an exponential smoothing model are simply set to the
current smoothed value, for any number of steps ahead, s:
to model the trends (using a unit root process – see chapter 12) and theseasonalities (see later in this chapter) of the form that are typically present
in real estate data
Exponential smoothing has several advantages over the slightly morecomplex ARMA class of models discussed above First, exponential smooth-ing is obviously very simple to use Second, there is no decision to be made
on how many parameters to estimate (assuming only single exponential
Trang 1210 9 8 7
6
(%) 5
4 3 2 1
be optimal for capturing any linear dependence in the data Moreover, theforecasts from an exponential smoothing model do not converge on thelong-term mean of the variable as the horizon increases The upshot is thatlong-term forecasts are overly affected by recent events in the history of theseries under investigation and will therefore be suboptimal
8.9 An ARMA model for cap rates
We apply an ARMA model to the NCREIF appraisal-based cap rates for the
‘all real estate’ category The capitalisation (cap) refers to the going-in caprate series (or initial yield) and is the net operating income in the first yearover the purchase price This series is available from 1Q1978 and the lastobservation in our sample is 4Q2007 We plot the series in figure 8.8 Thecap rate fell steeply from 2001, with the very last observation of the sampleindicating a reversal Cap rates had also shown a downward trend in the1980s and up to the mid-1990s, but the latest decreasing trend was steeper(apart from a few quarters in 1999 to 2000) Certainly, by the end of 2007,cap rates had reached their lowest level in our sample
Applying an ARMA model to the original cap rates may be problematic, asthe series exhibits low variation and trends are apparent over several years –e.g a downward trend from 1995 The series is also smoothed and stronglyautocorrelated, as the correlogram in figure 8.9 panel (a) demonstrates.Panel (b) shows the partial autocorrelation function
Trang 131.0 1.0 0.9
0.7 0.6
0.6 0.5
0.4
0.4
0.3 0.2
−0.2
−0.4
0.2
0.1 0.0
0.5 0.0
autocor-coefficient is still 0.54 The computed Ljung Box Q∗ statistic with twelve
lags takes a value of 600.64 (p-value = 0.00), which is highly significant,confirming the strong autocorrelation pattern The partial autocorrela-tion function shows a large peak at lag 1 with a rapid decline thereafter,which is indicative of a highly persistent autoregressive structure in theseries
The cap rate series in levels does not have the appropriate properties to
fit an ARMA model, therefore, and a transformation to first differences isrequired (see chapter 12, where this issue is discussed in detail) The newseries of differences of the cap rate is given in figure 8.10
The cap rate series in first differences appears to have very different erties from that in levels, and we again compute the acf and pacf for thetransformed series, which are shown in figure 8.11
prop-The first-order autocorrelation coefficient is now negative, at−0.30 Boththe second- and third-order coefficients are small, indicating that the trans-formation has made the series much less autocorrelated compared withthe levels data The Ljung–Box statistic using twelve lags is now reduced to
Trang 140.4 0.3
Table 8.1 Selecting the ARMA specification for cap rates
Order of AR, MA terms AIC SBIC
49.42, although it is still significant at the 1 per cent level (p = 0.00) We
also observe a seasonal pattern at lags 4, 8 and 12, when the size of theautocorrelation coefficients increases This is also the case for the pacf Forthe moment we ignore this characteristic of the data (the strong autocorre-lation at lags 4, 8 and 12), and we proceed to fit an ARMA model to the firstdifferences of the cap rate series We apply AIC and SBIC to select the modelorder Table 8.1 shows different combinations of ARMA specifications andthe estimated AIC and SBIC values
Trang 15Table 8.2 Estimation of ARMA (3,3)
Actual and fitted
values for cap rates
in first differences
Interestingly, both AIC and SBIC select an ARMA(3,3) Despite the fact thatAIC often tends to select higher order ARMAs, in our example there is aconsensus across the two criteria The estimated ARMA(3,3) is presented intable 8.2
All the AR and MA terms are highly significant at the 1 per cent level ThisARMA model explains approximately 31 per cent of the changes in the caprate This is a satisfactory performance if we consider the quarterly volatility
of the changes in the cap rate Figure 8.12 illustrates this volatility and givesthe actual and fitted values
The fitted series exhibit some volatility, which tends to match that ofthe actual series in the 1980s The two spikes in 1Q2000 and 3Q2001
Trang 16Table 8.3 Actual and forecast cap rates
CAPActual Forecast forecastForecast period 1Q07–4Q07
The forecast performance of the ARMA(3,3) is examined next There are,
of course, different ways to evaluate the model’s forecast, as we outline
in chapter 9 The application of ARMA models in economics and financesuggests that they are good predictors in the short run We use the ARMAmodel in our example to produce two sets of four-quarter forecasts Weobtain the forecasts from this model for the four-quarters of 2006 andthe next four quarters (that is, for 2006 and for 2007) In the first case
we estimate the full sample specification up to 4Q2005 and we ate forecast for 1Q2006 to 4Q2006 We then repeat the analysis for thenext four-quarter period – i.e we estimate the ARMA to 4Q2006 and weproduce forecasts for the period 1Q2007 to 4Q2007 From the ARMA model
gener-we obtain forecasts for the first differences in the cap rate, which gener-we thenuse to obtain the forecast for the actual level of the cap rates Table 8.3summarises the forecasts and figure 8.13 plots them
Before discussing the forecasts, it is worth noting that all the terms inthe ARMA(3,3) over the two estimation periods retain their statistical signif-icance at the 1 per cent level In the first three quarters of 2007 cap ratesfell by over forty bps (figure 8.13, panel (a)) The ARMA model produces a