Multi-equation structural modelsLearning outcomes In this chapter, you will learn how to ● compare and contrast single-equation and systems-basedapproaches to building models; ● discuss
Trang 1Table 9.15 Mean forecast errors for the changes in rents series
Steps ahead
(a) LaSalle Investment Management rents series
VAR(1) −1.141 −2.844 −3.908 −4.729 −5.407 −5.912 −6.158 −6.586VAR(2) −0.799 −1.556 −2.652 −3.388 −4.155 −4.663 −4.895 −5.505AR(2) −0.595 −0.960 −1.310 −1.563 −1.720 −1.819 −1.748 −1.876Long-term mean −2.398 −3.137 −3.843 −4.573 −5.093 −5.520 −5.677 −6.049Random walk 0.466 −0.246 −0.923 −1.625 −2.113 −2.505 −2.624 −2.955(b) CB Hillier Parker rents series
VAR(1) −1.447 −3.584 −5.458 −7.031 −8.445 −9.902 −11.146 −12.657AR(2) −1.845 −2.548 −2.534 −1.979 −1.642 −1.425 −1.204 −1.239Long-term mean −3.725 −5.000 −6.036 −6.728 −7.280 −7.772 −8.050 −8.481Random walk 1.126 −0.108 −1.102 −1.748 −2.254 −2.696 −2.920 −3.292
forecast is made in 1Q97 for the period 2Q97 to 1Q99) In this way, four one-quarter forecasts, forty-four two-quarter forecasts, and so forth arecalculated
forty-The forty-four one-quarter forecasts are compared with the realised datafor each of the four methodologies This is repeated for the two-quarter-,three-quarter-, , and eight-quarter-ahead computed values This compar-ison reveals how closely rent predictions track the corresponding historicalrent changes over the different lengths of the forecast horizon (one to eightquarters) The mean forecast error, the mean squared forecast error and thepercentage of correct sign predictions are the criteria employed to selectthe best performing models
Ex ante forecasts of retail rents based on all methods are also made for
eight quarters from the last available observation at the time that the studywas written Forecasts of real retail rents are therefore made for the peri-ods 1999 quarter two to 2001 quarter one An evaluation of the forecastsobtained from the different methodologies is presented in tables 9.15 to 9.17.Table 9.15 reports the MFE
As noted earlier, a good forecasting model should have a mean forecastingerror of zero The first observation that can be made is that, on average, allmean errors are negative for all models and forecast horizons This meansthat all models over-predict, except for the one-quarter-ahead CBHP forecastusing the random walk This bias could reflect non-economic influences
Trang 2Table 9.16 Mean squared forecast errors for the changes in rents series
VAR(1) 78.69 117.28 170.41 236.70 360.34 467.90 658.41 867.72
Long-term mean 209.55 163.42 139.88 137.20 139.98 143.91 150.20 154.84Random walk 198.16 132.86 123.71 149.78 132.94 148.79 149.62 158.13
during the forecast period The continuous fall in rents in the period 1990
to 1995, which constitutes much of the out-of-sample period, may to someextent explain this over-prediction, however Reasons that the authors putforward include the contention that supply increases had greater effectsduring this period when retailers were struggling than in the overall sampleperiod and the fact that retailers benefited less than the growth in GDP atthat time suggested, as people were indebted and seeking to save more toreduce indebtedness
Of the two VAR models used for LIM rents, the VAR(2) model – i.e a VARwith a lag length of two – produces more accurate forecasts This is notsurprising, given that the VAR(1) model of changes in LIM rents is a poorperformer compared with the VAR(2) model The forecasts produced by therandom walk model appear to be the most successful when forecasts up tothree quarters ahead are considered, however Then the AR model becomesthe best performer The same conclusion can be reached for CBHP rents, buthere the random walk model is superior to the AR(2) model for the first fourquarter-ahead forecasts
Table 9.16 shows the results based on the MSFE, an overall accuracy sure The computations of the MSFE for all eight time horizons in the CBHPcase show that the AR(2) model has the smallest MSFEs The VAR modelappears to be the second-best-performing methodology when forecasts up
Trang 3mea-Table 9.17 Percentage of correct sign predictions for the changes in rents series
Note: The random walk in levels model cannot, by definition, produce sign
predictions, since the predicted change is always zero.
to two quarters ahead are considered, but, as the forecast time horizonlengthens, the performance of the VAR deteriorates In the case of LIM retailrents, the VAR(2) model performs best up to four quarters ahead, but whenlonger-term forecasts are considered the AR process appears to generatethe most accurate forecasts Overall, the long-term mean procedure out-performs the random walk model in the first two quarters of the forecastperiod for both series, but this is reversed when the forecast period extendsbeyond four quarters Therefore, based on the MSFE criterion, the VAR(2) isthe most appropriate model to forecast changes in LIM rents up to four quar-ters but then the AR(2) model performs better This criterion also suggeststhat changes in CBHP rents are best forecast using a pure autoregressivemodel across all forecasting horizons
Table 9.17 displays the percentage of correct predictions of the sign forchanges in rent from each model for forecasts up to eight periods ahead.While the VAR model’s performance can almost match that of the AR speci-fication for the shortest horizon, the latter model dominates as the modelsforecast further into the future From these results, the authors concludethat rent changes have substantial memory for (at least) two periods Henceuseful information for predicting rents is contained in their own lags Thepredictive capacity of the other aggregates within the VAR model is limited.There is some predictive ability for one period, but it quickly disappearsthereafter Overall, then, the autoregressive approach is to be preferred
Trang 4Key concepts
The key terms to be able to define and explain from this chapter are
● root mean squared error ● Theil’s U 1 statistic
● bias, variance and covariance proportions ● Theil’s U 2 statistic
● out-of-sample forecasts ● forecast encompassing
Trang 5Multi-equation structural models
Learning outcomes
In this chapter, you will learn how to
● compare and contrast single-equation and systems-basedapproaches to building models;
● discuss the cause, consequence and solution to simultaneousequations bias;
● derive the reduced-form equations from a structural model;
● describe and apply several methods for estimating simultaneousequations models; and
● conduct a test for exogeneity
All the structural models we have considered thus far are single-equationmodels of the general form
In chapter 7, we constructed a single-equation model for rents The rentequation could instead be one of several equations in a more general modelbuilt to describe the market, however In the context of figure 7.1, onecould specify four equations – for demand (absorption or take-up), vacancy,rent and construction Rent variation is then explained within this system
of equations Multi-equation models represent alternative and competitivemethodologies to single-equation specifications, which have been the mainempirical frameworks in existing studies and in practice It should be notedthat, even if single equations fit the historical data very well, they canstill be combined to construct multi-equation models when theory suggeststhat causal relationships should be bidirectional or multidirectional Suchsystems are also used by private practices even though their performancemay be poorer This is because the dynamic structure of a multi-equation
303
Trang 6system may affect the ability of an individual equation to reproduce theproperties of an historical series Multi-equation systems are frameworks ofimportance to real estate forecasters.
Multi-equation frameworks usually take the form of equation structures These simultaneous models come with particularconditions that need to be satisfied for their estimation and, in general,their treatment and estimation require the study of specific econometricissues There is also another family of models that, although they resemblesimultaneous-equations models, are actually not These models, which aretermed recursive or triangular systems, are also commonly encountered inthe real estate field
simultaneous-This chapter has four objectives First, to explain the nature ofsimultaneous-equations models and to study the conditions that need to
be fulfilled for their estimation Second, to describe the available tion techniques for these models Third, to draw a distinction betweensimultaneous and recursive multi-equation models Fourth, to illustratethe estimation of a systems model
estima-10.1 Simultaneous-equation models
Systems of equations constitute one of the important circumstances underwhich the assumption of non-stochastic explanatory variables can be vio-lated Remember that this is one of the assumptions of the classical linearregression model There are various ways of stating this condition, differingslightly in terms of strictness, but they all have the same broad implica-
tion It can also be stated that all the variables contained in the X matrix are assumed to be exogenous – that is, their values are determined outside
the equation This is a rather simplistic working definition of exogeneity,although several alternatives are possible; this issue is revisited later in thischapter Another way to state this is that the model is ‘conditioned on’
the variables in X, or that the variables in the X matrix are assumed not
to have a probability distribution Note also that causality in this model
runs from X to y, and not vice versa – i.e changes in the values of the explanatory variables cause changes in the values of y, but changes in the value of y will not impact upon the explanatory variables On the other hand, y is an endogenous variable – that is, its value is determined
by (10.1)
To illustrate a situation in which this assumption is not satisfied, sider the following two equations, which describe a possible model for the
Trang 7con-demand and supply of new office space in a metropolitan area:
where Qdt = quantity of new office space demanded at time t, Qst =
quan-tity of new office space supplied (newly completed) at time t, Rt = rent level
prevailing at time time t, EMPt = office-using employment at time t, INTt =
interest rate at time t, and ut and vtare the error terms
Equation (10.2) is an equation for modelling the demand for new officespace, and (10.3) is a specification for the supply of new office space (10.4) is
an equilibrium condition for there to be no excess demand (firms requiringmore new space to let but they cannot) and no excess supply (empty officespace due to lack of demand for a given structural vacancy rate in themarket).1Assuming that the market always clears – that is, that the market
is always in equilibrium – (10.2) to (10.4) can be written
Equations (10.5) and (10.6) together comprise a simultaneous structuralform of the model, or a set of structural equations These are the equa-tions incorporating the variables that real estate theory suggests should berelated to one another in a relationship of this form The researcher may,
of course, adopt different specifications that are consistent with theory, butany structure that resembles equations (10.5) and (10.6) represents a simul-taneous multi-equation model The point to emphasise here is that priceand quantity are determined simultaneously: rent affects the quantity ofoffice space and office space affects rent Thus, in order to construct andrent more office space, everything else equal, the developers will have tolower the price Equally, in order to achieve higher rents per square metre,
developers need to construct and place in the market less floor space R and
Q are endogenous variables, while EMP and INT are exogenous.
1 Of course, one could argue here that such contemporaneous relationships are unrealistic For example, interest rates will have affected supply in the past when developers were making plans for development This is true, although on several occasions the
contemporaneous term appears more important even if theory supports a lag structure To
an extent, this owes to the linkages of economic and monetary data in successive periods Hence the current interest rate gives an idea of the interest rate in the recent past For the sake of illustrating simultaneous-equations models, however, let us assume the presence
of relationships such as (10.2) and (10.3).
Trang 8A set of reduced-form equations corresponding to (10.5) and (10.6) can be
obtained by solving (10.5) and (10.6) for R and Q separately There will be a
reduced-form equation for each endogenous variable in the system, which
will contain only exogenous variables.
Multiplying (10.8) through by βµ and rearranging,
µQ t − µα − µγ EMPt − µut = βQt − βλ − βκINTt − βvt (10.12)
µQ t − βQt = µα − βλ − βκINTt + µγ EMPt + µut − βvt (10.13)
(µ − β)Qt = (µα − βλ) − βκINTt + µγ EMPt + (µut − βvt) (10.14)
10.2 Simultaneous equations bias
It would not be possible to estimate (10.5) and (10.6) validly using OLS, as
they are related to one another because they both contain R and Q, and OLS
would require them to be estimated separately What would have happened,however, if a researcher had estimated them separately using OLS? Both
equations depend on R One of the CLRM assumptions was that X and u are independent (when X is a matrix containing all the variables on the RHS
of the equation), and, given the additional assumption that E(u)= 0, then
E(Xu)= 0 (i.e the errors are uncorrelated with the explanatory variables)
It is clear from (10.11), however, that R is related to the errors in (10.5) and (10.6) – i.e it is stochastic This assumption has therefore been violated.
Trang 9What would the consequences be for the OLS estimator, ˆβ, if the taneity were ignored? Recall that
equation system, so that E( ˆ β)= β in (10.22) The implication is that the OLS
estimator, ˆβ, would be unbiased
If the equation is part of a system, however, then E[(XX)−1Xu] = 0, ingeneral, so the last term in (10.22) will not drop out, and it can therefore
be concluded that the application of OLS to structural equations that arepart of a simultaneous system will lead to biased coefficient estimates This
is known as simultaneity bias or simultaneous equations bias.
Is the OLS estimator still consistent, even though it is biased? No, in fact,the estimator is inconsistent as well, so that the coefficient estimates wouldstill be biased even if an infinite amount of data were available, althoughproving this would require a level of algebra beyond the scope of this book
10.3 How can simultaneous-equation models be estimated?
Taking (10.11) and (10.15) – i.e the reduced-form equations – they can berewritten as
R t = π10+ π11INT t + π12EMP t + ε 1t (10.23)
Q t = π20+ π21INT t + π22EMP t + ε 2t (10.24)
Trang 10where the π coefficients in the reduced form are simply combinations of
the original coefficients, so that
misspecifications) Estimates of the π ij coefficients will thus be obtained
The values of the π coefficients are probably not of much interest, however;
what we wanted were the original parameters in the structural equations –
α , β, γ , λ, µ and κ The latter are the parameters whose values determine
how the variables are related to one another according to economic andreal estate theory
10.4 Can the original coefficients be retrieved from the π s?
The short answer to this question is ‘Sometimes’, depending upon whether
the equations are identified Identification is the issue of whether there is
enough information in the reduced-form equations to enable the form coefficients to be calculated Consider the following demand and sup-ply equations:
It is impossible to say which equation is which, so, if a real estate analystsimply observed some space rented and the price at which it was rented, it
would not be possible to obtain the estimates of α, β, λ and µ This arises
because there is insufficient information from the equations to estimatefour parameters Only two parameters can be estimated here, althougheach would be some combination of demand and supply parameters, and
so neither would be of any use In this case, it would be stated that both
equations are unidentified (or not identified or under-identified) Notice that
this problem would not have arisen with (10.5) and (10.6), since they havedifferent exogenous variables
10.4.1 What determines whether an equation is identified or not?
Any one of three possible situations could arise, as shown in box 10.1
Trang 11Box 10.1 Determining whether an equation is identified
(1) An equation such as (10.25) or (10.26) is unidentified In the case of an
unidentified equation, structural coefficients cannot be obtained from the
reduced-form estimates by any means.
(2) An equation such as (10.5) or (10.6) is exactly identified (just identified) In the
case of a just identified equation, unique structural-form coefficient estimates can
be obtained by substitution from the reduced-form equations.
(3) If an equation is over-identified, more than one set of structural coefficients can be
obtained from the reduced form An example of this is presented later in this chapter.
How can it be determined whether an equation is identified or not?Broadly, the answer to this question depends upon how many and whichvariables are present in each structural equation There are two conditionsthat can be examined to determine whether a given equation from a system
is identified – the order condition and the rank condition
● The order condition is a necessary but not sufficient condition for an
equa-tion to be identified That is, even if the order condiequa-tion is satisfied, theequation might still not be identified
● The rank condition is a necessary and sufficient condition for identification.
The structural equations are specified in a matrix form and the rank
of a coefficient matrix of all the variables excluded from a particularequation is examined An examination of the rank condition requiressome technical algebra beyond the scope of this text
Even though the order condition is not sufficient to ensure the cation of an equation from a system, the rank condition is not consideredfurther here For relatively simple systems of equations, the two rules wouldlead to the same conclusions In addition, most systems of equations ineconomics and real estate are in fact over-identified, with the result thatunder-identification is not a big issue in practice
identifi-10.4.2 Statement of the order condition
There are a number of different ways of stating the order condition; thatemployed here is an intuitive one (taken from Ramanathan, 1995, p 666,and slightly modified):
Let G denote the number of structural equations An equation is just identified
if the number of variables excluded from an equation is G− 1, where ‘excluded’ means the number of all endogenous and exogenous variables that are not present
in this particular equation If more than G− 1 are absent, it is over-identified If
less than G− 1 are absent, it is not identified.
Trang 12One obvious implication of this rule is that equations in a system canhave differing degrees of identification, as illustrated by the followingexample.
Example 10.1 Determining whether equations are identified
Let us determine whether each equation is over-identified, under-identified
or just identified in the following system of equations
ABS t = α0+ α1R t + α2Q st + α3EMP t + α4USG t + u 1t (10.27)
R t = β0+ β1Q st + β2EMP t + u 2t (10.28)
where ABSt = quantity of office space absorbed at time t, Rt = rent level
prevailing at time t, Qst = quantity of new office space supplied at time
t, EMP t = office-using employment at time t, USGt = is the usage ratio (that
is, a measure of the square metres per employee) at time t and ut , et and vt are the error terms at time t.
In this case, there are G= 3 equations and three endogenous variables
(Q, ABS and R) EMP and USG are exogenous, so we have five variables in
total According to the order condition, if the number of excluded variables
is exactly two, the equation is just identified If the number of excludedvariables is more than two, the equation is over-identified If the number ofexcluded variables is fewer than two, the equation is not identified
Applying the order condition to (10.27) to (10.29) produces the followingresults
● Equation (10.27): contains all the variables, with none excluded, so it isnot identified
● Equation (10.28): two variables (ABS and USG) are excluded, and so it is
distri-sible to classify two forms of exogeneity: predeterminedness and strictexogeneity
Trang 13● A predetermined variable is one that is independent of all contemporaneous
and future errors in that equation
● A strictly exogenous variable is one that is independent of all
contempora-neous, future and past errors in that equation
10.5.1 Tests for exogeneity
Consider again (10.27) to (10.29) Equation (10.27) contains R and Q – but are separate equations required for them, or could the variables R and Q be
treated as exogenous? This can be formally investigated using a Hausman(1978) test, which is calculated as shown below
(1) Obtain the reduced-form equations corresponding to (10.27) to (10.29),
(1− β1γ1)+ (u 2t + β1u3t)
(1− β1γ1) (10.33)
(10.33) is the reduced-form equation for Rt, since there are no
endoge-nous variables on the RHS Substituting in (10.27) for Qst from (10.29),
Trang 14EMP t + α4USG t +
(α1+ α2γ1)(u 2t + β1u 3t)(1− β1γ1)
+ (u 1t + α2u 3t)
(10.39)
(10.39) is the reduced-form equation for ABSt Finally, to obtain the
reduced-form equation for Qst, substitute in (10.29) for Rt from (10.33):
Q st =
γ0+ γ1(β0+ β1γ0)(1− β1γ1)
+ γ1β2EMP t
(1− β1γ1)+
γ1(u 2t + β1u 3t)(1− β1γ1) + u 3t
(10.40)Thus the reduced-form equations corresponding to (10.27) to (10.29) are,respectively, given by (10.39), (10.33) and (10.40) These three equations
can also be expressed using π ijfor the coefficients, as discussed above:
ABS t = π10+ π11EMP t + π12USG t + v1 (10.41)
Estimate the reduced-form equations (10.41) to (10.43) using OLS,
and obtain the fitted values, A ˆBSt1, ˆ R1t , ˆ Q1st, where the superfluoussuperscript1denotes the fitted values from the reduced-form equations.(2) Run the regression corresponding to (10.27) – i.e the structural-formequation – at this stage ignoring any possible simultaneity
(3) Run the regression (10.27) again, but now also including the fitted valuesfrom the reduced-form equations, ˆR1
(4) Use an F -test to test the joint restriction that λ2 = 0 and λ3= 0 If the
null hypothesis is rejected, Rt and Qstshould be treated as endogenous
If λ2and λ3are significantly different from zero, there is extra important
information for modelling ABSt from the reduced-form equations On
the other hand, if the null is not rejected, Rt and Qst can be treated
as exogenous for ABSt, and there is no useful additional information
available for ABSt from modelling Rt and Qstas endogenous variables.Steps 2 to 4 would then be repeated for (10.28) and (10.29)
Trang 1510.6 Estimation procedures for simultaneous equations systems
Each equation that is part of a recursive system (see section 10.8 below)can be estimated separately using OLS In practice, though, not all systems
of equations will be recursive, so a direct way to address the estimation
of equations that are from a true simultaneous system must be sought Infact, there are potentially many methods that can be used, three of which –indirect least squares (ILS), two-stage least squares (2SLS or TSLS) and instru-mental variables – are detailed here Each of these are discussed below
10.6.1 Indirect least squares
Although it is not possible to use OLS directly on the structural equations,
it is possible to apply OLS validly to the reduced-form equations If thesystem is just identified, ILS involves estimating the reduced-form equationsusing OLS, and then using them to substitute back to obtain the structuralparameters ILS is intuitive to understand in principle, but it is not widelyapplied, for the following reasons
(1) Solving back to get the structural parameters can be tedious For a large system,
the equations may be set up in a matrix form, and to solve them maytherefore require the inversion of a large matrix
(2) Most simultaneous equations systems are over-identified, and ILS can be used to
obtain coefficients only for just identified equations For over-identifiedsystems, ILS would not yield unique structural form estimates
ILS estimators are consistent and asymptotically efficient, but in eral they are biased, so that in finite samples ILS will deliver biasedstructural-form estimates In a nutshell, the bias arises from the fact that thestructural-form coefficients under ILS estimation are transformations of thereduced-form coefficients When expectations are taken to test for unbiased-ness, it is, in general, not the case that the expected value of a (non-linear)combination of reduced-form coefficients will be equal to the combination
gen-of their expected values (see Gujarati, 2009, for a progen-of)
10.6.2 Estimation of just identified and over-identified systems using 2SLS
This technique is applicable for the estimation of over-identified systems,for which ILS cannot be used It can also be employed for estimating thecoefficients of just identified systems, in which case the method would yieldasymptotically equivalent estimates to those obtained from ILS
Trang 16Two-stage least squares estimation is done in two stages.
● Stage 1 Obtain and estimate the reduced-form equations using OLS Save
the fitted values for the dependent variables
● Stage 2 Estimate the structural equations using OLS, but replace any RHS
endogenous variables with their stage 1 fitted values
Example 10.2
Suppose that (10.27) to (10.29) are required 2SLS would involve the followingtwo steps (with time subscripts suppressed for ease of exposition)
● Stage 1 Estimate the reduced-form equations (10.41) to (10.43) individually
by OLS and obtain the fitted values, and denote them ˆABS1, ˆ R1, ˆ Q1S, wherethe superfluous superscript1 indicates that these are the fitted valuesfrom the first stage
● Stage 2 Replace the RHS endogenous variables with their stage 1 estimated
In a simultaneous equations framework, it is still of concern whether theusual assumptions of the CLRM are valid or not, although some of the teststatistics require modifications to be applicable in the systems context Mosteconometrics packages will automatically make any required changes Toillustrate one potential consequence of the violation of the CLRM assump-tions, if the disturbances in the structural equations are autocorrelated, the2SLS estimator is not even consistent
The standard error estimates also need to be modified compared withtheir OLS counterparts (again, econometrics software will usually do this
automatically), but, once this has been done, the usual t-tests can be used
to test hypotheses about the structural-form coefficients This modificationarises as a result of the use of the reduced-form fitted values on the RHSrather than actual variables, which implies that a modification to the errorvariance is required