In-Sample Diagnostics: Stochastic Chaos Model Structure: 4 Lags, 3 Neurons Diagnostic Linear Model Network Model ∗marginal significance levels network model, appearing in parentheses, exp
Trang 10 5 10 15 20 25 30 0
FIGURE 5.2 Stochastic chaos process for different initial conditions
TABLE 5.1 In-Sample Diagnostics: Stochastic
Chaos Model (Structure: 4 Lags, 3 Neurons)
Diagnostic Linear Model (Network Model)
∗marginal significance levels
network model, appearing in parentheses, explains 53% The Quinn information criterion favors, not surprisingly, the network model.The significance test of the Q statistic shows that we cannot reject serialindependence of the regression residuals By all other criteria, the linear
Trang 2FIGURE 5.3 In-sample errors: stochastic chaos model
specification suffers from serious specification error There is evidence ofserial correlation in squared errors, as well as non-normality, asymmetry,and neglected nonlinearity in the residuals Such indicators would suggestthe use of nonlinear models as alternatives to the linear autoregressivestructure
Figure 5.3 pictures the error paths predicted by the linear and networkmodels The linear model errors are given by the solid curve and the net-work errors by dotted paths As expected, we see that the dotted curvesgenerally are closer to zero
5.2.2 Out-of-Sample Performance
The path of the out-of-sample prediction errors appears in Figure 5.4 Thesolid path represents the forecast error of the linear model while the dottedcurves are for the network forecast errors This shows the improved per-formance of the network relative to the linear model, in the sense that itserrors are usually closer to zero
Table 5.2 summarizes the out-of-sample statistics These are the rootmean squared error statistics (RMSQ), the Diebold-Mariano statistics forlags zero through four (DM-0 to DM-4), the success ratio for percentage
Trang 3FIGURE 5.4 Out-of-sample prediction errors: stochastic chaos model
TABLE 5.2 Forecast Tests: Stochastic Chaos Model
(Structure: 5 Lags, 4 Neurons)
∗marginal significance levels
of correct sign predictions (SR), and the bootstrap ratio (B-Ratio), which
is the ratio of the network bootstrap error statistic to the linear strap error measure A value less than one, of course, represents a gain fornetwork estimation
Trang 4boot-the Diebold-Mariano tests with lags zero through four are all significant.The success ratio for both models is perfect, since all of the returns inthe stochastic chaos model are positive The final statistic is the boot-strap ratio, the ratio of the network bootstrap error relative to the linearbootstrap error We see that the network reduces the bootstrap error byalmost 13%.
Clearly, if underlying data were generated by a stochastic process,networks are to be preferred over linear models
The SVJD model is widely used for representing highly volatile assetreturns in emerging markets such as Russia or Brazil during periods
of extreme macroeconomic instability The model combines a stochasticvolatility component, which is a time-varying variance of the error term,
as well as a jump diffusion component, which is a Poisson jump process.Both the stochastic volatility component and the Poisson jump components
directly affect the mean of the asset return process They are realistic
para-metric representations of the way many asset returns behave, particularly
in volatile emerging-market economies
Following Bates (1996) and Craine, Lochester, and Syrtveit (1999), wepresent this process in continuous time by the following equations:
φ represents the normal distribution The advantage of the continuous time
representation is that the time interval can become arbitrarily smaller andapproximate real time changes
Trang 5TABLE 5.3 Parameters for SVJD Process
Mean reversion of volatility β 7024
Standard deviation of percentage jump κ 0281
Correlation of Weiner processes ρ 6
The instantaneous conditional variance V follows a mean-reverting square root process The parameter α is the mean of the conditional vari- ance, while β is the mean-reversion coefficient The coefficient σ v is the
variance of the volatility process, while the noise terms dZ and dZ v are thestandard continuous-time white noise Weiner processes, with correlation
coefficient ρ.
Bates (1996) points out that this process has two major advantages.First, it allows systematic volatility risk, and second, it generates an “ana-lytically tractable method” for pricing options without sacrificing accuracy
or unnecessary restrictions This model is especially useful for optionpricing in emerging markets
The parameters used to generate the SVJD process appear in Table 5.3
In this model, S t+1 is equal to S t +[S t ·(µ−λk)] ·dt, and for a small value
of dt will be unit-root nonstationary After first-differencing, the model will
be driven by the components of dV and k ·dq, which are random terms We
should not expect the linear or neural network model to do particularly well.Put another way, we should be suspicious if the network model significantlyoutperforms a rather poor linear model
One realization of the SVJD process, after first-differencing, appears inFigure 5.5 As in the case of the stochastic chaos model, there are periods
of high volatility followed by more tranquil periods Unlike the stochasticchaos model, however, the periods of tranquility are not perfectly flat
We also notice that the returns in the SVJD model are both positive andnegative
5.3.1 In-Sample Performance
Table 5.4 gives the in-sample regression diagnostics of the linear model.Clearly, the linear approach suffers serious specification error in the errorstructure Although the network multiple correlation coefficient is higherthan that of the linear model, the Hannan-Quinn information criterion
only slightly favors the network model The slight improvement of the R2
statistic does not outweigh by too much the increase in complexity due to
Trang 6FIGURE 5.5 Stochastic volatility/jump diffusion process
TABLE 5.4 In-Sample Diagnostics: First-Differenced
SVJD Model (Structure: 4 Lags, 3 Neurons)
Diagnostic Linear Model (Network Model)
∗marginal significance levels
the larger number of parameters to be estimated While the Granger test does not turn up evidence of neglected nonlinearity, the BDStest does Figure 5.6 gives in-sample errors for the SVJD realizations We
Lee-White-do not see much difference
Trang 7FIGURE 5.6 In-sample errors: SVJD model
5.3.2 Out-of-Sample Performance
Figure 5.7 pictures the out-of-sample errors of the two models As expected,
we do not see much difference in the two paths
The out-of-sample statistics appearing in Table 5.5 indicate that thenetwork model does slightly worse, but not significantly worse, than the lin-ear model, based on the Diebold-Mariano statistic Both models do equallywell in terms of the success ratio for correct sign predictions, with slightlybetter performance by the network model The bootstrap ratio favors thenetwork model, reducing the error percentage of the linear model by slightlymore than 3%
The Markov regime switching model is widely used in time-series analysis
of aggregate macro data such as GDP growth rates The basic idea of the
Trang 8FIGURE 5.7 Out-of-sample prediction errors: SVJD model
TABLE 5.5 Forecast Tests: SVJD Model (Structure:
∗marginal significance levels
regime switching model is that the underlying process is linear However,the process follows different regimes when the economy is growing andwhen the economy is shrinking Originally due to Hamilton (1990), it wasapplied to GDP growth rates in the United States
Trang 9Following Tsay (2002, p 135–137), we simulate the following model resenting the rate of growth of GDP for the U.S economy for two states in
rep-the economy, S1and S2:
φ 2,i x t −i + ε 2,i ε2˜φ(0, σ22) if S = S2 (5.7)
where φ represents the Gaussian density function These states have the
following transition matrix, P, describing the probability of moving from
one state to the next, from time (t − 1) to time t:
of nonlinearity in this system The parameters used for generating 500realizations of the MRS model appear in Table 5.6
Notice that in the specification of the transition probabilities, as Tsay(2002) points out, “it is more likely for the U.S GDP to get out of acontraction period than to jump into one” [Tsay (2002), p 137] In oursimulation of the model, the transition probability matrix is called from
a uniform random number generator If, for example, in state S = S1, a
random value of 1 is drawn, the regime will switch to the second state,
S = S2 If a value greater than 118 is drawn, then the regime will remain
Trang 10FIGURE 5.8 Markov switching process
The process{x t } exhibits periodic regime changes, with different
dynam-ics in each regime or state Since the representative forecasting agent doesnot know that the true data-generating mechanism for {x t } is a Markov
regime switching model, a unit root test for this variable cannot reject anI(1) or nonstationary process However, work by Lumsdaine and Papell(1997) and Cook (2001) has drawn attention to the bias of unit root testswhen structural breaks take place We thus approximate the process{x t }
as a stationary process
The underlying data-generating mechanism is, of course, near linear,
so we should not expect great improvement from neural network mation One realization, for 500 observations, appears in Figure 5.8
Trang 11TABLE 5.7 In-Sample Diagnostics: MRSModel (Structure: 4 Lags, 3 Neurons)
Diagnostic Linear Model (Network Model)
Estimate
R2 35 (.38)HQIF 3291 (3268)
∗marginal significance levels
reject normality in the distribution of the residuals The BDS test showssome evidence of neglected nonlinearity, but the LWG test does not.Figure 5.9 pictures the error paths generated by the linear and neural net
models While the overall explanatory power or R2 statistic of the neural
Trang 12Diagnostic Linear Neural Net
∗marginal significance levels
net is slightly higher and the Hannan-Quinn information criterion indicatesthat the network model should be selected, there is not much noticeabledifference in the two paths relative to the actual series
5.4.2 Out-of-Sample Performance
The forecast statistics appear in Table 5.8 We see that the root meansquared error is slightly higher for the network, but the Diebold-Marianostatistics indicate that the difference in the prediction errors is not statis-tically significant The bootstrap error ratio shows that the network modelgives a marginal improvement relative to the linear benchmark
The paths of the linear and network out-of-sample errors appear inFigure 5.10
We see, not surprisingly, that both the linear and network models deliverabout the same accuracy in out-of-sample forecasting Since the MRS isbasically a linear model with a small probability of a switch in the coeffi-cients of the linear data-generating process, the network simply does about
as well as the linear model
What will be more interesting is the forecasting of the switches in ity, rather than the return itself, in this series We return to this subject inthe following section
Building on the stochastic volatility and Markov regime switching modelsand following Tsay [(2002), p 133], we use a simple autoregressive modelwith a regime switching mechanism for its volatility, rather than the return
Trang 13FIGURE 5.10 Out-of-sample prediction errors: MRS model
process itself Specifically, we simulate the following model, similar to theone Tsay estimated as a process representing the daily log returns, includingdividend payments, of IBM stock:2
r t = 043 − 022r t −1 + σ t + u t (5.9)
u t = σ t ε t , ε t ˜φ(0, 1) (5.10)
σ2t = 098u2t −1 + 954σ t2−1 if u t −1 ≤ 0
= 060 + 046u2t −1 + 8854σ t2−1 if u t −1 > 0 (5.11)
where φ(0, 1) is the standard normal or Gaussian density Notice that this
VRS model will have drift in its volatility when the shocks are positive,but not when the shocks are negative However, as Tsay points out, the
2Tsay (2002) omits the GARCH-in-Mean term 5σ t in his specification of the
returns r t.
Trang 14FIGURE 5.11 First-differenced returns and volatility of the VRS model
model essentially follows an IGARCH (integrated GARCH) when shocksare negative, since the coefficients sum to a value greater than unity.Figure 5.11 pictures the first-differenced series of {r t }, since we could
not reject a unit-root process, as well as the volatility process{σ2
t }.
5.5.1 In-Sample Performance
Table 5.9 gives the linear regression results for the returns We see thatthe in-sample explanatory power of both models is about the same Whilethe tests for serial dependence in the residuals and squared residuals, aswell as for symmetry and normality in the residuals, are not significant,the BDS test for neglected nonlinearity is significant Figure 5.12 picturesthe in-sample error paths of the two models
5.5.2 Out-of-Sample Performance
Figure 5.13 and Table 5.10 show the out-of-sample performance of thetwo models Again, there is not much to recommend the network model
Trang 15TABLE 5.9 In-Sample Diagnostics: VRS
Model (Structure: 4 Lags, 3 Neurons)
Diagnostic Linear Model (Network Model)
FIGURE 5.12 In-sample errors: VRS model
for return forecasting, but in its favor, it does not perform worse in anynoticeable way than the linear model
While these results do not show overwhelming support for the superiority
of network forecasting for the volatility regime switching model, they do
Trang 16FIGURE 5.13 Out-of-sample prediction errors: VRS model
TABLE 5.10 Forecast Tests: VRS Model(Structure: 4 Lags, 3 Neurons)
Diagnostic Linear Neural Net
∗marginal significance levels
show improved out-of-sample performance both by the root mean squarederror and the bootstrap criteria It should be noted once more that thereturn process is highly linear by design While the network does not dosignificantly better by the Diebold-Mariano test, it does buy a forecastingimprovement at little cost
Trang 175.6 Distorted Long-Memory Model
Originally put forward by Kantz and Schreiber (1997), the distorted memory (DLM) model was recently analyzed for stochastic neural networkapproximation by Lai and Wong (2001) The model has the following form:
long-y t = x2t −1 x t (5.12)
x t = 99x t −1 + t (5.13)
Following Lai and Wong, we specify σ = 5 and x0 = 5 One realization
appears in Figure 5.14 It pictures a market or economy subject to bubbles.Since we can reject a unit root in this series, we analyze it in levels ratherthan in first differences.3
FIGURE 5.14 Returns of DLM model
3 We note, however, the unit root tests are designed for variables emanating from a linear data-generating process.
Trang 18Diagnostic Linear Model
FIGURE 5.15 Actual and in-sample predictions: DLM model
5.6.1 In-Sample Performance
The in-sample statistics and time paths appear in Table 5.11 andFigure 5.15, respectively We see that the in-sample power of the linear
Trang 19TABLE 5.12 Forecast Tests: DLM Model(Structure: 4 Lags, 3 Neurons)
Diagnostic Linear Neural Net
∗marginal significance levels
model is quite high The network model is slightly higher, and it is favored
by the Hannan-Quinn criterion Except for insignificant tests for serial pendence, however, the diagnostics all indicate lack of serial independence,
inde-in terms of serial correlation of the squared errors, as well as non-normality,asymmetry, and neglected nonlinearity (given by the BDS test result) Sincethe in-sample predictions of the linear and neural network models so closelytrack the actual path of the dependent variable, we cannot differentiate themovements of these variables in Figure 5.15
5.6.2 Out-of-Sample Performance
The relevant out-of-sample statistics appear in Table 5.12 and the tion error paths are in Figure 5.16 We see that the root mean squared errorsare significantly lower, while the success ratio for the sign predictions areperfect for both models The network bootstrap error is also practicallyidentical Thus, the network gives a significantly improved performanceover the linear alternative, on the basis of the Diebold-Mariano statistics,even when the linear alternative gives a very high in-sample fit
Volatility Forecasting
The Black-Sholes (1973) option pricing model is a well-known methodfor calculating arbitrage-free prices for options As Peter Bernstein (1998)points out, this formula was widely in use by practitioners before it wasrecognized through publication in academic journals