• Our goal is to harness the information in intraday prices for computing daily volatility • Let us estimate the mean of returns using a long sample of daily observations: • When estimat
Trang 1Volatility Modeling Using Intraday Data
Elements of Financial Risk Management
Chapter 5Peter Christoffersen
Trang 2• Our goal is to harness the information in intraday
prices for computing daily volatility
• Let us estimate the mean of returns using a long
sample of daily observations:
• When estimating the mean of returns only the
first and the last observations matter as all the
intermediate terms cancel out
• When estimating the mean, we need a long time
span of data
Trang 3• The start and end points S0 and S T will be the same
irrespective of the sampling frequency of returns
• Consider now instead estimating variance on a sample of daily returns:
• In the variance estimator, the intermediate prices do not cancel out
• All return observations now matter because they are
squared before they are summed in the average
Trang 4• Now we have price observations at the end of every
hour instead of every day and the market for the asset at hand is open 24 hours a day
• Now we have 24.T observations to estimate 2 and we can get a much more precise estimate than when using
just the T daily returns
• Implication of this high-frequency sampling is that just
as we can use 21 daily prices to estimate a monthly
volatility we can also use 24 hourly observations to
estimate a daily volatility
Trang 5Realized Variance: Four Stylized Facts
• Assume that we are monitoring an asset that trades 24
hours per day and that it is extremely liquid so that ask spreads are virtually zero and new information is
bid-reflected in the price immediately
• Let m be the number of observations per day on an asset
If we have 24 hour trading and 1-minute observations,
then m = 24*60 = 1,440
• Let the jth observation on day t+1 be denoted S t+j/m. Then
the closing price on day t+1 is S t+m/m = S t+1 , and the jth
1-minute return is
Trang 6Realized Variance: Four Stylized Facts
• With m observations daily, we can calculate an estimate
of the daily variance from the intraday squared returns simply as
• Here we do not divide the sum of squared returns by m
If we did we would get a 1-minute variance
• This is the total variance for a 24-hour period
• Here we do not subtract the mean of the 1-minute returns
as it is so small that it will not impact the variance
Trang 7Realized Variance: Four Stylized Facts
• The top panel of Figure 5.1 shows the time series of daily realized S&P 500 variance computed from intraday
variances in the top panel
• Figure 5.1 illustrates the first stylized fact of RV: RVs are much more precise indicators of daily variance than are daily squared returns
Trang 8Figure 5.1: Realized Variance (top) and Squared
Returns (bottom) of the S&P500
Trang 9Realized Variance: Four Stylized Facts
• The top panel of Figure 5.2 shows the autocorrelation
function of the S&P 500 RV series from Figure 5.1
• The bottom panel shows the corresponding ACF computed from daily squared returns
• Notice how much more striking the evidence of variance persistence is in the top panel
• Figure 5.2 illustrates the second stylized fact of RV:
• RV is extremely persistent, which suggests that volatility may be forecastable at horizons beyond a few months as long as the information in intraday returns is used
Trang 10Figure 5.2: Autocorrelation of Realized Variance and
Autocorrelation of Squared Returns
Trang 11Realized Variance: Four Stylized Facts
• The top panel of Figure 5.3 shows a histogram of the RVs from Figure 5.1
• The bottom panel of Figure 5.3 shows the histogram of the natural logarithm of RV
• Figure 5.3 shows that the logarithm of RV is very close to normally distributed whereas the level of RV is strongly positively skewed with a long right tail
Trang 12Realized Variance
Trang 13Realized Variance: Four Stylized Facts
• The approximate log normal property of RV is the third
stylized fact We can write
• The fourth stylized fact of RV is that daily returns
divided by the square root of RV is very close to
following an i.i.d (independently and identically
distributed) standard normal distribution:
Trang 14Realized Variance: Four Stylized Facts
• RV m t+1 can only be computed at the end of day t+1, So this
result is not immediately useful for forecasting purposes
• If a good forecast RV mt+1/t can be made using information
available at time t then a normal distribution assumption
of will be a decent first modeling strategy Approximately:
• where we have now standardized the return with the RV forecast
Trang 15Realized Variance: Four Stylized Facts
• When constructing a good forecast for RV m t+1 , we need to keep in mind the four stylized facts of RV:
– RV is a more precise indicator of daily variance than is the daily squared return
– RV has large positive autocorrelations for many lags.– The log of RV is approximately normally distributed.– The daily return divided by the square root of RV is
close to i.i.d standard normal
Trang 16Forecasting Realized Variance
• Realized variances are very persistent
• So we need to consider forecasting models that
allow for current RV to matter for future RV
Trang 17Simple ARMA Models of Realized Variance
• AR(1) model allows for persistence in a time series
• If we treat the estimated RV mt as an observed time series,
then we can assume the AR(1) forecasting model
• where t+1 is assumed to be uncorrelated over time and
have zero mean
• The parameters 0 and 1 can easily be estimated using
OLS
• The one-day-ahead forecast of RV is then
Trang 18Simple ARMA Models of Realized Variance
• Since the log of RV is close to normally distributed we may be better off modeling the RV in logs rather than
levels We can therefore assume
• The normal property of ln (RV m
t+1) will make the OLS estimates of 0 and 1 better than those in the AR(1)
model for RV m
t+1
• The AR(1) errors, t+1, are likely to have fat tails, which
in turn yield noisy parameter estimates
Trang 19Simple ARMA Models of Realized Variance
• As we have estimated it from intraday squared returns, the
RV mt+1 is not truly an observed time series but it can be
viewed as the true RV observed with a measurement error
• If the true RV is AR(1) but we observed true RV plus an i.i.d measurement error then an ARMA(1,1) model is
likely to provide a good fit to the observed RV We can
write
• which due to the MA term must be estimated using
maximum likelihood techniques
Trang 20Simple ARMA Models of Realized Variance
• As the exponential function is not linear, we have in the
log RV model that
• Assuming normality of the error term we can use:
• In the AR(1) model the forecast for tomorrow is
Trang 21Simple ARMA Models of Realized Variance
• And for the ARMA(1,1) model we get,
• More sophisticated models such as long-memory ARMA
models can be used to model realized variance
• These models may yield better longer horizon variance
forecasts than the short-memory ARMA models
considered here
Trang 22Heterogeneous Autoregressions (HAR)
• Mixed-frequency or heterogeneous auto-regression model
(HAR) helps us to parsimoniously and easily model the memory features of realized volatility
long-• Consider the h-day RV from the 1-day RV as follows:
• where dividing by h makes RV t-h+1,t interpretable as the
average total variance starting with day t-h+1 and through day t.
Trang 23Heterogeneous Autoregressions (HAR)
• Consider forecasting tomorrow’s RV using daily, weekly, and monthly RV defined by the simple moving averages
• where we have assumed five trading days in a week and
21 trading days in a month
Trang 24Heterogeneous Autoregressions (HAR)
• The simplest way to forecast RV with these variables is via the regression which defines the HAR model
• Note that HAR can be estimated by OLS because all
variables are observed and because the model is linear in the parameters
Trang 25Heterogeneous Autoregressions (HAR)
• The HAR will be able to capture long-memory-like
dynamics because of the 21 lags of daily RV
• The model is parsimonious because the 21 lags of daily
RV do not have 21 different autoregressive coefficients:
• The coefficients are restricted to be
on today’s RV, on the past four days of
RV, and M /21 on the RVs for days t-20 through t-5.
Trang 26Heterogeneous Autoregressions (HAR)
• Given the log normal property of RV, we can consider HAR models of the log transformation of RV
• The advantage of this log specification is again that the
parameters will be estimated more precisely when using OLS
Trang 27Heterogeneous Autoregressions (HAR)
• The forecasting involves undoing the log transformation
so that
• Note that the HAR idea generalizes to longer-horizon
forecasting
Trang 28Heterogeneous Autoregressions (HAR)
• If we want to forecast RV over the next K days then we
can estimate the model
Trang 29S&P 500 Volatility
Trang 30Heterogeneous Autoregressions (HAR)
• The HAR model can capture the leverage effect by
simply including the return on the right-hand side
• In the daily log HAR we can write
• This can also easily be estimated using OLS
• Notice that because the model is written in logs the
variance forecast will not go negative
• will always be a positive
number
Trang 31Heterogeneous Autoregressions (HAR)
• The stylized facts of RV suggested that we can assume
• where RV m
forecasting models
• Expected Shortfall is computed by
• Under this assumption, we can compute Value-at-Risk by
Trang 32Combining GARCH and RV
• Here we try to incorporate the rich information in RV into
a GARCH modeling framework
• Consider the basic GARCH model:
• Given the information on daily RV we could augment the
GARCH model with RV as follows:
• In this GARCH-X model, RV is the explanatory variable
Trang 33Combining GARCH and RV
• A shortcoming of the GARCH-X approach is that
a model for RV is not specified
• This means that we cannot use the model to
forecast volatility beyond one day ahead
• The more general so-called Realized GARCH
model is defined by
• where is the innovation to RV
Trang 34Combining GARCH and RV
• This model can be estimated by MLE when assuming that
R t and t have a joint normal distribution
• The Realized GARCH model can be augmented to include
a leverage effect as well
• In the Realized GARCH model the VaR and ES would
simply be
• and
Trang 35The All RV Estimator
• As discussed before, in the ideal case with ultra-high
liquidity we have m = 24 * 60 observations available
within a day
• We can calculate an estimate of the daily variance from the intraday squared returns simply as
• This estimator is sometimes known as the All RV
estimator because it uses all the prices on the
1-minute grid
Trang 36The All RV Estimator
• Figure 5.5 uses simulated data to illustrate one of the
problems caused by illiquidity when estimating asset
price volatility
• We assume the fundamental asset price, S fund, follows the simple random walk process with constant variance
• Where e = 0.001 in Figure 5.5
• The observed price fluctuates randomly around the bid
and ask quotes that are posted by the market maker
Trang 37The All RV Estimator
• We observe
• where B t+j/m is the bid price, which is the fundamental
price rounded down to the nearest $1/10
• A t+j/m is the ask price, which is the fundamental price
rounded up to the nearest $1/10
• I t+j/m is an i.i.d random variable, which takes the values 1
and 0 each with probability 1/2
• I t+j/m is thus an indicator variable of whether the observed
price is a bid or an ask price
Trang 38with Bid-Ask Bounces
Trang 39The All RV Estimator
• Figure 5.5 shows that the observed intraday price can be very noisy compared with the smooth fundamental but
unobserved price
• The bidask spread adds a layer of noise on top of the
fundamental price
• If we compute RV mt+1 from the high-frequency S obst+j/m then
we will get an estimate of 2 that is higher than the true value because of the inclusion of the bid-ask volatility in the estimate
Trang 40The Sparse RV Estimator
• Here we try to construct an s-minute grid (where s ≥ 1)
instead of a 1-minute grid so that our new RV estimator
would be
• It is sometimes denoted as the Sparse RV estimator as
opposed to the previous All RV estimator
• The question is how to choose the parameter s?
• The larger the s the less likely we are to get a biased
estimate of volatility,
• But the larger the s the fewer observations we are using and
so the more noisy our estimate will be
Trang 41The Sparse RV Estimator
• The choice of s clearly depends on the specific asset
• For very liquid assets we should use an s close to 1 and for illiquid assets s should be much larger
• If liquidity effects manifest themselves as a bias in
estimated RVs when using a high sampling frequency then that bias should disappear when the sampling frequency is
lowered (when s is increased)
Trang 42The Sparse RV Estimator
• Volatility signature plots provide a convenient
graphical tool for choosing s:
• First compute RV st+1 for values of s going from 1
to 120 minutes
• Second, scatter plot the average RV across days
on the vertical axis against s on the horizontal
axis
• Third, look for the smallest s such that the average
RV does not change much for values of s larger
than this number
Trang 43The Sparse RV Estimator
• In markets with wide bid–ask spreads the average
RV in the volatility signature plot will be
downward sloping for small s
• But for larger s the average RV will stabilize at the
true long run volatility level
• We want to choose the smallest s for which the
average RV is stable This will avoid bias and
minimize variance
Trang 44The Sparse RV Estimator
• In markets where trading is thin, new information
is only slowly incorporated into the price
• Intraday returns will have positive autocorrelation
resulting in an upward sloping volatility signature
plot
• To compute RV, choose the smallest s for which
the average RV has stabilized
Trang 45The Average RV Estimator
• Let us use the volatility signature plot to chose s=15 in
the Sparse RV so that we are using a 15-minute grid for prices and squared returns to compute RV
• The first Sparse RV will use a 15-minute grid starting
with the 15-minute return at midnight, call it RV s,1t+1
• The second will also use a 15-minute grid but this one
will be starting one minute past midnight, call it RV s,2 t+1
and so on until the 15th Sparse RV, which uses a
15-minute grid starting at 14 15-minutes past midnight, call it
RV s,15 t+1
Trang 46The Average RV Estimator
• We thus use the fine 1-minute grid to compute 15 Sparse RVs at the 15-minute frequency
• We used the 1-minute grid but we have used it to
compute 15 different RV estimates, each based on
15-minute returns, and none of which are materially affected
by illiquidity bias
• By simply averaging the 15 sparse RVs we get the
Average RV estimator
Trang 47RV Estimators with Autocovariance
Adjustments
• To avoid RV bias we can try to model and then correct for the autocorrelations in intraday returns that are
driving the volatility bias
• Assume that the fundamental log price is observed with
an additive i.i.d error term, u, caused by illiquidity so
that
Trang 48RV Estimators with Autocovariance
Adjustments
• In this case the observed log return will equal the true
fundamental returns plus an MA(1) error:
• Due to the MA(1) measurement error our simple squared
return All RV estimate will be biased The All RV in this case is defined by