VIETNAM NATIONAL UNIVERSITYUNIVERSITY OF SCIENCEFACULTY OF MATHEMATICS, MECHANICS AND INFORMATICS Dao Minh Phuong EXTREME VALUE THEORY AND APPLICATIONS TO FINANCIAL MARKET Undergraduate
Trang 1VIETNAM NATIONAL UNIVERSITYUNIVERSITY OF SCIENCE
FACULTY OF MATHEMATICS, MECHANICS AND INFORMATICS
Dao Minh Phuong
EXTREME VALUE THEORY
AND APPLICATIONS TO
FINANCIAL MARKET
Undergraduate Thesis Advanced Undergraduate Program in Mathematics
Hanoi - 2012
Trang 2VIETNAM NATIONAL UNIVERSITYUNIVERSITY OF SCIENCE
FACULTY OF MATHEMATICS, MECHANICS AND INFORMATICS
Dao Minh Phuong
EXTREME VALUE THEORY
AND APPLICATIONS TO
FINANCIAL MARKET
Undergraduate Thesis Advanced Undergraduate Program in Mathematics
Thesis advisor: Dr Luu Hoang Duc
Hanoi - 2012
Trang 3Chapter 1 Extreme Value Theory 2
1.1 Introduction 2
1.2 Block Maxima method 2
1.2.1 Limiting Behavior of Maxima and Extrema 3
1.2.2 Fisher- Tippett Theorem (1928) 5
1.3 The Peaks- over- Thresholds (POT) method 8
1.3.1 The Generalized Pareto Distribution (GPD) 9
1.3.2 The POT method 10
1.3.3 Pickands- Balkema- de Hann theorem 11
Chapter 2 Applications: Some theoretical computations 13
2.1 Block Maxima Method 13
2.1.1 Maximum Likelihood Estimation 13
2.1.2 Hill estimator 15
2.1.3 Value at Risk 16
2.1.4 Block Maxima Method Approach to VaR 18
2.1.5 Multiperiod VaR 19
2.1.6 Return Levels 19
2.2 The POT method 20
2.2.1 The selection of threshold u 21
2.2.2 Expected Shortfall 21
2.2.3 POT method approach to VaR and Expected Shorfall 22
Chapter 3 Applications: Empirical computations 24
3.1 Stock market 24
3.2 Case study: the Crash in 1987 26
3.3 Risk measure computations using R: case study for Coca-Cola stock 30
3.3.1 Block Maxima Method 30
Trang 43.3.2 POT method 37
Trang 5I am grateful to all those who spent their time and support for this sis Foremost among this group is my advisor and instructor- Dr Luu Hoang Duc.Thank you for your dedication, patience, enthusiasm, motivation and immenseknowledge I am highly appreciate for your encouragement to the thesis
the-Second, my sincere thanks to all professors and lecturers of Faculty of ematics, Mechanics and Informatics for their help throughout my university’s life atHanoi University of Science
Math-Last but not least, I would like to thank my family and friends from K53- vanced Mathematics program who always support and give me advice during myuniversity’s life
Trang 6Conceptually, mathematics and finance have a lot of things in common Bothspeak everyday about the changes of economy and factors that lead to those changes.The subject ”financial mathematics” was created to apply mathematical theories andresults to develop and to predict the quantitative changes which might lead to se-rious damages in financial world and economies, and to suggest best solutions forthose phenomena
The purpose of this thesis is to highlight the significant conceptual overlapbetween mathematics and finance and represent the possibility for advancement offinancial research through the applications of a mathematical method called ”Ex-treme Value Theory” Extreme Value Theory is a branch of statistical mathematicsthat studies extremal deviation from the median of probability distribution It seeks
to assess, from a given ordered sample of a given random variable, the probability
of events that are more extreme than any observed prior These values are definitelyimportant since they usually describe times of the greatest gains or losses It pro-vides the solid fundamental needed for the statistical modeling of such events andthe computation of extremal risk finance Devastating floods, tornadoes, seismic,market crashes, etc are all real world phenomena modeled by extreme value theory
This thesis consists of three chapters The first chapter represents ExtremeValue Theory and some theoretical results, the block maxima and the Peaks-over-Threshold method The next chapter introduces some applications in financial mar-ket such as Value at Risk, return levels, the choice of threshold And the last chapter
is some empirical computations using R
Trang 7Let’s consider a random variable showing daily returns Basically, there aretwo methods for identifying extremes in real data The first method considers themaximum of variable in consecutive periods, such as months or year These selected
observations from the extreme events, called block maxima In the left of Figure1.1,
the observations X2, X5, X7, X11 perform the block maxima for four periods of eachthree observations In contrast, the second method focuses on the realizations sur-
passing a given threshold All the observations X1, X2, X7, X8, X9, X11 of the right
panel exceed to threshold u and compose the extreme events.
The block maxima is the traditional approach used to analyze data with casional as for instance hydrologic data Nevertheless, the Peaks-over- thresholds(POT) method use data more efficiently so it has been used popularly in recent ap-plications
oc-In the following sections, the block maxima and POT methods are presented.They are mostly based on Embrechts [2]
1.2 Block Maxima method
In this part, we will consider the so called Generalized Extreme Value bution (in short GEV) and the most important result: the Fisher theorem In order
Trang 8distri-Figure 1.1: Block maxima (left panel) and peaks- over- threshold u (right panel)
to do that, let’s consider the behaviour of extrema and maxima:
1.2.1 Limiting Behavior of Maxima and Extrema
Let X1, X2, , be iid random variables with distribution function (df) F In
risk management applications, these may exhibit operational losses, financial losses
It can be shown that, almost certainly, M n n−→→∞ x F,
where x F :=sup{x ∈R : F(x) < 1} ≤ ∞ is the right endpoint of F.
In practice, the distribution F(x) is unknown and therefore, the cummulative
Trang 9dis-tribution function (in short cfd) of M n is also unknown Nevertheless, as n goes to infinity, F n(x) → 0 if x < u and F n(x) → 1 if x > u or we can say that F n(x) be-comes degenerated This degenerated cdf does not have any practical value Hence,
the extreme value theory is interested in finding sequences of real numbers a n > 0
and b n such that(M n −b n)/a n, the sequence of normalized maxima and converges
for some non degenerate distribution function H(x)
If this condition holds, then F is in the maximum domain of attraction of H, and
we write F∈ MDA(H) H depends on location series b n and scale a n, thus it creates
a unique type of distribution
For more details and comprehensive treatment of extreme value theory, werefer to Embrechts [2]
Let M n∗ = (M n−b n)/a n Now under the independent assumption, the
lim-iting distribution of normalized minima M n∗is given as:
H ξ(x) =
(exp(−(1+ξx)−1/ξ) ξ 6=0,
ξ >0 H ξ corresponds to classical Frechet df
ξ =0 H ξ corresponds to classical Gumbel df
ξ <0 H ξ corresponds to classical Weibull dfBelow, we present now the most important and fundamental result - theFisher- Tippett theorem
Trang 10Figure 1.2: GEV: distribution functions for various ξ
1.2.2 Fisher- Tippett Theorem (1928)
Theorem 1.1. If appropriately normalized maxima converge in distribution to a non-degenerate limit, then the limit distribution must be an extreme value distribution, abbreviated:
F∈MDA (H) then H is of type H ξ for some ξ.
where H ξ is Generalized Extreme Value Distribution.
The Fisher- Tippett theorem essentially says that the GEV is the only possiblelimiting distribution for normalized block maxima
One of the main part to apply Fisher- Tippett theorem is to determine in
which case F∈MDA (H ξ) holds The following remarks would help us to do that
1 Fr ´echet case: (ξ >0)
Gnedenko (1943) pointed out that for ξ >0
Trang 11Figure 1.3: GEV: densities for various ξ, The solid line is Gumbel distribution, the dashed line is Frechet with ξ =0.9 and the dotted line is Weibull with ξ = −0.5
F∈ MDA(H ξ) ⇐⇒ 1−F(x) = x−1/ξ L(x)
for some slowly varying function L(x)
A function L on (0, ∞) is called slowly varying if
Example 1.3. A typical example is the Pareto distribution,
F(x) =1− ( K
K +x)α, α,K > 0, x≥0,
Trang 12is in MDA(H 1/α ) (Fr ´echet case) if we take a n = Kn 1/α /α, b n = Kn 1/α − K.
Other heavy- tailed distributions such as Burr, Cauchy, log gamma, t-distributions and various mixture models also belong to Fr ´echet family Some moments can be infinite.
2 Gumbel case: F ∈MDA (H0)
The characterization of this class is more complex Generally it includes tributions whose tails decay roughly exponentially and we call these thin-tailed or light-tailed distributions All moments exist for distributions in theGumbel- class
dis-Example 1.4. The exponential distribution: F(x) = 1−e−λx , λ > 0, x ≥ 0 We
take a n =1/λ, b n = (log n)/λ, ξ =0, then F(x)is in MDA(H0).
3 Weilbull case: ξ <0
The Weibull distribution is the asymptotic distribution of finite endpoint tributions
dis-Examples are uniform and beta distributions
Remark 1.5. • Necessarily all commonly encountered continuous distributions are in
the maximum domain of attraction of an extreme value distribution.
• From figure 1.3, we also see that the right tail of the distribution falls exponentially for
the Gumbel case, by a power function for the Frechet case and it is finite for Weibull case In risk measurement, we are mostly interested in Frechet family which consists
of stable and student-t distribution.
• The normalizing sequences a n and b n can always be chosen so that the limit H ξ has standard form without rescaling or relocation.
The Fisher- Tippett theorem has two influential implications First of all, the
tail behaviour of the cdf of F(x) determines the limiting distribution H ξ(x) of theunnormalized maxima Therefore, extreme value theory is generally applicable to
a huge range of distributions Second, the tail index ξ does not depend on the time interval of M t or it is stable under time aggregation
Trang 13Next, we introduce location and scale parameters µ and σ >0 indicating theunknown norming constants and work with
Obviously, H ξ,µ,σ is of type H ξ The parameter ξ is also called the tail index,
represents the thickness of the tail of distribution The Frechet distribution sponds to fat-tailed distributions and has been found to be the most suitable forfat-tailed financial data This result is very useful since the asymptotic distribution
corre-of the maximum always has one corre-of these three distributions, no matter what theoriginal distribution is
In practice, we need to measure ξ, µ, σ We congregate data on block maxima
and then fit the three- parameters form of the GEV This requires lots of raw data
to form sufficiently many, sufficiently large blocks Methods used to calculate thoseparameters will be described in the next chapter
1.3 The Peaks- over- Thresholds (POT) method
From what we have discussed above, we see that the disadvantage of the
block maxima method is that if we observe data over a period of a few years and
then take the maximum value for each period, we might loose the potential extremeevents For instance, we are interested in modeling the rainfall of Vietnam Giventhat heavy rain often occurs over summer period, we expect that the most extremerainfall to take place over a few months in summer However, if we divide a yearinto twelve months to observe and then to choose the highest points, we will loosesome high rainfall occurring in summer but still choose low rainfall in the winter.Hence, we seek for alternative approach to help us handle this kind of situation
Trang 14Another method, called the Peaks- over- Thresholds (POT) method is to set
a threshold which data is taken as extreme and then collect the exceedances over
a threshold We model this data using the Generalized Pareto distribution whichcalculate the probability of recording extreme events surpass the threshold
1.3.1 The Generalized Pareto Distribution (GPD)
The GPD is a two parameter distribution with distribution function:
G ξ,β(x) =
(
1− (1+ξx/β)−1/ξ ξ 6= 0
where β>0, and the support x≥0 if ξ ≥0 and x∈ [0,−β/ξ]if ξ <0
Figure below shows the shape of the GPD distribution G ξ,σ(x)where ξ, called
tail index or shape parameter takes positive, negative and zero value The scaling
parameter σ is chosen to 1.
Figure 1.4: Shape of GPD G ξ,σ for σ=1
The tail index ξ gives an exhibition of the ponderousness of the tail, the bigger
ξ, the heavier the tail E(X k)does not exist for k≥1/ξ In general, we can not fix an
Trang 15upper bound for the financial losses, only distributions with shape parameter ξ ≥0are suited to model financial return distributions.
ξ >0 Pareto (parametrized version)
ξ =0 Exponential
ξ <0 Pareto type II
1.3.2 The POT method
In this part, we would like to give a theoretical foundations for the secondapproach The primary and important knowledge need to be remembered are ex-cess distribution and the Pickand- Balkema- de Hann theorem
The excess distribution: Consider an unknown distribution function F of a random variable Let u be the high threshold We are interested in estimating the distribution function F u of the values of x above a certain threshold u :
Figure 1.5: Distribution function F and conditional distribution function F u The distribution function F u is called the conditional excess distribution function
and is defined as:
F u(x) =P(X−u≤x|X >u) = F(x+u) −F(u)
1−F(u) , (9)
Trang 16for 0≤ x≤x F−u where x F ≤∞ is the right end point of F.
The realizations of the random variable X is primarily between 0 and u and thus the measurement of F in this interval generally does not pose any problem The estimation of the portion F u however can be hard as we have in general very littleobservations in this area
Example 1.6. 1 The exponential distribution
1.3.3 Pickands- Balkema- de Hann theorem
Extreme value theory is very helpful as it provides us a powerful result aboutthe conditional excess distribution function which is stated in the following theo-rem:
Theorem 1.7. (Pickands(1975), Balkema and de Haan (1974)) : For a large class of lying distribution function F, the conditional excess distribution function F u(x), for u large,
under-we can find a function β(u) such that
F u(x) −G ξ,β(u)(x)
if and only if F∈MDA(H ξ), ξ ∈R.
Trang 17Basically, all the common continuous distributions used in risk management
or insurance mathematics are in MDA(H ¸ ) for some value of ξ.
Pickands- Balkema -de Hann theorem explain the importance of
General-ized Pareto Distribution (GPD) That is, the GPD is the natural model for the unknow
excess distribution over sufficiently high thresholds For a large class of
underly-ing distribution function F, the conditional excess distribution function F u(x), for ularge, is well approximated by
F u(x) ≈ G ξ,β(x),
for some ξ and β To measure this parameters we fit the GPD to excess amounts
over the threshold u Standard properties of maximum likelihood estimators apply
if ξ >−0.5
In order to implement the POT method, we have to choose a conformablethreshold u There are data- analytic tools such as mean excess plot to help us, al-though afterward simulations will suggest that inference is often sturdy to choice ofthreshold
Trang 18This chapter is mostly based on McNeil [6] and S.Tsay [8].
2.1 Block Maxima Method
2.1.1 Maximum Likelihood Estimation
Maximum likelihood estimation is a method of estimating the parameters of
a statistical model Given block maxima data y = (M(n1), , M(n m))′ from m blocks
of size n We need to estimate θ= (ξ, µ, σ)′ We build a log- likelihood by assuming
we have independent observations from GEV with density h θ
Trang 19Procedure to find MLE:
• Define the likelihood function L(θ)
• Take the natural logarithm lnL(θ)
• Differentiate lnL(θ) with respect to θ and then equate the derivative to 0.
• Solve for parameter θ and obtain ˆθ.
• Check whether it is a maximizer or global maximizer.
Let us consider examples of some general distribution functions to stand how it works
under-Example 2.1. • For one parameter, let us consider the geometric distribution, which
has the probability mass function is:
Equating dlnL dp to zero and solving for p, we get:
∑i n=1x i = 1
x.Hence, we get the maximum likelihood estimator of p as:
ˆp = n
∑n i=1X i = 1
X
Trang 20• For two parameters, let us consider the normal distribution N(µ, σ2) The density function for the normal variable is given by :
Hence, the likelihood function is:
When applying to GEV distribution, obviously in defining blocks, bias and
variance have to be balanced We reduce bias by improving the block size n; we
re-duce variance by increasing the number of blocks m.
k[ξ h(k) −ξ] is asymptotic normal with mean 0 and variance ξ2
Trang 21The estimation can be used by financial companies to assess their risks or by
a regulatory committee to set margin requirements In both cases, VaR is used toensure that the financial firms can still maintain their business after a devastatingevent From viewpoint of financial companies, VaR can be defined as the maxi-mal loss of a financial position during a given time period for a given probability.Under this perspective, one considers VaR as a measure of loss associated with a ex-traordinary event under normal market conditions From a viewpoint of regulatorycommittee, VaR can be defined as the minimal loss under rare market milieu Bothdefinitions lead to same VaR estimate despite the different concepts
Assume a random variable X with continuous distribution function F
mod-els losses or negative returns on a certain financial instrument over a certain time
horizon Then VaR q is defined as the q-th quantile of the distribution F :
VaR q =F−1(q) = in f{x ∈ R: F(x) ≥q} (2.4)
F−1 is called the quantile function.
Let’s recall the conditional excess distribution function:
Trang 22F(u+y) =P(X−u≥y|X >u) ·P(X >u) (2.7)
F(u+y) = F(u)F u(x−u), Letx =u+y (2.8)Besides, R.Smith (1987) introduced a tail estimator based on GPD approxi-mation to excess distribution Let
where n is the total number of observations and for sufficiently high threshold u,
F u(x−u) ≈ G ξ,β(u)(x−u) Hence, we get the tail estimator:
ˆF(x) = N u
n (1+ξˆx−u
ˆβ )−
where x > u A high u reduces bias in measuring excess function A low u
reduces variance in estimating excess function and F u
Trang 23Inverting the tail estimation formula with q>F(u)we obtain:
2.1.4 Block Maxima Method Approach to VaR
Assume that there are T observations of an fortune return available in the same period We divide the same period into k non-overlapping blocks of length n.
If T =kn+m with 1≤m ≤n then we delete first m observations from the sample.
Now, plugging the maximum likelihood estimates into equation (1.5) for x = (r−
µ n)/σ n to get the quantile of a given probability of the generalized extreme value
distribution Let p∗be a small upper tail probability that expresses the potential loss
and r∗n be the(1−p∗)th quantile of the block maxima under the limiting generalizedextreme value distribution Then we have:
1−p∗ =
(exp[−(1+ ξ n( ∗
n−µ n)
σ n )−1/ξ n] ξ n 6=0,exp[−exp(−( ∗n−µ n)
Trang 24This relationship between probablilities allow us to obtain VaR from the original
fortune return series r t For a small upper tail probability p, the (1−p)th quantile
of r t is r n∗ if the upper tail probability p∗ if the block maxima is selected based on
equation (2.16) where P(r t ≤ r∗n) = 1−p Therefore, for a given small upper tail
probability p, the VaR of a financial position with log return r t is:
The square root of time rule of the Risk Metrics methodology becomes a
spe-cial case under the extreme value theory The proper relationship between l-day and
1-day horizons is:
VaR(l) =l 1/σ VaR=l ξ VaR (2.18)
Here, σ is the tail index and ξ is the shape parameter of the extreme value tion This relationship is refered to as the σ-root of the time rule Note that σ=1/ξ , not the scale parameter σ n in equation (2.17)
distribu-2.1.6 Return Levels
Often when examining extreme data we are concerned on asking questions:
”How often could we expect to observe extreme events?” or ”How often do weexpect stock prices to be falled?” or ”How low will it fall ?” In order to answerthese questions, we need to estimate the return levels for a given return period Thereturn period is the amount of time we expect to wait before recording an extremeevent and the return levels expresses the intensity of the event that occurs within
that period R n,k , the k n-block return level or the quantiles, is defined by:
It is the level expected to be surpassed in one out of k period of length n on
average
... ξ,β(x),for some ξ and β To measure this parameters we fit the GPD to excess amounts
over the threshold u Standard properties of maximum likelihood estimators apply
if... be used by financial companies to assess their risks or by
a regulatory committee to set margin requirements In both cases, VaR is used toensure that the financial firms can still maintain...
Often when examining extreme data we are concerned on asking questions:
”How often could we expect to observe extreme events?” or ”How often weexpect stock prices to be falled?” or ”How