The first two chapters are to reacquaint the reader with measure theory and characteristicfunctions, after which the topic will swiftly move on to infinitely divisible random variables..
Trang 1Supervisor: A.E Kyprianou
Trang 3AbstractThe aim of this paper is to introduce the reader into L´evy Processes in a formal and rigorousmanner The paper will be analysis based and no probability knowledge is required, thought
it will certainly be a tough read in this case We aim to prove some important theorems thatdefine the structure of L´evy Processes
The first two chapters are to reacquaint the reader with measure theory and characteristicfunctions, after which the topic will swiftly move on to infinitely divisible random variables
We will prove the L´evy canonical representation Then we will go on to prove the existence
of Brownian motion and some properties of it, after which we will briefly talk about Poissonprocesses and measures
The final chapter is dedicated to L´evy processes in which we will prove three important orems; L´evy-Khintchine representation, L´evy-Ito decomposition and the points of increase forL´evy processes
the-Keywords:Brownian Motion, Poisson Processes, L´evy Processes, Infinitely Divisible tions, L´evy-Itˆo Decomposition, L´evy-Khintchine Representation, Points of Increase
Trang 4Contents i
Acknowledgements ii
Introduction iii
1 Preliminaries 1 1.1 Measure Theory 1
1.2 Integration 3
1.3 Convergence 4
2 Characteristic Functions 6 2.1 Basic Properties 6
2.2 Examples 7
3 Infinitely Divisible Random Variables 8 3.1 Definitions 8
3.2 Properties 8
4 Brownian Motion 12 4.1 Definition and Construction 12
4.1.1 Interval [0, 1] 12
4.1.2 Extension to [0, ∞)d 16
4.2 Properties 17
5 Poisson Processes 19 5.1 Poisson Processes 19
5.2 Poisson Measures 20
6 L´evy Processes 21 6.1 Definitions 21
6.2 Representations 21
6.2.1 L´evy-Khintchine representation 21
6.2.2 L´evy-Itˆo decomposition 23
6.3 Strong Markov Property 28
6.4 Points of Increase 29
i
Trang 5First and foremost, my deepest admiration and gratitude goes to Andreas Kyprianou, to whom
I owe all of my current knowledge in probability His support and enthusiasm has always been
a source of inspiration for me, and I doubt I can find a measure space which his support would
be in I hope that this project has done justice to the effort he has invested in me
I would also like to thank Juan Carlos, for being very supportive and taking time out totalk to me about his research and interests He has pointed me towards interesting areas ofprobability and L´evy processes
Also I have to thank Andrew MacPherson, Daria Gromyko and Laura Hewitt for putting upwith me constantly talking about my project and giving me valuable feedback
Last but not least I wish to express my gratitude to Akira Sakai and Adam Kinnison forbeing a huge source of inspiration for me If it was not for these people, I would not have beenstudying probability
ii
Trang 6The study of L´evy processes began in 1930 though the name did not come along until later inthe century These processes are a generalisation of many stochastic processes that are around,prominent examples being Brownian motion, the Cauchy process and the compound Poissonprocess These have some common features; they are all right continuous and have left limits,and they all have stationary independent increments These properties give a rich underlyingunderstanding of the processes and also allow very general statements to be made about many
of the familiar stochastic processes
The field owes many things to the early works of Paul L´evy, Alexander Khintchine, Kiyosi Itˆoand Andrey Kolmogorov There is a lot of active research in L´evy processes, and this paper willlead naturally to subjects such as fluctuation theory, self similar Markov processes and Stableprocesses
We will assume no prior knowledge of probability throughout the paper The reader isassumed to be comfortable with analysis, and in particular Lp spaces and measure theory Thefirst chapter will brush over these as a reminder
Notation
xn ↓ x will denote a sequence x16 x2 6 such that xn → x and similarly xn ↑ x will denote
x1 > x2 > with xn → x x+ will be shorthand for limy↓xy and x− will mean limy↑xy By
R+ we mean the set of non-negative real numbers and R = R ∪ {∞, −∞} is the extended realline We will also be using the convention that inf ∅ = ∞
We will denote the power set of a set Ω by P(Ω) The order (or usual) topology on R isthe topology generated by sets of the form (a, b) We will often abbreviate limit supremums,lim supnAn := limn↑∞supk>nAn The notation ∂B where B is a set will be used to mean theboundary of B
A c`adl`ag (continue `a droite, limit`ee `a gauche) function is one that is right continuous withleft limits Unless specified otherwise, we will follow the convention that N , L, B (or W ) will
be Poisson, L´evy, and Wiener processes respectively We will use X when we are talking about
a general process or random variable
iii
Trang 7Chapter 1 Preliminaries
“ The theory of probability as a mathematical discipline can and should be developedfrom axioms in exactly the same way as geometry and algebra ”
-Andrey Kolmogorov
The aim of this chapter is to familiarise the reader with the aspects of measure theory We willnot rely heavily on measure theory in this paper, it is, however, essential to get a basic grasp ofthe concept in order to do probability
Definition 1.1.1 A σ-algebra F on a set Ω is a collection of subsets of Ω such that,
(i) ∅ ∈F and Ω ∈ F
(ii) A ∈F =⇒ Ac∈F
(iii) {An}n∈N ⊂F =⇒ Sn∈NAn∈F
We call the pair (Ω,F ) a measurable space
From this we can use de Morgan’s laws to deduce that a σ-algebra is also closed undercountable intersection The elements of a σ-algebra can be viewed as events, Ω being the completeevent (in the sense that it is the event “something happens”) It is clear that if we have an event
A, then we also have an event of A not not happening Finite intersection and union may also bejustified in terms of events, the sole reason for the countable union and intersections are however,for the purpose of analysis
A simple question would be on how to obtain a σ-algebra from a given collection of subsets.Proposition 1.1.2 LetT be a collection of sets of Ω, then there exists a smallest σ-algebra Bsuch that T ⊂ B
Proof Take the intersection of all the σ-algebras that containT (there is at least one σ-algebra,namelyP(Ω)) This intersection is also a σ-algebra (a fact that the reader may want to confirmfor themselves) and thus the smallest containingT
Definition 1.1.3 A Borel set B ∈B(X) is an element of the smallest σ-algebra on X, generated
We wish to somehow assign a likelihood to each event To do so we must define a map onthe σ-algebra to the reals
Definition 1.1.4 A measure on a measurable space (Ω,F ) is a function µ : F → R+ suchthat if A1, A2, are disjoint elements ofF then1
Trang 81.1 Measure Theory Preliminaries 2
A finite measure is a measure µ such that µ(Ω) < ∞ and a σ-finite measure is a measure µsuch that for each {Ωn}∞
n=1with Ωn↑ Ω, we have µ(Ωn) < ∞ for each n ∈ N
A probability measure P is a measure with P(Ω) = 1
Definition 1.1.5 A measure space (Ω,F , µ) is a measurable space (Ω, F ) with a measure µdefined on it
A probability space (Ω,F , P) is a measurable space (Ω, F ) with a probability measure Pdefined on it
A µ-null set of a measure space is a set A ∈F such that µ(A) = 0 We will sometimes call
it null sets where the measure is obvious from the context
In a measure space a property holds almost everywhere if the points in which a property doesnot hold are the µ-null sets In probability spaces this is also known as almost surely which isthe same statement as saying the event happens with probability one
Definition 1.1.6 We say that A, B ∈F are independent on a probability space (Ω, F , P) ifP(A ∩ B) = P(A)P(B)
Now we look at a basic theorem about measures
Theorem 1 (Monotone Convergence Theorem for Measures) Suppose that (Ω,F , µ) is a sure space and {Bn}∞
mea-n=1⊂F is a sequence of sets that converge to B, then
µ(B) = lim
n→∞µ(Bn)
The term we shall use is infinitely often, abbreviated to i.o This is a shorthand way of sayinglim sup, i.e An i.o = lim supnAn The reason for this terminology is that an element of thelim sup must occur in infinitely many sets of An
Using the Monotone Convergence Theorem, we will prove a very important theorem Thiswill be in heavy use in dealing with Brownian motion when we prove things to do with limits.Theorem 2 (Borel-Cantelli Lemma) On a probability space (Ω,F , P) let A1, A2, ∈F thenif
∞
X
n=1
P(An) < ∞ then P(lim supnAn) = 0
Proof Notice that lim sup An = ∩∞i=1∪∞
n=iAn Define Bi= ∪∞n=iAn Now from the subadditivity
of the measure we have that P(Bi) 6 P∞
n=iP(An) By the assumption P∞
n=1P(An) < ∞therefore P(Bi) → 0 as i → ∞ Hence as n → ∞, P(∩n
i=1Bi) → P(∩∞i=1Bi) = 0 by the MonotoneConvergence Theorem
Now that we have a framework for probability, we need to look at more interesting thingsthan just events The following is a formal definition of a random variable
Definition 1.1.7 A function f is said to be measurable if f : Ω → Y where (Ω,F ) is ameasurable space, Y is a topological space and for any open set U ⊂ Y we have that f−1(U ) ∈F Definition 1.1.8 A random variable X on a probability space (Ω,F , P) is a measurable func-tion
Note that this is a very general definition In all the cases, the random variables will be
Rd valued, that is they will map to Rd with the usual topology Measurability is an importantconcept as this allows us to assign probabilities to random variables
Measurability is not really as strong as we like Sets such as (a, b] are not open in R, hence
we do not know if the pre-image of these are in the σ-algebra The next definition will becomevery useful for us
Trang 91.2 Integration Preliminaries 3
Definition 1.1.9 A function is said to be Borel measurable if f : Ω → Y where (Ω,F ) is
a measurable space, Y is a topological space and for any Borel set B ∈ B(Y ) we have that
f−1(B) ∈F
We will always be assuming our random variables are Borel measurable
Notice that a random variable X acts as a measure on (Rd,B(Rd)) by the composition
µ ◦ X−1 as X−1:B(Rd) →F and µ : F → R This is known as the distribution or law of X.Now we introduce some probabilistic abuses of notation which usually is the most confusingpart of probability For a random variable X, P(X ∈ B) is shorthand for P(X−1(B)) where
B ∈B(Rd
) The distribution unless otherwise specified will be denoted by P(X ∈ dx)
The following are some examples of some important random variables These will play animportant role later on so it is essential to become familiar with them
Example 1.1.10 An Rd valued Normal or Gaussian random variable2 X on (Ω,F , P) has adistribution of the form
P(X ∈ dx) =
1q(2π)d/2|Σ|1
where µ ∈ Rd and Σ is a positive definite real d × d matrix It is denotedNd(µ, Σ)
In the case of R (which we will be using) it is of the form,
1
√2πσ2exp
where µ, σ ∈ R This is denoted N (µ, σ2)
We can also have discrete measure spaces which gives rise to discrete random variables.Example 1.1.11 A Poisson random variable N is a discrete random variable on a discretemeasure space (Ω,F , P) It can be described by,
P(N ∈ {k}) = e−λλk
k!
where λ > 0 and is called the parameter The measure of the Poisson random variable is atomic,that is, it assigns values to singleton sets A Poisson random variable with parameter λ iscommonly denoted P ois(λ)
We can also collect together random variables to model how something is evolving with time.This yields the next definition
Definition 1.1.12 A stochastic process is a family of random variables {Xt, t ∈ I}
Examples of stochastic processes will be the main concern over the next few chapters of thepaper
1.2 Integration
We will brush over some integration theory, for a detailed outline the reader is referred to Ash(1972) or Billingsley (1979) which are two of the many books that deal with this subject Thenext theorem will become useful later on when we look at integration over product spaces Thetheorem will not be proved, a proof can be found in any modern probability or measure theorybook
2 This is usually called the multivariate normal distribution
Trang 10Now we define some central operators on probability spaces.
Definition 1.2.1 An expectation of a random variable X on Rd
, denoted E[X] is defined by,E[X] =
Z
Rd
x P(X ∈ dx)
The co-variance of two random variables X, Y on Rd is defined as,
Cov(X, Y ) = E[(E[X] − X)(E[Y ] − Y )]
The variance of X is V ar(X) = Cov(X, X)
Intuitively, expectation is what is usually referred to by people as average Variance is theamount by which the random variable is spread around the mean Low variance means thatthe spread is tight around the mean Notice that E is a linear function and also if two randomvariables are independent, then they have zero covariance
Trang 111.3 Convergence Preliminaries 5
There is a subtle difference between almost sure convergence and convergence in probability,however almost sure convergence is a stronger statement than convergence in probability Thereader can verify that almost sure convergence implies convergence in probability which in turnimplies convergence in distribution
Now we define convergence of measures This will play an important part in Chapter 3 where
we discuss infinitely divisible measures
(ii) For each closed F ⊂ Ω, lim sup µn(F ) 6 µ(F )
(iii) For each open U ⊂ Ω, lim inf µn(U ) > µ(U )
With the basic tools we have, we may begin to characterise probability spaces and randomvariables
Trang 12Chapter 2 Characteristic Functions
“ Pure mathematics is the world’s best game It is more absorbing than chess, more
of a gamble than poker, and lasts longer than Monopoly It’s free It can be playedanywhere - Archimedes did it in a bathtub ”
-Richard J Trudeau
2.1 Basic Properties
In this section we will be assuming that X is a random variable on (Ω,F , P) (a probabilityspace) The aim of this chapter is to give a basic introduction to characteristic functions Weshall not be proving most statements here For a formal approach to this subject, we refer thereader to Lukacs (1970) or Moran (1984)
In mathematics, Fourier transforms can reduce complicated tasks into simpler ones Wecan also use Fourier transforms on a distribution function to simplify the expression For somerandom variables the distribution cannot be explicitly known whereas we can often know thecharacteristic function
Definition 2.1.1 A characteristic function ψ of X is defined by,
ψ(θ) =Z
R
eiθxP(X ∈ dx)and the function log ψ is referred to as the characteristic exponent of X
This next theorem will play an important role in this paper It describes the basic properties
of sequence of characteristic functions We will be using this heavily in the forthcoming chapters
so it is important to keep in mind the equivalences stated in this theorem
Theorem 4 (L´evy Continuity Theorem) Let {Xn : n = 1, 2, } be a sequence of randomvariables (not necessarily on the same probability space) with characteristic functions ψn If
ψn→ ψ pointwise then the following are equivalent,
(i) ψ is the characteristic function of some random variable X
(ii) ψ is the characteristic function of X where Xn
D
−→ X(iii) ψ is continuous
(iv) ψ is continuous in some neighbourhood of 0
For the proof see Fristedt and Gray (1997)
To see which functions are characteristic functions we need a theorem from analysis.Theorem 5 (Bochner) A function ψ is a characteristic function if and only if the followinghold;
Trang 132.2 Examples Characteristic Functions 7
We will be using characteristic functions in the next chapter to work on a general class of randomvariables It is essential that we get familiar with some solid examples of characteristic functionsbeforehand These will be of the most common random variables
Example 2.2.1 The characteristic of a N (µ, σ2) random variable X is
−(x − (µ + iσ
2θ))22σ2
of random variables We can define a new random variable X by
The proof of this is simple and is left as an exercise to the reader
Now we can compute the characteristic function of X by
E[eiθX|N ] = E[eiθ P N
E[eiθX] = E[E[eiθξn]N]
Now we need to derive the probability generating function of the Poisson process in order toobtain an analytic expression for the characteristic function of X The probability generatingfunction is given by
E[eiθX] = eλ(E[eiθξn]−1) = eλRR (eiθx−1)F (dx).This will play an important role later when we introduce the compound Poisson process
1 Independent, identically distributed, i.e they all have the same distribution and are independent (pairwise).
2 We have not yet defined conditional expectation, we will be dealing with this later The expectation here can be defined as E[A|B] = E[A1 ]/P(B).
Trang 14Chapter 3 Infinitely Divisible Random
Definition 3.1.1 A random variable X is said to be infinitely divisible if for each n ∈ N thereexists {Xi(n)}n
i=1 of i.i.d random variables such that
X= Xd 1(n)+ X2(n)+ + Xn(n)Alternatively one may define infinite divisibility on a measure µ of a random variable,
µ = λn(n)= λ(n)∗ λ(n)∗ ∗ λ(n)
n
where λ(n) is the law of some random variable
Most distributions one encounters in every day life are infinitely divisible This class ofrandom variables cover a wide range of properties A prominent example is of the normalrandom variable
Example 3.1.2 A N (µ, σ2) random variable X is infinitely divisible, with the distributions
X(n)= N (µ/n, σ2/n) This is easily seen from the characteristic function of X,
exp(iθµ − θ2σ2/2) = exp(iθµ/n − θ2σ2/2n)n
3.2 Properties
As it turns out, infinitely divisible random variables can be represented in an elegant way throughtheir characteristic functions The aim of this section is to establish this result We will becombining the approaches of Moran (1984) and Lukacs (1970)
We approach the problem in the same manner as Paul L´evy did in 1934 The construction isdone via a sequence of Poisson like random variables which will limit to give the characteristicfunction of infinitely divisible random variables
To obtain this results we wish to write the characteristic function of an infinitely divisibledistribution as ψ = elog ψ which is valid as long as ψ is not zero (as the log function is notdefined at zero) After this we can go on to find that the characteristic function will be the limit
of ek(ψ
1
k −1) by the definition of the logarithm
Trang 153.2 Properties Infinitely Divisible Random Variables 9
Proposition 3.2.1 The characteristic function of an infinitely divisible random variable has
in a neighborhood of 0 Applying the L´evy continuity theorem again we conclude that φ iscontinuous and thus φ = 1, hence ψ has no zeros
Now we can go on to a theorem that will be very crucial in proving the main result
Theorem 6 The characteristic function that is the limit of characteristic functions of infinitelydivisible processes is infinitely divisible
Proof Let {f(n)} be a sequence of infinitely divisible characteristic functions that converge to
a characteristic function f Then for each n, k ∈ N there exists fk(n) s.t (fk(n))k = f(n) Now
as each fk(n) is also infinitely divisible and so by Proposition 3.2.1 has no zeros Hence we mayinfer that
fk(n)= ek1log f(n)
So we have that limn→∞fk(n) = e1k log f = f1k is a the limit of characteristic functions and as
f is continuous, so is f1k so the L´evy Continuity Theorem tells us that f1k is a characteristicfunction Thus f is infinitely divisible
Theorem 7 (De Finetti’s Theorem) The characteristic function ψ of a random variable isinfinitely divisible if and only if
ψ(θ) = lim
n→∞epn (g n (θ)−1)
for some pn> 0 and gn, where gn are characteristic functions
Proof Suppose that ψ(θ) is infinitely divisible Let pn= n and gn = ψ1n, then as ψ has no zeros(by Proposition 3.2.1) and it follows that
is a characteristic function of an infinitely divisible distribution for each n ∈ N
Hence passing to the limit as n tends to infinity gives
Trang 163.2 Properties Infinitely Divisible Random Variables 10
Corollary 3.2.2 A characteristic function is of an infinitely divisible distribution if and only
if it is the limit of Poisson like characteristic functions
Proof From De Finetti’s Theorem we have that ψ is the characteristic function of an infinitelydivisible distribution if and only if
where Gn is the measure of a Poisson like random variable
Define ψn(θ) = enRR (e iθx −1)G n (dx) and so
an = nZ
R
x
1 + x2Gn(dx)and
Trang 173.2 Properties Infinitely Divisible Random Variables 11
Now we can reverse the order of the integration using Fubini’s Theorem to get,
Rn(dx) =
1 − sin xx
1 + x2
x2 Hn(dx)
As ψ is continuous thus we have that λn converges to a continuous function Hence we canconclude1that Rn converges weakly to a bounded and non-decreasing function R, that is to sayfor all f ∈ C#,2
In particular for each g ∈ C#, 1 −sin xx −1 x2
1+x 2g(x) is continuous and also vanishes at ∞ and
−∞.3 Hence, Hn converges weakly to some distribution function H and by the same argument
nGn converges weakly to some G Thus we have that,
an= nZ
This satisfies (3.2.1),(3.2.2) and (3.2.3)
The sufficiency is an application of Corollary 3.2.2 to (3.2.4)
The representation in the last theorem are unique up to distribution by the property of theFourier transform We will for now leave this as it is and return to it in Chapter 6 where we will
be talking about L´evy processes
1 For proof of this see (Moran, 1984, p.252 Theorem 6.3)
2 C # are the set of R-valued continuous functions that vanish at ∞ and −∞
3 Notice that (1 − sin x/x) → 1 and x 2 /(1 + x 2 ) → 1 as x tends to ±∞ Thus the function defined vanishes at
±∞ as g vanishes.
Trang 18Chapter 4 Brownian Motion
“ One cannot escape the feeling that these mathematical formulas have an independentexistence and an intelligence of their own, that they are wiser than we are, wiser eventhan their discoverers ”
-Heinrich Hertz
4.1 Definition and Construction
Brownian motion is one of the most interesting stochastic processes around It posses variousamounts of properties and hence has been the focus of study for a long time The idea of Brownianmotion, sometimes known as the Wiener process, is modeled after the physical phenomena of asmoke particle moving about in air It was Brown that discovered that it was the air particles thatproduced this seemingly random motion and Norbert Wiener that mathematically formalised it.Imagine a particle of smoke being bombarded by particles of air The seemingly randommotion that this smoke particle exhibits is called Brownian motion
Definition 4.1.1 (Brownian Motion) A stochastic process Wtis called a Brownian motion or
a Wiener process if it satisfies the following properties,
(i) W0= 0 almost surely
(ii) for 0 6 t16 6 tn, Wtk− Wtk−1, , Wt2− Wt1 are independent
(iii) for s < t, Wt− Ws is distributed N (0, t − s)
(iv) t 7→ Wtis continuous almost surely
We will be proving that a Brownian motion does indeed exist and some of the basic properties
it posses It is useful to first work in the interval [0, 1] as anything proved in this will be easilyextended to Rd
to a stochastic process It is similar to that of Carath`eodory’s extension theorem.1
Suppose we have 0 < t1< < tn and a measure µt1, ,tn on (Rd)n If for any permutation
1 Carath` eodory’s extension theorem states that a countably additive measure on a ring of sets can be extended uniquely to a measure on the σ-algebra generated by this ring.
Trang 194.1 Definition and Construction Brownian Motion 13
For 0 < t1< < tn6 1 we define a measure µ on Rn as follows,
The assumptions of the Kolmogorov’s extension theorem can easily be verified in this instance.Therefore this extends to give a measure on [0, 1] Kolmogorov’s extension theorem does notguarantee the continuity of the paths It is not at all obvious why we need the continuity in thedefinition To see the importance of this, consider the following example
Example 4.1.2 Let us for a second assume that Brownian motion exists on [0, 1] call this Bt.Let U be a uniform random variable on [0, 1] independent of Bt, now we define a new process
is almost surely discontinuous
Now assured that our efforts to prove continuity are not in vain, we may continue We willapproach the problem in a much similar manner to that of Norbert Wiener Alternative approachvia the Polish space2
C([0, ∞), Rd) is given in Strook and Varadhan (1979)
We will do a direct construction using the lemma below The construction will be of asequence of stochastic processes which are almost surely convergent uniformly
Lemma 4.1.3 The uniform limit of a sequence of continuous functions is continuous
The proof of this is just an application of the definitions which we leave to the unsure reader
t} on [0, 1] by,3
2 Complete separable metric space
3 We will be changing between notation of Wn, by referring to its value for a particular ω ∈ Ω by Wn(t, ω)
Trang 204.1 Definition and Construction Brownian Motion 14
Proof First we see from (4.1.3) that for each ω ∈ Ω, we first need to analyse Yn(t, ω) for each
n ∈ N Notice that by (4.1.4) and the fact that Fk,nis maximum at t = 2kn,
for some constant C
Notice that applying the ratio test to 2√−3n