Genetic Genealogical Models in Rare Event Analy- sis pdf

We present in this article a genetic type interacting particle systemsalgorithm and a genealogical model for estimating a class of rare events arising inphysics and network analysis.. We

Trang 1

Genetic Genealogical Models in Rare Event sis

Analy-Fr´ ed´ eric C´ erou, Pierre Del Moral, Fran¸ cois LeGland and Pascal Lezaud

IRISA / INRIA, Campus de Beaulieu, 35042 RENNES C´edex, France

E-mail address: Frederic.Cerou@inria.fr

Laboratoire J.A Dieudonné, Université Nice, Sophia-Antipolis, Parc Valrose, 06108 NICECédex 2, France

E-mail address: delmoral@math.unice.fr

IRISA / INRIA, Campus de Beaulieu, 35042 RENNES C´edex, France

E-mail address: legland@irisa.fr

Centre d’Etudes de la Navigation A´erienne, 7 avenue Edouard Belin, 31055 TOULOUSEC´edex 4, France

E-mail address: lezaud@cena.fr

Abstract We present in this article a genetic type interacting particle systemsalgorithm and a genealogical model for estimating a class of rare events arising inphysics and network analysis We represent the distribution of a Markov processhitting a rare target in terms of a Feynman–Kac model in path space We showhow these branching particle models described in previous works can be used toestimate the probability of the corresponding rare events as well as the distribution

of the process in this regime

1 Introduction

Let X ={Xt, t≥ 0} be a continuous–time strong Markov process taking values

in some Polish state space S For a given target Borel set B ⊂ S we define thehitting time

TB= inf{t ≥ 0 : Xt∈ B} ,

as the first time when the process X hits B Let us assume that X has almost surelyright continuous, left limited trajectories (RCLL), and that B is closed Then we

Received by the editors August 31, 2005, accepted April 5, 2006.

2000 Mathematics Subject Classification 65C35.

Key words and phrases interacting particle systems, rare events, Feynman-Kac models, netic algorithms, genealogical trees.

ge-Second version in which misprints have been corrected.

Trang 2

have that XT B ∈ B In many applications, the set B is the (super) level set of ascalar measurable function φ defined on S, i.e.

B ={x ∈ S : φ(x) ≥ λB}

In this case, we will assume that φ is upper semi–continuous, which ensures that

B is closed For any real interval I we will denote by D(I, S) the set of RCLLtrajectories in S indexed by I We always take the convention inf∅ = ∞ so that

TB =∞ if X never succeeds to reach the desired target B It may happen thatmost of the realizations of X never reach the set B The corresponding rare eventprobabilities are extremely difficult to analyze In particular one would like toestimate the quantities

P(TB≤ T ) and Law(Xt, 0≤ t ≤ TB | TB≤ T ) , (1.1)where T is either

• a deterministic finite time,

• a P–almost surely finite stopping time, for instance the hitting time of arecurrent Borel set R⊂ S, i.e T = TRwith

TR= inf{t ≥ 0 : Xt∈ R} and P(TR<∞) = 1 The second case covers the two “dual” situations

• Suppose the state space S = A ∪ R is decomposed into two separate regions

A and R The process X starts in A and we want to estimate the probability

of the entrance time into a target B⊂ A before exiting A In this contextthe conditional distribution (1.1) represents the law of the process in this

”ballistic” regime

• Suppose the state space S = B ∪C is decomposed into two separate regions

B and C The process X evolves in the region C which contains a collection

of ”hard obstacles” represented by a subset R⊂ C The particle is killed

as soon as it enters the ”hard obstacle” set R In this context the twoquantities (1.1) represent respectively the probability of exiting the pocket

of obstacles C without being killed and the distribution of the process whichsucceeds to escape from this region

In all the sequel, P(TB≤ T ) will be of course unknown, but nevertheless assumed

to be strictly positive

The estimation of these quantites arises in many research areas such as in physicsand engineering problems In network analysis such as in advanced telecommu-nication systems studies X traditionally represents the length of service centers

in an open/closed queueing network processing jobs In this context these twoquantities (1.1) represent repectively the probability of buffer-overflows and thedistribution of the queueing process in this overflow regime

Several numerical methods have been proposed in the literature to estimatethe entrance probability into a rare set We refer the reader to the excellent pa-per Glasserman et al (1999) which contains a precise review on these methods aswell as a detailed list of references For the convenience of the reader we presenthereafter a brief description of the two leading ideas

The first one is based on changing the reference probability so that rare setsbecomes less rare This probabilistic approach often requires the finding of theright change of measure This step is usually done using large deviations techniques.Another more physical approach consists in splitting the state space into a sequence

Trang 3

of sub-levels the particle needs to pass before it reaches the rare target Thissplitting stage is based on a precise physical description of the evolution of theprocess between each level leading to the rare set The next step is to introduce asystem of particles evolving in this level decomposition of the state, in which eachparticle branches as soon as it enters into a higher level.

The purpose of the present article is to connect the multilevel splitting techniqueswith the branching and interacting particle systems approximations of Feynman–Kac distributions studied in previous articles This work has been influenced by thethree referenced papers Del Moral and Miclo (2000, 2001) and Glasserman et al.(1999)

Our objective is twofold First we propose a Feynman–Kac representation of thequantities (1.1) The general idea behind our construction is to consider the levelcrossing Markov chain in path space and associated with the splitting of the statespace The concatenation of the corresponding states will contain all information

on the way the process passes each level before entering into the final and targetrare set Based on this modeling we introduce a natural mean field type geneticparticle approximation of the desired quantities (1.1) More interestingly we alsoshow how the genealogical structure of the particle at each level can be used to findthe distribution of the process during its excursions to the rare target

When the state space is splitted into m levels the particle model evolve into

m steps At time n = 0 the algorithm starts with N independent copies of theprocess X The particles which enter the recurrent set R are killed and instantly aparticle having reached the first level produces an offspring If the whole system iskilled the algorithm is stopped Otherwise by construction of the birth and deathrule we obtain N particles at the first level At time n = 1 the N particles in thefirst level evolve according to the same rule as the process X Here again particleswhich enter the recurrent set R are killed and instantly a particle having reachedthe second level produces an offspring and so on

From this brief description we see that the former particle scheme follows thesame splitting strategies as the one discussed in Glasserman et al (1999) The newmathematical models presented here allows to calibrate with precision the asymp-totic behavior of this particle techniques as the size of the systems tends to infinity

In addition and in contrast to previous articles on the subject the Feynman–Kacanalysis in path space presented hereafter allows to study the genealogical struc-ture of these splitting particle models We will show that the empirical measuresassociated with the corresponding historical processes converge as N → ∞ to thedistribution of the whole path process between each levels

An empirical method called restart Vill´en-Altamirano and Vill´en-Altamirano(1991); Tuffin and Trivedi (2000) can also be used to compute rare transient eventsand the probability of rare events in steady state, not only the probability to reachthe target before coming back to a recurrent set It was developped to computethe rate of lost packets through a network in a steady–state regime From a math-ematical point of view, this is equivalent to the fraction of time that the trajectoryspends in a particular set B, asymptotically as t→ +∞, provided we assume thatthe system is ergodic In order to be able to both simulate the system on the longtime and see frequent visits to the rare event, the algorithm splits the trajectoriescrossing the levels ”upwards” (getting closer to the rare event), and cancel thosecrossing ”downwards”, except one of them called the master trajectory So the main

Trang 4

purpose of this algorithm is quite different from the one of restart It is used bypractitionners, but this method requires some mathematical approximations whichare not yet well understood Moreover this method is not taken into account by theprevious formalism So, a further work could be an extension of the former particlescheme for covering restart.

A short description of the paper is as follows Section 2 of this paper sets outthe Feynman–Kac representation of the quantities (1.1) Section 3 provides thedescription of the corresponding genetic-type interacting particle system approx-imating model Section 4 introduces a path-valued interacting particle systemsmodel for the historical process associated with the previous genetic algorithm Fi-nally, Section 5 deals with a numerical example, based on the Ornstein-Uhlenbeckprocess Estimation of exit time for diffusions controlled by potentials are suggestedalso, since the lack of exact calculations, even if some heuristics may be applied

2 Multi-level Feynman–Kac formulae

In practice the process X, before visiting R or entering into the desired set B,passes through a decreasing sequence of closed sets

B = Bm+1⊂ Bm⊂ · · · ⊂ B1⊂ B0.The parameter m and the sequence of level sets depend on the problem at hand

We choose the level sets to be nested to ensure that the process X cannot enter Bnbefore visiting Bn−1 The choice of the recurrent set R depends on the nature ofthe underlying process X To visualize these level sets we propose hereafter the two

”dual” constructions corresponding to the two ”dual” interpretations presented inthe introduction

• In the ballistic regime the decreasing sequence B = Bm+1 ⊂ Bm ⊂ · · · ⊂

B1 ⊂ B0 represents the physical levels the process X needs to pass before

To capture the behavior of X between the different levels B = Bm+1⊂ Bm⊂ · · · ⊂

B1⊂ B0 we introduce the discrete event–driven stochastic sequence

Xn = (Xt, Tn−1∧ T ≤ t ≤ Tn∧ T ) ∈ E with E = [

t 0 ≤t 00

D([t0, t00], S)for any 1≤ n ≤ m + 1, where Tn represents the first time X reaches Bn, that is

Tn = inf{t ≥ 0 : Xt∈ Bn}with the convention inf∅ = ∞ At this point we need to endow E with a σ-algebra.First we extend all the trajectories X by 0 such that they are defined on the wholereal line We denote by ˜X the corresponding extended trajectory They are thenelement of D(R, S), on which we consider the σ-algebra generated by the Skorohodmetric Then we consider the product space ˜E = D(R, S)× ¯R+× ¯R+endowed withthe product σ-algebra Finally to any element X∈ E, defined on an interval [s, t],

we associate ( ˜X, s, t)∈ ˜E So we have imbedded E in ˜E in such a way that all thestandard functionals (sup, inf, ) have good measurability properties We denote

Trang 5

by Bb(E) the measurable bounded functions from E (or equivalently its image in

• finally, if Tn ≤ T , then Xn = (Xt, Tn−1 ≤ t ≤ Tn) represents the path of

X between the successive levels Bn−1 and Bn, and XT n ∧T = XT n∈ Bn.Consequently, XT n ∧T ∈ Bn if and only if Tn ≤ T By construction we also noticethat

T0= 0≤ T1≤ · · · ≤ Tm≤ Tm+1= TBand for each n

(Tn−1> T )⇒ (Tp> T and Xp={XT} 6⊂ Bp, for all p≥ n)

From these observations we can alternatively define the times Tn by the inductiveformula

Tn= inf{t ≥ Tn−1 : Xt ∈ Bn}with the convention inf∅ = ∞ so that Tn > T if either Tn−1> T or if starting in

Bn−1 at time Tn−1 the process never reaches Bn before time T We also observethat

(TB≤ T ) ⇔ (Tm+1≤ T ) ⇔ (T1≤ T, · · · , Tm+1≤ T )

By the strong Markov property we check that the stochastic sequence (X0,· · · ,

Xm+1) forms an E-valued Markov chain One way to check whether the path hassucceeded to reach the desired n-th level is to consider the potential functions gn

on E defined for each x = (xt, t0

1(Tn≤ T ) =

nYk=0

gk(Xk) ,and

f (Xn) 1(Tn≤ T ) = f(Xt, Tn−1≤ t ≤ Tn) 1(Tn ≤ T )

For later purpose, we introduce the following notation

(X0,· · · , Xn) = (X0, (Xt, 0≤ t ≤ T1∧ T ), · · · , (Xt, Tn−1∧ T ≤ t ≤ Tn∧ T ))

= [Xt, 0≤ t ≤ Tn∧ T ] Introducing the Feynman–Kac distribution ηn defined by

ηn(f ) = γn(f )

γn(1) with γn(f ) = E(f(Xn)

n−1Yk=0

gk(Xk)) ,for any bounded measurable function f defined on E, we are now able to state thefollowing Feynman–Kac representation of the quantities (1.1)

Trang 6

Theorem 2.1 (Multilevel Feynman–Kac formula) For any n and for any f ∈

Bb(E) we have

E(f (Xt, Tn−1≤ t ≤ Tn)| Tn≤ T ) =

E(f (Xn)

nYp=0

gp(Xp))

E(

nYp=0

gp(Xp))

E(

nYp=0

gp(Xp))

The straightforward formula

P[Tn≤ T ] =

nYk=0P[Tk≤ T | Tk−1≤ T ] ,which shows how the very small probability of a rare event can be decomposed intothe product of reasonably small but not too small conditional probabilities, each

of which corresponding to the transition between two events, can also be derivedfrom the the well–known identity

γn+1(1) =

nYk=0

ηk(gk) ,

and will provide the basis for the efficient numerical approximation in terms of

an interacting particle system These conditional probabilities are not known inadvance, and are learned by the algorithm as well

3 Genetic approximating models

In previous studies Del Moral and Miclo (2000, 2001) we design a collection

of branching and interacting particle systems approximating models for solving ageneral class of Feynman–Kac models These particle techniques can be used tosolve the formulae presented in Theorem 2.1 We first focus on a simple muta-tion/selection genetic algorithm

3.1 Classical scheme To describe this particle approximating model we first recallthat the Feynman–Kac distribution flow ηn∈ P(E) defined by

ηn(f ) = γn(f )

γn(1) with γn(f ) = E(f(Xn)

n−1Y

gp(Xp))

Trang 7

is solution of the following measure valued dynamical system

The mappings Φn+1 from the set of measures

Pn(E) ={η ∈ P(E) , η(gn) > 0}into P(E) are defined by

Φn+1(η)(dx0) = (Ψn(η) Kn+1)(dx0) =

ZE

Ψn(η)(dx) Kn+1(x, dx0)The Markov kernels Kn(x, dx0) represent the Markov transitions of the chain Xn.The updating mappings Ψn are defined from Pn(E) into Pn(E) and for any η ∈

Pn(E) and f ∈ Bb(E) by the formula

Ψn(η)(f ) = η(f gn)/η(gn)Thus we see that the recursion (3.1) involves two separate selection / mutationtransitions

ηn∈ P(E)−−−−−−→ bselection ηn= Ψn(ηn)∈ P(E)−−−−−−−→ ηmutation n+1= bηnKn+1∈ P(E)

ηp(gp)

In these notations we readily observe that

γn(gn) = P(Tn≤ T )and

∪ {∆} where ∆ stands for a cemetery

or coffin point Its transitions are defined as follows For any configuration x =(x1,· · · , xN)∈ EN such that 1

N

NXi=1δxi ∈ Pn(E) we setP(ξn+1 ∈ dy | ξn = x) =

NYp=1

Φn+1(1N

NXi=1δxi)(dyp) (3.3)where dy = dy1× · · · × dyN is an infinitesimal neighborhood of y = (y1,· · · , yN)∈

EN When the system arrives in some configuration ξn= x such that

1N

NXi=1δxi 6∈ Pn(E)the particle algorithm is stopped and we set ξn+1 = ∆ The initial system of par-ticles ξ0= (ξ1,· · · , ξN

0 ) consists in N independent random variables with commonlaw η0 = Law(X0) = Law(X0) The superscript i = 1,· · · , N represents the label

of the particle and the parameter N is the size of the systems and the precision ofthe algorithm

Trang 8

Next we describe in more details the genetic evolution of the path-particles Atthe time n = 0 the initial configuration consists in N independent and identicallydistributed S-valued random variables ξi

0 with common law η0 Since we have

g0(u) = 1 for η0-almost every u ∈ S, we may discard the selection at time n = 0and set bξi

ξn= ∆ we set ξn+1 = ∆ Otherwise during mutation, independently of each other,each selected path-particle

n at time Tn+1−,i = bT+,i

n , and let it evolverandomly as a copy{ξi

n+1(s) , s≥ Tn+1−,i} of the process {Xs, s≥ Tn+1−,i}, until thestopping time Ti

+,n+1, which is either

Tn+1+,i = inf{t ≥ Tn+1−,i : ξn+1i (t)∈ Bn+1∪ R},

in case of a recurrent set to be avoided, or

Tn+1+,i = T∧ inf {t ≥ Tn+1−,i : ξin+1(t)∈ Bn+1},

in case of a deterministic final time, depending on the problem at hand

The selection transition ξn+1 → bξn+1 is defined as follows From the previousmutation transition we obtain N path-particle

ξn+1i = (ξn+1i (t) , Tn+1−,i ≤ t ≤ Tn+1+,i)Only some of these particle have succeeded to reach to desired set Bn+1 and theother ones have failed We denote by IN

n+1 the labels of the particles having ceeded to reach the (n + 1)-th level

suc-IN n+1 ={i = 1, · · · , N : ξi

gn+1(ξin+1) = 0 ⇐⇒ N1

NXi=1δξi n+1 6∈ Pn+1(E)

we see that in this situation the algorithm is stopped and bξn+1= ∆ Otherwise theselection transition of the N -particle models (3.3) and (3.5) are defined as follows

In the first situation the system bξn+1 = (bξn+11 ,· · · , bξn+1N ) consists in N independent(given the past until the last mutation) random variables

b

ξn+1i = (bξn+1i (t) , bTn+1−,i ≤ t ≤ bTn+1+,i)

Trang 9

with common distribution

Ψn+1(1

N

NXi=1δξi n+1) =

NXi=1

gn+1(ξi n+1)N

Xj=1

gn+1(ξn+1j )

δξi n+1

|IN n+1|

Xi∈I N n+1

δ(ξn+1i (t) , Tn+1−,i ≤ t ≤ Tn+1+,i)

In simple words, we draw them uniformly among the sucessfull pieces of trajectories{ξi

n+1, i∈ IN

n+1}

3.2 Alternate scheme As mentioned above the choice of the N -particle imating model of (3.1) is not unique Below, we propose an alternative schemewhich contains in some sense less randomness Del Moral et al (2001b) The keyidea is to notice that the updating mapping Ψn: Pn(E)→ Pn(E) can be rewritten

approx-in the followapprox-ing form

Ψn(η)(dx0) = (η Sn(η))(dx0) =

ZEη(dx) Sn(η)(x, dx0) , (3.4)with the collection of Markov transition kernels Sn(η)(x, dx0) on E defined by

Sn(η)(x, dx0) = (1− gn(x)) Ψn(η)(dx0) + gn(x) δx(dx0) ,

where

gn(x) = 1(gn(x) = 1) = 1(x ∈ g−1

n (1)) ,and where g−1

n (1) stands for the set of paths in E entering the level Bn, that is

gn−1(1) ={x ∈ E : gn(x) = 1} = {x ∈ D([t0, t00], S) , t0

≤ t00 : xt 00 ∈ Bn} Indeed

(η Sn(η))(dx0) = Ψn(η)(dx0) (1− η(gn)) +

ZEη(dx) gn(x) δx(dx0) ,hence

(η Sn(η))(f ) = Ψn(η)(f ) (1− η(gn)) + η(f gn) = Ψn(η)(f ) ,

for any bounded measurable function f defined on E, which proves (3.4) In thisnotation, (3.1) can be rewritten as

ηn+1 = ηnKn+1(ηn) ,with the composite Markov transition kernel Kn+1(η) defined by

Kn+1(η)(x, dx0) = (Sn(η) Kn+1)(x, dx0) =

ZE

Sn(η)(x, dx00) Kn+1(x00, dx0)The alternative N -particle model associated with this new description is defined asbefore by replacing (3.3) by

P(ξn+1 ∈ dy | ξn = x) =

NY

Kn+1(1N

NX

δxi)(xp, dyp) (3.5)

Trang 10

By definition of Φn+1 and Kn+1(η) we have for any configuration x = (x1,· · · , xN)

in EN with 1

N

NX

i=1

δx i ∈ Pn(E)

Φn+1(1N

NXi=1δxi)(dv) =

NXi=1

gn(xi)NXj=1

gn(xj)

Kn+1(xi, dv)

In much the same way we find that

Kn+1(1N

NXi=1δxi) = Sn(1

N

NXi=1δxi) Kn+1with the selection transition

Ψn(1N

NXi=1δxi) =

NXi=1

gn(xi)NXj=1

gn(xj)

δxi

Thus, we see that the transition ξn→ ξn+1 of the former Markov models splits upinto two separate genetic type mechanisms

ξn∈ EN∪ {∆}−−−−−−→ bselection ξn= (bξni)1≤i≤N ∈ EN∪ {∆}−−−−−−−→ ξmutation n+1∈ EN∪ {∆}

By construction we notice that

ξn= ∆ =⇒ ∀p ≥ n ξp= ∆ and ξbp= ∆

By definition of the path valued Markov chain Xn this genetic model consists in

N -path valued particles

ξni = (ξni(t) , Tn−,i≤ t ≤ Tn+,i)∈ D([Tn−,i, Tn+,i], S)

b

ξni = (bξni(t) , bTn−,i≤ t ≤ bTn+,i)∈ D([ bTn−,i, bTn+,i], S)

The random time-pairs (T−,i

Trang 11

is sampled according to the selection distribution

Sn+1(1

N

NX

j=1

δ

ξjn+1)(ξ

i n+1, dv)

= (1− gn+1(ξn+1i )) Ψn(1

N

NXj=1

δξj n+1)(dv) + gn+1(ξn+1i ) δξi

NXj=1

δ

ξn+1j )(dv)+ 1

(ξi

n+1(Tn+1+,i)∈ Bn+1)δξi

n+1(dv)More precisely we have

ξi n+1(Tn+1+,i)∈ Bn+1 =⇒ bξi

n+1 = ξi n+1

In the opposite we have ξi

n+1(Tn+1+,i)6∈ Bn+1 when the particle has not succeeded

to reach the (n + 1)–th level In this case bξi

n+1 is chosen randomly and uniformly

We denote by τN the lifetime of the N -genetic model

τN = inf{n ≥ 0 : N1

NXi=1δξi

n 6∈ Pn(E)} For each time n < τN we denote by ηN

n and bηN

n the particle density profiles ciated with the N -particle model

asso-ηnN= 1N

NXi=1δξi

n and ηbnN= Ψn(ηnN) For each time n < τN the N –particle approximating measures γN

n associated with

γn are defined for any f∈ Bb(E) by

γnN(f ) = ηNn(f )

n−1Yp=0

ηpN(gp) Note that

γn+1N (1) = γnN(gn) =

nYp=0

ηpN(gp) =

nYp=1

|IN

p |

N ,and

δ(ξi

n(t) , T−,i

n ≤ t ≤ T+,i

n ) The asymptotic behavior as N → ∞ of the interacting particle model we haveconstructed has been studied in many works We refer the reader to the surveypaper Del Moral and Miclo (2000) in the case of strictly positive potentials gnand C´erou et al (2002); Del Moral et al (2001a) for non negative potentials For the

Định dạng
Số trang	23
Dung lượng	245,27 KB