1. Trang chủ
  2. » Giáo Dục - Đào Tạo

performance analysis of communications networks and systems

543 4,5K 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Performance Analysis of Communications Networks and Systems
Tác giả Piet Van Mieghem
Trường học Delft University of Technology
Chuyên ngành Communications Networks and Systems
Thể loại Book
Năm xuất bản 2006
Thành phố Delft
Định dạng
Số trang 543
Dung lượng 10,69 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

9.2 Discrete-time Markov chain 15810.5 The transitions in a continuous-time Markov chain 19310.6 Example: the two-state Markov chain in continuous-time 195 11.1 Discrete Markov chains an

Trang 2

OF COMMUNICATIONS NETWORKS AND SYSTEMS

PIET VAN MIEGHEM

Delft University of Technology

Trang 3

Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo Cambridge University Press

The Edinburgh Building, Cambridge cb2 2ru, UK

First published in print format

isbn-13 978-0-521-85515-0

isbn-13 978-0-511-16917-5

© Cambridge University Press 2006

2006

Information on this title: www.cambridge.org/9780521855150

This publication is in copyright Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press.

isbn-10 0-511-16917-5

isbn-10 0-521-85515-2

Cambridge University Press has no responsibility for the persistence or accuracy of urls for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

Published in the United States of America by Cambridge University Press, New York www.cambridge.org

hardback

eBook (NetLibrary) eBook (NetLibrary) hardback

Trang 4

to my father

to my wife Saskia

and my sons Vincent, Nathan and Laurens

Trang 6

Preface xi

4.1 Generation of correlated Gaussian random variables 61

v

Trang 7

4.4 Examples of the non-linear transformation method 744.5 Linear combination of independent auxiliary random

5.3 Inequalities deduced from the Mean Value Theorem 86

5.7 The dominant pole approximation and large deviations 94

Trang 8

9.2 Discrete-time Markov chain 158

10.5 The transitions in a continuous-time Markov chain 19310.6 Example: the two-state Markov chain in continuous-time 195

11.1 Discrete Markov chains and independent random

12.2 The limit Z of the scaled random variables Zn 23312.3 The Probability of Extinction of a Branching Process 237

Trang 9

16.1 The shortest path and the link weight structure 34816.2 The shortest path tree in NQ with exponential link

16.8 The proof of the degree Theorem 16.6.1 of the URT 380

Trang 10

17.4 The Chuang—Sirbu law 404

17.7 Proof of Theorem 17.3.1: jQ(p) for n-ary trees 414

18.6 The performance measure  in exponentially growing

Trang 12

Performance analysis belongs to the domain of applied mathematics Themajor domain of application in this book concerns telecommunications sys-tems and networks We will mainly use stochastic analysis and probabilitytheory to address problems in the performance evaluation of telecommuni-cations systems and networks The first chapter will provide a motivationand a statement of several problems.

This book aims to present methods rigorously, hence mathematically, withminimal resorting to intuition It is my belief that intuition is often gainedafter the result is known and rarely before the problem is solved, unless theproblem is simple Techniques and terminologies of axiomatic probability(such as definitions of probability spaces, filtration, measures, etc.) havebeen omitted and a more direct, less abstract approach has been adopted

In addition, most of the important formulas are interpreted in the sense of

“What does this mathematical expression teach me?” This last step justifiesthe word “applied”, since most mathematical treatises do not interpret as

it contains the risk to be imprecise and incomplete

The field of stochastic processes is much too large to be covered in a singlebook and only a selected number of topics has been chosen Most of the top-ics are considered as classical Perhaps the largest omission is a treatment

of Brownian processes and the many related applications A weak excusefor this omission (besides the considerable mathematical complexity) is thatBrownian theory applies more to physics (analogue fields) than to systemtheory (discrete components) The list of omissions is rather long and onlythe most noteworthy are summarized: recent concepts such as martingalesand the coupling theory of stochastic variables, queueing networks, schedul-ing rules, and the theory of long-range dependent random variables that cur-rently governs in the Internet The confinement to stochastic analysis alsoexcludes the recent new framework, called Network Calculus by Le Boudecand Thiran (2001) Network calculus is based on min-plus algebra and hasbeen applied to (Inter)network problems in a deterministic setting

As prerequisites, familiarity with elementary probability and the edge of the theory of functions of a complex variable are assumed Parts inthe text insmall font refer to more advanced topics or to computations thatcan be skipped at first reading Part I (Chapters 2—6) reviews probabilitytheory and it is included to make the remainder self-contained The bookessentially starts with Chapter 7 (Part II) on Poisson processes The Pois-

knowl-xi

Trang 13

son process (independent increments and discontinuous sample paths) andBrownian motion (independent increments but continuous sample paths)are considered to be the most important basic stochastic processes Webriefly touch upon renewal theory to move to Markov processes The theory

of Markov processes is regarded as a fundament for many applications intelecommunications systems, in particular queueing theory A large part

of the book is consumed by Markov processes and its applications Thelast chapters of Part II dive into queueing theory Inspired by intriguingproblems in telephony at the beginning of the twentieth century, Erlanghas pushed queueing theory to the scene of sciences Since his investiga-tions, queueing theory has grown considerably Especially during the lastdecade with the advent of the Asynchronous Transfer Mode (ATM) and theworldwide Internet, many early ideas have been refined (e.g discrete-timequeueing theory, large deviation theory, scheduling control of prioritizedflows of packets) and new concepts (self-similar or fractal processes) havebeen proposed Part III covers current research on the physics of networks.This Part III is undoubtedly the least mature and complete In contrast tomost books, I have chosen to include the solutions to the problems in anAppendix to support self-study

I am grateful to colleagues and students whose input has greatly improvedthis text Fernando Kuipers and Stijn van Langen have corrected a largenumber of misprints Together with Fernando, Milena Janic and Almer-ima Jamakovic have supplied me with exercises Gerard Hooghiemstra hasmade valuable comments and was always available for discussions about

my viewpoints Bart Steyaert eagerly gave the finer details of the ing function approach to the GI/D/m queue Jan Van Mieghem has givenoverall comments and suggestions beside his input with the computation ofcorrelations Finally, I thank David Hemsley for his scrupulous corrections

generat-in the origgenerat-inal manuscript

Although this book is intended to be of practical use, in the course ofwriting it, I became more and more persuaded that mathematical rigor hasample virtues of its own

Per aspera ad astra

Trang 14

The aim of this first chapter is to motivate why stochastic processes andprobability theory are useful to solve problems in the domain of telecommu-nications systems and networks

In any system, or for any transmission of information, there is always anon-zero probability of failure or of error penetration A lot of problems inquantifying the failure rate, bit error rate or the computation of redundancy

to recover from hazards are successfully treated by probability theory Often

we deal in communications with a large variety of signals, calls, destination pairs, messages, the number of customers per region, and so on.And, most often, precise information at any time is not available or, if it

source-is available, determinsource-istic studies or simulations are simply not feasible due

to the large number of dierent parameters involved For such problems, astochastic approach is often a powerful vehicle, as has been demonstrated

in the field of physics

Perhaps the first impressing result of a stochastic approach was mann’s and Maxwell’s statistical theory They studied the behavior of parti-cles in an ideal gas and described how macroscopic quantities as pressure andtemperature can be related to the microscopic motion of the huge amount

Boltz-of individual particles Boltzmann also introduced the stochastic notion Boltz-ofthe thermodynamic concept of entropyV,

V = n log ZwhereZ denotes the total number of ways in which the ensembles of parti-cles can be distributed in thermal equilibrium and wheren is a proportion-ality factor, afterwards attributed to Boltzmann as the Boltzmann constant.The pioneering work of these early physicists such as Boltzmann, Maxwelland others was the germ of a large number of breakthroughs in science.Shortly after their introduction of stochastic theory in classical physics, the

1

Trang 15

theory of quantum mechanics (see e.g Cohen-Tannoudji et al., 1977) wasestablished This theory proposes that the elementary building blocks ofnature, the atom and electrons, can only be described in a probabilisticsense The conceptually di!cult notion of a wave function whose squaredmodulus expresses the probability that a set of particles is in a certain stateand the Heisenberg’s uncertainty relation exclude in a dramatic way ourdeterministic, macroscopic view on nature at the fine atomic scale.

At about the same time as the theory of quantum mechanics was beingcreated, Erlang applied probability theory to the field of telecommunica-tions Erlang succeeded to determine the number of telephone input lines

p of a switch in order to serve QV customers with a certain probability s.Perhaps his most used formula is the Erlang E formula (14.17), derived inSection 14.2.2,

s = 1034, the number of input lines p can be computed for each load .Due to its importance, books with tables relatings,  and p were published.Another pioneer in the field of communications that deserves to be men-tioned is Shannon Shannon explored the concept of entropy V He in-troduced (see e.g Walrand, 1998) the notion of the Shannon capacity of achannel, the maximum rate at which bits can be transmitted with arbitrarysmall (but non zero) probability of errors, and the concept of the entropyrate of a source which is the minimum average number of bits per sym-bol required to encode the output of a source Many others have extendedhis basic ideas and so it is fair to say that Shannon founded the field ofinformation theory

A recent important driver in telecommunication is the concept of ity of service (QoS) Customers can use the network to transmit dierenttypes of information such as pictures, files, voice, etc by requiring a spe-cific level of service depending on the type of transmitted information Forexample, a telephone conversation requires that the voice packets arrive atthe receiver G ms later, while a file transfer is mostly not time critical butrequires an extremely low information loss probability The value of themouth-to-ear delay G is clearly related to the perceived quality of the voiceconversation As long as G ? 150 ms, the voice conversation has toll qual-ity, which is roughly speaking, the quality that we are used to in classical

Trang 16

qual-telephony WhenG exceeds 150 ms, rapid degradation is experienced andwhen G A 300 ms, most of the test persons have great di!culty in un-derstanding the conversation However, perceived quality may change fromperson to person and is di!cult to determine, even for telephony For ex-ample, if the test person knows a priori that the conversation is transmittedover a mobile or wireless channel as in GSM, he or she is willing to tolerate

a lower quality Therefore, quality of service is both related to the nature

of the information and to the individual desire and perception In futureInternetworking, it is believed that customers may request a certain QoSfor each type of information Depending on the level of stringency, the net-work may either allow or refuse the customer Since customers will also pay

an amount related to this QoS stringency, the network function that mines to either accept or refuse a call for service will be of crucial interest

deter-to any network operadeter-tor Let us now state the connection admission control(CAC) problem for a voice conversation to illustrate the relation to stochas-tic analysis: “How many customersp are allowed in order to guarantee thatthe ensemble of all voice packets reaches the destination withinG ms withprobabilitys?”This problem is exceptionally di!cult because it depends onthe voice codecs used, the specifics of the network topology, the capacity ofthe individual network elements, the arrival process of calls from the cus-tomers, the duration of the conversation and other details Therefore, wewill simplify the question Let us first assume that the delay is only caused

by the waiting time of a voice packet in the queue of a router (or switch)

As we will see in Chapter 13, this waiting timeW of voice packets in a singlequeueing system depends on (a) the arrival process: the way voice packetsarrive, and (b) the service process: how they are processed Let us assumethat the arrival process specified by the average arrival rate and the ser-vice process specified by the average service rate are known Clearly, thearrival rate  is connected to the number of customers p A simplifiedstatement of the CAC problem is, “What is the maximum allowed suchthat Pr [W A G] ? ?” In essence, the CAC problem consists in computingthe tail probability of a quantity that depends on parameters of interest Wehave elaborated on the CAC problem because it is a basic design problemthat appears under several disguises A related dimensioning problem is thedetermination of the buer size in a router in order not to lose more than acertain number of packets with probabilitys, given the arrival and serviceprocess The above mentioned problem of Erlang is a third example An-other example treated in Chapter 18 is the server placement problem: “Howmany replicated serversp are needed to guarantee that any user can accessthe information withinn hops with probability Pr [kQ(p) A n]  ”, where

Trang 17

 is certain level of stringency and kQ(p) is the number of hops towards themost nearby of the p servers in a network with Q routers.

The popularity of the Internet results in a number of new challenges Thetraditional mathematical models as the Erlang B formula assume “smooth”tra!c flows (small correlation and Markovian in nature) However, TCP/IPtra!c has been shown to be “bursty” (long-range dependent, self-similar andeven chaotic, non-Markovian (Veres and Boda, 2000)) As a consequence,many traditional dimensioning and control problems ask for a new solu-tion The self-similar and long range dependent TCP/IP tra!c is mainlycaused by new complex interactions between protocols and technologies (e.g.TCP/IP/ATM/SDH) and by other information transported than voice It

is observed that the content size of information in the Internet varies siderably in size causing the “Noah eect”: although immense floods areextremely rare, their occurrence impacts significantly Internet behavior on

con-a globcon-al sccon-ale Unfortuncon-ately, the mcon-athemcon-atics to cope with the self-similcon-arand long range dependent processes turns out to be fairly complex and be-yond the scope of this book

Finally, we mention the current interest in understanding and modelingcomplex networks such as the Internet, biological networks, social networksand utility infrastructures for water, gas, electricity and transport (cars,goods, trains) Since these networks consists of a huge number of nodes Qand links O, classical and algebraic graph theory is often not suited to pro-duce even approximate results The beginning of probabilistic graph theory

is commonly attributed to the appearance of papers by Erdös and Rényi inthe late 1940s They investigated a particularly simple growing model for agraph: start from Q nodes and connect in each step an arbitrary random,not yet connected pair of nodes until all O links are used After about Q@2steps, as shown in Section 16.7.1, they observed the birth of a giant com-ponent that, in subsequent steps, swallows the smaller ones at a high rate.This phenomenon is called a phase transition and often occurs in nature

In physics it is studied in, for example, percolation theory To some extent,the Internet’s graph bears some resemblance to the Erdös-Rényi randomgraph The Internet is best regarded as a dynamic and growing network,whose graph is continuously changing Yet, in order to deploy services overthe Internet, an accurate graph model that captures the relevant structuralproperties is desirable As shown in Part III, a probabilistic approach based

on random graphs seems an e!cient way to learn about the Internet’s triguing behavior Although the Internet’s topology is not a simple Erdös-Rényi random graph, results such as the hopcount of the shortest path andthe size of a multicast tree deduced from the simple random graphs provide

Trang 18

in-a first order estimin-ate for the Internet Moreover, in-anin-alytic formulin-as bin-ased

on other classes of graphs than the simple random graph prove di!cult toobtain This observation is similar to queueing theory, where, beside theM/G/x class of queues, hardly closed expressions exist

We hope that this brief overview motivates su!ciently to surmount themathematical barriers Skill with probability theory is deemed necessary

to understand complex phenomena in telecommunications Once mastered,the power and beauty of mathematics will be appreciated

Trang 20

Probability theory

Trang 22

Random variables

This chapter reviews basic concepts from probability theory A random able (rv) is a variable that takes certain values by chance Throughout thisbook, this imprecise and intuitive definition su!ces The precise definitioninvolves axiomatic probability theory (Billingsley, 1995)

vari-Here, a distinction between discrete and continuous random variables ismade, although a unified approach including also mixed cases via the Stielt-jes integral (Hardy et al., 1999, pp 152—157), R

j({)gi ({), is possible Ingeneral, the distributionI[({) = Pr [[  {] holds in both cases, andZ

In most practical situations, the Stieltjes integral reduces to the Riemannintegral, else, Lesbesgue’s theory of integration and measure theory (Royden,1988) is required

2.1 Probability theory and set theoryPascal (1623—1662) is commonly regarded as one of the founders of proba-bility theory In his days, there was much interest in games of chance1 andthe likelihood of winning a game In most of these games, there was a finitenumber q of possible outcomes and each of them was equally likely The

1 “La règle des partis”, a chapter in Pascal’s mathematical work (Pascal, 1954), consists of a series of letters to Fermat that discuss the following problem (together with a more complex question that is essentially a variant of the probability of gambler’s ruin treated in Section 11.2.1): Consider the game in which 2 dice are thrown q times How many times q do we have

to throw the 2 dice to throw double six with probability s = 1 ?

9

Trang 23

probability of the event D of interest was defined as

Pr [D] = qD

qwhereqD is the number of favorable outcomes (samples points ofD) If thenumber of outcomes of an experiment is not finite, this classical definition

of probability does not su!ce anymore In order to establish a coherent andprecise theory, probability theory employs concepts of group or set theory.The set of all possible outcomes of an experiment is called the samplespace

that is an element of the sample space

sample points An event

complement Df of an event D consists of all sample points of the sample

Pr [

Pr [D] = 1 means that the event D is certain to occur If Pr [D] = s with

0? s ? 1, the event D has probability s to occur

If the events D and E have no sample points in common, D _ E = >,the events D and E are called mutually exclusive events As an example,the event and its complement are mutually exclusive because D _ Df = >.Axiom 2 of a probability measure is that for mutually exclusive events DandE holds that Pr [D ^ E] = Pr [D]+Pr [E] The definition of a probabilitymeasure and the two axioms are su!cient to build a consistent framework

on which probability theory is founded Since Pr [>] = 0 (which follows from

2 A field F posseses the properties:

to also imply that D K E M F.

Trang 24

Axiom 2 because D _ > = > and D = D ^ >), for mutually exclusive events

D and E holds that Pr [D _ E] = 0

As a classical example that explains the formal definitions, let us sider the experiment of throwing a fair die The sample space consists ofall possible outcomes:

is immediately understood by drawing a Venn diagram as in Fig 2.1 Taking

:

Fig 2.1 A Venn diagram illustrating the unionD ^ E

the probability measure of the union yields

Pr [D ^ E] = Pr [(D _ E) ^ (Df_ E) ^ (D _ Ef)]

= Pr [D _ E] + Pr [Df_ E] + Pr [D _ Ef] (2.1)where the last relation follows from Axiom 2 Figure 2.1 shows that D =(D _ E) ^ (D _ Ef) and E = (D _ E) ^ (Df_ E) Since the events aremutually exclusive, Axiom 2 states that

Pr [D] = Pr [D _ E] + Pr [D _ Ef]

Pr [E] = Pr [D _ E] + Pr [Df_ E]

Substitution into (2.1) yields the important relation

Pr [D ^ E] = Pr [D] + Pr [E]  Pr [D _ E] (2.2)Although derived for the measure Pr [=], relation (2.2) also holds for othermeasures, for example, the cardinality (the number of elements) of a set

Trang 25

2.1.1 The inclusion-exclusion formula

A generalization of the relation (2.2) is the inclusion-exclusion formula,

Pr [^q

n=1Dn] =

qX

n 1 =1

Pr [Dn1]

qX

n 1 =1

qX

n 2 =n 1 +1

Pr [Dn1_ Dn2]

+qX

n 1 =1

qX

n 2 =n 1 +1

qX

n 3 =n 2 +1

Pr [Dn1_ Dn2_ Dn3]

+· · · + (1)q1

qX

n 1 =1

qX

n 2 =n 1 +1

· · ·qX

n q =n q1 +1

Pr£_qm=1Dn m

¤(2.3)

The formula shows that the probability of the union consists of the sum ofprobabilities of the individual events (first term) Since sample points canbelong to more than one eventDn, the first term possesses double countings.The second term removes all probabilities of samples points that belong toprecisely two event sets However, by doing so (draw a Venn diagram), wealso subtract the probabilities of samples points that belong to three eventssets more than needed The third term adds these again, and so on Theinclusion-exclusion formula can be written more compactly as,

i(2.4)

Proof of the inclusion-exclusion formula3: Let D = q31n=1D n and E = D q such that

3 Another proof (Grimmett and Stirzacker, 2001, p 56) uses the indicator function defined in Section 2.2.1 Useful indicator function relations are

q

\

n=1

(1 3 1 Dn) Multiplying out and taking the expectations using (2.13) leads to (2.3).

Trang 26

q32n=1DnK D q K D q31

l (2.7) Similarly, in a next iteration we use (2.6) after suitable modification in the right-hand side of (2.7)

to lower the upper index in the union,

k

q33n=1D n K D q32

l Pr

3 Pr

k

q33n=1D n K D q31 K D q32

l Pr

k

q33n=1D n K D q K D q32

l Pr

k

q33n=1DnK D q31 K D q32

l + Pr

k

q33n=1DnK D q K D q32

l + Pr

Trang 27

Substitution of (2.3) into the above expression yields, after suitable grouping of the terms, Pr

l 3

dm

An application of the latter formula to multicast can be found in Chapter

17 and many others are in Feller (1970, Chapter IV) Sometimes it is useful

to reason with the complement of the union (^qn=1Dn)f = qn=1 Dn =_qn=1Dfn Applying Axiom 2 to (^qn=1Dn)f^ (^qn=1Dn) =

Pr [(^qn=1Dn)f] = Pr [ qn=1Dn]and using Axiom 1 and the inclusion-exclusion formula (2.5), we obtain

Trang 28

with the convention thatV0 = 1 The Boole’s inequalities

The inclusion-exclusion formula is of a more general nature and also plies to other measures on sets than Pr [=], for example to the cardinality asmentioned above For the cardinality of a set D, which is usually denoted

ap-by|D|, the inclusion-exclusion variant of (2.8) is

A nice illustration of the above formula (2.10) applies to the sieve ofEratosthenes (Hardy and Wright, 1968, p 4), a procedure to construct thetable of prime numbers4 up to Q Consider the increasing sequence ofintegers

and remove successively all multiples of 2 (even numbers starting from 4,

6, ), all multiples of 3 (starting from 32 and not yet removed previously),all multiples of 5, all multiples of the next number larger than 5 and still inthe list (which is the prime 7) and so on, up to all multiples of the largestpossible prime divisor that is equal to or smaller thanhs

Qi Here [{] is thelargest integer smaller than or equal to{ The remaining numbers in thelist are prime numbers Let us now compute the number of primes (Q )smaller than or equal toQ by using the inclusion-exclusion formula (2.10)

4 An integer number s is prime if s A 1 and s has no other integer divisors than 1 and itself

s The sequence of the first primes are 2, 3, 5, 7, 11, 13, etc If d and e are divisors of q, then q = de from which it follows that d and e cannot exceed bothIq Hence, any composite number q is divisible by a prime s that does not exceedIq.

Trang 29

The number of primes smaller than a real number { is ({) and, evidently,

ifsq denotes theq-th prime, then  (sq) =q Let Dn denote the set of the

in the sieve of Eratosthenes is equal to the largest prime number sqsmallerthan or equal to hs

Q

i, hence,q = ³s

Q

´ If t 5 (^qn=1Dn)f, this meansthat t is not divisible by each prime number smaller than sq and that t is

a prime number lying between s

Q ? t  Q The cardinality of the set(^q

n=1Dn)f, the number of primes between s

Q ? t  Q is

|(^qn=1Dn)f| = (Q)  ³s

Q

´

On the other hand, if u 5 _mp=1Dn p for 1 n1 ? n2 ? · · · ? nm  q, then

u is a multiple of sn 1sn 2= = = sn m and the number of multiples of the integer

sn 1sn 2= = = sn m in

Q

sn 1sn 2= = = sn m

¸

=¯¯¯_m p=1Dn p¯¯¯Applying the inclusion-exclusion formula (2.10) with 0 =Q  1 and

sn 1sn 2= = = snm

¸

The knowledge of the prime numbers smaller than or equal tohs

Qi, i.e thefirst q = ³s

Q´primes, su!ces to compute the number of primes (Q )smaller than or equal to Q without explicitly knowing the primes t lyingbetween s

Trang 30

experiment The set of values { can be finite or countably infiniteand constitute the discrete probability space.

(ii) P

{Pr[[ = {] = 1=

In the classical example of throwing a die, the discrete probability space

possible as outcome, Pr[[ = {] = 16 for each

i

(2.15)

Trang 31

The variance is always non-negative Using the linearity of the expectationoperator and  = H [[], we rewrite (2.15) as

Var [[], is used An interestingvariational principle of the variance follows, for the variable x, from

H

h([  x)2

i

h([  )2

i+ (x  )2which is minimized at x =  = H [[] with value Var[[] Hence, the bestleast square approximation of the random variable [ is the number H [[]

2.2.2 The probability generating function

The probability generating function (pgf) of a discrete random variable [

is defined, for complex }, as

non-hlw is used such that (2.17) expresses the Fourier series of *[

¡

hlw¢ Theimportance of the pgf mainly lies in the fact that the theory of functions can

be applied Numerous examples of the power of analysis will be illustrated.Concentrating on non-negative integer random variables[,

Trang 32

known, the knowledge of the pgf results in a complete alternative description,

gn*[(})g}n

¶}[3q

¸

such that

H

µ[q

¶¸

q!

gq*[(})g}q

[(1) =*00

[(1) (*0

[(1))2 These first few derivatives are interestingbecause they are related directly to probabilistic quantities Indeed, from(2.23), we observe that

Trang 33

at the left, d ? [ Hence, I[({) is not necessarily continuous at the leftwhich implies thatI[({) is not necessarily continuous and that I[({) maypossess jumps But even if I[({) is continuous, the pdf is not necessarycontinuous6.

The pdf of a continuous random variable[ is defined as

p 407) is an other classical, noteworthy function with peculiar properties.

Trang 34

Assuming thatI[({) is dierentiable at {, from (2.29), we have for small,positive{

Pr [{ ? [  { + {] = I[({ + {)  I[({)

= gI[({)

³({)2

´Using the definition (2.30) indicates that, ifI[({) is dierentiable at {,

Pr [d ? [  e] = Pr [d  [  e] = Pr [d  [ ? e] = Pr [d ? [ ? e]

Ifi[({) is not finite, then I[({) is not dierentiable at { such that

lim

{{<0I[({ + {)  I[({) = I[({) 6= 0This means that I[({) jumps upwards at { over I[({) In that case,there is a probability mass with magnitude I[({) at the point { Al-though the second definition (2.31) is strictly speaking not valid in thatcase, one sometimes denotes the pdf at| = { by i[(|) = I[({)(|  {)where({) is the Dirac impulse or delta function with basic property that

R+"

3" (|  {)g{ = 1 Even apart from the above-mentioned di!cultiesfor certain classes of non-dierentiable, but continuous functions, the factthat probabilities are always confined to the region [0,1] may suggest that

0 i[({)  1 However, the second definition (2.31) shows that i[({) can

be much larger than 1 For example, if [ is a Gaussian random variablewith mean and variance 2 (see Section 3.2.3) theni[() = I 1

2 can bemade arbitrarily large In fact,

7 In Lesbesgue measure theory (Titchmarsh, 1964; Billingsley, 1995), it is said that a countable, finite or enumerable (i.e function evaluations at individual points) set is measurable, but its measure is zero.

Trang 35

2.3.1 Transformation of random variables

It frequently appears useful to know how to compute I\({) for \ = j([).Only if the inverse function j31 exists, the event {j([)  {} is equivalent

If j is deceasing, we find that i\ (|) g| = i[({) g{ Thus, if j31 and

j0 exists, then the relation between the pdf of a well-behaved continuousrandom variable [ and that of the transformed random variable \ = j([)is

2.3.2 The expectationAnalogously to the discrete case, we define the expectation of a continuousrandom variable as

If [ is a continuous random variable and j is a continuous function, then

8 This requirement is borrowed from measure theory and Lebesgue integration (Titchmarsh, 1964, Chapter X)(Royden, 1988, Chapter 4), where a measurable function is said to be integrable (in the Lebesgue sense) over D if i + = max(i({)> 0) and i 3 = max(3i({)> 0) are both integrable over D Although this restriction seems only of theoretical interest, in some applications (see the

Trang 36

\ = j([) is also a continuous random variable with expectation H [\ ] equalto

2 (which is a standard excercise in contour integration), but this integral does not exists in the Lesbesgue sense Only for improper integrals (integration interval is infinite), Riemann integration may exist where Lesbesgue does not However, in most other cases (integration over a finite interval), Lesbesgue integration is more general For instance, if i({) = 1{{ is rational}, then U1

0 i (x)gx does not exist in the Riemann sense (since upper and lower sums do not converge to each other) However, U1

0 i(x)gx = 0 in the Lesbesgue sense (since there is only a set of measure zero dierent from 0, namely all rational numbers in [0> 1] ) In probability theory and measure theory, Lesbesgue integration is assumed.

Trang 37

or the mean of a discrete random variable[ expressed in tail probabilities

2.3.3 The probability generating function

The probability generating function (pgf) of a continuous random variable[ is defined, for complex }, as the Laplace transform

}[¤(discrete) Since the exponential is an entire

Trang 38

function10 with power series around } = 0, h3}[ = P"

n=0

(31) n [ n

n! }n, theexpectation and summation can be reversed leading to

by (2.18) is expressed in terms of probabilities of[ This observation has led

¶(}  1)m

"

X

n=m

µnm

11 The Landau big R-notation specifies the “order of a function” when the argument tends to some limit Most often the limit is to infinity, but the R-notation can also be used to characterize the behavior of a function around some finite point Formally, i ({) = R (j ({)) for { < " means that there exist positive numbers f and { 0 for which |i({)| $ f|j({)| for { A { 0

12 The lognormal distribution defined by (3.43) is an example where the summation (2.40) diverges for any } 6= 0.

Trang 39

from which O[(0) = 0 because *[(0) = 1 Further, analogous to thediscrete case, we see that O0

However, the dierence with the discrete case lies in the higher moments,

H [[q] = (1)q g

q*[(})g}q

=O000[(0)

2.4 The conditional probabilityThe conditional probability of the event D given the event E (or on thehypothesis E) is defined as

Pr [D|E] = Pr [D _ E]

The definition implicitly assumes that the event E has positive probability,otherwise the conditional probability remains undefined We quote Feller(1970, p 116):

Taking conditional probabilities of various events with respect to a particular pothesis E amounts to choosing E as a new sample space with probabilities pro-portional to the original ones; the proportionality factor Pr[E] is necessary in order

hy-to reduce the hy-total probability of the new sample space hy-to unity This lation shows that all general theorems on probabilities are valid for conditionalprobabilities with respect to any particular hypothesis For example, the law

formu-Pr [D ^ E] = formu-Pr [D] + formu-Pr [E]  formu-Pr [D _ E] takes the form

Pr [D ^ E|F] = Pr [D|F] + Pr [E|F]  Pr [D _ E|F]

The formula (2.44) is often rewritten in the form

Trang 40

which easily generalizes to more events For example, denoteD = D1 and

E = D2_ D3, then

Pr [D1_ D2_ D3] = Pr [D1|D2_ D3] Pr [D2_ D3]

= Pr [D1|D2_ D3] Pr [D2|D3] Pr [D3]Another application of the conditional probability occurs when a partition-

exclusive, which means that En_ Em => for any n and m 6= n Then, with(2.45),

The eventDn={D _ En} is a decomposition (or projection) of the event D

in the basis event En, analogous to the decomposition of a vector in terms

of a set of orthogonal basis vectors that span the total state space Indeed,using the associative propertyD _ {E _ F} = D _ E _ F and D _ D = D,the intersection Dn_ Dm = {D _ En} _ {D _ Em} = D _ {En_ Em} = >,which implies mutual exclusivity (or orthogonality) Using the distributivepropertyD _ {En^ Em} = {D _ En} ^ {D _ Em}, we observe that

=D _ {^nEn} = ^n{D _ En} = ^nDn

Finally, since all events Dn are mutually exclusive, Pr [D] = P

nPr [Dn] =P

nPr [D _ En] Thus, if nEn and in addition, for any pairm> n holdsthatEn_ Em =>, we have proved the law of total probability or decompos-ability,

defined above Using the definition (2.44) followed by (2.45),

Ngày đăng: 01/06/2014, 11:02

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN