Chapter 4: RANDOM VARIABLES AND PROBABILITY DISTRIBUTION

48 Random variables and probability distributions 4.1 The concept of a random variable Fig.. On the other hand, the function Y-: S—> R, defined by YHT}= Y{HH}=1, Y{TH}=Y{TT}=0 does not p

Trang 1

developed much further than stating certain properties of P(-) and

introducing the idea of conditional probability This is because the model based on (S, ¥ P(-)) does not provide us with a flexible enough framework The main purpose of this section is to change this probability space by mapping it into a much more flexible one using the concept of a random variable

The basic idea underlying the construction of (S, ¥ P(-)) was to set upa framework for studying probabilities of events as a prelude to analysing problems involving uncertainty The probability space was proposed as a formalisation of the concept of a random experiment & One facet of & which can help us suggest a more flexible probability space is the fact that

when the experiment is performed the outcome is often considered in

relation to some quantifiable attribute; i.e an attribute which can be represented by numbers Real world outcomes are more often than not

expressed in numbers It turns out that assigning numbers to qualitative

outcomes makes possible a much more flexible formulation of probability theory This suggests that if we could find a consistent way to assign numbers to outcomes we might be able to change (S, & P(-)) to something more easily handled The concept of a random variable is designed to do just that without changing the underlying probabilistic structure of (S, F, P(-))

47

Trang 2

48 Random variables and probability distributions

4.1 The concept of a random variable

Fig 4.1 illustrates the mathematical model (S, ¥ P(-)) for the coin-tossing

example discussed in Chapter 3 with the o-field of interest being ¥ ={S, @,

UHH)}, {(TT)j, {(HH.(TT)), ((TH)(HT)}, ((HT),(TH), (HH),

{(HT).(TH).(TT)}) The probability set function P(-) is defined on ¥ and

rakes values in the interval (0, 1], i.e P(-) assigns probabilities to the events

in ¥ As can be seen, various combinations of the elementary events in S define the o-field F (ensure that it is a o-field!) and the probability set function P(-) assigns probabilities to the elements of ¥

The main problem with the mathematical model (S, ¥ P(-)) is that the

general nature of S and ¥ being defined as arbitrary sets makes the mathematical manipulation of P(-) very difficult; its domain being a o-field

of arbitrary sets For example, in order to define P(-) we will often have to derive all the elements of ¥ and tabulate it (a daunting task for large or infinite ¥s), to say nothing about the differentiation or integration of sucha set function

Let us consider the possibility of defining a function X(-) which maps S directly into the real line R, that is,

Trang 3

4.1 The concept of a random variable 49

us with a consistent way of attaching numbers to elementary events;

consistent in the sense of preserving the event structure of the probability space (S, # P(-)) The answer, unsurprisingly, is certainly not

This is because, although X is a function defined on S, probabilities are

assigned to events in ¥ and the issue we have to face is how to define the values taken by X for the different elements of S in a way which preserves the event structure of 4% In order to illustrate this let us return to the earlier example To each value of X, equal to 0, 1 and 2 there correspond some subset of S, 1.¢

that is, it preserves unions, intersections and complements In other words,

Trang 4

for each subset N of R, the inverse image X~‘(N) must be an event

in # Looking at X as defined above we can see that X '(O)eF

X~1({1}© {2}) € F, that is, X(-) does indeed preserve the event structure of

¥ On the other hand, the function Y(-): S—> R, defined by Y((HT})=

Y({HH})=1, Y({TH})=Y({TT})=0 does not preserve the event structure

of # since Y-'(0)¢ % Y~'(1)¢# This prompts us to define a random variable X to be any such function satisfying this event preserving condition

in relation to some o-field defined on R,; for generality we always take the

Three important features of this definition are worth emphasising

(i) A random variable is always defined relative to some specific o-

field F

(ii) In deciding whether some function Y(-): S>R is a random

variable we proceed from the elements of the Borel field 4 to those

of the o-field ¥ and not the other way around

(ili) A random variable is neither ‘random’ nor ‘a variable’

Let us consider these important features in some more detail in order to enhance our understanding of the concept of a random variable; undoubtedly the most important concept in the present book

The question ‘is X(-): S— Ra random variable?’ does not make any

sense unless some o-field ¥ is also specified In the case of the function X- number of heads, in the coin-tossing example we see that it is a random variable relative to the o-field ¥ as defined in Fig 4.1 On the other hand, Y,

as defined above, is not a random variable relative to -¥ This, however, does not preclude Y from being a random variable with respect to some other o-

fied ¥,; for instance F,=(S,@.{(HH)(HT)}, {(TH),(TT)}} Intuition

suggests that for any real valued function X(-): S > IR we should be able to define a o-field 4, on S such that X is a random variable In the previous section we considered the o-field generated by some set of events C Similarly, we can generate o-fields by functions X(-): S— R which turn X(-) into a random variable Indeed %, above is the minimal o-field generated by Y, denoted by ø(Y) The way to generate such a minimal o-field

is to start from the set of events of the inverse mapping Y 1(-), Le {(HT),(HH)} = Y ~!(1) and {(TH),(TT)} = Y ~ (0) and generate a o-field by taking unions, intersections and complements In the same way we can see

Trang 5

that the minimal o-field generated by X — the number of heads, ø(X) coincides with the o-field ¥ of Fig 4.2; verify this assertion In general, however, the o-field ¥ associated with S on which a random variable X is defined does not necessarily coincide with o(X) Consider the function

The above example is a special case of an important general result where X,, X, , X, are random variables on the same probability space (S, ¥ P(-)) and we define the new random variables

Y,=X,, Yo=Xy+X2, Y3=X,+X 4+ Xs,

Ifo(Y,), o(¥3), ,0(¥,) denote the minimal o-fields generated by Y,, Y, ,

, respectively, we can show that

i.e a(Y,), ,6(Y,) form an increasing sequence of a-fields in ¥ In the above example we can see that if we define a new random variable X ,(-):S —> by

X;((HHj)=1, X;(TT)j)= X;(ŒH)j = X(T T)}) =9,

then X = X; + X; (see Table 4 L) 1s also a random variable relative to ø(X);

X is defined as the number of Hs (see Table 4.1)

Note that X, is defined as ‘at least one H’ and X, as ‘two Hs’

The above concept of a-fields generated by random variables will prove

very useful in the discussion of conditional expectation and martingales (see Chapters 7 and 8) The concept ofa o-field generated by a random variable enables us to concentrate on particular aspects of an experiment without having to consider everything associated with the experiment at the same

time Hence, when we choose to define a r.v and the associated a-field we make an implicit choice about the features of the random experiment we are

interested in

‘How do we decide that some function X(-): S— Ris a random variable

relative to a given o-field ¥? From the above discussion of the concept of a

Trang 6

52 Random yariables and probability distributions

random variable it seems that if we want to decide whether a function X isa random variable with respect to % we have to consider the Borel field 4 on

R orat least the Borel field 4, on R,;a daunting task It turns out, however,

that this is not necessary From the discussion of the o-field a(J) generated

by the set J={B,:xeR' where B,=(— «,x] we know that #=o(J) and if

X(-) is such that

X“'(-x xJHts: Xisde(—a.x],seSleF

for all(—z x]ec.2 (4.6) then

In other words, when we want to establish that X is a random variable or

define P,(-) we have to look no further than the half-closed intervals

(—x,x] and the o-field o(J) they generate, whatever the range iR, Let us

use the shorthand notation { X(s) <x} instead of {s: X(x)e(— x.x],seS} to

consider the above argument in the case of X — the number of Hs, with respect to ¥ in Fig 4.2

we can see that X~ !{(— z x])e.Z for all xelR and thus X(-) is a random

variable with respect to 4 On the other hand,

The term random variable is rather unfortunate because as can be seen from the above definition X is neither ‘random’ nor a ‘variable’; it is a real valued function and the notion of probability does not enter its definition Probability enters the picture after the random variable has been defined in

an attempt to complete the mathematical model induced by X

Trang 7

be consistent with the probabilities assigned to the corresponding events

in F% Formally, we need to define a set function P,(-): 4 — [0, 1] such that

P.(Bì=P(X!1(B)= P(s: X(s)eB.seS) for all Be.2 (4.10) For example, in the case illustrated in Table 4.1

=, PA 2)=s, PCO; (=a ete

Jah PCI ae Pu00/2/1/)0£1 PCO} VS 0 (1) =O

LS ll

Ble ~~ ” a, —

The question which arises is whether, in order to define the set function P,{-), we need to consider all the elements of the Borel field #4 The answer is that we do not need to do that because, as argued above, any such element

of 4 can be expressed in terms of the semi-closed intervals (— «, x] This implies that by choosing such semi-closed intervals ‘intelligently’, we can define P,(-) with the minimum of effort For example, P,(-) for x, as defined

in Table 4.1, can be defined as follows:

Ás we can see, the semi-closed intervals were chosen to divide the real line at

the points corresponding to the values taken by X This way of defining the semi-closed intervals is clearly non-unique but it will prove very convenient

in the next section

The discerning reader will have noted that since we introduced the concept of a random variable X(-) on (S,.4 P(-)) we have in effect

Trang 8

developed an alternative but equivalent probability space (R,.4 P,(-))

induced by X The event and probability structure of (S,4~ P(-)) is

preserved in the induced probability space (R, 4, P,{-)) and the latter has a much ‘easier to handle’ mathematical structure; we traded S, a set of

arbitrary elements, for R, the real line, ¥ a o-field of subsets of S with 4, the

Borel field on the real line, and P(-) a set function defined on arbitrary sets with P,{-), a set function on semi-closed intervals of the real line In order to illustrate the transition from the probability space (S,4% P(-)) to

(R,, 4 P,{-)) let us return to Fig 4.1 and consider the probability space

induced by the random variable X-number of heads, defined above As can

be seen from Fig 4.3, the random variable X(-) maps S into (0, 1,2}

Choosing the semi-closed intervals (— 0,0], (—«, 1], (—«,2] we can generate a Borel field on R which forms the domain of P,,(-) The concept of

a random variable enables us to assign numbers to arbitrary elements of a set (S) and we choose to assign semi-closed intervals to events in ¥ as induced by X By defining P,(-) over these semi-closed intervals we complete the procedure of assigning probabilities which is consistent with the one used in Fig 4.1 The important advantage of the latter procedure is that the mathematical structure of the probability space (R, Z2, P,(-)) isa lot more flexible as a framework for developing a probability model The purpose of what follows in this part of the book is to develop such a flexible mathematical framework It must be stressed, however, that the original probability space (S, ¥ P(-)) has a role to play in the new mathematical

framework both as a reference point and as the basis of the probability

model we propose to build Any new concept to be introduced has to be related to (S, ¥ P(-)) to ensure that it makes sense in its context

Fig 4.3 The change from (S,¥ ,P(-)) to (R,,¥,P,(-)) induced by X

Trang 9

4.2 The distribution and density functions 55

4.2 The distribution and density functions

In the previous section the introduction of the concept of a random variable (rv.), X, enabled us to trade the probability space (S, 4 P(-)) for

(R, 4, P,{-)) which has a much more convenient mathematical structure

The latter probability space, however, is not as yet simple enough because P,(-)is still a set function albeit on real line intervals In order to simplify it

we need to transform it into a point function (a function from a point to a point) with which we are so familiar

The first step in transforming P,(-) into a point function comes in the form of the result discussed in the previous section, that P,(-) need only be defined on semi-closed intervals (— x,x], x €R, because the Borel field # can be viewed as the minimal o-field generated by such intervals With this

in mind we can proceed to argue that in view of the fact that all such intervals have a common starting ‘point (— «) we could conceivably define

a point function

whichis, seemingly, only a function of x In effect, however, this function will

do exactly the same job as P,(-) Heuristically, this is achieved by defining F(-) as a point function by

Pal -— x, x]\)=F(x)-F(—«), for all xeR, (4.13) and assigning the value zero to F(— «) Moreover, given that as x increases the interval it implicitly represents becomes bigger we need to ensure that F(x) is a non-decreasing function with one being its maximum value (i.e

F(x,)<F(x,) if x; <x, and lim F(x) = 1) For mathematical reasons we

also require F(-) to be continuous from the right

(iii) F(x) is continuous from the right

(i.e im,¡ạ F(x + h)= F(x), VxefR) (4.17)

Trang 10

Tt can be shown (see Chung (1974)) that this defines a unique point function

for every set function P,(-)

The great advantage of F(-) over P(-) and P,(-) is that the former is a point function and can be represented in the form of an algebraic formula: the kind of functions we are so familiar with in elementary mathematics This will provide us with a very convenient way of attributing probabilities

to events

Fig 4.4 represents the graph of the DF of the r.v ¥ in the coin-tossing example discussed in the previous section, illustrating its properties in the case of a discrete rv X-number of Hs

Definition 3

A random variable X is called discrete if its range R,, is some subset

of the set of integers Z=(O41, +2, }

In this book we shall restrict ourselves to only two types of random variables, namely, discrete and (absolutely) continuous

Definition 4

A random variable X is called (absolutely) continuous if its distribution function F(x) is continuous for all x € R and there exists

a non-negative function f(-) on the real line such that

Trang 11

It must be stressed that for X to be continuous is not enough for the distribution function F(x) to be continuous The above definition postulates that F(x) must also be derivable by integrating some non-negative function (x) So far the examples used to illustrate the various concepts referred to discrete random variables From now on, however emphasis will be placed almost exclusively on continuous random variables The reason for this is that continuous random variables (r.v.’s) are susceptible to a more flexible mathematical treatment than discrete r.v.s and this helps in the construction of probability models and facilitates the mathematical and statistical analysis

In defining the concept of a continuous r.v we introduced the function f(x) which is directly related to F(x)

is said to be the (probability) density function (pdf) of X

In the coin-tossing example, f(0)=4, f(1)=4, and f(2)=4 (see Fig 4.5) In order to compare F(x) and f(x) fora discrete with those of a continuous rv, let us consider the case where X takes values in the interval (a b] and all values of X are attributed the same probability: we express this by saying that X is uniformly distributed in the interval [a,b] and we write X ~ U(a, b) The DF of X takes the form

Trang 12

Comparing Figs 4.4 and 4.5 with 4.6 and 4.7 we can see that in the case of

a discrete random variable the DF is a step function and the density

function attributes probabilities at discrete points On the other hand, fora

continuous r.v the density function cannot be interpreted as attributing probabilities because, by definition, if X is a continuous r.v P(X =x)=0 for all xe R This can be seen from the definition of f(x) at every continuity

Trang 13

we gain in simplicity and added intuition It enhances intuition to view

density functions as distributing probability mass over the range of X The density function satisfies the following properties:

(4.28)

Trang 14

Properties (ii) and (11) can be translated for discrete r.v.’s by substituting

‘Y’ for ‘}-dx’ It must be noted that a continuous r.v is not one with a

continuous DF F(-) Continuity refers to the condition that also requires the existence of a non-negative function f{-) such that

4.3 The notion of a probability model

Let us summarise the discussion so far in order to put it in perspective The axiomatic approach to probability formalising the concept of a random experiment & proposed the probability space (S,.~ P(-)), where S

represents the set of all possible outcomes, ¥ is the set of events and P(-)

assigns probabilities to events in-~ The uncertainty relating to the outcome

of a particular performance of & is formalised in P(-) The concept of a random variable X enabled us to map S into the real line R and construct an equivalent probability space induced by X,(R, 4 P,(-)), which has a much

‘easier to handle’ mathematical structure, being defined on the real line

Although P,(-) is simpler than P(-) it is still a set function albeit on the Borel field 4 Using the idea of o-fields generated by particular sets of events we defined P,(-) on semi-closed intervals of the form (— %, x] and managed to define the point function F(-), the three being related by

P(s: X(s)e(— %., x], seS)= P,(— x]= F(x) (4.30) The distribution function F(x) was simplified even further by introducing the density function f(x) via F(x)=J*%,, f(u) du This introduced further flexibility into the probability model because f(x) is definable in closed algebraic form This enables us to transform the original uncertainty related

to & to uncertainty related to unknown parameters @ of f(-): in order to emphasise this we write the pdfas f(x: 0) We are now ina position to define our probability model in the form ofa parametric family of density functions which we denote by

® represents a set of density functions indexed by the unknown parameter(s)

@ which are assumed to belong to a parameter space © (usually a multiple of the real line) In order to illustrate these concepts let us consider an example

Trang 15

4.3 The notion of a probability model 61 Fix) ~ Ov oy

4 u

SNe

Fig 4.8 The density function of a Pareto distributed random variable for

different values of the parameter

of a parametric family of density functions, the Pareto distribution:

be seen from Fig 4.8

When such a probability model is postulated it is intended as a description of the chance mechanism generating the observed data For example, the model in Fig 4.8 is commonly postulated in modelling personal incomes exceeding a certain level x9 If we compare the above graph with the histogram of personal income data in Chapter 2 for incomes over £4500 we can see that postulating a Pareto probability density seems to be a reasonable model In practice there are numerous

such parametric families of densities we can choose from, some of which will

be considered in the next section The choice of one such family, when modelling a particular real phenomenon, is usually determined by previous experience in modelling similar phenomena or by a preliminary study of the data

When a particular parametric family of densities ® is chosen, as the appropriate probability model for modelling a real phenomenon, we are in effect assuming that the observed data available were generated by the

‘chance mechanism’ described by one of those densities in ®, The original

uncertainty relating to the outcome of a particular trial of the experiment

Tiêu đề	Chapter 4: Random Variables and Probability Distribution
Trường học	University of Example
Chuyên ngành	Mathematics / Statistics
Thể loại	Textbook
Năm xuất bản	2023
Thành phố	Sample City

Định dạng
Số trang	31
Dung lượng	0,92 MB