Chapter 5: RANDOM VECTORS AND THEIR DISTRIBUTIONS

In the next section we consider the concept of a random vector and its joint distribution and density functions in direct analogy to the random variable case.. In Sections 5.2 and 5.3 we

Trang 1

CHAPTER 5

Random vectors and their distributions

The probability model formulated in the previous chapter was in the form

of a parametric family of densities associated with a random variable (r.v.) X: ®={ f(x; 6), 0€O} In practice, however, there are many observable phenomena where the outcome comes in the form of several quantitative attributes For example, data on personal income might be related to number of children, social class, type of occupation, age class, etc In order

to be able to model such real phenomena we need to extend the above framework for a single r.v to one for multidimensional r.v.’s or random vectors, that is,

X=(ŒX,X¿, X,},

where each X,,¡ = l,2, ,n measures a particular quantifiable attribute of the random experiment’s (6) outcomes

For expositional purposes we shall restrict attention to the two- dimensional (bivariate) case, which is quite adequate for a proper understanding of the concepts involved, giving only scanty references to the n-dimensional random vector case (just for notational purposes) In the next section we consider the concept of a random vector and its joint distribution and density functions in direct analogy to the random variable case In Sections 5.2 and 5.3 we consider two very important forms of the joint density function, the marginal and conditional densities respectively These forms of the joint density function will play a very important role in Part IV

5.1 Joint distribution and density functions

Consider the random experiment & of tossing a fair coin twice The sample

78

Trang 2

5.1 Joint distribution and density functions 79 space takes the form S={(H1T), (TH), (HH), (TT)} Define the function

X ,(-) to be the number of ‘heads’ and X,(-) to be the number of ‘tails’ Both

of these functions map S into the real line IR in the form

(X,(-), X20 )(TA)} =, 1), ee (X (TH), Xo(TH))= (1, 0), (X4(-), XC: )(AT)} = (1, Ð,

(X4(-), Xa(-))(CH)} = (2,0),

(X1(-), X2(-))(TT)} =, 2)

This is shown in Fig 5.1 The function (X,(-), X2(-)): S— R? is a two- dimensional vector function which assigns to each element s of S, the pair of ordered numbers (x,,x2) where x,=X,(s), x,=X,(s) As in the one- dimensional case, for the vector function to define a random vector it has to

satisfy certain conditions which ensure that the probabilistic and event

structure of (S, ¥ P(-)) is preserved In direct analogy with the single variable case we say that the mapping

defines a random vector if for each event in the Borel field product #x Z=

8, say B=(B,, B,), the event defined by

X~1(B)={s: X,(s)eB,, X,(s)€B,,s€S} (5.2)

belongs to &

Ss X (+) = (Ky +), XK)

X¿

0

Fig 5.1 A bivariate random vector

x

Trang 3

80 Random vectors and their distributions

Extending the result that # can be profitably seen as being the o-field

generated by half-closed intervals of the form {— «,x] to the case of the

direct product # x #8 we can show that the random vector X(-) satisfying

X-'!{(—~z ,x])e.# for all xelR2?

implies X~1(B)e¢ ¥ for all BEB’ (5.3)

This allows us to define a random vector as follows:

Definition 1

A random vector X(-): is a vector function

such that for any two real numbers (x,,X )=x, the event

X-!(—z.,x]={s:—œ<X¡§)<xy,

—~ x <ÄX¿(s)<x;, seS} c,Z

Note.(— x,x]=((— x x,].(— x.x,]) represents an infinite rectangle (see Fig 5.2) The random vector (as in the case of a single random variable)

induces a probability space (R*, 2”, P,(-)), where Z? are Borel subsets on

the plane and P,(-) a probability set function defined over events in #7, ina way which preserves the probability structure of the original probability

X;

a

mn

{x1 XQ) xy ST xy < x5}

me

Fig 5.2 The inũnite rectangle (— +, x*], x*=(Xị Xa)

Trang 4

%1 Joint distribution and density functions 81 space (S,# P(-)) This is achieved by attributing to each Be-Z? the probability

or

This enables us to reduce P,({-) to a point function F(x,, x), we call the joint (cumulative) distribution function

Definition 2

Let X=(X,,X) be a random vector defined on (S, ¥, P(-)) The function defined by

such that

F(x) = F(X1,X2)=P,{(— «©, x]) = PX, <x, X2 <x)

=Pr(X <x)

is said to be the joint distribution function of X

In the coin-tossing example above, the random vector X(-) takes the

value (1, 1), (2,0), (0, 2) with probabilities 4, 4 and 4 respectively In order to

derive the joint distribution function (DF) we have to define all the events of

the form {s: X;(s)<x¡, X;z(s)<x;¿, seS} for all (x¡, x;) e8?

Ø x;<0,x;<0 {(HT),(THỊ, 0<x¿<2,0<x;<2

is: Ä¡(s)Sx;:, X;¿(s)<x¿,seS}= ( (HH), x:;>2,0<x;<2

{TT}, O<x,<2,x,22

Nore a degree of arbitrariness in choosing the infinite rectangles (— «, x] The joint DF of X, and X, is given by

0, x,<0,x,<0

$3, O<x,<2,0<x,<2

F(x,,X,)= 3

p X,22,0<x,<2

1, x, 22,x,22.

Trang 5

Table 5.1 Joint density function of (X,, X >)

X;

From the definition of the joint DF we can deduce that F(x,,X,) 1s a monotone non-decreasing function in each variable separately, and

As in the case of one r.v., we concentrate exclusively on discrete and continuous joint DF only; singular distributions are not considered

Definition 3

The joint DF of X, and X, is called a discrete distribution if there exists a density function f{() such that

/#&¡,x;)>0, (x¡,x;) elR? (5.10) and it takes the value zero everywhere except at a finite or countably infinite point in the plane with

In the coin-tossing example the density function in a rectangular array form is represented in Table 5.1 Fig 5.3 represents the graph of the joint density function of X =(X,, X 2) The joint DF is obtained from f(x,, x) via the relation

Definition 4

The joint DF of X, and X , is called (absolutely) continuous if there exists a non-negative function f(x,,x,) such that

Fx,.x;)= Ỉ Ừ ƒ(u, e) du ức, (5.13)

Trang 6

52 Some bivariate distributions

83

Ff (x4,X2)

xy

Fig 5.3 The bivariate density function of Table 5.1

f(xị,Xz} 1S called the joint (probability) density function of X,,X This definition implies the following properties for f(x,, Xa):

(F1) ƒ(x,,x;) đxị dx;= 1b

(5.14)

(5.15)

if f(-) is continuous at (x1, Xz)

5.2 Some bivariate distributions

(1) Bivariate normal distribution

(1—p?)?

x3 = -

2(1—p”) oy

—2 Xy Ta X27 Ha 4 X)—H2\"

0=(H¡ Hạ, 01, 03 pyeR? x RẺ x [0 1].

Trang 7

F (x4, X>)

Fig 5.4 The density function of a standard normal density

It is interesting to note that the expression inside the square brackets expressed in the form of

defines a sequence of ellipses of points with equal probability which can be viewed as map-like contours of the graph of f(x,, x2) represented in Fig 5.4

(2) Bivariate Pareto distribution

f(X45Xa) = HAF Dayan) (agxy +4yXq— Gy ay) 7",

(A>0, x, >a, >0, x2 >a4,>0), (5.19)

O=(/.;.d))

(3) Bivariate binomial distribution

nl ee

/(xị.X¿z)= vÀ xa DỊ P2" xịi+x;¿=l, pi†+p;=] (5.20)

xijtxảl

The extension of the concept of a random variable X to that of a random vector X=(X,, X>, , X,,) enables us to generalise the probability model

Trang 8

5.3 Marginal distributions 85

to that of a parametric family of joint density functions

This is a very important generalisation since in most applied disciplines, including econometrics, the real phenomena to be modelled are usually multidimensional in the sense that there is more than one quantifiable feature to be considered

5.3 Marginal distributions

Let X =(X,, X,) bea bivariate random vector defined on (S,.% P(-)) witha joint distribution function F(x;, x;) The question which naturally arises is whether we could separate X, and X, and consider them as individual random variables The answer to this question leads us to the concept of a marginal distribution The marginal distribution functions of X ¡ and X ; are defined by

xi>#

Having separated X , and X , we need to see whether they can be considered

as single r.v.’s defined on the same probability space In defining a random vector we imposed the condition that

The definition of the marginal distribution function we used the event

which we know belongs to # This event, however, can be written as the intersection of two sets of the form

but the second set is S ie {s: X4(s)< x} =S,

which implies that

ft; X¡(s)Sxị, X2(S)< DH} = [81 Xy(s) Sy 5, (5.28)

which indeed belongs to ⁄Z and it is the condition needed for X; to be a r.v

with a probability function F ,(x,); the same is true for X ; In order to see

Trang 9

this, consider the joint distribution function

—Đxạ F(x,,x;)=l—e ”'—e"”: +exp| —=Ô0(x¡ +x;)},

(x;.X;)€ R , (5.29)

F,(x¡)= lim F(x,,x;)=l—e ”"!, x,elR,,

since lim„ „(e `”)=0 Similarly,

Note that F¡(x¡) and F;(x;) are proper distribution functions

Given that the probability model has been defined tn terms of the joint density functions, it is important to consider the above operation of marginalisation in terms of these density functions The marginal density

functions of X, and X, are defined by

fts=| ƒ(xị,x;)dx;

fIs)= | | ƒ(xịi, X;) dx:,

— 2

that is, the marginal density of X,(= l,2) is derived by integrating out

X ({i#j) from the joint density In the discrete case this amounts to summing out with respect to the other variable:

II 1

Example

Consider the working population of the UK classified by income and age as follows:

Income: £2000—-4000, £4000—8000, £8000— 12 000, £12 000—20 000, £20 000—

50 000, over £50 000

Age: young, middle-aged senior

Define the random variables X ,-income class, taking values 1-6, and X ,- age class, taking values 1-3 Let the joint density be (Table 5.2):

Trang 10

5.3 Marginal distributlons 87

Table 5.2 Joint density of (X,,X)

Xp =Xy;

The marginal density function of X, is shown in the column representing row totals and it refers to the probabilities that a randomly selected person will belong to the various income classes The marginal density of X, is the row representing column totals and it refers to the probabilities that a randomly selected person will belong to the various age classes That is, the marginal distribution of X,(X.,) incorporates no information relating to X,(X,) Moreover, it is quite obvious that knowing the joint density function of X, and X, we can derive their marginal density functions; the reverse, however, is not true in general Knowledge of f,(x,) and f,(x3) is enough to derive f(x,,x,) only when

in which case we say that X, and X, are independent r.v.’s Independence in terms of the distribution functions takes the same form

In the case of the income-age example it is clear that

ƒ(xị,x;)#/(x¡) folX2),

e.g

0.250 ¥ (0.275)(0.4),

and hence, X, and X, are not independent r.v.’s, ie income and age are related in some probabilistic sense; it is more probable to be middle-aged and rich than young and rich!

In the continuous r.v.’s example we can easily verify that

F (x1) Fa(x2)=U ee) = Flxy, x2), (5.34)

and thus X, and X, are indeed independent.

Trang 11

Note that two events, A, and A), in the context of the probability space (S, ¥, P(-)) are said to be independent (see Section 3.3) if

It must be stressed that marginal density functions are proper density functions satisfying all the properties of such functions In the income—age example it can be seen that ƒ,(x;)>0 /;(x;)>0 and 3; /(x¡j)= I and

3; /2(x;¡j)= 1

Because of its importance in what follows let us consider the marginal density functions in the case of the bivariate normal density:

x 1- 2)\-4

fis=| une)

_„ 2010;

since the integral equals one, the integrand being a proper conditional density function (see Section 5.4 below)

Similarly, we can show that

Hence, the marginal density functions of jointly normal r.v.’s are univariate normal

In conclusion we observe that marginalisation provides us with ways to simplify a probability model when such model is defined in terms of joint density functions by ‘taking out’ any unwanted random variables In general, the marginal density of the r.v.’s of interest X,, X, , X, can be

Trang 12

5.4 Conditional distributions 89 derived from the joint density function of X¡., X¿ Xg Äz y4 X Via

đ.2, kÊXG Xxc cv Xx)

-| | | ƒ(X‡,X¿ = Xz)dXy.¡ dX,,

`——~-

In the income-age example if age is not relevant in our investigation we can simplify the probability by marginalising out X 5

5.4 Conditional distributions

In the previous section we considered the question of simplifying probability models of the form (22) by marginalising out some subset of the rv.’s X,, X>, , X, This amounts to ‘throwing away the information related to the r.v.’s; integrated out as being irrelevant In this section we consider the question of simplifying ® by conditioning with respect to some subset of the r.v.’s

In the context of the probability space (S,.4 P(-)) the conditional probability of event A, given event A, is defined by (see Section 3.3):

(5.41)

By choosing 4, ={s: X,(s)<x,} we could use the above formula to derive

an analogous definition in terms of distribution functions that is

where

P(fs:X,(s)<xi}n4;

PUX <j) Ag) PISA 4) (5.43)

P(A)

As far as event A, is concerned there are two related forms we are particularly interested in, A, = {X,=%,}, where X, is a specific value taken

by X; and A,=o(X,) i.e the o-field generated by X, In the case where A,=o(X,), there are no particular problems arising in the definition of the conditional distribution function

since ơ(X ;)c.Z, although ít is not particularly clear what form Fy ay, will

take In the case where A,='s: X,(s)=X,}, however, it is immediately

obvious that since P(s: A,(s)=x,)=0 when X, is a continuous r.v., there will

Định dạng
Số trang	18
Dung lượng	461,19 KB

Tiêu đề	Joint distribution and density functions
Thể loại	Chapter