Chapter 6: FUNCTIONS OFF RANDOM VARIABLES

CHAPTER 6 Functions of random variables One of the most important problems in probability theory and statistical inference is to derive the distribution of a function hX,,X3,....X,, w

Trang 1

CHAPTER 6

Functions of random variables

One of the most important problems in probability theory and statistical

inference is to derive the distribution of a function h(X,,X3, X,,) when

the distribution of the random vector X=(X, , X,,) is known This

problem is important for at least two reasons:

(i) it is often the case that in modelling observable phenomena we are

primarily interested in functions of random variables; and

(11} in statistical inference the quantities of primary interest are

commonly functions of random variables

It is no exaggeration to say that the whole of statistical inference is based on

our ability to derive the distribution of various functions of r.v.’s In the first

subsection we are going to consider the distribution of functions ofa single

r.v and then consider the case of functions of random vectors

6.) Functions of one random variable

Let X be a r.v on the probability space (S,-¥% P(-)) By definition, X(-):

5 > R.ie X isa real valued function on S Suppose that f(-): R > R, where

h is a continuous function with at most a countable number of

discontinuities More formally we need /(-) to be a Borel function

Definition |

A function h(-): R, > Ris said to be a Borel function if foranvaeR

and x ER, the set B,= (x: h(x) <a} isa Borel set i.e B, €-4 where B

is the Borel field on R (see Section 3.2)

Requiring that h(-) is a Borel function is an obvious condition to impose

even that we need FON) to be a random variable itself

Wœ£ khow thất Vos ct dimetion from Š to ca and thú, SCVGH cạn be

Hà

6.1 Functions of one random variable 97

A= {s:h(X)(s)eB,}

A(X(s))=h (X)(s) Fig 6.1 A Borel function of a random variable

considered a function from S to R and the above ensures that the composite function h(X): S - R is indeed a random variable, i.e the set A= {s: h(Xs)eB,}c.Z for any B,eF (see Fig 6.1) Let us denote the r.v A(X) by

Y, then Y induces a probability set function P,(-) such that P,(B,)= P,(B,)= P(A), in order to preserve the probability structure of the original (S, % P(-)) Note that the reason we need h(-) to be a Borel function is to

preserve the event structure of Z2

Having ensured that the function /(-) of the r.v X is itself a rv Y=h(X)

we want to derive the distribution of Y when the distribution of X is known Let us consider the discrete case first When X isa discrete r.v the Y=h(X)

is again a discrete r.v and all we need to do is to give the set of values of Y and the corresponding probabilities Consider the coin-tossing example where X is the r.v defined by X =(number of H — number of T), then since S={HT, TH, HH,TT}, X(HT)= X(TH)=0, X(HH)=2, X(TT)= —2 and the probability function ts

P(X=x) 4 3

let Y= X?, then Y takes values (—2)?=4, (0)?=0, 2?=4 with the same

probabilities as X but since 4 occurs twice we add the probabilities, i.e

PdY=y) 1 3

In general, the distribution function of Y is defined as

Fy) = P(s: ¥(s) <= Pls: X(s)eh'((-— cw vy), (6.1) where the inverse function A '(-) need not be unique

Trang 2

OS Functions of random variables

In the case where V is a continuous r.v., deriving the distribution of ¥

h(X) is not as simple as the discrete case because, firstly, Y is not always a

continuous r.v as well and, secondly, the solution to the problem depends

crucially on the nature of h(-) A sufficient condition for Y to be a

continuous r.v as well is given by the following lemma

Lemma

Let X be a continuous r.v and Y =h(X) where h(X) is differentiable

for allx ER, and [dh(x)}/(dx) > 0 or [dh(x)]/(dx) <0 for all x Then

the density function of Y is given by

d

> J(6y)=/ˆ"()) lạ h- l0) Jora<y<b, (6.2)

where || stands for the absolute value and a and b refer to the

smallest and biggest value y can take, respectively

Example 1

Let X ~ N(u, 0”) and Y =(X —y)/o, which implies that [dh(x)]/(dx) = 1/o>

Ofor all x eR since o > 0 by definition; h~'(y)=oy+pand [dh~'(y)]/(dy)=

o Thus since

fay {aCe f

ie Y ~ N(O, 1) the standard normal distribution

In cases where the conditions of Lemma 1 are not satisfied we need to

derive the distribution from the relationship

F,(y) = Pr(h(x)2 y)= Pr(X ch~!((— œ, x])) (6.3)

Example 2

Let X ~ Núu, ø?)and Y = X” (see Fig 6.2) Since [dh(x)]/(dx) = 2x we can see

that h(x) is monotonically increasing for x>0 and monotonically

decreasing for x <0 and Lemma 2 is not satisfied However, for y>0

Fg(y)= PH(h(x) <y) = Prix eh *(— w, x])

=Pr(—./y<X <,/=F,/y)-FA-V/y)

Aix}

h (x) = x?

EE———*kx)<y———

Fig 6.2 The function Y= X? where X is normally distributed

(see Fig 6.2) In this form we can apply the above lemma with [dh” !(y)]/

(dy)= 1/(2,/y) for x >0 and x <0 separately to get

Horm hida( 5.) +6l-V9( 575) for y>0

=5( 2 (2z) exp{—3y}+ 2 (Qn) : exp{—4y} Pi —a2yys fy }y7?

1

“FD (9)! exp(—4y) y>0

That is, f,(y) is the so-called gamma density, where I(-) is the gamma function (I(n)=|@ v"e~’ dv) A gamma r.v denoted by Y~G(r, p) has a density of the form ƒ(y)=[p/T0)l(pyƑ~' exp(— py), y>0 The above distribution is G(4,4) and is known as the chi-square distribution; an important distribution in statistical inference; see Appendix 6.1

6.2* Functions of several random variables

As in the case of a single r.v for a Borel function h(-): R’—R and a random

vector X=(X,, X, , X,,), h(X) is a random variable Let us consider

certain commonly used functions of random variables concentrating on the two variables case for convenience of exposition

Trang 3

100 hunetions of random sariables

xi†x¿ạ=y

Fig 6.3 The function Y= X,+X3

(td) The distribution of X , +X,

By definition the distribution function of Y=, +X, (see Fig 6.3) is

Fy(y)=PHX; +X;<y)

SQ) -| ƒy—x;z,x;)dx;, ve

In particular, if X, and X, are independent, then

6I9=| #4(y—x¿) f2(X¿) asa | Ail) aly — x1) dx,,

—

by symmetry Using an analogous argument we can show that for Y=

XxX, ~ X;

Por Vand Ny independent

Ay tvs val HP bt wad faly a)

AO) hy, —1}

Lxantple 3

Pot Vy ~ Nt oT), X 2 ~ N(uo, 63), X , and X , are independent r.v.’s Define

yo oN, ¢ ¥,, then

"mm

“|svau°m| | 5 Jin

{+034 cà {Y0 +2),

= SP{— 2gyø) {|

Hence, Y~ N(u, + 2, 07 +03) In general if X,,X,, ,X,, are independent nv.s with X,~ N(u;, 07); then

Y= » x~M(Š Hạ 3 ø?}

Example 4 let X;~ U(—1, 1), i= 1,2 (uniformly distributed), and define Y= X,+X, Using Fig 6.3 we can show that

0, |>2

(sce Fig 6.4) For X,~ U(—1, 1), i=1,2,3 and Y=X,+X,4+-X, we can show

0, |y|23

)=J ——, (3-9)? I<i<

3— 2

go 0<|y|<1

This density function is shown below (see Fig 6.5) and as can be seen it is not

Trang 4

102 Functions of random variables

fyly)

0.5

!

y

Fig 6.4 The density function of Y=X,+X, where X, and X, are

uniformly distributed

f„ ty)

Fig 6.5 The density function of Y=X,+X,+X 3 where X;, i=1, 2, 3,

are uniformly distributed

only continuous but also differentiable everywhere The shape of the curve

is very much like the normal density This is a general result which states

mat for X;~ U(—1, 1),i=1,2, uniformly distributed independent r.v.’s,

=)", X; has a distribution which is closer to a normal distribution the

ghaate, the value of n; a particular case of the central limit theorem (see

Chapter 9)

(2) The distribution of X ,/X 5

Consider two rvos ¥, and NV, and let Y= N,N) The distribution of ¥

2* Functions of several random variables 103

Mf 7 xX ne | Xì

%

Fig 6.6 The function Y=X,/X, for Y<0 and Y >0

tikes the form

- ao Ê`yWXz

Ko | A(X1,X2) dx, dx,

09 —œ

0 œ

as suggested by Fig 6.6 For mm

F,(y) -|" f (UX2,X2)X2 dudx,

_>

s0!=| Ix] f(yx2, Xo) dx,, yeR (6.7)

In the case where X, and X, are independent this becomes

hvample 5 (the mathematical manipulations are not important!) EetV, - X0 DĐ and X, < z0) chi-square with n degrees of freedom, X, and Vo berg independent Define Yo Ny (¥a/n) and let us derive its Medobuton The density function of the denominator Z ~ (X3/n) ts given

Trang 5

by

nr?

IA) = se) (nd) exp > › z>0

Since f(x,,2)=f)(x,) -f5(2), it takes values only for z>0, which implies that

00 [+L yam? | amt {st

nữưz)

Tin 7 explain yeh de

—l[{n+ 1)/2] R

This is the density of Student’s t-distribution

Example 6

Let X¥,~x°(n,) and X,~y?(n,) be two independent r.v.’s and define

_Ötx 1/m) _Hy Ấn

(Xz/n;) - ny xX,

A) -| (ws sa Ê say) fates dx,

(n n 2ˆ [ứu +n;)/2] n ứŒy/2)— 1 1/ 2 © ylinin/21= 1 x,

1

\ my dx,

2

(Ep) 2 2 ny}°

This represents the density of Fisher's /-distribution with a, and"; degrees

of freedom

6.2* Functions of several random variables 105

Example 7

Let X¥,~N(0,07), X,~ N(O, 03), X, and X independent r.v.’s, and define

y = X,/X, The density function of Y takes the form

x

Ị y?x) x3

00952 2| | sÍsp| ("2 +) Jess

Xã

-[ v,(er} (53 +3) Jes

d -

“nays Jo “APY 2 a2 oa

==—zz | oe "du, where u=— [| 3+

Ihe density of y is known as the Cauchy density function

(1) The dlistribufion øƒ Y = min(X¡, X;}) [he distribution function of ¥ =min(X,, X,) for two r.v.s X,, X takes the

yeneral form

Pe (y)= Primin(X, X 2) <¥)= 1 — Primin(X ,, X 2) >)

illustrated in Fig 6.7 In the case where X, and X, are independent

BG) = 1 == F(x, Flv)

ILxuample 8 bast

1+ 1í khoa das the Weibul đistrIbution function

Ne sisneriie Vathious sHHBIe Tunclons OE rÝ vs separately, let us

Trang 6

106

X2

consider them together Let (X,, X,, ,

joint probability density function f(x,,x,

Functions of random variables

+%

4

`

r†

“— min (x1, X2)

7

⁄

Fig 6.7 The function Y=min (X;, X;)

transformation:

whose inverse take the form h; '(-)=g,(-), i= 1, 2,

Assume:

(i)

(ii)

(iii)

3ì =hy(X1,X2,- vơ Xx)

ya=h¿(X¡,X;, X„)

Vn =X 1 Xo ng Xn)

X1 =Gi(V 15 V25 -5 Va)

Xn=Gu(V1> Varrees Yn)

h,(:) and g,(-) are continuous;

the partial derivatives ¢x,/Cy;, i,j=1, 2,

continuous; and

the Jacobian of the inverse transformation

J =del( hiến 1" we) 20

COV Von Vy)

X,,) be a random vector with a ,X,) and define the one-to-one

(6.10)

,H

(6.11)

, D, exist and are

6.2* Functions of several random variables 107

These assumptions enable us to deduce that

FOr Va VHA Va)ooe es Gn Vise ees

Example 9 Let X;~ N(O, 1), i= 1,2 be two independent r.v.’s and

xX

Y¥, =hy(X,,X)=X,4+ X32, Y,=h,(X,,X,)=—

X;

Since

1+ 1+ ),)7 Vy

this implies that

a OPV IN tye If +99)

1 Wily | y2 = ¥a)

2xr(1+y¿)2 t1 2? yf

The main drawback of this approach is well demonstrated by the above example The method provides us with a way to derive the joint density function of the Y;s and not the marginal density functions These can be tetiscd by integrating out the other variables For instance,

and in the above example these take the form

/ Qn)

I , Cauchy density

fv)

¬¬

The derivations of these marginal density functions, however, involve some

mini atcd mathematical manipulations.

Trang 7

6.3 Functions of normally distributed random variables, a summary

The above examples on functions of random variables show clearly that

deriving the distribution of h(X,, , X,) when f(x,, x,) is known is

not an easy exercise Indeed this is one of the most difficult problems in

probability theory as argued below Some of the above results, although

involved (as far as mathematical manipulations are concerned), have been

included because they play a very important role in statistical inference

Because of their importance generalisations of these results will be

summarised below for reference purposes

Lemma 6.1

If X;~N(uj,07), i=1, 2, ., a are independent r.v.s_ then

(È7-¡ Xj)~N7?-¡ ty, }7<¡ 6?) — normal

Lemma 6.2

If X;~ N(O, 1),i=1,2, nare independent r.v.’s then 7 1, X?)~

y°(n) — chi-square with n degrees of freedom

Lemma 6.2*

If X,~N(uj,07), i=1, 2, ., nm are independent r.v.s then

(Ề7-¡ XỶ/ø?)~ xứ: ð) — non-central chi-square with non-centrality

parameter, 0= À3 4 HỆ lơ,

Lemma 6.3

If X,~N(O.1), X.~y7(n) are X,,X > independent r.v’s then

X¡/[V(X;/n)]~ tín) - Student’s ¢ with n degrees of freedom

Lemma 6.3*

If X,~N(u,o*), X,~o07x7(n), X,,X independent r.v’s then

X1/[./(X 2/n)] ~ t(n; 6) non-central ¢ with non-centrality

parameter 0=p/0

Lemma 6.4

If Xi~yxm) X;~# (Hy), Xị,X; independent r.v.s then

(X ,/ny)AX/n2)~ F(n,, 2) — Fisher’s F with n, and ny degrees of

freedom

Lemma 6.4*

If X,~y71,10), X>~ ¥7(n,), X,,X5 being independent rrvos then , r~ LY z>~ÈH; Ad g I

(Xi m)AX H8) PUN.nïô) — non-central P ð bejmg the non-

centrality: parameter,

ma5

hectic | cú,0

†

|

{

|

'

t

| LUNE) erected t(n) ———¬

|

Fig 6.8 The normal and related distributions

Lemma 6.5

If X;~N(O, 1), i=1,2 are two independent rvs then (X ,/X5)~ C(O 1) — Cauchy distribution

{he relationships among the distributions referred to in these lemmas are depicted in Fig 6.8 For a summary of these distributions see Appendix 6.1 below, fora more extensive discussion see the excellent book by Johnson and Kotz (1970)

Note that if X ~t(n), Y= X?~ F(1,n) and for n=1, f(1)= C(1.0)

fn thus chapter we considered the distribution of functions of random variables, Although the mathematical manipulations are in general rather foselved this a very important facet of probability theory for two reasons:

w It often occurs in practice that the probability model is not defined

in terms of the original r.v.s bút tn some functions of these

tì Statistical inference is crucially dependent on the distribution of

functions of random: varrables Pstimators and test statistics are

Trang 8

functions of r.v.’s of the form h(X ,, X5, ,X,,) and the distribution

of such functions is the basis of any inference related to the

unknown parameters 0

From the above discussion it is obvious that determining the distribution

of h(X,,X,, ,X,) is by no means a trivial exercise It turns out that more

often than not we cannot determine the distribution exactly Because of the

importance of the problem, however, we are forced to develop

approximations; the subject matter of Chapter 10

It is no exaggeration to say that most of the results derived in the context

of the various statistical models in econometrics, discussed in Part IV,

depend crucially on the results summarised in Section 6.3 above

Estimation, testing and prediction in the context of these models is based on

the results related to functions of normally distributed random variables

and the normal, Student’s t, Fisher’s F and chi-square distributions are

used extensively in Part IV

Appendix 6.1— The normal and related distributions

(1) Univariate normal ~ X ~ N(u, 07)

» Mi 1 7 yy "

x

E(X)=u, Var(X)=o", skewness=a,=0, kurtosis=a,=3

Higher moments:

o'r!

2r2( 7 \1

Characteristic function w(t) = exp(itp —4671°)

Cumulants kK,y=H, K,=07, K,=0, r=3,4

Some properties

(a) Z=[(X—#)/z]~N(0.1)— the standard normal distribution

(b) Reproductive property: WON» N 4.2 NX, are independent rvs,

ÁN <Š Ni 1 [4.0, 17 E2, cú then (NV 1< XIV vu Na] ca : oo 1 fae 2 9Ð,

Fly)

Fig 6.9 The density functions of a central and non-central chi-square

(2) Chi-square distribution — ¥ ~ 77(n)

ƒ›")= T5T(n/2) Am2~1e~02) vw>0, n=1,2, E(y)=n (the degrees of freedom)), Vat(y)= 2n

ihe density function is illustrated for several values of n in Fig 6.9 Reproductive property

It ¥, Y,, , ¥, are independent r.v.’s Y;~ y*(n;), i=1, 2, ., k, then

(Sh Y)~x7(ny tng 4° ++ +1,)

(1) Non-central chỉ-square distribution — Y ~ xŸ0 ð)

f(y ð,n)= Fi 27%?) exp[ — Mr +õj| y9 1

& (ðy*Tk+})

k=0 (21)! r(h +5)

y>0, 6>0, n=1,2,

Hence, the important difference with the central chi-square is that the

@ensity function is shifted to the right and the variance increases

Reproductive property

~Y, are independent rvs, ¥,~ v7 0,) i= 12,2 k, then

Trang 9

f (x)

x Fig 6.10 Comparison of a t and standard normal density

Student's t-distribution — W~ t(n)

ntl

com

~ /(nn) r(; " [+ 1072]

H

W)=—

n—-2

6 n>2, z¿=3+ —, n>4

n—4 These moments show that for large n the f-distribution is very close to the

normal (see Fig 6.10)

(5) Non-central t-distribution — W~ t(n: 0), 0>0

(M:m,ð)=—>————— ——sšz

"` awe ee

For large n

or Var(HMH)~ E1 5

al TH) ở

(6) Fisher’s F-distribution — U ~ F(n,, n>)

BU)= “25, mg>2, Vana a

wu>9

lus ny, a2) =

ny >4 The central and non-central F-distribution density functions are shown in lig 6.11 for purposes of comparison

(7) Non-central F-distribution - U~ F(n,,n3; 6), 0>0

f(u; ny, nz; 6)=

din +2k) kit +240= IP mị +nạ + 2K my mm

e?y

k = n, \iutm+20 ứn, n,+2k , u>d,

m(n;—2)

— ¬(12\ m +ð)?+(mị +2ð)(n; — 2)

vad2)=2( -) (n;— 2)?(n; -4) 7 nạ>4

f{u}

f(u;m,n)

tay OTL Central and non-centval PP densHy functions.

Trang 10

[14 Functions of random variables

Important concepts

Borel functions, distribution ofa Borel function ofa r.v., normal and related

distributions, Student’s t, chi-square, Fisher’s F and Cauchy distributions

3

4

Questions

Why should we be interested in Borel functions of r.v.’s and their

distributions?

‘A Borel function 1s nothing more thana r.v relative to the Borel field 4

on the real line.” Discuss

Explain intuitively why a Borel function of a r.v is a r.v itself

Explain the relationships between the normal, chi-square, Student’s ¢,

Fisher’s F and Cauchy distributions

What is the difference between central and non-central chi-square and

F-distributions?

Exercises

Let X, be a r.v with density function

f (4) 43 5

Derive the density functions of

(i) x=*X?

(ii) X =e";

Let the density function of the rv X be ƒ(x)=e *, x>0 Find the

distribution of Y=log, X

Let the joint density function of X¥, and X, be

x,

Derive the distribution of

(ii) ¥ =min(X,, X,)

Let X ~ (0 1), derive the distribution of Y oN’

Additional references Clarke (1975); Cramer (1946): Giri (1974); Mood, Graybill and Boes (1974); Pfeiffer (1978); Rao (1973); Rohatgi (1976).

Định dạng
Số trang	10
Dung lượng	365,24 KB