A Course in Mathematical Statistics phần 5 ppsx

8.6.2 Do likewise in order to establish part ii of Theorem 7′... Chapter 9 Transformations of Random Variables and Random Vectors 9.1 The Univariate Case The problem we are concerned wit

Trang 1

±( )

Relations (6) and (7) imply that lim

n→∞P(X n /Y n ≤ z) exists and is equal to

Trang 2

P X

n n

REMARK 12 Theorem 8 is known as Slutsky’s theorem

Now, if X j , j = 1, , n, are i.i.d r.v.’s, we have seen that the sample

and also in probability On the other hand, X¯2

n ⎯ →⎯n→∞ μ2

a.s and also in

probability, and hence X¯2

P n

d n

n n

d n

Trang 3

by Theorem 3, and

n n

−1( −μ)

converges in distribution to N(0, 1) as n→ ∞, by Theorem 9 ▲

The following result is based on theorems established above and it is ofsigniﬁcant importance

For n = 1, 2, , let X n and X be r.v.’s, let g: → be differentiable, and let

its derivative g ′(x) be continuous at a point d Finally, let c n be constants suchthat 0 ≠ c n → ∞, and let c n (X n − d) ⎯ →⎯d X as n → ∞ Then c n [g(X n)− g(d)]

d

⎯ →⎯ g ′(d)X as n → ∞.

PROOF In this proof, all limits are taken as n → ∞ By assumption,

c n (X n −d) d

⎯ →⎯ X and c n−1→ 0 Then, by Theorem 8(ii), X n − d ⎯ →⎯d 0, or

equivalently, X n − d ⎯ →⎯P 0, and hence, by Theorem 7′(i),

However, |X n*− d| ≤ |X n − d| ⎯ →⎯P 0 by (8), so that X n * ⎯ →⎯P d, and therefore,

by Theorem 7′(i) again,

g X( )*n ⎯ →⎯P g d( ) (10)

By assumption, convergence (10) and Theorem 8(ii), we have c n (X n − d)

g ′(X n *) ⎯ →⎯d g ′(d)X This result and relation (9) complete the proof of the

Trang 4

8.6* Pólya’s Lemma and Alternative Proof of the WLLN

The following lemma is an analytical result of interest in its own right It wasused in the corollary to Theorem 3 to conclude uniform convergence

(Pólya) Let F and {F n } be d.f.’s such that F n (x) ⎯ →⎯n→∞ F(x), x∈, and let F be

continuous Then the convergence is uniform in x∈ That is, for every ε > 0

there exists N( ε) > 0 such that n ≥ N(ε) implies that |F n (x) − F(x)| <ε for every

x∈

PROOF Since F(x) → 0 as x → −∞, and F(x) → 1, as x → ∞, there exists an

interval [α, β] such that

Trang 5

0≤F x( )+ −ε F x n( )≤F x( )j+1 + −ε F x( )j +ε 2<2ε

and therefore |F n (x) − F(x)| < ε Thus for n ≥ N(ε), we have

F x n( )−F x( )<ε for every x∈ (17)Relation (17) concludes the proof of the lemma ▲

Below, a proof of the WLLN (Theorem 5) is presented without usingch.f.’s The basic idea is that of suitably truncating the r.v.’s involved, and isdue to Khintchine; it was also used by Markov

ALTERNATIVE PROOF OF THEOREM 5 We proceed as follows: For any

ifif

δδ0

δδ

Then, clearly, X j = Y j + Z j , j = 1, , n Let us restrict ourselves to the continuous case and let f be the (common) p.d.f of the X’s Then,

Trang 6

1 2 1 2

1 2

j n

j j n

1

2 2 2 1 2 1

2 2 1

Trang 7

j j

n

j j n

1

21

1

1 1

1

1 1

1

1 1

δδ

δ δ

Trang 8

for n sufﬁciently large Thus,

n

j n

j j

n

j j n

j j

n

j j n

This section is concluded with a result relating convergence in probability

and a.s convergence More precisely, in Remark 3, it was stated that X n P

⎯ →⎯→∞ X, then there is a subsequence {n k } of {n} (that is, n k ↑ ∞, k → ∞) such that X nk

Trang 9

8.6.1 Use Theorem 11 in order to prove Theorem 7′(i)

8.6.2 Do likewise in order to establish part (ii) of Theorem 7′

Trang 10

Chapter 9 Transformations of Random Variables and Random Vectors

9.1 The Univariate Case

The problem we are concerned with in this section in its simplest form is thefollowing:

Let X be an r.v and let h be a (measurable) function on into , so

that Y = h(X) is an r.v Given the distribution of X, we want to determine the distribution of Y Let P X , P Y be the distributions of X and Y, respectively That is, P X (B) = P(X ∈ B), P Y (B) = P(Y ∈ B), B (Borel) subset of Now (Y ∈ B) = [h(X) ∈ B] = (X ∈ A), where A = h−1(B) = {x ∈ ; h(x) ∈ B} Therefore P Y (B) = P(Y ∈ B) = P(X ∈ A) = P X (A) Thus we have the following

theorem

Let X be an r.v and let h: → be a (measurable) function, so that Y = h(X)

is an r.v Then the distribution P Y of the r.v Y is determined by the distribution

P X of the r.v X as follows: for any (Borel) subset B of , P Y (B) = P X (A), where

A = h−1(B).

Discrete Random Variables

Let X be a discrete r.v taking the values x j , j = 1, 2, , and let Y = h(X) Then

Y is also a discrete r.v taking the values y j , j= 1, 2, We wish to determine

f Y (y j)= P(Y = y j ), j = 1, 2, By taking B = {y j}, we have

A={x h x i; ( )i = y j},and hence

Trang 11

f X( )x i =P X( =x i)

Let X take on the values −n, , −1, 1, , n each with probability 1/2n, and let Y = X2

Then Y takes on the values 1, 4, , n2

with probability found as

1.That is,

x2+2x− = ,3 y

we get

x2+2x−( )y+3 =0, so that x= − ±1 y+4.Hence x= − +1 y+4, the root − −1 y+4 being rejected, since it is nega-

tive Thus, if B = {y}, then

It is a fact, proved in advanced probability courses, that the distribution P X of

an r.v X is uniquely determined by its d.f X The same is true for r vectors.

(A ﬁrst indication that such a result is feasible is provided by Lemma 3 in

Chapter 7.) Thus, in determining the distribution P Y of the r.v Y above, it sufﬁces to determine its d.f., F Y This is easily done if the transformation h is one-to-one from S onto T and monotone (increasing or decreasing), where S

is the set of values of X for which f X is positive and T is the image of S, under h: that is, the set to which S is transformed by h By “one-to-one” it is meant that for each y ∈T, there is only one x ∈S such that h(x) = y Then the inverse

EXAMPLE 1

EXAMPLE 2

Trang 12

transformation, h−1, exists and, of course, h−1[h(x)] = x For such a

where F X (x −) is the limit from the left of F X at x; F X (x −) = limF X (y), y ↑ x.

REMARK 1 Figure 9.1 points out why the direction of the inequality is

re-versed when h−1 is applied if h in monotone decreasing.

Thus we have the following corollary to Theorem 1

Let h: S → T be one-to-one and monotone Then F Y (y) = F X (x) if h is ing, and F Y (y) = 1 − F X (x −) if h is decreasing, where x = h−1

increas-(y) in either case.

REMARK 2 Of course, it is possible that the d.f F Y of Y can be expressed in terms of the d.f F X of X even though h does not satisfy the requirements of the

corollary above Here is an example of such a case

Trang 13

We will now focus attention on the case that X has a p.d.f and we will determine the p.d.f of Y = h(X), under appropriate conditions.

One way of going about this problem would be to ﬁnd the d.f F Y of the r.v

Y by Theorem 1 (take B = (−∞, y], y ∈), and then determine the p.d.f f Y of

Y, provided it exists, by differentiating (for the continuous case) F Y at

continu-ity points of f Y The following example illustrates the procedure

In Example 3, assume that X is N(0, 1), so that

1

2 1 2

2 1

1

y≥ 0 and zero otherwise We recognize it as being the p.d.f of a χ2

distributedr.v which agrees with Theorem 3, Chapter 4

Another approach to the same problem is the following Let X be an r.v whose p.d.f f X is continuous on the set S of positivity of f X Let y = h(x) be a

(measurable) transformation deﬁned on into which is one-to-one on the

set S onto the set T (the image of S under h) Then the inverse transformation

x = h−1(y) exists for y ∈ T It is further assumed that h−1 is differentiable and its

derivative is continuous and different from zero on T Set Y = h(X), so that Y

is an r.v Under the above assumptions, the p.d.f f Y of Y is given by the

EXAMPLE 4

Trang 14

Mathematical Analysis, Addison-Wesley, 1957, pp 216 and 270–271) and

∈( )= [ ]−( ) −( )

) and let y = h(x) = ax + b, where a, b ∈, a ⫽ 0, are constants,

so that Y = aX + b We wish to determine the p.d.f of the r.v Y.

Here the transformation h: → , clearly, satisﬁes the conditions ofTheorem 2 We have

π σ

μσ

Now it may happen that the transformation h satisﬁes all the requirements

of Theorem 2 except that it is not one-to-one from S onto T Instead, the following might happen: There is a (ﬁnite) partition of S, which we denote by

THEOREM 2

EXAMPLE 5

Trang 15

{S j , j = 1, , r}, and there are r subsets of T, which we denote by T j , j= 1, ,

r, (note that 傼r

j=1T j = T, but the T j ’s need not be disjoint) such that h: S j → T j,

j = 1, , r is one-to-one Then by an argument similar to the one used in

proving Theorem 2, we can establish the following theorem

Let the r.v X have a continuous p.d.f f X on the set S on which it is positive, and let y = h(x) be a (measurable) transformation deﬁned on into , so that

Y = h(X) is an r.v Suppose that there is a partition {S j , j = 1, , r} of S and subsets T j , j = 1, , r of T (the image of S under h), which need not be distinct

or disjoint, such that ∪r

j=1T j = T and that h deﬁned on each one of S j onto T j ,

j = 1, , r, is one-to-one Let h j be the restriction of the transformation h to

S j and let h−1j be its inverse, j = 1, , r Assume that h j

−1 is differentiable andits derivative is continuous and ≠ 0 on T j , j = 1, , r Then the p.d.f f Y of Y is

This result simply says that for each one of the r pairs of regions (S j , T j),

j = 1, , r, we work as we did in Theorem 2 in order to ﬁnd

f y f h y d

dy h y

Y j( )= X[ ]j− 1( ) −j1( );

then if a y in T belongs to k of the regions T j , j = 1, , r (0 ≤ k ≤ r), we ﬁnd

f Y (y) by summing up the corresponding f Y j (y)’s The following example will

serve to illustrate the point

Consider the r.v X and let Y = h(X) = X2

We want to determine the p.d.f f Y

of the r.v Y Here the conditions of Theorem 3 are clearly satisﬁed with

S1= −∞( , 0], S2=( )0, ∞, T1=[0, ∞), T2=( )0, ∞

by assuming that f X (x) > 0 for every x ∈ Next,

h11 y y h y y

2 1

12

0

− ( )= − , −( )= , > Therefore,

THEOREM 3

EXAMPLE 6

Trang 16

,

provided±√y are continuity points of f X In particular, if X is N(0, 1), we arrive

at the conclusion that f Y (y) is the p.d.f of a χ2

r.v., as we also saw in Example

i) Express the p.d.f of Y in terms of that of X, and notice that Y is a discrete

r.v whereas X is an r.v of the continuous type;

ii) If n = 3, X is N(99, 5) and B1 = (95, 105), B2 = (92, 95) + (105, 107),

B3= (−∞, 92] + [107, ∞), determine the distribution of the r.v Y deﬁned

above;

iii) If X is interpreted as a speciﬁed measurement taken on each item of a

product made by a certain manufacturing process and c j , j= 1, 2, 3 are theproﬁt (in dollars) realized by selling one item under the condition that

X ∈ B j , j= 1, 2, 3, respectively, ﬁnd the expected proﬁt from the sale of oneitem

9.1.3 Let X, Y be r.v.’s representing the temperature of a certain object

in degrees Celsius and Fahrenheit, respectively Then it is known that Y=–95X + 32 If X is distributed as N(μ, σ2

), determine the p.d.f of Y, ﬁrst by

determin-ing its d.f., and secondly directly

9.1.4 If the r.v X is distributed as Negative Exponential with parameter λ,

ﬁnd the p.d.f of each one of the r.v.’s Y, Z, where Y = e X

ii) What do the p.d.f.’s in part (i) become for α = 0 and β = 1?

iii) Forα = 0 and β = 1, let Y = logX and suppose that the r.v.’s Y j , j= 1, ,

n, are independent and distributed as the r.v Y Use the ch.f approach to

determine the p.d.f of −∑n

=1Y

Trang 17

9.1.6 If the r.v X is distributed as U(−–12π,–12π), show that the r.v Y = tanX is distributed as Cauchy Also ﬁnd the distribution of the r.v Z = sinX.

9.1.7 If the r.v X has the Gamma distribution with parameters α, β, and Y = 2X/ β, show that Y ∼ χ2

and show that the r.v Y = 1/X is distributed as N(0, 1).

9.1.11 Suppose that the velocity X of a molecule of mass m is an r.v with p.d.f f given in Exercise 3.3.13(ii) of Chapter 3 Derive the distribution of the r.v Y=–12mX2

(which is the kinetic energy of the molecule)

9.1.12 If the r.v X is distributed as N(μ, σ2

), show, by means of a

transforma-tion, that the r.v Y = [(X −μ)/σ]2

is distributed as χ2

1

9.2 The Multivariate Case

What has been discussed in the previous section carries over to the mensional case with the appropriate modiﬁcations

multidi-Let X = (X1, , X k)′ be a k-dimensional r vector and let h: k

→m

be

a (measurable) function, so that Y= h(X) is an r vector Then the

distribu-tion PY of the r vector Y is determined by the distribution PX of the r vector

X as follows: For any (Borel) subset B of m

, PY(B) = PX(A), where

A = h−1(B).

The proof of this theorem is carried out in exactly the same way as that of

Theorem 1 As in the univariate case, the distribution PY of the r vector Y is

uniquely determined by its d.f FY

Let X1, X2 be independent r.v.’s distributed as U(α, β) We wish to determine

the d.f of the r.v Y = X1+ X2 We have

Trang 18

where A is the area of that part of the square lying to the left of the line

y

y y

2

1 22

β

REMARK 3 The d.f of X1+ X2 for any two independent r.v.’s (not necessarily

U( α, β) distributed) is called the convolution of the d.f.’s of X1, X2 and is

denoted by F X1+X2= F X1* F X2 We also write f X1+X2= f X1* f X2 for the correspondingp.d.f.’s These concepts generalize to any (ﬁnite) number of r.v.’s

Trang 19

Let X1 be B(n1, p), X2 be B(n2, p) and independent Let Y1 = X1 + X2 and

Y2= X2 We want to ﬁnd the joint p.d.f of Y1, Y2 and also the marginal p.d.f

of Y1, and the conditional p.d.f of Y2, given Y1= y1

n

y p q n

1 2

2 2

n y

Next, for the four possible values of the pair, (u,υ), we have

n

n y

n

n y

n

n y n

n y y

y

y n

1

2

2 0

1

2 2

n n y

2 2 1 1

1

1 2

2 2

1 2 1

the hypergeometric p.d.f., independent, of p!.

We next have two theorems analogous to Theorems 2 and 3 in Section 1.That is,

EXAMPLE 8

Trang 20

Let the k-dimensional r vector X have continuous p.d.f fX on the set S on

which it is positive, and let

k-dimensional r vector Suppose that h is one-to-one on S onto T (the image

of S under h), so that the inverse transformation

REMARK 4 In Theorem 2′, the transformation h transforms the

k-dimen-sional r vector X to the k-dimenk-dimen-sional r vector Y In many applications, however, the dimensionality m of Y is less than k Then in order to determine

the p.d.f of Y, we work as follows Let y = (h1(x), , h m(x))′ and choose

another k − m transformations deﬁned on k

into , h m +j , j = 1, , k − m,

say, so that they are of the simplest possible form and such that thetransformation

h=(h1,⋅ ⋅ ⋅, h m,h m+ 1,⋅ ⋅ ⋅,h k)′satisﬁes the assumptions of Theorem 2′ Set Z = (Y1, , Y m , Y m+ 1, , Y k)′,

where Y= (Y1, , Y m)′ and Y m + j = h m + j (X), j = 1, , k − m Then by applying

Theorem 2′, we obtain the p.d.f fZ of Z and then integrating out the last k − m arguments y m +j , j = 1, , k − m, we have the p.d.f of Y.

A number of examples will be presented to illustrate the application ofTheorem 2′ as well as of the preceding remark

THEOREM 2′

Trang 21

Let X1, X2 be i.i.d r.v.’s distributed as U( α, β) Set Y1= X1+ X2 and ﬁnd the

and also α < y2<β Since y1− y2= x1,α < x1<β, we have α < y1− y2<β Thus

the limits of y1, y2 are speciﬁed by α < y2<β, α < y1− y2<β (See Figs 9.3 and9.4.)

Trang 22

REMARK 5 This density is known as the triangular p.d.f.

Let X1, X2 be i.i.d r.υ.’s from U(1, β) Set Y1= X1X2 and ﬁnd the p.d.f of Y1.Consider the transformation

From h, we get

x y y

x y

J y

y y y

1 1 2

2 2

2 1 2 2 2

β β

,

,,

f Y

1(y1)

Figure 9.5

Trang 23

f Y Y y y y y T

11

120

11

1

11

112

11

12

Let X1, X2 be i.i.d r.υ.’s from N(0, 1) Show that the p.d.f of the r.v.

Y1= X1/X2 is Cauchy with μ = 0, σ = 1; that is,

Trang 24

0 1 , so that .Since−∞ < x1, x2< ∞ implies −∞ < y1, y2< ∞, we have

f Y Y y y f X X y y y y y y y y

1 2 2 2 2 2 2

1 2 2 2

2 2 1 2

12

21+

+, so that

1

1 2

as Beta with parameters α, β

We set Y2= X1+ X2 and consider the transformation:

EXAMPLE 12

Trang 25

1 1 1 1 2

2 1 0

2 2

1 0

1

0( )=

Trang 26

and prove that Y1 is U(0, 1), Y3 is distributed as Gamma with α = 3, β = 1, and

Now from the transformation, it follows that x1, x2, x3∈ (0, ∞) implies that

y1∈( )0 1, , y2∈( )0 1, , y3∈( )0,∞.Thus

2 0 1

1

1 2 3

2

2 2 0 1

3 2

the independence of Y1, Y2, Y3 is established The functional forms of f Y1, f Y3

verify the rest

Trang 27

9.2.1 Application 2: The t and F Distributions

The density of the t distribution with r degrees of freedom (t r) Let the dent r.υ.’s X and Y be distributed as N(0, 1) and χ2

indepen-r, respectively, and set T=

X/ √Y/r The r.v T is said to have the (Student’s) t-distribution with r degrees of freedom (d.f.) and is often denoted by t r We want to ﬁnd its p.d.f We have:

0

1 2

t

u r

1

2 21

ΓΓ

Trang 28

r r

211

21

ππ

12

11

1

2 1 2 1

1 2

2 1 2 1

12

11

r 2, respectively, and set F = (X/r1)/(Y/r2) The

r.v F is said to have the F distribution with r1, r2 degrees of freedom (d.f.) and

is often denoted by F r1,r2

We want to ﬁnd its p.d.f We have:

x X

1 1

Trang 29

2 1 2

2 2

2

1

2 1 1

2 2

1 2

2 1

2 1 2 1 2 1

1 2

2 1 2

2 2

1 2

1 2 1 0

1 2

Tiêu đề	Further Limit Theorems
Chuyên ngành	Mathematical Statistics
Thể loại	Lecture Notes

Định dạng
Số trang	59
Dung lượng	374,05 KB