8.6.2 Do likewise in order to establish part ii of Theorem 7′... Chapter 9 Transformations of Random Variables and Random Vectors 9.1 The Univariate Case The problem we are concerned wit
Trang 1±( )
Relations (6) and (7) imply that lim
n→∞P(X n /Y n ≤ z) exists and is equal to
Trang 2
P X
n n
REMARK 12 Theorem 8 is known as Slutsky’s theorem
Now, if X j , j = 1, , n, are i.i.d r.v.’s, we have seen that the sample
and also in probability On the other hand, X¯2
n ⎯ →⎯n→∞ μ2
a.s and also in
probability, and hence X¯2
P n
d n
n n
d n
Trang 3by Theorem 3, and
n n
−1( −μ)
converges in distribution to N(0, 1) as n→ ∞, by Theorem 9 ▲
The following result is based on theorems established above and it is ofsignificant importance
For n = 1, 2, , let X n and X be r.v.’s, let g: → be differentiable, and let
its derivative g ′(x) be continuous at a point d Finally, let c n be constants suchthat 0 ≠ c n → ∞, and let c n (X n − d) ⎯ →⎯d X as n → ∞ Then c n [g(X n)− g(d)]
d
⎯ →⎯ g ′(d)X as n → ∞.
PROOF In this proof, all limits are taken as n → ∞ By assumption,
c n (X n −d) d
⎯ →⎯ X and c n−1→ 0 Then, by Theorem 8(ii), X n − d ⎯ →⎯d 0, or
equivalently, X n − d ⎯ →⎯P 0, and hence, by Theorem 7′(i),
However, |X n*− d| ≤ |X n − d| ⎯ →⎯P 0 by (8), so that X n * ⎯ →⎯P d, and therefore,
by Theorem 7′(i) again,
g X( )*n ⎯ →⎯P g d( ) (10)
By assumption, convergence (10) and Theorem 8(ii), we have c n (X n − d)
g ′(X n *) ⎯ →⎯d g ′(d)X This result and relation (9) complete the proof of the
Trang 48.6* Pólya’s Lemma and Alternative Proof of the WLLN
The following lemma is an analytical result of interest in its own right It wasused in the corollary to Theorem 3 to conclude uniform convergence
(Pólya) Let F and {F n } be d.f.’s such that F n (x) ⎯ →⎯n→∞ F(x), x∈, and let F be
continuous Then the convergence is uniform in x∈ That is, for every ε > 0
there exists N( ε) > 0 such that n ≥ N(ε) implies that |F n (x) − F(x)| <ε for every
x∈
PROOF Since F(x) → 0 as x → −∞, and F(x) → 1, as x → ∞, there exists an
interval [α, β] such that
Trang 50≤F x( )+ −ε F x n( )≤F x( )j+1 + −ε F x( )j +ε 2<2ε
and therefore |F n (x) − F(x)| < ε Thus for n ≥ N(ε), we have
F x n( )−F x( )<ε for every x∈ (17)Relation (17) concludes the proof of the lemma ▲
Below, a proof of the WLLN (Theorem 5) is presented without usingch.f.’s The basic idea is that of suitably truncating the r.v.’s involved, and isdue to Khintchine; it was also used by Markov
ALTERNATIVE PROOF OF THEOREM 5 We proceed as follows: For any
ifif
δδ0
δδ
Then, clearly, X j = Y j + Z j , j = 1, , n Let us restrict ourselves to the continuous case and let f be the (common) p.d.f of the X’s Then,
Trang 61 2 1 2
1 2
1 2
j n
j j n
1
1
2 2 2 1 2 1
2 2 1
Trang 7j j
n
j j n
j j n
1
1
21
1
1 1
1
1 1
1
1 1
δδ
δ δ
Trang 8for n sufficiently large Thus,
n
j n
j n
j j
n
j j n
j j n
j j
n
j j n
This section is concluded with a result relating convergence in probability
and a.s convergence More precisely, in Remark 3, it was stated that X n P
⎯ →⎯→∞ X, then there is a subsequence {n k } of {n} (that is, n k ↑ ∞, k → ∞) such that X nk
Trang 98.6.1 Use Theorem 11 in order to prove Theorem 7′(i)
8.6.2 Do likewise in order to establish part (ii) of Theorem 7′
Trang 10Chapter 9 Transformations of Random Variables and Random Vectors
9.1 The Univariate Case
The problem we are concerned with in this section in its simplest form is thefollowing:
Let X be an r.v and let h be a (measurable) function on into , so
that Y = h(X) is an r.v Given the distribution of X, we want to determine the distribution of Y Let P X , P Y be the distributions of X and Y, respectively That is, P X (B) = P(X ∈ B), P Y (B) = P(Y ∈ B), B (Borel) subset of Now (Y ∈ B) = [h(X) ∈ B] = (X ∈ A), where A = h−1(B) = {x ∈ ; h(x) ∈ B} Therefore P Y (B) = P(Y ∈ B) = P(X ∈ A) = P X (A) Thus we have the following
theorem
Let X be an r.v and let h: → be a (measurable) function, so that Y = h(X)
is an r.v Then the distribution P Y of the r.v Y is determined by the distribution
P X of the r.v X as follows: for any (Borel) subset B of , P Y (B) = P X (A), where
A = h−1(B).
Discrete Random Variables
Let X be a discrete r.v taking the values x j , j = 1, 2, , and let Y = h(X) Then
Y is also a discrete r.v taking the values y j , j= 1, 2, We wish to determine
f Y (y j)= P(Y = y j ), j = 1, 2, By taking B = {y j}, we have
A={x h x i; ( )i = y j},and hence
Trang 11f X( )x i =P X( =x i)
Let X take on the values −n, , −1, 1, , n each with probability 1/2n, and let Y = X2
Then Y takes on the values 1, 4, , n2
with probability found as
1.That is,
x2+2x− = ,3 y
we get
x2+2x−( )y+3 =0, so that x= − ±1 y+4.Hence x= − +1 y+4, the root − −1 y+4 being rejected, since it is nega-
tive Thus, if B = {y}, then
It is a fact, proved in advanced probability courses, that the distribution P X of
an r.v X is uniquely determined by its d.f X The same is true for r vectors.
(A first indication that such a result is feasible is provided by Lemma 3 in
Chapter 7.) Thus, in determining the distribution P Y of the r.v Y above, it suffices to determine its d.f., F Y This is easily done if the transformation h is one-to-one from S onto T and monotone (increasing or decreasing), where S
is the set of values of X for which f X is positive and T is the image of S, under h: that is, the set to which S is transformed by h By “one-to-one” it is meant that for each y ∈T, there is only one x ∈S such that h(x) = y Then the inverse
EXAMPLE 1
EXAMPLE 2
Trang 12transformation, h−1, exists and, of course, h−1[h(x)] = x For such a
where F X (x −) is the limit from the left of F X at x; F X (x −) = limF X (y), y ↑ x.
REMARK 1 Figure 9.1 points out why the direction of the inequality is
re-versed when h−1 is applied if h in monotone decreasing.
Thus we have the following corollary to Theorem 1
Let h: S → T be one-to-one and monotone Then F Y (y) = F X (x) if h is ing, and F Y (y) = 1 − F X (x −) if h is decreasing, where x = h−1
increas-(y) in either case.
REMARK 2 Of course, it is possible that the d.f F Y of Y can be expressed in terms of the d.f F X of X even though h does not satisfy the requirements of the
corollary above Here is an example of such a case
Trang 13We will now focus attention on the case that X has a p.d.f and we will determine the p.d.f of Y = h(X), under appropriate conditions.
One way of going about this problem would be to find the d.f F Y of the r.v
Y by Theorem 1 (take B = (−∞, y], y ∈), and then determine the p.d.f f Y of
Y, provided it exists, by differentiating (for the continuous case) F Y at
continu-ity points of f Y The following example illustrates the procedure
In Example 3, assume that X is N(0, 1), so that
1
2 1 2
2 1
1
y≥ 0 and zero otherwise We recognize it as being the p.d.f of a χ2
distributedr.v which agrees with Theorem 3, Chapter 4
Another approach to the same problem is the following Let X be an r.v whose p.d.f f X is continuous on the set S of positivity of f X Let y = h(x) be a
(measurable) transformation defined on into which is one-to-one on the
set S onto the set T (the image of S under h) Then the inverse transformation
x = h−1(y) exists for y ∈ T It is further assumed that h−1 is differentiable and its
derivative is continuous and different from zero on T Set Y = h(X), so that Y
is an r.v Under the above assumptions, the p.d.f f Y of Y is given by the
EXAMPLE 4
Trang 14Mathematical Analysis, Addison-Wesley, 1957, pp 216 and 270–271) and
∈( )= [ ]−( ) −( )
) and let y = h(x) = ax + b, where a, b ∈, a ⫽ 0, are constants,
so that Y = aX + b We wish to determine the p.d.f of the r.v Y.
Here the transformation h: → , clearly, satisfies the conditions ofTheorem 2 We have
π σ
μσ
Now it may happen that the transformation h satisfies all the requirements
of Theorem 2 except that it is not one-to-one from S onto T Instead, the following might happen: There is a (finite) partition of S, which we denote by
THEOREM 2
EXAMPLE 5
Trang 15{S j , j = 1, , r}, and there are r subsets of T, which we denote by T j , j= 1, ,
r, (note that 傼r
j=1T j = T, but the T j ’s need not be disjoint) such that h: S j → T j,
j = 1, , r is one-to-one Then by an argument similar to the one used in
proving Theorem 2, we can establish the following theorem
Let the r.v X have a continuous p.d.f f X on the set S on which it is positive, and let y = h(x) be a (measurable) transformation defined on into , so that
Y = h(X) is an r.v Suppose that there is a partition {S j , j = 1, , r} of S and subsets T j , j = 1, , r of T (the image of S under h), which need not be distinct
or disjoint, such that ∪r
j=1T j = T and that h defined on each one of S j onto T j ,
j = 1, , r, is one-to-one Let h j be the restriction of the transformation h to
S j and let h−1j be its inverse, j = 1, , r Assume that h j
−1 is differentiable andits derivative is continuous and ≠ 0 on T j , j = 1, , r Then the p.d.f f Y of Y is
This result simply says that for each one of the r pairs of regions (S j , T j),
j = 1, , r, we work as we did in Theorem 2 in order to find
f y f h y d
dy h y
Y j( )= X[ ]j− 1( ) −j1( );
then if a y in T belongs to k of the regions T j , j = 1, , r (0 ≤ k ≤ r), we find
f Y (y) by summing up the corresponding f Y j (y)’s The following example will
serve to illustrate the point
Consider the r.v X and let Y = h(X) = X2
We want to determine the p.d.f f Y
of the r.v Y Here the conditions of Theorem 3 are clearly satisfied with
S1= −∞( , 0], S2=( )0, ∞, T1=[0, ∞), T2=( )0, ∞
by assuming that f X (x) > 0 for every x ∈ Next,
h11 y y h y y
2 1
2 1
12
12
0
− ( )= − , −( )= , > Therefore,
THEOREM 3
EXAMPLE 6
Trang 16,
provided±√y are continuity points of f X In particular, if X is N(0, 1), we arrive
at the conclusion that f Y (y) is the p.d.f of a χ2
r.v., as we also saw in Example
i) Express the p.d.f of Y in terms of that of X, and notice that Y is a discrete
r.v whereas X is an r.v of the continuous type;
ii) If n = 3, X is N(99, 5) and B1 = (95, 105), B2 = (92, 95) + (105, 107),
B3= (−∞, 92] + [107, ∞), determine the distribution of the r.v Y defined
above;
iii) If X is interpreted as a specified measurement taken on each item of a
product made by a certain manufacturing process and c j , j= 1, 2, 3 are theprofit (in dollars) realized by selling one item under the condition that
X ∈ B j , j= 1, 2, 3, respectively, find the expected profit from the sale of oneitem
9.1.3 Let X, Y be r.v.’s representing the temperature of a certain object
in degrees Celsius and Fahrenheit, respectively Then it is known that Y=–95X + 32 If X is distributed as N(μ, σ2
), determine the p.d.f of Y, first by
determin-ing its d.f., and secondly directly
9.1.4 If the r.v X is distributed as Negative Exponential with parameter λ,
find the p.d.f of each one of the r.v.’s Y, Z, where Y = e X
ii) What do the p.d.f.’s in part (i) become for α = 0 and β = 1?
iii) Forα = 0 and β = 1, let Y = logX and suppose that the r.v.’s Y j , j= 1, ,
n, are independent and distributed as the r.v Y Use the ch.f approach to
determine the p.d.f of −∑n
=1Y
Trang 179.1.6 If the r.v X is distributed as U(−–12π,–12π), show that the r.v Y = tanX is distributed as Cauchy Also find the distribution of the r.v Z = sinX.
9.1.7 If the r.v X has the Gamma distribution with parameters α, β, and Y = 2X/ β, show that Y ∼ χ2
and show that the r.v Y = 1/X is distributed as N(0, 1).
9.1.11 Suppose that the velocity X of a molecule of mass m is an r.v with p.d.f f given in Exercise 3.3.13(ii) of Chapter 3 Derive the distribution of the r.v Y=–12mX2
(which is the kinetic energy of the molecule)
9.1.12 If the r.v X is distributed as N(μ, σ2
), show, by means of a
transforma-tion, that the r.v Y = [(X −μ)/σ]2
is distributed as χ2
1
9.2 The Multivariate Case
What has been discussed in the previous section carries over to the mensional case with the appropriate modifications
multidi-Let X = (X1, , X k)′ be a k-dimensional r vector and let h: k
→m
be
a (measurable) function, so that Y= h(X) is an r vector Then the
distribu-tion PY of the r vector Y is determined by the distribution PX of the r vector
X as follows: For any (Borel) subset B of m
, PY(B) = PX(A), where
A = h−1(B).
The proof of this theorem is carried out in exactly the same way as that of
Theorem 1 As in the univariate case, the distribution PY of the r vector Y is
uniquely determined by its d.f FY
Let X1, X2 be independent r.v.’s distributed as U(α, β) We wish to determine
the d.f of the r.v Y = X1+ X2 We have
Trang 18where A is the area of that part of the square lying to the left of the line
y
y
y y
2
1 22
β
β
REMARK 3 The d.f of X1+ X2 for any two independent r.v.’s (not necessarily
U( α, β) distributed) is called the convolution of the d.f.’s of X1, X2 and is
denoted by F X1+X2= F X1* F X2 We also write f X1+X2= f X1* f X2 for the correspondingp.d.f.’s These concepts generalize to any (finite) number of r.v.’s
Trang 19Let X1 be B(n1, p), X2 be B(n2, p) and independent Let Y1 = X1 + X2 and
Y2= X2 We want to find the joint p.d.f of Y1, Y2 and also the marginal p.d.f
of Y1, and the conditional p.d.f of Y2, given Y1= y1
n
y p q n
1 2
2 2
n y
Next, for the four possible values of the pair, (u,υ), we have
n
n y
n
n y
n
n y n
n y y
y
y n
1
2
2 0
1
2 2
n n y
2 2 1 1
1
1 2
2 2
1 2 1
the hypergeometric p.d.f., independent, of p!.
We next have two theorems analogous to Theorems 2 and 3 in Section 1.That is,
EXAMPLE 8
Trang 20Let the k-dimensional r vector X have continuous p.d.f fX on the set S on
which it is positive, and let
k-dimensional r vector Suppose that h is one-to-one on S onto T (the image
of S under h), so that the inverse transformation
REMARK 4 In Theorem 2′, the transformation h transforms the
k-dimen-sional r vector X to the k-dimenk-dimen-sional r vector Y In many applications, however, the dimensionality m of Y is less than k Then in order to determine
the p.d.f of Y, we work as follows Let y = (h1(x), , h m(x))′ and choose
another k − m transformations defined on k
into , h m +j , j = 1, , k − m,
say, so that they are of the simplest possible form and such that thetransformation
h=(h1,⋅ ⋅ ⋅, h m,h m+ 1,⋅ ⋅ ⋅,h k)′satisfies the assumptions of Theorem 2′ Set Z = (Y1, , Y m , Y m+ 1, , Y k)′,
where Y= (Y1, , Y m)′ and Y m + j = h m + j (X), j = 1, , k − m Then by applying
Theorem 2′, we obtain the p.d.f fZ of Z and then integrating out the last k − m arguments y m +j , j = 1, , k − m, we have the p.d.f of Y.
A number of examples will be presented to illustrate the application ofTheorem 2′ as well as of the preceding remark
THEOREM 2′
Trang 21Let X1, X2 be i.i.d r.v.’s distributed as U( α, β) Set Y1= X1+ X2 and find the
and also α < y2<β Since y1− y2= x1,α < x1<β, we have α < y1− y2<β Thus
the limits of y1, y2 are specified by α < y2<β, α < y1− y2<β (See Figs 9.3 and9.4.)
Trang 22REMARK 5 This density is known as the triangular p.d.f.
Let X1, X2 be i.i.d r.υ.’s from U(1, β) Set Y1= X1X2 and find the p.d.f of Y1.Consider the transformation
From h, we get
x y y
x y
J y
y y y
1 1 2
2 2
2 1 2 2 2
β β
,
,,
f Y
1(y1)
Figure 9.5
Trang 23f Y Y y y y y T
11
120
11
1
11
112
11
12
Let X1, X2 be i.i.d r.υ.’s from N(0, 1) Show that the p.d.f of the r.v.
Y1= X1/X2 is Cauchy with μ = 0, σ = 1; that is,
Trang 240 1 , so that .Since−∞ < x1, x2< ∞ implies −∞ < y1, y2< ∞, we have
f Y Y y y f X X y y y y y y y y
1 2 2 2 2 2 2
1 2 2 2
2 2 1 2
12
21+
+, so that
1
1 2
as Beta with parameters α, β
We set Y2= X1+ X2 and consider the transformation:
EXAMPLE 12
Trang 251 1 1 1 2
2 1 0
2 2
2 2
1 0
1
0( )=
Trang 26and prove that Y1 is U(0, 1), Y3 is distributed as Gamma with α = 3, β = 1, and
Now from the transformation, it follows that x1, x2, x3∈ (0, ∞) implies that
y1∈( )0 1, , y2∈( )0 1, , y3∈( )0,∞.Thus
2 0 1
1
1 2 3
2
2 2 0 1
3 2
the independence of Y1, Y2, Y3 is established The functional forms of f Y1, f Y3
verify the rest
Trang 279.2.1 Application 2: The t and F Distributions
The density of the t distribution with r degrees of freedom (t r) Let the dent r.υ.’s X and Y be distributed as N(0, 1) and χ2
indepen-r, respectively, and set T=
X/ √Y/r The r.v T is said to have the (Student’s) t-distribution with r degrees of freedom (d.f.) and is often denoted by t r We want to find its p.d.f We have:
0
1 2
t
u r
u r
1
2 21
ΓΓ
Trang 28r r
211
21
ππ
12
11
1
2 1 2 1
1 2
1 2
2 1 2 1
12
11
r 2, respectively, and set F = (X/r1)/(Y/r2) The
r.v F is said to have the F distribution with r1, r2 degrees of freedom (d.f.) and
is often denoted by F r1,r2
We want to find its p.d.f We have:
x X
1 1
Trang 292 1 2
2 2
2
1
2 1 1
2 2
1 2
1 2
2 1
2 1 2 1 2 1
1 2
2 1 2
2 2
1 2
1 2 1 0
1 2