Independent And Stationary Sequences Of Random Variables - Chapter 15 pptx

Chapter 15APPROXIMATION OF DISTRIBUTIONS OF SUMS OF INDEPENDENT COMPONENTS BY INFINITELY DIVISIBLE DISTRIBUTIONS ± 1.. Statement of the problem We here consider the general problem of th

Trang 1

Chapter 15

APPROXIMATION OF DISTRIBUTIONS OF SUMS OF INDEPENDENT COMPONENTS BY INFINITELY

DIVISIBLE DISTRIBUTIONS

± 1 Statement of the problem

We here consider the general problem of the limiting behaviour of the distribution function F§ (x) of the sum

S§=X1+X2+ +X§

(15 1 1)

of independent random variables with the same distribution F, when no further assumptions are made aboutF It follows from ± 2 6 that it is not

in general possible to choose normalising constants A§, B§ such that the distribution of (S§ -A§)/B§ converges to any non-degenerate distribution Even more is true, for there are distributions for which no subsequence (S§k-A§k )/B§ k converges in distribution One such example is the (infinitely divisible) distribution with characteristic function [48]

f (t) = exp ~ J _ e (cos tx -1) d 1 +

4 log IxI )

~-©©

. 1 +

J (cos tx-1) d 4 log x Although the sequence F§(x) in general diverges, we can ask the question ; does there exist a sequence D§ (x) of infinitely divisible distributions such tha t, in some sense, F§ and D§ are close for large n The answer is affir-mative, and is given by the following theorem

Theorem 15 1 1 There exists an absolute constant C such that, for any distribution F and any n there exists an infinitely divisible distribution D§ with

sup ID§(x)-F§(x)I < Cn -1

( 15 1 2)

x

Trang 2

2 68

DISTRIBUTIONS OF SUMS OF INDEPENDENT COMPONENTS

Chap 15

This chapter is devoted to the proof of this theorem, which is completed

in ± 4, ±± 2, 3 being devoted to some auxiliary propositions which are necessary for the proof

If F and G are distribution functions, we write

IF-GI = sup IF(x)-G(x)l

(15 1 3)

X

for the distance used to define strong convergence in ± 1 3 Then Theorem

15 1 1 is just the assertion that, for all F, inf IF,,-DI < Cn -3 ,

( 15 1 4)

D

or since C does not depend on F,

sup inf IF,,-DI < Cn - 3

( 15 1 5)

F D

The left-hand side of (15 1 5) may be regarded as the greatest distance (in the sense of (15 1 3)) of the set of n-fold convolutions F§from the set of infinitely divisible distributions

Throughout this chapter, we shall write

1

"

(Pa2(x) =

e-Z2/2a2dz,

a(270 1 - 00

E (x) =

00(x) - 0

(x < 0) ' 1

(x>0) ;

C 1 , C 2 , will denote absolute constants

± 2 Concentration functions The concentration function of a random variable X is the function

Q X(l) = Q(l) =sup P(x<X <x+l)

X

As a function of 1>0, this is non-decreasing and right-continuous

If X, Y are independent random variables, the concentration function of

X + Y is not greater than that of either of them In fact, for all u, P(x<X+Y<x+IIX=u)<Q Y (l),

Trang 3

15 2

CONCENTRATION FUNCTIONS

2 6 9

and taking expectations, P(x<X+Y<x+l)<Q1.(l)

We shall however need a more precise estimate for the decrease in the concentration function of a sum of independent random variables Write

Sn=Xl+X2+ +Xn, where the Xi are independent,

n

Qi(l) = Qx,(l) , Q (l) = Qs (l) , s = E { 1- Qi(l)}

i=1

Theorem 15 2 1 There exists an absolute constant C 1 such that, for all

L>,1, Q(L) < C 1 L/ Is'-

(15 2 1) The proof of this theorem requires a number of auxiliary results

Lemma 15.2 1 Let 21 be a set of n elements, and K a class of subsets of 21

such that no member of K is contained in any other member Then the num-ber v of memnum-bers of K does not exceed ( n nl)

Proof Among the classes K satisfying the conditions of the lemma, we can choose one, K o with the greatest possible size (number of elements) Assume for the sake of argument that n is even (n =2m) ; the argument for odd n is similar We show that all the subsets in K o have the same size m

Suppose if possible that K o contains r > 1 sets of size k > m + 1 > z (n+ 1),

denoted by A1, A2, , Ar, and none of size >k Each Ai has k subsets

Ai1, A12, A,,, of size (k-1), but the collection {A1 ; i=1, 2, , r ;

j= 1, 2, , k} may contain a given set more than once ; enumerate the distinct members of the collection as B 1 , B 2, , BS Each Ba can be a subset of at most (n - k + 1)of the A1, and so can appear at most (n - k + 1)

times in the collection {A 1 } Thus

s(n-k+ 1) >, rk ,

and since k > i (n + 1), this implies that

s>r

Trang 4

2 7 0

Chap 15

Thus the class K' obtained from Ko by replacing A 1 , , A, by B1, , B,

is larger than Ko and satisfies the conditions of the lemma, and this con-tradicts the assumption that Ko is maximal

The contradiction shows that the members of K o each have size <m

An exactly similar argument shows that the members of Ko all have size

> m Thus the number of members of Ko cannot exceed the number of subsets of 2C of size m, which is (n,)

Lemma 15 2 2 If the random variables Xl in Theorem 15 2 1 have distribu-tions given by

P (X1 =al ) = P (Xl = -al ) = z , where al > 1, then

Q(21-0) < (12n])2-n

Proof By Lemma 15 2 2,

Q( 1) < Q(2l-O) <

C[z])

2-n

(15 2 2)

Proof The probability of {x < S n < x + 2l} is equal to 2 -n times the num-ber of sums of the form

n

I Ekak

( 8k= „ 1)

k=1

falling in the interval (x, x + 2l) For any such sum, consider the subset of

11, 2, , n} consisting of those k for which Ek=1 Then this collection of subsets satisfies the conditions of Lemma 15 2 1, for if the subset corres-ponding toy'Ek ak is contained in that corresponding toyEk ak' we clearly have

EEk ak-EEkak > 21, which gives a contradiction Thus Lemma 15.2 1 shows that there can be

at most (,In) sums of the form y-Ekak lying in any interval (x, x+21) Lemma 15 2 3 Under the conditions of the previous lemma,

Q(L) < C 2 L/ln 1"

( 15 2 3)

Trang 5

15 2

CONCENTRATION FUNCTIONS

and by Stirling's formula,

n

1

2-" < C3n - I

[zn]

Hence

Q (L) = sup-P (x < S§ < x + L) <

x

[L/1]

E sup P(x+jl < S§ < x+(j+ 1)1) = i=o

= Q(l)([L/l] +1) < 2C 3 L/l0

Corollary 15.2.1 If P(X 1 = a 1+/ii) = P(X 1 = -a t +/31)=2 ,

then

Q (L) < C 2 L/l0

Proof. The concentration functions of S§ and S§ - E1'i coincide

Proof of Theorem 15 2 1 First suppose that the distribution functions

F i (x) ofthe X i are continuous and strictly increasing, so that the inverse functions F 1-1(~) are well-defined The variable ~ 1 = F 1 (X 1 ) has

P(~i< y) = P {F 1 (X 1) < y} =

= P {X 1 < F1-1 (y)} _

=Fi{Fj-1(y)}=y,

so that

X i = F 1-1(4) ,

where ~ 1 , b2, , ~n are independent and uniformly distributed on (0, 1) Write 1-Qi(l) =4E 1 , x'= F i- 1(E ), x" = F i-1 (1-E 1 ), and note that

Fi(xi')-FF(x')= 1-2Ei = 2{ 1 + Q,(l)}>Qi(l)

so that xi'-x> 1 We consider the random subset

of {1, 2, , m} consisting of those i for which ~ 1 < Eior ~ 1> 1-Ei,and write, for such i,

_

bi

if ~i< Ei ,

Z

i

f 1 - ~i if ~i >E 1

271

Trang 6

2 7 2

Chap 15

Probabilities conditional on fixed values of i1, , l,, Z i1 , , Z jm will

ti

N

be denoted by P It is clear that, under P, the variables X;,, , X1 are independent, with

P(Xk = ak+ xk) = P(Xk = ak -Xk) = 2 ,

where

xk - 2 { Fk 1(1 -Zk) - Fk 1 (Zk)) ,

ak = 2 { Fk 1(1 -Zk)+Fk 1(Zk) J (k=i1, i2 , , im)

As remarked above,

Q (L) < Q 1(L) ,

(15 2 4)

where Q1 (L) is the concentration function, under P, of

m

E Xtr

r= 1

This can be estimated using Corollary 15 2 1 ; for L> max Xk > 21,

Q1(L) < 2C2L/ lm'

(15 2 5)

From this and (15 2 4) we have

Q (L) < EQ (L) < EQ1(L) < P (m < -4 s)1 + 4C2L /ls Z

( 15 2 6)

It remains to evaluate P(m < 4s), which we do by noting that mcan be re-garded as the number of successes in n trials, when the probability of

success at the kth trial is

P(b k <Ek )+ Nk > 1-8k)= 2Ek Thus

n

E (m) = Y 2Ei= 2s ,

i=1

n

V (m) _

2Ei(1-2Ei) < 2s ,

=1

so that Chebyshev's inequality gives

P(m<4s)<P(Im-Eml > -4 s)<1 16s -2 V(m)<8s -1

( 15 2 7)

Combining this with (15 2 4) we have

Q(L) < 8s -1 +4C2L/ls4 < C 1 L/ls'

(15 2 8)

Trang 7

15 3

AUXILIARY PROPOSITIONS

27 3

The theorem is therefore proved for variables X i whose distribution func-tions are continuous and strictly increasing The general case can easily

be deduced as follows Replace Xi by Xi = Xi + qi, where r/i are independent

of each other and of the X i, having the normal distribution N(0, a) The distribution functions of the Xi are continuous and strictly increasing (± 1 2), and so

Q'(L) < C 1 L/ls' 2 ,

( 15 2 9) where

Q' (L) = Qx,, + + x§ (L)

n

S, = 1 ( 1 Qx;(l))

-i=1

As 6-+0, Q' , Q 1 and Q'-*Q at points of continuity of Qi and Q which form dense sets Since C 1 is absolute, it follows that

Q(L) < C 1 L/ls 2

Corollary 15.2 2 If the distributions of the XX satisfy

P ( X k< X k) -P ( X k1> X k) = 2

X k - X k then

Q (L) < C4 L/ In -,

Proof. Under the condition stated, Q i (l) >2,

n

s=

( 1- Qi(l)) %in,

i= 1

and

Q(L) < C 1 L/ls 2 < 2 2 C 1 L/ln 2 = C 4 L/ln 4

± 3 Auxiliary propositions

Lemma 15 3 1 If o > 0, 61 > 0, then

10Ax) - 0Q2(X)

I < C5

so that

(15 2 10)

(15 3 1)

Trang 8

2 7 4

Chap 15

Proof. Without loss of generality we can suppose that I(Q1

<2 and that a > a 1 , x > 0 Then

I0Q2(x)-0a,(x)I =

cc

E

sup

I F (x) - E (x) I < C6

= (27r) - z

then,for h > u ,

(27[U2)- 1

r=-oo rh<x5(r+1)h

Proof If c is a random variable with distribution F, then by Chebyshev's inequality,

IF (x)-E(x)I < P(ItI %x) < C2 /x 2 ,

so that the sum in (15 3 2) is bounded by

2

1 + ~ h2r2 zz

Y

r~0

r :f 0

Lemma 15 3 3 Let X 1 , X2 , , X§ be independent and identically distri-buted with

E(X k )=0, V(X k)=a2

>0,

Ix

< (2nai)-~ J

1 )

-e-Z2 /2a2 e-Z2 /2a 2,

J- xdF(x)=o

00

00 00

,

~-00

e -Z2 /2a 2_Q -1 e

-Z2J2a2 )dz

~1 e _Z2(a-2-QI 2) a

+ z2

Lemma 15 3 2 If the distribution function F(x) satisfies

x 2dF (x) = U2>0 ,

dz

/ dz<

(15 3 2)

Trang 9

15 3

2 7 5

and let H be the distribution function of X l + X2 + + Xn If I XkI <1, h

an -1 , then

00

E

sup

IH (x) - 0 na 2 (x) I < C, / unl

r=-0o rh-<x-<(r+1)h

Proof If H n (x) denotes the distribution function of

Zn = (X1 + X2 + +Xn) l on+ ,

then by Theorem 3 6 2,

jX J 3 IHn(x) -P1(x)I 1

a3n2

E +x12

Since IXl I <l, EIX1 1 3 < lEIX 1 1 2 = la- 2 ,

so that

IHH(x) - cP1(x)I <

~C81

(15 3 4)

a-n2 (1 +x 2

and thus

00

E

r=-oo rh<x5(r+1)h

00

sup

I Hn (Y) - (Y) _

r = - o0 rh/an'/2 y 5 [(r + 1)h]/ant/2 C7l yoo

+

r 2h2 -1

1 ant r

-

a n

C8 1 ©©

1

ani r ~ 1+r2 = C7 l/an+

Lemma 15 3 4 For any integer n and 0 < p < 1,

X

E

k=0

sup

IH (x) - 0na 2 (x) I =

( ) p k(l _ p )n-k _ (k)k

e -np

(15 3 3)

< C9 p

( 15 3 5)

Proof. This lemma is a strengthening of Poisson's well-known approxima-tion to the binomial distribuapproxima-tion [47] Let ~ 1 , ~2, , ~n be independent

Trang 10

2 7 6

Chap 15

variables taking only the two values 1 (with probability p) and 0 (with probability q =1- p), and write

C=~1+~_2+ +fin,

so that

P(C=k) = Pn(k) = (k) pk (1 - p)n-k

Let >7 z be a variable with a Poission distribution with parameter ),, write

~k

H A (k)=P(q,=k) =

kl e - ~,

and

imp = q ,

nnp (k) =17 (k)

Then the left-hand side of (15 3 5) is equal to

00

E (Pn(k) -17 (k)I , k=0

the variation distance between the distributions of q and c

We recall that

E(4T) = np ,

V (C) = np( 1- p) ,

(15 3 6) and that

E(q2)= V(17,)=2,

so that

E(q) = V (q) = np,

E(ri 2 ) = (np)2+np

(15 3 7) There is no loss of generality in taking p<,-L. We write xk = k - np, and denote by E' and E" summation respectively over Ixk1 <-210 and lXkl,> n'. Then

EIP,,(k)-17(k)I =E'IPn (k)-17(k)I+E"IPn (k)-17(k)I

and we examine separately the two sums E' and E"

(I) E" From (15 3 6) and (15 3 7) and Chebyshev's inequality,

E"IPP (k)-I7(k)I < E"Pn(k)+E"17(k)<

(15 3 8)

P( C-ECI > Zn 2 )+P(Iii-Eri1 >2n 2)< 4n -1 (VC +Vij) < 8p

Trang 11

15.3

2 7 7

(II) E' There is no loss of generality in assuming n >4 Then

E' I Pn( k ) - H(k)I = E' 17 (k)I d(k)-1I

,

where d (k) = Pn(k) / 17 (k) It is easy to see that

d (k) = d 1 (k) d 2 (k) ,

where

d 1 (k)= 1-1 1 ) 1- k -1 ,

n

d2 ( k ) _ ( 1-p)n-ke' '

We first show that d (k) = d (k, n, p) is bounded We have

k-1

log d 1 (k) = E log 1 - S =

S=1

n

k-1 oo Sr

00

S _

r

r'

S=1 r=1 rn

r = 1 rn

k- 1

where Sr =

Sr s=1

Settingf r (x) = X r , we have

so that

kr+ 1

r+ 1 - Sr

In particular,

k 2

k

S1 = 2 - 2 Substituting into (15 3 9) and remembering that Ikl <zn, we have

= k r+ 1

f

1

k-1 S

r 1

fr(x)dx - E

(k k

0

S=0

k- 1 (s+ 1)/k

= k r+1 E

[fr(x) - fr(s/k)]dx

s = 0 s/k

k-1

k r-1 Y fr s

rkr ,

S=0

k r+ 1

S r

r+1 + Orkr ,

1e1 < 1

(15 3 9)

Trang 12

2 7 8

Chap 15

log d 1 (k) _ _ I

kr+

+ 20

k

(15 3 10) r=1 r(r+ 1)n

n

Moreover, log d2 (k) = n[(1-p) log (1-p)+p]-xk log(1-p)=

00

r+1

00

r

=n~

P

+x k ~P

(15 3 11) r=1 r(r+1)

r = 1 r

It follows from (15 3 10) and (15 3 11) that

Ilog d(k)I =

00 Ik

r+

- ( nP)r

(+

1)(np) r xkl

+ 20 n r=1

(r+ 1)n

By Taylor's theorem,

kr+1 =

(np+xk)r+1 = (np)r+1 +xk (r+ 1)(np)r+

+2xk(r+1)r0max(np, k)' 1 ,

and therefore

log d(k)I < 8 + p +

nk < 3

(15 3 12)

Using the inequality lex-11,< Ixl e1 1 , together with (15 3 6) and (15 3 12),

we obtain

E' IPn (k) - H (k) I = E' 17 (k) I elogd(k) -1

e 3 pE H (k) + 3 E k 17 (k) < 4e 3 p

n

± 4 Proof of theorem 15 1

In this section we conclude the proof of Theorem 15 1 1 The necessary arguments are rather complicated, and we separate the proof into several parts

(I) Preliminary construction

Until part (IV) we shall assume that the distribution function F(x) of the variables X3 is continuous and strictly increasing As in ± 2, X1 = F -1 (~r), where (~ ;) is a sequence of independent random variables, uniformly distributed over (0, 1)

Trang 13

15 4

PROOF OF THEOREM 15 1 1

27 9

We write

P =n

-+

,

¢; =

0, if 2p<cj <1-2p,

1 , otherwise ,

n

¢=~¢j ,

1

A(x)=P(Xj <xIp j =0), B(x)=P(Xj<xI¢j =1),

Go

a = E(Xj I ¢j =0) =

J

_ 00 xdA(x), 00

0* 2 = V (X; I ¢j = 0) _ _ (x - a) 2 dA (x)

Clearly

F (x) = pB (x) + (1- p) A (x)

(15 4 2) This construction expandsFas a combinationoftwo distributions One of them, A (x), is concentrated on the interval [x -, x + ], where x - = F -1 (ip),

X + = F-1 (1-Zp), oflength A Consequently A (x) can be examined using the results of ± 3, notably Lemmas 15 3 2 and 15 3.3 The distribution B

on the other hand is concentrated on the half-lines (- oo, x -] and [x + , oo), each with probability2 For the powersBm = B*m (in this section powers of

distributions are always to be understood in the sense of convolution)

we can use Corollary 15 2.2, which leads to the inequality (with 2 =

X+ -x - ),

QBk(2) < C 4 k

-f, ,

( 15 4 3) where QG denotes the concentration function of the distribution G There is no loss ofgenerality in supposing that a = 0, since otherwise we can replace X; by XX = XX - a ; ifthe distribution function of E XX can be approximated by the infinitely divisible distribution function D' (x), then that of E XX is approximated to the same accuracy by D (x) = D' (x - na)

We shall expand f,, as a sum

Fn =F"= {PB+(1-P)A}n= n (n)pj(,_P)

n- ; B; *A n- ;

~

=1 J and examine separately the two cases A -o-n-I and A < an'

(15 4.1)

Tiêu đề	Approximation of Distributions of Sums of Independent Components by Infinitely Divisible Distributions
Trường học	University Name
Chuyên ngành	Mathematics
Thể loại	Bài báo

Định dạng
Số trang	17
Dung lượng	505,1 KB