The main finding of this note is an improvement of the ChenGoldsteinShao proof of the BerryEsseen bound for the combinatorial central limit theorem. A bound of the correct order in terms of thirdmoment type quantities with a small explicit constant is obtained. Moreover, our approach does not need to use a truncation step as in ChenGoldsteinShao. An example is also given to illustrate the optimality of the bound.
Trang 1On the Berry-Esseen bound for a combinatorial
central limit theorem
Th` anh Lˆ e Vˇ an∗
Abstract The main finding of this note is an improvement of the Chen-Goldstein-Shao proof of the Berry-Esseen bound for the combinatorial central limit theorem A bound of the correct order
in terms of third-moment type quantities with a small explicit constant is obtained Moreover, our approach does not need to use a truncation step as in Chen-Goldstein-Shao An example is also given to illustrate the optimality of the bound
Key Words and Phrases: Berry-Esseen bound, combinatorial central limit theorem, zero-bias coupling, Stein’s method
2010 Mathematics Subject Classifications: 60F05, 60D05
1 Introduction and result
Let n ≥ 2 and A = {aij, 1 ≤ i, j ≤ n} be an array of real numbers In this note, we study the combinatorial central limit theorem, that is, the central limit theorem for random variables of the form
Y = YA=
n
X
i=1
where π is a random permutation with the uniform distribution over the symmetric group of all permutations of {1, , n}
The central limit theorem for YAwere proved by Wald and Wolfowitz [17] when the factorization
aij = bicj holds, and by Hoeffding [11] for general arrays Bounds on the error in the normal approximation were later considered by a number of authors Ho and Chen [10] used a concentration inequality approach and Stein’s method for exchangeable pairs [16], which yield the optimal rate only under condition that supij|aij| ≤ C Bolthausen [1] also used Stein’s method with an inductive approach, which obtained a bound of the correct order in terms of third-moment type quantities, but with an unspecified constant Goldstein [7] employing the zero bias version of Stein’s method obtained bounds with an explicit constant, but in terms of supi,j|ai,j| Recently, Chen, Goldstein
∗ Department of Mathematics, Vinh University, Nghe An 42118, Vietnam A part of research of the second author
is also supported by the Vietnam Institute for Advanced Study in Mathematics (VIASM) and the Vietnam National Foundation of Sciences and Technology Development (NAFOSTED) Email: levt@vinhuni.edu.vn
Trang 2and Shao [5, Theorem 6.2] used the zero bias variation of Ghosh [6] on the inductive method in Bolthausen [1] to prove a bound depending on a third moment type quantity of the matrix, but with an unspecified constant like Bolthausen [1] In this note, we give an improvement of the Chen-Goldstein-Shao proof and obtain a bound of the correct order in terms of third-moment type quantities with a constant c = 90 Moreover, our approach do not need to use the truncation step
as in [1, 5, 6] We also give an example to illustrate the optimality of the bound
As far as we are aware, on bound depending on a third moment type quantity of the matrix, the best absolute constant is c = 447 which was obtained very recently by Chen and Fang [4] (Neammanee and Suntornchost [15] obtained a constant c = 198 However, Chen and Fang [4] showed that the proof in [15] is incorrect.) Both in [4] and [15], the authors used the concentration inequality approach and method of exchangeable pairs, and considered the case where the elements
of A are independent random variables
We denote the mean and variance of YA by µAand σ2A, and use the following notation
ai.= 1
n
n
X
j=1
aij, 1 ≤ i ≤ n, a.j = 1
n
n
X
i=1
aij, 1 ≤ j ≤ n, and a = 1
n2
n
X
i,j=1
aij
It is known that
µA= na =
n
X
i=1
ai.=
n
X
i=1
a.π(i), (1.2)
and
σA2 = 1
n − 1
n
X
i,j=1
(a2ij− a2
i.− a2 j+ a2 ) = 1
n − 1
n
X
i,j=1
(aij− ai.− a.j+ a )2 (1.3) From (1.2), we automatically get that
YA− EYA=
n
X
i=1
(aiπ(i)− ai.− a.π(i)+ a ) (1.4)
We also denote WA= (YA− µA)/σAand
βA=
Pn i,j=1|aij− ai.− a.j+ a |3
σ3 A
Throughout this note, Z is the standard normal random variable, Φ(x) = √1
2π
Rx
−∞exp(−t2/2)dt
is the distribution function of Z For n ≥ 1, let Sn denote the symmetric group of all permutations
of {1, , n}, and let π denote a random permutation with the uniform distribution over Sn For a set S, the indicator function of S is denoted by 1(S) and the cardinal of S is denoted by |S| The following theorem will be proved in this note
Theorem 1.1 We have
sup
x∈R
|P (WA≤ x) − Φ(x)| ≤ 90βA
Trang 3Let aij= aj for 1 ≤ i ≤ m and aij= 0 for m < i ≤ n, where 1 ≤ m < n and A = {a1, , an} is
a set Then YA = X1+ · · · + Xm, where {X1, , Xm} is a random sampling without replacement
of size m from A In this case, it is easy to see that with ¯a =Pn
j=1aj/n,
βA
m(n − m)[(n − m)2+ m2]Pn
j=1|aj− ¯a|3
Pn j=1|aj− ¯a|3
n2(V arX)3/2
Therefore, we have the following corollary
Corollary 1.2 Let X = X1+ · · · + Xm, where {X1, , Xm} is a random sampling without re-placement of size m from A = {a1, , an} Then
sup
x∈R
PX − EX√
V arX ≤ x− Φ(x)
Pn j=1|aj− ¯a|3
H¨oglund [12] proved this corollary with the same bound but without an explicit constant by Fourier analysis More recently, Goldstein [8] obtained the Wasserstein distance to the normal distribution, and Hu, Robinson, and Wang [13] proved the Cram´er-type large deviations for X
2 Proof
In view of (1.4) and (1.5), we may replace aij by
aij− ai.− a.j+ a
σA
and assume a = ai.= a.j = 0, σ2A= 1
It was shown by Goldstein and Reinert [9] that for any mean zero random variable W with finite variance σ2, there exists a random variable W∗ which satisfies EW f (W ) = σ2Ef0(W∗) for all absolutely continuous f with E|W f (W )| < ∞ We say that such a W∗ has the W -zero biased distribution Before present the proof of the theorem, we recall the zero-bias coupling construction for Y = YA in Goldstein [8] as follows
Choose I†, J†, K†, L† independently of the remaining random variables, with distribution
P (I† = i, J† = j, K†= k, L†= l) = (aik+ ajl− ail− ajk)2
For 1 ≤ i, j ≤ n, let τij be the permutation which transposes i and j Set
π†=
πτπ−1 (K † ),J † if L†= π(I†), K†6= π(J†),
πτπ−1 (L † ),I † if L†6= π(I†), K†= π(J†),
πτπ−1 (K † ),I †τπ−1 (L † ),J † otherwise ,
and π‡= π†τI† ,J † Then {π†(I†), π†(J†)} = {π‡(I†), π‡(J†)} = {K†, L†} Let
I = {I†, J†, π−1(K†), π−1(L†)},
Trang 4and let Y and Y be random variables given by (1.1) with π replaced by π and π , respectively Then, with U is the uniform distribution on [0, 1] independent of the remaining random variables, Goldstein [8] showed that
has Y -zero biased distribution, and
where
i / ∈I
aiπ(i), T =X
i∈I
aiπ(i), T†=X
i∈I
aiπ† (i) and T‡=X
i∈I
Let 2 ≤ l ≤ 4, and D = {dij, 1 ≤ i, j ≤ n − l} be the (n − l) × (n − l) array formed by removing the l rows R ⊂ {1, , n} and l columns C ⊂ {1, , n} from A Let E = {eij, 1 ≤ i, j ≤ n − l} be
a matrix with
eij= (dij− di.− d.j+ d )/σD
It follows that ei.= e.j = e = µE = 0, σ2E= 1, βE= βD
For x > −1 and 0 < α < 1, let hx,α(ω) be the function which is 1 for ω ≤ x and then drops linearly to the value 0 at x + α and is 0 for ω ≥ x + α Let g(ω) = (ωf (ω))0, where f = fx,α be the unique bounded solution of the Stein equation
f0(ω) − ωf (ω) = h(ω) − Eh(Z)
For an arbitrary random variable X and a ≤ b, we will use the following simple fact
P (a ≤ X ≤ b) ≤ b − a√
The proof of the following lemma is easy, and will be presented in Appendix
Lemma 2.1 Assume that n > 25000 and
βA
n <
1
Then
Proof of Theorem 1.1 For β > 0, set
M (β, n) =nA ∈ Rn×n: ai.= a.j = 0, σ2A= 1, βA≤ βo, δ(β, n) = supn|P (WA≤ x) − Φ(x)|, x ∈ R, A ∈ M(β, n)o
Trang 5If A ∈ M (β, n), then −A ∈ M (β, n) and
|P (W−A≤ x) − Φ(x)| = |P (WA≥ −x) − (1 − Φ(−x))|
= |P (WA< −x) − Φ(−x)|
Therefore
δ(β, n) = supn|P (WA≤ x) − Φ(x)|, x ≥ 0, A ∈ M (β, n)o
We then follow the computation in Chen and Shao [2, p 246] to get
δ(β, n) ≤ sup
x≥0
1
It suffices to prove that
sup
β>0
nδ(β, n)
If 2 ≤ n ≤ 25000, then for all β > 0 and A ∈ M (β, n), it easy to see that (see, e.g., [5, p 171])
βA
3/2
Combining (2.7) and (2.9), we see that (2.8) holds for 2 ≤ n ≤ 25000 Assuming that n > 25000 and (2.8) holds for all m ≤ n − 1, we will prove that it also holds for n
Fix β > 0, by (2.7), we may assume that β/n < 1/160 Let A ∈ M (β, n) be arbitrary, we will prove that
sup
x≥0
|P (WA≤ x) − Φ(x)| ≤ 90βA
Firstly, we consider x such that (1 + x)βA
84 In this case, we have
|P (YA≤ x) − Φ(x)| = |P (YA> x) − (1 − Φ(x))|
≤ max{P (YA> x), 1 − Φ(x)}
≤ max{E(YA+ 1)
2
(1 + x)2 , 1
1 + x}
(1 + x)2, 1
1 + x}
≤ 1.05
1 + x ≤89βA
It remains to consider the case (1 + x)βA
1
84 Let h = hx,α be defined earlier and f be the solution to the Stein’s equation for this h Let YA†, YA‡, Y∗
A, S, T , T†, T‡be written as in (2.1), (2.2)
Trang 6and (2.3), we have
|Eh(YA) ư Eh(Z)| = |Ef0(YA) ư EYAf (YA)| = |E(f0(YA) ư f0(YA∗))|
≤ |E(YAf (YA) ư YA∗f (YA∗))| + |h(YA∗) ư h(YA)|
=
E(S + T )f (S + T ) ư (S + U T†+ (1 ư U )T‡)f (S + U T†+ (1 ư U )T‡)
+1
αE
|Y∗
Aư YA|
Z 1 0
I[x,x+α](YA+ r(YA∗ư YA))dr := R1+ R2
(2.11)
Let I = (I†, J†, πư1(K†), πư1(L†), π(I†), π(J†), K†, L†) and B = σ(I, U ) By (2.3), we see that
T, T†, T‡ are measurable with respect to B Therefore
R1 =
E
Z T
U T † +(1ưU )T ‡
g(S + u)du
=
E
Z T
U T † +(1ưU )T ‡
EBg(S + u)du
=
E
Z T
U T † +(1ưU )T ‡
EIg(S + u)du
(by the independence of U from {I, S})
(2.12)
Let R = I = {I†, J†, πư1(K†), πư1(L†)}, C = {π(I†), π(J†), K†, L†}, l = |I| Denote YD =
Pnưl
i=1diθ(i), where θ is a random permutation with the uniform distribution over Snưl Since S = P
i / ∈Idiπ(i) and π is chosen uniformly from Sn, we have
By (2.13) and Lemma 2.2,
E{I=i}g(S + u) = Eg(YD+ u) ≤ 4 for all i and u (2.14)
Since (2.14) holds for all i,
By Theorem 6.1 of Goldstein [8] and that n > 25000,
E|YA∗ư YA| ≤ βA
n ư 1
n ư 1+
4 (n ư 1)2
≤ 8.01βA
Combining (2.12), (2.15) and (2.16), we obtain
R1 ≤ 4E|T ư U T†ư (1 ư U )T‡|
= 4E|YA∗ư YA|
≤32.04βA
(2.17)
Now we bound R2 By the computation in [5, p 173-174],
R2= 1
αE
|YAư YA∗|
Z 1
0
P (S ∈ [x ư qr, x + α ư qr]|B)dr, (2.18)
Trang 7where qr = rU T + r(1 − U )T + (1 − r)T Since qr is measurable with respect to B for all r, it follows from (2.18) that
R2 ≤ 1
αE
|YA− YA∗|
Z 1 0
sup
u∈R
P (S ∈ [x − u, x + α − u]|B)dr
αE
|YA− YA∗| sup
u∈R
P (S ∈ [x − u, x + α − u]|B)
αE
|YA− YA∗| sup
u∈R
P (S ∈ [x − u, x + α − u]|I)
(2.19)
where the last equality we have used the independence of U from {S, I} We have
sup
u∈R
P (S ∈ [x − u, x + α − u]|I = i)
= sup
u∈R
P (YD∈ [x − u, x + α − u]) (by (2.13))
= sup
u∈R
Px − u − µD
σD
≤ YE ≤x + α − u − µD
σD
2πσD + 2δ(βE, n − l)
(by (2.4))
2πσD
+180βE
n − l (by the inductive hypothesis)
2πσD
+208βA
(2.20)
As the last in (2.20) does not depend on i or u, it implies
sup
u∈R
P (S ∈ [x − u, x + α − u]|I) ≤ √ α
2πσD
+208βA
Combining (2.16), (2.19) and (2.21), we obtain
R2 ≤ 1 α
√ 2πσD
+208βA n
E|YA∗− YA|
α
√ 2πσD +
208βA n
8.01βA n
≤ 3.32βA
1667βA2
αn2
(2.22)
From (2.11), (2.17) and (2.22), we obtain
sup
x>−1
|Ehx,α(YA) − Ehx,α(Z)| ≤ 35.5βA
1667β2 A
Now, let α = 19√
2πβA/n, we have from (2.23) that
sup
x>−1
|Ehx,α(YA) − Ehx,α(Z)| ≤71βA
For x ≥ 0, then x − α > −1 It thus follows from (2.24) that
sup
x≥0
|Ehx−α,α(YA) − Ehx−α,α(Z)| ≤ 71βA
By (2.24), (2.25) and the definition of h, we have for all x ≥ 0
P (YA≤ x) − Φ(x) ≤ Ehx,α(YA) − Ehx,α(Z) + Ehx,α(Z) − Φ(x)
≤ |Ehx,α(YA) − Ehx,α(Z)| + P (x < Z ≤ x + α)
≤71βA
α
√
90βA
(2.26)
Trang 8P (YA≤ x) − Φ(x) ≥ Ehx−α,α(YA) − Ehx−α,α(Z) + Ehx−α,α(Z) − Φ(x)
≥ −|Ehx−α,α(YA) − Ehx−α,α(Z)| − P (x − α < Z ≤ x)
≥ −71βA
90βA
(2.27)
Combining (2.26) and (2.27), we have |P (YA≤ x) − Φ(x)| ≤ 90βA
n for all x ≥ 0, i.e., (2.10) holds. Taking the supremum over A ∈ M (β, n) and then taking supremum over β > 0, we conclude that (2.8) holds for n
It remains to present and prove Lemma 2.2 which we have used above The proof will be presented
in Appendix
Lemma 2.2 Let x > −1 and 0 < α < 1 and the inductive hypothesis in the proof of Theorem 1.1 holds If n > 25000, and
then
Example 2.3 Let 1 ≤ k, m < n and A = {a1, , an} be a set with aj = 0 or 1 and |{aj : aj ∈
A, aj = 1}| = k Let A = (aij)n×n such that aij = aj for 1 ≤ i ≤ m, 1 ≤ j ≤ n and aij = 0 for
m < i ≤ n, 1 ≤ j ≤ n Then WA is the Hypergeometric distribution with parameters m, k, n Let
p = k/n and f = m/n, we have
|aij− ai.− a.j+ a | =
( (1 − f )|aj− p| if 1 ≤ i ≤ m,
f |aj− p| if m < i ≤ n
It is easy to show that µA= mp, σA2 = n2p(1 − p)f (1 − f )/(n − 1), and
βA
np(1 − p)f (1 − f )[p2+ (1 − p)2][f2+ (1 − f )2]
σ3 A
= (n − 1)[p
2+ (1 − p)2][f2+ (1 − f )2]
nσA
σA
By Theorem 2.2 in Lahiri and Chatterjee [14], there exists a constant c0> 0 such that
sup
x∈R
PWA− µA
σA
≤ x− Φ(x)≥ c0
σA
≥c0βA
Thus, the bound in (1.6) is optimal
A Appendix
In this Section, we will prove Lemma 2.1 and Lemma 2.2
Trang 9Proof of Lemma 2.1 Firstly, we estimate σD By (6.50) and (6.51) in [5], we get
n−l
X
i=1
|di.|3≤ l
2βA
(n − l)3,
n−l
X
j=1
|d.j|3≤ l
2βA
(n − l)3, |d |3≤ 5l
2βA
Therefore
n−l
X
i=1
d2i.≤ (n − l)1/3(
n−l
X
i=1
|di.|3)2/3≤ l
4/3βA2/3 (n − l)5/3,
and
n−l
X
j=1
d2.j ≤ (n − l)1/3(
n−l
X
i=1
|d.j|3)2/3≤ l
4/3β2/3A (n − l)5/3
From (1.3), we have
n − l − 1
X
i,j
d2ij− (n − l)
n−l
X
i=1
d2i.− (n − l)
n−l
X
j=1
d2.j+ (n − l)2d2
It thus follows that
σ2
D ≥
Pn i,j=1a2
ij−P
{i∈R}∪{j∈C}a2
ij− (n − l)P
id2
i.− (n − l)P
jd2 j
n − l − 1
n − l − 1
{i∈R}∪{j∈C}
a2ij + l
4/3βA2/3 (n − l)2/3 + l
4/3β2/3A (n − l)2/3
n − l − 1
2n + 2l
4/3n2/3 (n − l)2/3
βA
n
2/3
≥ 0.93, where we have used the fact that
X
{i∈R}∪{j∈C}
a2ij ≤ (2nl − l2)1/3 X
{i∈R}∪{j∈C}
|aij|32/3≤ (2nl)1/3βA2/3≤ 2n1/3βA2/3
We now prove the second half of (2.6) It follows from the first half of (2.6) and (A.1) that
βD = σ−3D
n−l
X
i,j=1
|dij− di.− d.j+ d |3
≤ 1012σD−3
n−l
X
i,j=1
(|dij|3/1002+ |di.+ d.j− d |3)
≤ 1012100
93
3/2 Xn−l i,j=1
(|dij|3/1002+ 16(|di.|3+ |d.j|3) + 4|d |3)
≤100 93
3/21012
1002 + 1012 52l
2
(n − l)2
βA≤ 1.153βA, where in the second inequality, we have used the fact that
|x + y|3= | x
100+ · · · +
x
100+ y|
3≤ 1012|x|3
1002 + |y|3
Trang 10Proof of Lemma 2.2 We will use the following forms which given by Chen and Shao [3, p 2025]:
N hx,α= Ehx,α(Z) = Φ(x) +√α
2π
Z 1 0
se−(x+α−αs)2/2ds, (A.2)
fx,α(ω) =
√
√ 2πeω2/2(1 − Φ(ω))N hx,α
−αeω2/2R1+(x−ω)/α
0 se−(x+α−αs)2/2ds if x < ω ≤ x + α,
√ 2πeω 2 /2(1 − Φ(ω))N hx,α if ω > x + α,
(A.3)
and
gx,α(ω) =
√
2π(1 + ω2)eω2/2Φ(ω) + ω(1 − N hx,α) if ω < x,
√
2π(1 + ω2)eω 2 /2(1 − Φ(ω)) − ωN hx,α+ rx,α(ω) if x ≤ ω ≤ x + α,
√
2π(1 + ω2)eω 2 /2(1 − Φ(ω)) − ωN hx,α if ω > x + α,
(A.4)
where
rx,α(ω) = −ωeω2/2
ω
1 +x − s α
e−s2/2ds +1 + x − ω
α
Chen and Shao [3] also proved that 0 ≤ rx,α(ω) ≤ 1 for x ≤ ω ≤ x + α and
2π(1 + ω2)eω2/2(1 − Φ(ω)) − ω ≤ 2
We have 0 ≤ f (ω) ≤ 1, |f0(ω)| ≤ 1 (see Lemma 2.5 in [5]) Therefore, for all ω
For −1 < x ≤ 3, using (A.4)-(A.6), it is not hard to prove that gx,α(ω) ≤ 4 for all ω, so that (2.29) holds
For x > 3, using (A.2)-(A.5) and the fact that 1 − Φ(x) ≤ e−x2/2/x√
2π, we have g(x − 2) ≤√
2π(x2− 4x + 5)e(x−2)2/2Φ(x − 2) + x − 2(1 − Φ(x))
≤ (x − 4 + 5
x)e
−2x+2+ 1
x√ 2π(x − 2)e
and
xfx,α(x) ≤√
2πxex2/2Φ(x)(1 − Φ(x))
2
Furthermore, we have g ≥ 0, g(ω) ≤ 2(1 − Φ(x)) ≤ 2(1 − Φ(3)) for ω ≤ 0, g(ω) ≤ 2/(1 + 33) + 1(x <
ω < x + α) for ω ≥ x and g is increasing for 0 ≤ ω < x (see Chen and Shao [3, p 2025]) Therefore Eg(YD+ u) = Eg(YD+ u)1(YD+ u ≤ 0) + Eg(YD+ u)1(0 < YD+ u ≤ x − 2)
+Eg(YD+ u)1(YD+ u ≥ x) + Eg(YD+ u)1(x − 2 < YD+ u < x)
≤ 2(1 − Φ(3)) + g(x − 2) + 2/(1 + 33) + P (x < YD+ u < x + α) +Eg(YD+ u)1(x − 2 < YD+ u < x)
≤ 1.09 + Eg(YD+ u)1(x − 2 < YD+ u < x)
(A.9)
... (1.4) and (1.5), we may replace a< small>ij bya< small>ij− a< small>i.− a< small>.j+ a< small>
σA< /small>
and assume a< small> =...
Trang 6and (2.3), we have
|Eh(YA< /small>) Eh(Z)| = |Ef0(YA< /small>) EYA< /small>f... class="text_page_counter">Trang 8
P (YA< /small>≤ x) − Φ(x) ≥ Ehx−α,α(YA< /small>) − Ehx−α,α(Z) + Ehx−α,α(Z)