1. Trang chủ
  2. » Ngoại Ngữ

Central limit theorem of linear spectral statistics for large dimensional random matrices

116 282 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 116
Dung lượng 384,48 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Con-sequently, a new approach, large dimensional random matrices LDRM theory, has beenproposed to replace the classical large sample theory.. In this thesis, usingthe Bernstein polynomia

Trang 1

LARGE DIMENSIONAL RANDOM MATRICES

WANG XIAOYING

NATIONAL UNIVERSITY OF SINGAPORE

2009

Trang 2

STATISTICS FOR LARGE DIMENSIONAL RANDOM

MATRICES

WANG XIAOYING(B.Sc Northeast Normal University, China)

A THESIS SUBMITTEDFOR THE DEGREE OF DOCTOR OF PHILOSOPHY

DEPARTMENT OF STATISTICS AND APPLIED PROBABILITY

NATIONAL UNIVERSITY OF SINGAPORE

2009

Trang 3

I would like to express my deep and sincere gratitude to my supervisors, Professor BaiZhidong and Associate Professor Zhou Wang Their valuable guidance and continuoussupport have been crucial to the completion of this thesis I do appreciate all the timeand efforts they have spent in helping me to solve the problems I encountered I havelearned many things from them, especially regarding academic research and characterbuilding

Special acknowledgement are also due to Assistant Professor Pan Guangming and

Mr Wang Xiping for discussions on various topics of large dimensional random matricestheory

It is a great pleasure to record my thanks to my dear friends Ms Zhao Wanting, Ms.Zhao Jingyuan, Ms Zhang Rongli, Ms Li Hua, Ms Zhang Xiaoe, Ms Li Xiang, Mr.Khang Tsung Fei, Mr Li Mengxin, Mr Deng Niantao, Mr Su Yue, Mr Wang Daqing,and Mr Loke Chok Kang, who have given me much help not only in my study but also

in my daily life Sincere thanks to all my friends who helped me in one way or another

Trang 4

for their friendship and encouragement.

On a personal note, I thank my parents, husband, sisters and brother for their endlesslove and continuous support during the entire period of my PhD programme I also thank

my baby for giving me a lot of happy times and a sense of responsibility

Finally, I would like to attribute the completion of this thesis to other members andstaff of the department for their help in various ways and providing such a pleasantstudying environment I also wish to express my gratitude to the university and thedepartment for supporting me through an NUS research scholarship

Trang 5

1.1 Large Dimensional Random Matrices 1

1.2 Spectral Analysis of LDRM 3

1.3 Methodologies 5

1.3.1 Moment Method 5

1.3.2 Stieltjes Transform 8

1.3.3 Orthogonal Polynomial Decomposition 11

1.4 Organization of the Thesis 12

Trang 6

2 Literature Review 14

2.1 Limiting Spectral Distribution (LSD) of LDRM 14

2.1.1 Wigner Matrix 14

2.1.2 Sample Covariance Matrix 16

2.1.3 Product of Two Random Matrices 18

2.2 Limits of Extreme Eigenvalues 19

2.3 Convergence Rate of ESD 21

2.4 CLT of Linear Spectral Statistics (LSS) 22

3 CLT of LSS for Wigner Matrices 26 3.1 Introduction and Main Result 26

3.2 Bernstein Polynomial Approximation 29

3.3 Truncation and Preliminary Formulae 33

3.3.1 Simplification by Truncation 33

3.3.2 Preliminary Formulae 35

3.4 The Mean Function of LSS 36

3.5 Convergence of∆ − E∆ 49

Trang 7

4 CLT of LSS for Sample Covariance Matrices 61

4.1 Introduction and Main Result 61

4.2 Bernstein Polynomial Approximations 65

4.3 Simplification by Truncation and Normalization 68

4.4 Convergence of∆ − E∆ 70

4.5 The Mean Function of LSS 89

5 Conclusion and Further Research 96 5.1 Conclusion and Discussion 96

5.2 Future Research 98

Trang 8

With the rapid development of computer science, large dimensional data have becomeincreasingly common in various disciplines These data resist conventional multivariateanalysis that rely on large sample theory, since the number of variables for each obser-vation can be very large and comparable to the sample size The classical multivariateanalysis appears to have intolerable errors in dealing with large dimensional data Con-sequently, a new approach, large dimensional random matrices (LDRM) theory, has beenproposed to replace the classical large sample theory

Spectral analysis of LDRM plays an important role in large dimensional data ysis After finding the limiting spectral distribution (LSD) of the empirical spectraldistribution (ESD i.e the empirical distribution of the eigenvalues) of LDRM, one caneasily derive the limit of the corresponding linear spectral statistics (LSS) Then, in order

anal-to conduct further statistical inference, it is important anal-to find the limiting distribution ofLSS of LDRM

A general conjecture about the convergence rate of ESD to LSD puts it at the order of

Trang 9

O(n−1) If this is true, then it seems natural to consider the asymptotic properties of theempirical process Gn(x)= n(Fn(x) − F(x)) Unfortunately, many lines of evidence showthat the process Gn(x) cannot converge in any metric space As an alternative, we turned

to finding the limiting distribution of LSS of the process Gn(x) In this thesis, usingthe Bernstein polynomial approximation and Stieltjes transform method, under suitablemoment conditions, we prove the central limit theorem (CLT) of LSS with a generalizedregular class C4of the kernel functions for large dimensional Wigner matrices and sam-ple covariance matrices These asymptotic properties of LSS suggest that more efficientstatistical inferences, such as hypothesis testing, constructing confidence intervals or re-gions, etc., on a class of population parameters, are possible The improved criteria onthe constraint conditions of kernel functions in our results should also provide a betterunderstanding of the asymptotic properties of the ESD of the corresponding LDRM

Trang 10

Wn(k) the submatrix extracted from Wn by removing its k-th

row and k-th column

γm the contour formed by the boundary of the rectangle with

vertices (±a ± i/√m)

γmh, γmv the union of the two horizontal/vertical parts of γm

Trang 12

Chapter 1

Introduction

1.1 Large Dimensional Random Matrices

In classical multivariate analysis, large sample theory assumes that the data dimension

pis very small and fixed; the number of observations, sample size n, is large or tends

to infinity However, in recent four or five decades, this is not always the case with therapidly developing computer science For some contemporary data, the dimension p isalso large and comparable to the sample size n, in some cases, even larger than the samplesize These phenomena are commonplace in various fields, such as finance, genetics,bioinformatics, wireless communications, signal processing, and environmental scienceand so on Hence, the new features of contemporary data bring a series of new tasks tostatisticians, for example, how to properly describe these new features of data; whetherthe classical limiting theory is suitable for analyzing large dimensional data and if not,

Trang 13

how to amend it.

In the 1972 Wald Memorial Lecture, Huber (1973) proposed a new reasonably totic setup for large sample theory After summarizing and analyzing several possibilitiesfor the concomitant of p as n tends to infinity, he strongly suggested studying the situa-tion of dimension p increasing together with n in linear regression analysis For LDRM,

asymp-it is the convention to exploasymp-it the properly simple asymptotic setup, p tends to infinasymp-ityproportional to n, that is, p/n → y ∈ (0, ∞) This new setup leads to two kinds of limitingresults: classical limiting theory (for fixed dimension p) and large dimensional limitingtheory (for large dimension p and also called LDRM theory) Therefore, it is natural

to ask which one is closer to reality, meaning which kind of limiting theory should beapplied to a particular problem

Bai and Saranadasa (1996) encouraged statisticians to reexamine classical statisticalapproaches when dealing with high dimensional data As an example for the effect ofhigh dimension, a two sample problem was investigated They showed that both Demp-ster’s non-exact test (Dempster, 1958) and their asymptotically normally distributed testhave higher power than classical Hotelling’s test when the data dimension is proportion-ally close to the within sample degree of freedom Another example was presented inBai and Silverstein (2004) When p increases proportionally to n, an important statistics

in multivariate analysis Ln = ln(detSn) performs in a complete different manner than itdoes on data of low dimension with large sample size (Here Snis the sample covariancematrix of n samples from a p-dimensional mean zero random vector with population

Trang 14

matrix I) Thus, when p is large, any test which assumes asymptotic normality of Ln, i.e.employs the classical limiting theory will result in a serious error.

These examples show that some classical limiting theory might be no longer suitablefor dealing with large dimensional data The LDRM theory might offer one possiblemethod to analyze large dimensional data and has attracted much interest from statisti-cians At the international conference “Mathematical Challenges of the 21st Century”,Donoho (2000) stated that, “we can say with complete confidence that in the coming cen-tury, high-dimensional data analysis will be a very significant activity, and completelynew methods of high-dimensional data analysis will be developed; we just don’t knowwhat they are yet.”

1.2 Spectral Analysis of LDRM

A major part of LDRM theory is the spectral analysis of LDRM Suppose A is an n ×

n matrix with eigenvalues λj, j = 1, 2, · · · , n If all these eigenvalues are real, e.g.,

if A is symmetric or Hermitian for complex entries, we can define a one-dimensionaldistribution function:

Trang 15

indi-ESD of the matrix A:

FA(x, y)= 1

n × cardinal number of { j ≤ n : Re(λj) ≤ x, Im(λj) ≤ y}

= 1n

n

X

j =1

I{Re(λj) ≤ x, Im(λj) ≤ y},

where Re(λj) and Im(λj) denote the real and imaginary parts of the complex number λj,respectively

One of the original motivations of spectral analysis of LDRM arose in nuclear physicsduring the 1950’s In a quantum mechanical system, the energy levels of quantum cannot

be observed but can be represented by eigenvalues of a matrix of some physical ments or observations Furthermore, most nuclei have thousands of energy levels, whichare too complex to be described exactly Since the 1950’s, a large number of physicistsand statisticians have showed keen interest in the research on spectral analysis of LDRMand have carried out some research results on it Many theorems and applications of theLDRM theory in quantum mechanic and other related areas were well summarized byMehta (1990)

measure-In multivariate statistical inference, the motivation of spectral analysis of LDRM

is due to the fact that many important statistics can be expressed as functionals of thespectral distribution of some random matrices For concrete applications, the reader mayrefer to Bai (1999)

Two key problems in spectral analysis of LDRM theory are to investigate the vergence and its rate of the sequence of empirical spectral distributions FA n for a given

Trang 16

con-sequence of random matrices {An} The limit distribution F (possibly defective), which

is usually non-random, is called the Limiting Spectral Distribution (LSD) of the matrixsequence {An} The following section will review three main methodologies of findingthe limits and improving the convergence rates

1.3 Methodologies

It is known that the eigenvalues of a matrix are continuous functions of the elements

of the matrix elements However, the explicit forms of eigenvalues are too complex tocalculate when the matrix dimension is larger than 4 In order to investigate the spectraldistributions of LDRM, three primary methods have been employed in the literature.They are moment method, Stieltjes transform and orthogonal polynomial decomposition

of the exact density of eigenvalues

Moment method is one of the most popular methods in LDRM theory, which is based onthe Frechet-Shohat Moment Convergence Theorem (MCT, see Lo`eve, 1977) Suppose{Fn} denotes a sequence of distribution functions with finite moments of all orders TheMCT investigates under what conditions the convergence of moments of all fixed ordersimplies the weak convergence of the sequence of the distributions Fn The sufficient

Trang 17

conditions are precisely described in the following three lemmas One may refer to Baiand Silverstein (2006, Appendix B) for their detailed proof.

Let

βnk = βk(Fn) :=

Z

xkdFn(x)

be the k-th moment of the distribution Fn

Lemma 1.3.1 (Unique Limit) A sequence of distribution functions {Fn} converges weakly

to a limit if the following conditions are satisfied:

(1) Each {Fn} has finite moments of all orders

(2) For each fixed integer k ≥0, βnk converges to a finite limit βk as n → ∞

(3) If two right continuous nondecreasing functions F, G have the same momentsequence βk, then F = G+constant

One can prove Lemma 1.3.1 by using Helly’s selection theorem and the property ofdistribution function When we applying this lemma, besides verifying condition (1)and (2), we need to check condition (3) of this lemma The following two lemmas give

sufficient conditions which imply condition (3) in Lemma 1.3.1

Lemma 1.3.2 (Carleman) Let {βk = βk(F)} be the sequence of moments of the tion function F If the following Carleman condition is satisfied:

distribu-∞

X β−1/2k 2k = ∞,

Trang 18

then, F is uniquely determined by the moment sequence {βk, k = 0, 1, · · · }.

The following Lemma 1.3.3 is a corollary of Craleman condition in Lemma 1.3.2.The Riesz condition referring to (1.1) is easy to check and is powerful enough in spectralanalysis of LDRM

Lemma 1.3.3 (M Riesz) Let {βk} be the sequence of moments of the distribution tion F If

then, F is uniquely determined by the moment sequence {βk, k = 0, 1, · · · }

In LDRM theory, the k-th moment of FAcan be written as

Trang 19

One of the important advantages of Stieltjes transform is that it always exists for allfunctions with bounded variations defined on the real line The following lemmas in thissection about the properties and inequalities of Stieltjes transform are well summarizedand proved in Bai and Silverstein (2006, Appendix B).

Lemma 1.3.4 (Inversion formula) If G is a distribution function, for any continuitypoints a < b of G,

G(b) − G(a)= lim

v↓0

Z b a

Im(sG(z))du

This lemma provides a one-to-one correspondence between the distribution functionsand their Stieltjes transforms Furthermore, it also offers an easy way to extract thedistribution function if its Stieltjes transform is known

Lemma 1.3.5 (Continuity) Assume that {Gn} is a sequence of bounded variation tions and Gn(−∞)= 0 for all n Then

func-limsGn(z)= s(z) ∀z ∈ C+,

Trang 20

if and only if there is function of bounded variation G with G(−∞) = 0 and Stieltjestransform s(z) such that Gn → G vaguely.

This Lemma describes continuity properties between the family of bounded variationfunctions and the family of their Stieltjes transforms

Lemma 1.3.6 (Differentiability) Let G be a function of bounded variation and x0 ∈ R

Suppose that ImsG(x0) = lim

z∈C+→x 0

ImsG(z) exists Then G is differentiable at x0, and itsderivative is 1πImsG(x0)

From this lemma, one can find another important advantage of Stieltjes transform

is that the density function of a distribution function can be obtained via its Stieltjestransform

In LDRM theory, if A be an n × n matrix, the Stieltjes transform of FA has thefollowing expression

where I is the identity matrix

Applying the inverse matrix formula, we have

Trang 21

The following three lemmas describe the distance between distributions in term oftheir Stielfjes transforms and pave way for estimating convergence rates of ESD ofLDRM to its LSD.

Lemma 1.3.7 Let F be a distribution function and let G be a function of bounded ation satisfying R |F(x) − G(x)|dx < ∞ Denote their Stieltjes transforms by f (z) andg(z), respectively Then, we have

where z= u + iv, v > 0, and a and γ are constants related to each other by

Trang 22

Lemma 1.3.8 Under the assumptions of Lemma 1.3.7, we have

where A and B are positive constants such that A > B and κ = 4B

π(A−B)(2γ−1) < 1

The following Lemma 1.3.9 is an immediate corollary of Lemma 1.3.8

Lemma 1.3.9 In addition to the assumptions of Lemma 1.3.8, assume further that, forsome constant B > 0, F([−B, B]) = 1 and |G|((−∞, −B)) = |G|((B, ∞)) = 0, where

|G|((a, b)) denotes the total variation of the signed measure G on the interval (a, b).Then, we have

where A, B and κ are defined in Lemma 1.3.8

If all elements of the matrix A has a joint density pn(A) = H(λ1, , λn), then the jointdensity of the eigenvalues will be given by

f(λ1, , λn)= cJ(λ1, , λn)H(λ1, , λn),

Trang 23

where J is the integral of the Jacobian of the transform from the matrix space to itseigenvalue-eigenvector space.

Generally, it is assumed that H has the form H(λ1, , λn)= Πn

k =1g(λk) and J has theform J(λ1, , λn) = Πi< j(λi − λj)βΠn

k =1hn(λk) For example, β = 1 and hn = 1 for realGaussian matrix, β = 2 and hn = 1 for complex Gaussian matrix, β = 4 and hn = 1 forquaternion Gaussian matrix, β = 1 and hn= xn−pfor real Wishart matrix with n ≥ p

Note that the orthogonal polynomial decomposition can only be applied under the sumption that the exact density of eigenvalues is known However, in this thesis, we willnot assume the existence of density functions which is too restrictive Instead, we con-sider a general situation The underlying distribution of the elements of matrices could

as-be discrete Hence the detailed discussion on the orthogonal polynomial decomposition

is beyond the scope of this study

1.4 Organization of the Thesis

This thesis consists of five chapters and is organized as follows In this chapter, Chapter

1, we have provided a general introduction to the motivation of the LDRM theory andthe spectral analysis as well as three main methodologies in this field

In Chapter 2, we present a detailed review on spectral analysis of LDRM

Chapter 3 and Chapter 4 are the main parts of this thesis We prove our main results,

Trang 24

Theorem 3.1.1 and Theorem 4.1.1.

In the last chapter, Chapter 5, we discuss some applications and possible future search

Trang 25

in the complex case); and the entries on or above the diagonal are independent.

The Wigner matrix is named after the famous physicist Eugene Wigner; and it plays

an important role in nuclear physics (see Mehta (1990)) It also has strong statistical

Trang 26

meaning in multivariate analysis as it is the limit of the normalized Wishart Matrix.

The study of spectral analysis of large dimensional Wigner matrix can date back toEugene Wigner’s (1955, 1958) famous semicircular law He proved that the expectedESD of an n × n standard Gaussian matrix Wn, normalized by (1/√n), converges to thesemicircular law F with the density

4 − x2, if |x| ≤ 2;

(2.1)

Grenander (1963) proved that kF√1n Wn

− Fk → 0 in probability It was further alized by Arnold (1967, 1971) in various aspects Bai (1999) derived the almost sureversion using both the moment method and the Stieltjes transform method This result

gener-is presented in the following theorem

Theorem 2.1.1 Suppose Wn= (xi j) is an n × n generalized Wigner matrix whose entriesabove the diagonal are i.i.d complex random variables with variance σ2, and whosediagonal entries are i.i.d real random variables (without any moment requirement).Then, as n → ∞,with probability 1, the ESD F√1n Wn

tends to the semicircle law withscale-parameter σ, whose density is given by

√4σ2− x2, if |x| ≤ 2σ;

(2.2)

The following theorem is the generalized result of the non-i.i.d case proved by Baiand Silverstein (2006, page 23)

Trang 27

Theorem 2.1.2 Suppose that Wnis a Wigner matrix, the entries above or on the diagonal

of Wnare independent but may be dependent on n and may not be necessarily identicallydistributed Assume that all the entries of Wnare of mean zero and variance 1 and satisfythe following condition For any constant η >0,

Then, the ESD of Wnconverges to the semicircular law almost surely

Definition 2.1.2 (Sample Covariance Matrix) Let Xn = (xi j)p×n, 1 ≤ i ≤ p, 1 ≤ j ≤ n,

be an observation matrix of size n from a certain p-dimensional population distributionand xj = (x1 j, · · ·, xp j)t be the j-th column of Xn Then the sample covariance matrix is

j =1xj and A∗denotes the complex conjugate transpose of matrix A

In spectral analysis of large dimensional sample covariance matrix, it is usual to studythe simplified sample covariance matrices which is given by

Bn = 1n

Trang 28

The first success in finding the LSD of sample covariance matrix was attributed toMarˇcenko and Pastur (1967) They found the limiting distribution, presently known asMarˇcenko-Pastur law (MP law) Subsequent work was done in Grenander and Silver-stein (1977), Jonsson (1982), Silverstein (1995), Wachter (1978) and Yin (1986) Thefollowing theorem in Bai (1999) for the complex case is a generalized version of Yin(1986) where the real case was studied.

Theorem 2.1.3 Suppose that {xi j, 1 ≤ i ≤ p, 1 ≤ j ≤ n} is a double array of i.i.d.complex random variables with mean zero and variance σ2 and p/n → y Then, withprobability 1, the ESD of Bntends to a limiting distribution with the density

√(b − x)(x − a), if a ≤ x ≤ b;

and a point mass 1 − 1/y at the origin if y > 1, where a = σ2(1 − √y)2 and b =

σ2(1+ √y)2; the constant y is the dimension to sample size ratio index

The limiting distribution of Theorem 2.1.3 is called the Marˇcenko-Pastur law with ratioindex y and scale parameter σ2 If σ2 = 1, the MP law is known as the standard MP law

The following theorem extend the above result to the non-i.i.d case for samplecovariance matrices proposed in Bai and Silverstein (2006, page 46)

Theorem 2.1.4 Suppose that for each n, the entries of Xnare independent complex ables with a common mean µ and variance σ2 Assume that p/n → y and that for any

Trang 29

i j

E|xi j|2I{|xi j| ≥η√n}= 0,Then, with probability one, the ESD of sample covariance matrices FBn converges to the

MP law with ratio index y and scale parameter σ2

The study of a product of two random matrices originates from two areas The first

is the investigation of the LSD of a sample covariance matrix S T when the populationcovariance matrix T is not a multiple of the identity matrix The second is the study ofLSD of a multivariate F-matrix F = S1S−1

2 which is a product of a sample covariancematrix and the inverse of another sample covariance matrix, both independent of eachother

Yin and Krishnaiah (1983) investigated the limiting distribution of a product of aWishart matrix S and a positive definite matrix T Other variations of the product wasconsidered by Bai, Yin and Krishnaiah (1986) Silverstein and Bai (1995) showed theexistence of the ESD of the generalized version B= A +1

nX∗T X The set-up of matrix Boriginated from nuclear physics, but is also encountered in multivariate statistics

As for the F-matrix, pioneering work was done by Wachter (1980), who consideredthe LSD of F when S1and S2are independent Wishart matrices Yin, Bai and Krishnaiah(1983) also showed the existence of the LSD of the multivariate F-matrix The explicit

Trang 30

form of the LSD of multivariate F-matrices was derived in Bai, Yin, and Krishnaiah(1987) and Silverstein (1985a) Under the same structure, Bai, Yin, and Krishnaiah(1986) established the existence of the LSD when the underlying distribution of S isisotropic.

2.2 Limits of Extreme Eigenvalues

In multivariate analysis, many statistics generated from a random matrix can be written

as functions of integrals with respect to the ESD of the random matrix When the LSD isknown, the approximate values of the statistics can be obtained by using the Helly-Braytheorem (see Lo`eve (1977) p.184-186), which is not applicable unless we can prove thatthe extreme eigenvalues of the random matrix remain in certain bounded intervals

The investigation on limits of extreme eigenvalues is not only important in the aboveaspect, but also in many other areas, such as signal processing, pattern recognition, edgedetection, numerical analysis

The first work in this direction is attributed to Geman (1980) who proved that thelargest eigenvalue of the large dimensional sample covariance matrix tends to b = (1 +

c)2 as p/n → y ∈ (0, ∞) under a restriction on the growth rate of the moments of theunderlying distribution

E|X11|k ≤ Mkαk,

Trang 31

for some M > 0, α > 0 and for all k ≥ 3 This result was further generalized byYin, Bai and Krishnaiah (1988) under the assumption of the existence of the fourthmoment of the underlying distribution The fourth moment condition was further proved

to be necessary in Bai, Silverstein and Yin (1988) Silverstein (1989) showed that thenecessary and sufficient conditions for the weak convergence of the largest eigenvalue

of a sample covariance matrix are EX11 = 0 and n2P|X11| ≥ √n → 0 In Bai andYin (1988), the necessary and sufficient condition for the almost sure convergence of thelargest eigenvalue of Wigner matrix was obtained Jiang (2004) proved that the almostsure limit of the largest eigenvalue of the sample correlation matrix is same as that of thelargest eigenvalue of the sample covariance matrix

A relatively difficult problem is to find the limit of the smallest eigenvalue of ple covariance matrix Yin, Bai and Krishnaiah (1983) proved that the lower limit ofthe smallest eigenvalue of a Wishart matrix has a positive lower bound if p/n → y ∈(0, 1/2) Silverstein (1984) extended this work to y ∈ (0, 1) He later (1985b) provedthat the smallest eigenvalue of a standard Wishart matrix tends to a = (1 − √y)2 ifp/n → y ∈(0, 1) However, it is hard to use his approach to obtain a general result, ashis method depends heavily on the normality assumption A breakthrough was made inBai and Yin (1993) They used a unified approach to establish the strong limits of boththe largest and the smallest eigenvalues of the sample covariance matrix simultaneouslyunder the existence of fourth moment of the underlying distribution In fact, the stronglimit of the smallest eigenvalue was proven to be a = (1 − √y)2

Trang 32

sam-2.3 Convergence Rate of ESD

The convergence rate of the ESD, is of practical interest, but had been an open problemfor decades since there were no suitable tools The first great breakthrough in estimatingthe convergence rate was made in Bai (1993a), in which a Berry-Esseen type inequality

of the difference of two ESDs was established in terms of their Stieltjes transforms.Through this tool, Bai offered a way to establish the convergence rate and proved that

a convergence rate for the expected ESD of large dimensional Wigner matrix is O(n−1).Applying this inequality, Bai, Miao and Tsay (1997) first showed that the ESD itselfconverges to the Wigner semicircular law in probability with the rate O(n−1), under theassumption of the finite fourth moment Later, Bai, Miao and Tsay (1999) improved therate to Op(n−1) In 2002, they further derived that under the eighth moment condition,the convergence rate of the expected ESD is O(n−1) and that of ESD itself is Op(n−2)

For large dimensional sample covariance matrices, under the finite fourth momentcondition, Bai (1993b) showed that the convergence rate for the expectation of ESD isO(n−1) if the ratio of the dimension to the degrees of freedom is away from 1, and isO(n−485), if the ratio is close to 1 Bai, Miao and Tsay (1997) proved the same rates ofconvergence in probability for ESD itself

Using the Stieltjes transform, Bai, Miao, and Yao (2003) improved that the expectedspectral distribution converges to the Marcenko–Pastur law with the rate of O(n−1) if theratio of dimension to sample size y = yn = p/n keeps away from 0 and 1, under theassumption that the entries have a finite eighth moment Furthermore, the rates for both

Trang 33

the convergence in probability and the almost sure convergence are shown to be Op(n−2)and oa.s.(n−2+η), respectively, when y is away from 1 It is interesting that the rate inall senses is O(n−1) when y is close to 1 However, the exact convergence rate and theoptimal condition of convergence for Wigner and sample covariance matrices are stillopen.

2.4 CLT of Linear Spectral Statistics (LSS)

As mentioned in the introduction section, many important statistics in multivariate ysis can be expressed as functionals of ESD of some random matrices Indeed, a param-eter θ of the population can often be expressed as:

Trang 34

we need to know the limiting distribution of

Gn( f )= αn(ˆθ − θ)=Z f(x)dGn(x)

where Gn(x)= αn(Fn(x) − F(x)) and αn → ∞ is a suitably chosen normalizer such that

Gn( f ) tends to a non-degenerate distribution

It seems natural to pursue the properties of linear functionals by considering theasymptotic of the empirical process Gn(x)= αn(Fn(x) − F(x)) when viewed as a randomelement in C space or D space, the metric spaces of functions, equiped with the Sko-rokhod metric If for some choice of αn, Gn(x) tends to a limiting process G(x), then thelimiting distribution of all LSS can be derived Unfortunately, many lines of evidenceshow that Gn(x) cannot tend to a limiting process in any metric space The work done

in Bai and Silverstein (2004) showed that Gn(x) cannot converge weakly to any trivial process for any choice of αn This phenomenon appears in other random matrixensembles as well When Fn is the empirical distribution of the angles of eigenvalues

non-of an n × n Haar matrix, Diaconis and Evans (2001) proved that all finite dimensionaldistributions of Gn(x) converge in distribution to independent Gaussian variables when

αn = n/ plog n This shows that when αn = n/ plog n, the process Gncannot be tight in

D space

Therefore, we have to withdraw our attempts of looking for the limiting process of

Gn(x) Instead, we will consider the convergence of the empirical process Gn( f ) withsuitable αnand f

The first work in this direction was done by Jonsson (1982) in which he proved the

Trang 35

CLT for the centralized sum of the r-th power of eigenvalues of a normalized Wishartmatrix Similar work for Wigner matrix was obtained in Sinai and Soshnikov (1998).Later, Johansson (1998) proved the CLT of linear spectral statistics of Wigner matrixunder density assumption.

In Bai and Silverstein (2004), the normalization constant αn for large dimensionalsample covariance matrices has been found to be n, by showing that the limiting distri-bution of Gn( f )= n R f (x)d(Fn(x) − F(x)) is Gaussian under certain assumptions, where

f is any function analytic on a certain open set including the support of MP law In Baiand Yao (2005), they considered the Wigner matrix case Under fourth moment condi-tion, they proved that Gn( f ) converges to a Gaussian limit For the CLT of other typematrices, one can refer to Anderson and Zeitouni (2006)

In Bai and Silverstein (2004) and Bai and Yao (2005), the test functions f are analytic

on an open set, including the support of the corresponding limit distributions However,the condition that the functions have to be analytic is stringent This is because some

of the functions observed in real-life situations do not satisfy this condition As such, itwill be more useful to relax this condition

The aim of this thesis is to relax this condition We only require that the test functionshave continuous fourth-order derivatives on an open interval including the support of thecorresponding limiting spectral distribution We prove that the LSS for sample covari-ance matrices and Wigner matrices converge weakly to Gaussian processes under certainmoment conditions We also provide the explicit formulae for the mean and covariance

Trang 36

function of the limiting Gaussian processes.

Trang 37

Chapter 3

CLT of LSS for Wigner Matrices

3.1 Introduction and Main Result

A real Wigner matrix of size n is a real symmetric matrix Wn = (xi j)1≤i, j≤n where theupper-triangle entries (xi j)1≤i≤ j≤n are independent, zero-mean real-valued random vari-ables satisfying the following moment conditions:

(1) ∀i, E|xii|2 = σ2 > 0; (2) ∀i < j, E|xi j|2 = 1

The set of these real Wigner matrices is called the Real Wigner Ensemble (RWE)

A complex Wigner matrix of size n is a Hermitian matrix Wn = (xi j)1≤i, j≤nwhere theupper-triangle entries (xi j)1≤i≤ j≤n are independent, zero-mean complex-valued randomvariables satisfying the following moment conditions:

(1) ∀i, E|xii|2 = σ2 > 0; (2) ∀i < j, E|xi j|2= 1, and Ex2 = 0

Trang 38

The set of these complex Wigner matrices is called the Complex Wigner Ensemble(CWE).

The empirical distribution Fngenerated by the n eigenvalues of the normalized Wignermatrix n−1/2Wnis called the empirical spectral distribution (ESD) of Wigner matrix Thesemicircular law states that Fnconverges a.s to the distribution F with the density

F0(x)= 1

4 − x2, x ∈[−2, 2]

Its various versions of convergence were later investigated

Clearly, as stated in introduction, one method of refining the above approximation

is to establish the rate of convergence, which was studied in Bai (1993a), Costin andLebowitz (1995), Johansson (1998), Khorunzhy, Khoruzhenko and Pastur (1996), Sinaiand Soshnikov (1998) and Bai, Miao and Tsay (1997, 1999, 2002) The convergencerate was improved gradually from O(n−1) to O(n−2) Although the exact convergencerate remains unknown for Wigner matrices, Bai and Yao (2005) proved that the LSS ofWigner matrices indexed by a set of functions analytic on an open domain of the complexplane including the support of the semicircular law converges to a Gaussian process withrate n under finite fourth moment and a Lindeberg type condition

Naturally, one may ask whether it is possible to derive the convergence of the LSS

of Wigner matrices indexed by a larger class of functions In other words, can we relaxthe analyticity condition on test functions?

In this paper, we consider the LSS of Wigner matrices, which is indexed by a set

Trang 39

of functions with continuous fourth-order derivatives on an open interval of the real lineincluding the support of the semicircular law More precisely, let C4(U) denote the set offunctions f : U → C which have continuous fourth-order derivatives The open set U

of the real line includes the interval [−2, 2], the support of F(x) The empirical process

Gn , {Gn( f )} indexed by C4(U) is given by

Theorem 3.1.1 Suppose

E|xi j|6≤ M for all i, j (3.1)Then the empirical process Gn = {Gn( f ) : f ∈ C4(U)} converges weakly in finite dimen-sions to a Gaussian process G := {G( f ) : f ∈ C4(U)} with mean function

EG( f ) = κ − 1

4 [ f (2)+ f (−2)] − κ − 1

2 τ0( f )+ (σ2−κ)τ2( f )+ βτ4( f )

and covariance function

c( f , g) , E[{G( f ) − EG( f )}{G(g) − EG(g)}]

Trang 40

where f , g ∈ C4(U),

V(t, s) = σ2−κ + 1

2βts

!p(4 − t2)(4 − s2)+κ log

Here {Tl, l ≥ 0} is the family of Chebyshev polynomials

The strategy of the proof is to use Bernstein polynomials to approximate functions

in C4(U) This will be done in Section 3.2 Then the problem is reduced to the analyticcase The truncation and re-normalization steps are given in Section 3.3 We derive themean function of the limiting process in Section 3.4 The convergence of the empiricalprocesses is proved in Section 3.5

3.2 Bernstein Polynomial Approximation

It is well-known that if ˜f(y) is a continuous function on the interval [0, 1], the Bernsteinpolynomials

!

yk(1 − y)m−kf˜ k

m

!

Ngày đăng: 12/09/2015, 09:39

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN