1. Trang chủ
  2. » Ngoại Ngữ

Infeasibility and efficiency of working correlation matrix in generalized estimating equations

83 234 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 83
Dung lượng 326,05 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Infeasibility and Efficiency of Working Correlation Matrixin Generalized Estimating Equations NG WEE TECK NATIONAL UNIVERSITY OF SINGAPORE 2005... Infeasibility and Efficiency of Working

Trang 1

Infeasibility and Efficiency of Working Correlation Matrix

in Generalized Estimating Equations

NG WEE TECK

NATIONAL UNIVERSITY OF SINGAPORE

2005

Trang 2

Infeasibility and Efficiency of Working Correlation Matrix

in Generalized Estimating Equations

NG WEE TECK(B.Sc.(Hons) National University of Singapore)

A THESIS SUBMITTEDFOR THE DEGREE OF MASTER OF SCIENCE

DEPARTMENT OF STATISTICS AND APPLIED PROBABILITY

NATIONAL UNIVERSITY OF SINGAPORE

2005

Trang 3

1.1 Organization of this thesis 1

1.2 Preliminaries 2

1.3 Generalized Estimating Equations 3

1.4 Estimation of α using the moment method 10

i

Trang 4

Contents ii

3.1 Quasi-Least Squares 16

3.2 Pseudolikelihood (Gaussian Estimation) 19

3.3 Cholesky Decomposition 21

3.4 Covariance of the estimates 22

3.5 Computation of Estimates 24

3.5.1 Algorithm for Quasi-Least Squares corrected for Bias 24

3.5.2 Algorithm for Gaussian (Pseudo-Likelihood) Method and Cholesky Method 25

4 Common Patterned Correlation Structures 27 4.1 Exchangeable/Equicorrelation Structure 27

4.2 AR(1) Structure 34

4.3 One-dependence Structure 36

4.4 Generalized Markov Correlation Structure 42

5 Simulation Studies 47 5.1 Conclusion 50

Trang 5

I would like to thank my advisor, Associate Professor Wang You-Gan for hisguidance in the 2 years Without his patience and understanding when I was downthis piece of work would probably have never been completed

My thanks also goes out to all the professors in the department who have parted knowledge in one way or another thoughout my undergraduate and graduatedays They are an exceptional bunch of folks who have taught me in not just anacademic matters but sometimes about life in general as well I only hope that infuture I wish have more opportunities to learn more from them

im-This thesis would never have been completed without the help of some of myfellow students and with Yvonne’s help in computer stuff To my dear friends from

iii

Trang 6

Acknowledgements iv

PRC, I wish to thank you guys for helping me brush up my chinese and teaching

me mathematical terms in the chinese langauge

Lastly, I would like to dedicate this piece of work to my family who has alwaysbeen there for me every step in the journey of life

Carpe Diem

Ng Wee TeckAugust 2005

Trang 7

List of Figures

5.1 Estimated MSE of ˆα and ˆβ1 for normal unbalanced data, True EXCWorking AR(1), K=25 545.2 Estimated MSE of ˆα and ˆβ1 for normal unbalanced data, True EXCWorking AR(1), K=100 555.3 Estimated MSE of ˆα and ˆβ1 for normal balanced data, True MA(1)Working AR(1), K=25 565.4 Estimated MSE of ˆα and ˆβ1 for normal balanced data, True MA(1)Working AR(1), K=100 575.5 Estimated MSE of ˆα and ˆβ1 for normal balanced data, True AR(1)Working EXC, K=25 58

v

Trang 8

List of Figures vi

5.6 Estimated MSE of ˆα and ˆβ1 for normal balanced data, True AR(1)

Working EXC, K=100 59

Trang 9

List of Tables

5.1 Table of percentage of times infeasible answers occurs, true tion EXC and working AR(1) 525.2 Table of percentage of times infeasible answers occurs, true correla-tion EXC and working AR(1) 525.3 Table of percentage of times infeasible answers occurs, true correla-tion MA(1) and working AR(1) 525.4 Table of percentage of times infeasible answers occurs, true correla-tion MA(1) and working AR(1) 535.5 Estimated MSE of ˆα for correlated Poisson data, True AR(1) Work-ing AR(1) 53

correla-vii

Trang 10

List of Tables viii

5.6 Estimated MSE of ˆβ1 for correlated Poisson data, True AR(1)

Work-ing AR(1) 605.7 Estimated MSE of ˆα for correlated binary data, True EXC Working

AR(1) 615.8 Estimated MSE of ˆβ1 for correlated binary data, True EXC Working

AR(1) 61

Trang 11

Generalized estimating equations (GEEs) have been used extensively in the ysis of clustered and longitudinal data since the seminal paper by Liang & Zeger(1986) A key attraction of using GEEs is that the regression parameters remainconsistent even when the ’working’ correlation is misspecified However, Crowder(1995) pointed out that there are problems with the estimation of the correlationparameters when it is misspecified and this affects the consistency of the regressionparameters as well This issue has been addressed to a certain extent in a paper

anal-by Chaganty & Shults (1999) , however the estimates are asymptotically biased

In this thesis, we aim to clarify some of these issues Firstly, the feasibility ofthe estimators for the correlation parameters under misspecification and secondlythe efficiency of the various methods of estimating the correlation parameters under

ix

Trang 12

Summary x

misspecification are investigated Analytic expressions for the estimating functionsusing the decoupled Gaussian and cholesky decomposition methods proposed byWang & Carey (2004) are also provided for common correlation structures such asexchangeable, AR(1) and MA(1)

Trang 13

Chapter 1

Introduction

1.1 Organization of this thesis

The main objective of the thesis is to study the impact of misspecification of thecorrelation matrix on both the regression and correlation parameters in a

Generalized Estimating Equations (GEE) setup The structure is as follows, inChapter 1 we give a brief introduction to GEE and also introduce some estimatesfor common correlations structures using the moment approach

In Chapter 2, we describe the main problem and present examples as to when theinfeasibility problem sets in and breaks down the robustness property of GEE

Chapter 3 describes other techniques for obtaining estimates for the correlationparameters using estimating equations In particular, the three methods arequasi-least squares, gaussian (pseudo-likelihood) method and the Cholesky

1

Trang 14

In this thesis, we assume the usual set-up for GEE, each response vector

yi = (y1, , yni)0 measured on subject i = 1, , K are assumed to be

independent between subjects The vector of responses yi is measured at times

ti = (ti1, , tin i)0 For subject i at time j, the response is denoted by yij and hascovariates x0

ij = (xij1, , xijp), p being the number of regression parameters Wedenote the expected value E(yij) = µij and it is linked to the covariates through

µij = g−1(x0

ijβ) where β = (β1, , βp)0 are the regression parameters The

variance of an observation var(yij) = φσ2ij, where φ is an unknown dispersion

parameter The covariance matrix of yi, denoted by Vi is assumed to be of theform φA1/2i RiA1/2i with Ai=diag(σ2

ij) and Ri the correlation matrix

Trang 15

1.3 Generalized Estimating Equations 3

The notation for the estimates of the correlation parameters will be denoted inthe following manner, ˆαmethod,structure where method is a single letter describingthe method used and structure is the estimator for the correlation structure

under study For example, ˆαM,AR(1) indicates a moment estimator for an AR(1)structure

1.3 Generalized Estimating Equations

In a seminal paper in 1986, Liang & Zeger (1986) introduced Generalized

Estimating Equations (GEE) that extends upon the work of Generalized LinearModels (GLM) McCullagh & Nelder (1989) to a multivariate setting with

correlated data An important contribution of Liang & Zeger (1986) is that theyincorporated the information inherently present in the correlation structure of

longitudinal data into estimating functions The theoretical justifications and

asymptotic properties for the resulting estimators from using GEE’s are also

presented in that paper

One of the key features that has encouraged the use of GEE’s in clustered and

longitudinal data analysis is that the regression parameters ( ˆβ) remain consistenteven if the ’working’ correlation or covariance structure is misspecified What

they mean by a ’working’ correlation matrix is as follows, in real life we would

not know what the true correlation structure of the data is However in the GEE

Trang 16

1.3 Generalized Estimating Equations 4

framework, we only need to specify some structure that is a good approximationand we call that a ’working’ correlation structure There is no need to have

complete knowledge of the true correlation, we would only need a ’working’

correlation structure to estimate the regression parameters Throughout this

thesis, the true correlation structure will be denoted as Ri and the ’working’

correlation ¯Ri Although the theory of GEE indicates that we only need a

’working’ correlation structure, we can expect that if the correlation or covariancestructure is modeled accurately, statistical inference on the regression parameterswould most definitely be improved in terms of (smaller) standard errors or

improvement in the asymptotic relative efficiency (Wang & Carey (2003))

The results obtained in Liang & Zeger (1986) are asymptotic results, thus in a

finite sample setting or when the number of subjects available is small there is anobvious need to model the correlation structure properly due to the lack of

information Furthermore, rather than regarding the correlation and covarianceparameters as nuisance parameters, there are instances when these parameters

are of scientific interests, for eg in genetic studies Lastly, we need to emphasizethe importance of proper modelling of the correlation parameters in that it is

possible that a gross misspecification of the structure may lead to infeasible

results This is in fact the main concern of this thesis and is explained in furtherdetail in Chapter 2

Trang 17

1.3 Generalized Estimating Equations 5

We would next describe briefly the optimality of GEE’s along the lines of the

classic Gauss-Markov Theorem Suppose we have i.i.d observations yi with

E(yi) = µ and Var(yi) = σ2 The Gauss-Markov Theorem states that the

estimated regression parameters is a best linear unbiased estimator (BLUE)

For example, if y = Xβ + , E(y) = Xβ and Cov(y) = σ2I,

then βBLU E = (X0X)−1X0y has minimum variance among all unbiased estimators

of β Another way to look at this problem would be that we are interested to find

a matrix A, such that E(Ay) = β and Cov(Ay) has minimum variance amongall estimators of this type It can be shown that A = (X0X)−1X0 satisfies the 2above conditions

Under the independence assumption, suppose we have E(yij) = µij(β),

Var(yij) = νij/φ and the design matrix is Xi = (xi1, , xin i)0 The score

equations using a likelihood analysis have the form,

where ∆i = diag(dµij/dηij) Denote the solution of (1.1) as βI

It can be shown that the asymptotic variance of ˆβI is,

Trang 18

1.3 Generalized Estimating Equations 6

As an extension to (1.1), the GEE setup involves solving the following estimatingfunction,

correlation matrix of the ith subject

The key difference between the independence and GEE setup is the extension ofthe uncorrelated response vector in GLM to the multivariate response we have inGEE GEE includes the information from the correlation matrix Ri, which

models the correlation in each subject/cluster Note that GEE reduces to the

independence equation when we specify Ri = I This approach is also similar tothe function derived from the quasi-likelihood approach proposed by Wedderburn(1974) and McCullagh (1983) The optimality of estimators arising from

quasi-likelihood functions are also shown in the two papers, in particular,

McCullagh (1983) shows there is a close connection between the asymptotic

optimality of quasi-likelihood estimators and the Gauss-Markov Theorem

In general, since Ri is unknown, we use ¯Ri(α) as a ’working’ correlation matrixand α is a q×1 vector of unknown parameters that the correlation matrix

depends on We write the matrix ¯R as a function of α because we cannot be surethat we have the correct model, thus it is appropriate to write it as a function of

Trang 19

1.3 Generalized Estimating Equations 7

α and αi itself is some function of the true correlation parameter ρi Note that qcan take values from 0 (independence model) to ni (n i −1)

2 in the case of a fullyunstructured correlation matrix

If R = I (independence model), GEE reduces to the usual GLM setup

Below are some common choices for R, the motivation for these structures can befound in Crowder & Hand (1990)

1 R = I, Ini×ni is the identity matrix

This structure implies that the measurements on the ith subject is

independent within the subject itself, i.e, yij is independent of yik for all

For lattice based observations, sometimes we can expect the correlations

between observations within the same subject to decrease over time A

Trang 20

1.3 Generalized Estimating Equations 8

simple way to model such a phenomenon is to allow the correlations to

decrease geometrically at a rate of ρ at each time point

3 Exchangeable (EXC) or compound symmetry

In clustered data, for example in teratological studies we expect the

offspring of a female rat in the same litter to share the same correlation ρfor the traits we’re measuring, thus this structure will come in handy

Trang 21

1.3 Generalized Estimating Equations 9

Suppose we observe that the correlations decreases at each time point

depending on how far they’re apart (typically ρ1 < · · · < ρm) but this

correlation drops to zero when they’re more than m time points away Thisphenomenon can be modeled using an MA(m) structure, which is

essentially a banded matrix with bandwidth m For example, when m = 1,

we have ones on the diagonal and ρ on the off diagonals above and belowthe main diagonal

Trang 22

1.4 Estimation of α using the moment method 10

1.4 Estimation of α using the moment method

To estimate α, Liang & Zeger (1986) proposed the moment approach

Let ˆεi = (yi− µi( ˆβ))0A−i 1 where ˆβ is the estimated regression coefficients and ˆεij

be the jth element of ˆεi We have,

εij = yij− µij

ˆ

σij( ˆβ) ∼ N(0, φ)and a general moment based approach (Wang & Carey (2003)) is to solve,

(1.3)For the AR(1) model, the moment method is to solve,

Pn i

j=1ε2

ij (the usual Pearson residuals), N =PK

i=1ni, K isthe number of subjects, p is the number of covariates and ni is the number of

observations per subject/cluster However, there are problems with this method(Crowder (1995)) when the correlation structure is misspecified, details will be

discussed in Chapter 2

Trang 23

1.4 Estimation of α using the moment method 11

In fact, for the AR(1) or MA(1) model, we can estimate α using all pairs lagged

by one unit of observation time, resulting in an estimate as follows,

Burg-type estimator in econometric time series analysis and it is well documented

in time series literature (Pourmadi (2001)) It’s derivation is along the lines of

information theory and it is a maximum entropy estimator of ρ

We will next show that the moment estimator in (1.5) is always well defined,

Without loss of generality, we only need to consider the inner summation

Trang 24

Chapter 2

Problems with Estimation of α

In Liang & Zeger (1986), they proved that ˆβ is consistent even if the ’working’correlation structure is different from the true correlation structured providedthat the mean structure is correctly modeled, ˆα and ˆφ are consistently estimated.However, Crowder (1995) pointed out that the working correlation ¯Ri(α) maylead to infeasible results if specified incorrectly, this leads to a breakdown of theasymptotic theory for ˆβ The following 2 examples illustrates how the lack ofconsistency of ˆαunder misspecification can affect the estimation of ˆβ

Example 1

To illustrate the pitfalls of the moment method considered by Liang & Zeger(1986), consider (1.4) with ni = 3 where the working correlation is AR(1) and thetrue correlation is exchangeable To find ˆα, we have to solve the following

equation,

12

Trang 25

ρ can be thought of as a estimator of ρ where we have correctly specified the

correlation structure Note that for an exchangeable correlation structure of

dimension 3, ρ ∈ (−1/2, 1) Hence, a problem arises when −1/2 < ρ < −1/3 as itwould lead to the solution of (2.2) being infeasible Even if all the roots of

equation (1.4) lies in the feasible range, there would still be the problem of

choosing the ’correct’ solution

Example 2

Assume now that the working correlation is AR(1) and the true correlation is

MA(1) and that ni = 3 Using (2.1) and taking expectations again, we have

Trang 26

−2 cos(π/4)1 ≤ ρ ≤ −2 cos(3π/4)1 , i.e, −1/√2 ≤ ρ ≤ 1/√2 Therefore, if the true ρ isless then -1/2, there is a positive probability that −1/√2 ≤ ˆρ ≤ −1/2 and we runinto the same problem as in example 1 when the estimator for ˆα would be

undefined

Example 3

In this example, we assume that the true structure is autoregressive and the

working correlation is exchangeable Now using the moment estimators in Liang

& Zeger (1986), we have

ˆjk= 1(n − p) ˆφ

by approximating εijεik with its expectation when n is large Assuming that ˆφ

tends to φ and using the average correlation for estimating α,

Trang 27

where d = (n − p)n(n − 1)/2 Observe that ˆα → n/(n − p) ≈ 1 as ρ → 1 Thus,although ˆα exists and converges as n tends to infinity, it is not consistent for anyrecognizable ’true parameter’ underlying the stochastic nature because of its

dependence on the sample size n

Trang 28

Chapter 3

Methods for Estimating α

Apart from the moment method described in the chapter 1, various authors haveused estimating equations to estimate the correlation parameters In this chapter,

we present 3 of the methods which can be found in the literature The notationfor the estimating functions introduced in this chapter will be of the form, UQ,

UG and UC denoting the Quasi-Least Squares method, Gaussian method andCholesky method respectively

Trang 29

where E(εi) = 0 and E(εiε0i) = φRi

Thus we have the following estimating equation and the objective is to find theˆ

αq (q for Quasi-Least Squares) that satisfies,

In that paper, the author showed that for commonly used correlation structureslike exchangeable, tridiagonal, AR(1) and unstructured Ri’s, there exists a

solution in the space where the solution is positive definite However, the

drawback is that ˆαis asymptotically biased even when the correlation structure

is correctly specified The estimating equation for ˆβ is identical to that proposed

in Liang & Zeger (1986), thus ˆβ is consistent asymptotically If the investigator isonly interested in the regression coefficients, QLS offers a feasible method to

obtain sensible results

In a follow up paper, Changanty & Shults (1999) modified their QLS method to

Trang 30

is α = f (ρ) where f is a continuous and one to one function Denote ˆαq as the

Quasi Least Squares estimate of α, then ˆρq= f−1( ˆαq) is a consistent estimate of

ρ Note that the working matrix ¯Ri can be of any structure but clearly the

number of parameters in ¯Ri must be of the same number in the true Ri This

technique also assumes that the working correlation is correctly specified, in thisthesis we will carry out studies to investigate the impact of misspecification of theworking correlation matrix

In the Changanty & Shults (1999) paper, the authors also noted that the limitingvalue of ˆα under an AR(1) working correlation is

regardless of whether the true correlation structure R is AR(1), MA(1),

exchangeable or independent Thus, they propose that we should set the workingcorrelation matrix as AR(1) and use the bias-corrected estimate (denoted by cq,Corrected Quasi Least Squares)

Trang 31

3.2 Pseudolikelihood (Gaussian Estimation) 19

ˆ

αcq = 2 ˆαq

1 + ˆα2 q

since the AR(1), MA(1), exchangeable or independent are the most commonly

used correlation models used in analyzing balanced and equally spaced data

In the case of data that are unbalanced with ni measurements per subject and atirregularly timed intervals ti1, ti2, , tin i for i = 1, , K, the authors suggestedusing a markov and generalized markov structure for modelling the correlation

structure The bias corrected estimates can be obtained analogously but there

are no closed form solutions and the estimates have to be computed numerically

3.2 Pseudolikelihood (Gaussian Estimation)

To correct for the bias in the QLS method, we evaluate the bias of (3.1) as,

Trang 32

3.2 Pseudolikelihood (Gaussian Estimation) 20

For a given ˆβ, by minimising

φ−1X

i

ε0iR−1i εˆi− log (|φ|Ri)



we can see the relation of estimating function (3.3) to the Gaussian distribution

is that it can be obtained by minimising -2 times the Gaussian loglikelihood Itcan be shown that this estimating function is unbiased even though ˆε is not

Gaussian (Crowder (1985))

Another way to derive (3.3) would be through the generalized least squares

method in which β is treated as known in the covariance (weighting) function

Thus, the bias corrected version of Quasi Least Squares together with U (β, α)

can be viewed as (Gaussian) pseudo-likelihood (Carroll & Ruppert, 1988, §3.2;

Davidian & Giltinan, 1995, §2.2-2.3)

Since the parameter β appears both in the mean and variance functions, we treat

β in the variance function as known (or distinct from the β in the mean

function) to avoid complications in the minimising of the loglikelihood function,this technique is known as ”decoupling” Another advantage of decoupling the

parameters is so that ˆβ remains consistent even when the working correlation ismisspecified, this is not the case when they are not decoupled It is clear from theestimating function that we have to estimate the scale parameter φ

More generally, instead of using the Gaussian likelihood as a vehicle for

estimation it might to of interest to try non-Gaussian distributions in the

Trang 33

3.3 Cholesky Decomposition 21

estimation procedure Possible candidates include the multivariate t (Lange,

Little, Taylor 1989) and the multivariate skew-normal distribution (Azzalini,

Trang 34

3.4 Covariance of the estimates 22

Thus, we can use UC1(αj, β) as another unbiased estimating function for αj

Analogous expressions to UC1 and UΓ1 can be obtained by decomposing

R−1i = U0

iD∗

iUi, where Ui is an upper triangular matrix and D∗

i a diagonalmatrix Denote the estimating functions derived from this decomposition by UC2and UΓ2 Letting UC = UC1+ UC2, we have

∂αj

DiLi+∂U

0 i

in a finite sample setting will be investigated in Chapter 5

3.4 Covariance of the estimates

Let the joint estimating functions of U(θ) = (U(β), U(α))0 with covariance

Trang 35

3.4 Covariance of the estimates 23

For the Cholesky decomposition method,

Trang 37

Method and Cholesky Method

I Initialization Denote the estimating function by UW, where W can be

either G for the Gaussian method and C for the Cholesky method ˆβ(0) iscomputed under a working independence model(Ri = In i) by setting

ˆ

α(0)=0

II Computation of estimates

(a) Compute ˆε(k)i using ˆβ(k−1)

Trang 38

3.5 Computation of Estimates 26

(b) Solve UW(α(k−1), β(k)) = 0 (eqn 3.3 or eqn 3.4) to obtain ˆα(k)

(c) Find ˆβ(k) by solving U(β, ˆα(k)) = 0

(d) If k(ˆβ(k), ˆα(k))0− (ˆβ(k−1), ˆα(k−1))0k is less than specified tolerance thenstop, else continue until convergence

(e) Suppose convergence is achieved at iteration n, stop iteration and

return ( ˆβ(n), ˆα(n))0

Trang 39

R(α) = (1 − ρ)In+ ρ110, where ρ ∈(-1/(n-1),1) and 1 is an n-vector of 1’s Next,

we will show that

R−1(α) = 1

1 − ρIn−

ρ(1 − ρ)(1 + (n − 1)ρ)11

0

27

Ngày đăng: 22/10/2015, 22:37

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN