1. Trang chủ
  2. » Thể loại khác

Longitudinal data analysis

633 10 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Longitudinal Data Analysis
Tác giả Garrett Fitzmaurice, Marie Davidian, Geert Verbeke, Geert Molenberghs
Trường học Harvard School of Public Health
Chuyên ngành Biostatistics
Thể loại edited book
Năm xuất bản 2008
Thành phố Boston
Định dạng
Số trang 633
Dung lượng 9,02 MB
File đính kèm 142.LONGITUDINAL WITH STATA.rar (3 MB)

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

A major focus of hismethodological research has been on the development of statistical methods for analyzing re-peated binary data and for handling the problem of attrition in longitudin

Trang 2

Longitudinal Data Analysis

Trang 3

The objective of the series is to provide high-quality volumes covering the state-of-the-art in the theory and applications of statistical methodology The books in the series are thoroughly edited and present comprehensive, coherent, and unified summaries of specific methodological topics from statistics The chapters are written by the leading researchers in the field, and present a good balance of theory and application through a synthesis of the key methodological developments and examples and case studies using real data.

The scope of the series is wide, covering topics of statistical methodology that are well developed and find application in a range of scientific disciplines The volumes are primarily of interest to researchers and graduate students from statistics and biostatistics, but also appeal to scientists from fields where the methodology is applied to real problems, including medical research, epidemiology and public health, engineering, biological science, environmental science, and the social sciences

Chapman & Hall/CRC Handbooks of Modern Statistical Methods

Longitudinal Data Analysis

Edited by Garrett Fitzmaurice, Marie Davidian, Geert Verbeke, and Geert Molenberghs

Trang 4

Chapman & Hall/CRC

Handbooks of Modern Statistical Methods

Longitudinal Data Analysis

Edited by

Garrett Fitzmaurice Marie Davidian Geert Verbeke Geert Molenberghs

Trang 5

Chapman & Hall/CRC

Taylor & Francis Group

6000 Broken Sound Parkway NW, Suite 300

Boca Raton, FL 33487-2742

© 2009 by Taylor & Francis Group, LLC

Chapman & Hall/CRC is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S Government works

Printed in the United States of America on acid-free paper

10 9 8 7 6 5 4 3 2 1

International Standard Book Number-13: 978-1-58488-658-7 (Hardcover)

This book contains information obtained from authentic and highly regarded sources Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the valid- ity of all materials or the consequences of their use The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint.

Except as permitted under U.S Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or lized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopy- ing, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers.

uti-For permission to photocopy or use material electronically from this work, please access www.copyright.com (http:// www.copyright.com/) or contact the Copyright Clearance Center, Inc (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400 CCC is a not-for-profit organization that provides licenses and registration for a variety of users For orga- nizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for

identification and explanation without intent to infringe.

Library of Congress Cataloging-in-Publication Data

Longitudinal data analysis / editors, Garrett Fitzmaurice [et al.].

p cm (Chapman and Hall/CRC series of handbooks of modern statistical methods) Includes bibliographical references and index.

ISBN 978-1-58488-658-7 (hardback : alk paper)

1 Longitudinal method 2 Multivariate analysis 3 Regression analysis I Fitzmaurice, Garrett M., 1962- II Title III Series.

Trang 7

Preface ix Editors xi Contributors xiii

PART I: Introduction and Historical Overview

An historical perspective 3

Garrett Fitzmaurice and Geert Molenberghs

PART II: Parametric Modeling of Longitudinal Data

Introduction and overview 31

Garrett Fitzmaurice and Geert Verbeke

for longitudinal data analysis 43

Stuart Lipsitz and Garrett Fitzmaurice

Chapter 4 Generalized linear mixed-effects models 79

Sophia Rabe-Hesketh and Anders Skrondal

Chapter 5 Non-linear mixed-effects models 107

Marie Davidian

with non-Gaussian random effects 143

Bengt Muth´en and Tihomir Asparouhov

Chapter 7 Targets of inference in hierarchical

models for longitudinal data 167

Stephen W Raudenbush

PART III: Non-Parametric and Semi-Parametric Methods

for Longitudinal Data

methods: Introduction and overview 191

Xihong Lin and Raymond J Carroll

methods for longitudinal data 199

Xihong Lin and Raymond J Carroll

Trang 8

Chapter 12 Penalized spline models for longitudinal data 291

Babette A Brumback, Lyndia C Brumback, and Mary J Lindstrom

PART IV: Joint Models for Longitudinal Data

and overview 319

Geert Verbeke and Marie Davidian

longitudinal data 327

Christel Faes, Helena Geys, and Paul Catalano

of repeated-measurement and time-to-event outcomes 349

Peter Diggle, Robin Henderson, and Peter Philipson

Chapter 16 Joint models for high-dimensional longitudinal data 367

Steffen Fieuws and Geert Verbeke

PART V: Incomplete Data

Chapter 17 Incomplete data: Introduction and overview 395

Geert Molenberghs and Garrett Fitzmaurice

Chapter 18 Selection and pattern-mixture models 409

Roderick Little

Chapter 19 Shared-parameter models 433

Paul S Albert and Dean A Follmann

Chapter 20 Inverse probability weighted methods 453

Andrea Rotnitzky

Chapter 21 Multiple imputation 477

Michael G Kenward and James R Carpenter

Chapter 22 Sensitivity analysis for incomplete data 501

Geert Molenberghs, Geert Verbeke, and Michael G Kenward

exposures 553

Author Index 601 Subject Index 613

Trang 9

Longitudinal studies play a prominent role in the health, social, and behavioral sciences,

as well as in public health, biological and agricultural sciences, education, economics, andmarketing They are indispensable to the study of change in an outcome over time Bymeasuring study participants repeatedly through time, longitudinal studies allow the directstudy of temporal changes within individuals and the factors that influence change Becausethe study of change is so fundamental to almost every discipline, there has been a steadygrowth in the number of studies using longitudinal designs Moreover, the designs of manyrecent longitudinal studies have become increasingly complex

There is a wide variety of challenges that arise in analyzing longitudinal data By their verynature, the repeated measures arising from longitudinal studies are multivariate and have acomplex random-error structure that must be appropriately accounted for in the analysis.Longitudinal studies also vary in the types of outcomes of interest Although linear modelshave been the dominant approach for the analysis of longitudinal data when the outcome

is continuous, in many applications the pattern of change is more faithfully characterized

by a function that is non-linear in the parameters In other settings, parametric models forlongitudinal data are not sufficiently flexible to adequately capture the complex patterns ofchange in the outcome and their relationships to covariates; instead, more flexible functionalforms are required When the outcome of interest is discrete, there are broad classes oflongitudinal models that may be suitable for analysis However, there are distinctions amongthese models not only in approach, but in their relative targets of inference as well As

a result, greater care is required in the modeling of discrete longitudinal data Anotherissue that complicates the analysis is the inclusion of time-varying covariates in models forlongitudinal data Longitudinal studies permit repeated measures not only of the outcome,but also of the covariates The incorporation of covariates that change stochastically overtime poses many intricate and complex analytic issues Finally, longitudinal studies are alsomore prone to problems of missing data and attrition The appropriate handling of missingdata continues to pose one of the greatest challenges for the analysis of longitudinal data.These, and many other issues, increase the complexity of longitudinal data analysis.The last 20 years have seen many remarkable advances in statistical methodology foranalyzing longitudinal data Although there are a number of books describing statisticalmodels and methods for the analysis of longitudinal data, to date there is no volume thatprovides a comprehensive, coherent, unified, and up-to-date summary of the major advances

This has provided the main impetus for Longitudinal Data Analysis This book constitutes

a carefully edited collection of chapters that synthesize the state of the art in the ory and application of longitudinal data analysis The book is comprised of 23 expositorychapters, dealing with five broad themes These chapters have been written by many ofthe world’s leading experts in the field Each chapter integrates and illustrates importantresearch threads in the statistical literature, rather than focusing on a narrowly definedtopic Each part of the book begins with an introductory chapter that provides usefulbackground material and a broad overview to set the stage for subsequent chapters Thebook combines a good blend of theory and applications; many of the chapters includeexamples and case studies using data sets drawn from various disciplines Many of thedata sets used to illustrate methods can be downloaded from the Web site for the book

Trang 10

the-x PREFACE(http://www.biostat.harvard.edu/∼fitzmaur/lda), as can sample source code for fitting cer-

tain models

Although our coverage of topics in the book is quite broad, it is certainly not complete.Our selection of topics required judicious choices to be made; we have decided to placegreater emphasis on statistical models and methods that we think likely to endure Thebook is intended to have a broad appeal It should be of interest to all statisticians involvedeither in the development of methodology or the application of new and advanced methods

to longitudinal research We anticipate that the book will also be of interest to quantitativelyoriented researchers from various disciplines

Finally, the compilation of this book would not have been possible without the ingness, persistence, and dedication of each of the contributing authors; we thank themwholeheartedly for their tremendous efforts and the excellent quality of the chapters theyhave written We would also like to thank the many friends and colleagues who have helped

will-us produce this book A special word of thanks to Butch Tsiatis and Nan Laird who viewed several chapters and provided insightful feedback Last, but not least, we thank RobCalver, Aquiring Editor at Chapman & Hall/CRC Press of Taylor & Francis, for encour-agement to undertake this project The original seeds of this book arose from conversationsRob Calver had with a number of distinguished colleagues We are grateful to all, mostparticularly to Rob, for his strong belief in the project and his enthusiasm and perseverance

re-to see the project through from beginning re-to end

Trang 11

Garrett Fitzmaurice is Associate Professor of Psychiatry (Biostatistics) at the Harvard

Medical School, Associate Professor in the Department of Biostatistics at the HarvardSchool of Public Health, and Foreign Adjunct Professor of Biostatistics at the KarolinskaInstitute, Sweden He is a Fellow of the American Statistical Association and a member

of the International Statistical Institute He has served as Associate Editor for

Biomet-rics, the Journal of the Royal Statistical Society, Series B, and Biostatistics; currently,

he is Statistics Editor for the journal Nutrition His research and teaching interests are

in methods for analyzing longitudinal and repeated measures data A major focus of hismethodological research has been on the development of statistical methods for analyzing re-peated binary data and for handling the problem of attrition in longitudinal studies Much

of his collaborative research has concentrated on applications to mental health research,

broadly defined He has co-authored the textbook Applied Longitudinal Analysis (Wiley,

2004) and received the American Statistical Association’s Excellence in Continuing tion Award for a short course on longitudinal analysis at the Joint Statistical Meetings in2006

Educa-Marie Davidian is William Neal Reynolds Distinguished Professor of Statistics at North

Carolina State University and Adjunct Professor of Biostatistics and Bioinformatics at DukeUniversity She is a Fellow of the American Statistical Association, the Institute of Math-ematical Statistics, and the American Association for the Advancement of Science She has

served as an Associate Editor for Biometrics and the Journal of the American Statistical

Association, and was Coordinating Editor of Biometrics in 2000–2002 She is currently

Ex-ecutive Editor of Biometrics Her research interests include the development of methods for

analysis of longitudinal data arising in contexts such as pharmacokinetics, where non-linear,often mechanistically based models for individual behavior are used; and for joint modelingand analysis of longitudinal data and time-to-event outcomes

Geert Verbeke is a Professor of Biostatistics at the Biostatistical Centre of the Katholieke

Universiteit Leuven in Belgium He has published a number of methodological articles onvarious aspects of models for longitudinal data analyses, with particular emphasis on mixedmodels He held a visiting position in the Department of Biostatistics of the Johns HopkinsUniversity in Baltimore, MD, as well as in the affiliated Institute of Gerontology He isPast President of the Belgian Region of the International Biometric Society, InternationalProgram Chair for the International Biometric Conference in Montreal (2006), and Joint

Editor of the Journal of the Royal Statistical Society, Series A (2005–2008) He has served

as Associate Editor for several journals, including Biometrics and Applied Statistics He is

a Fellow of the American Statistical Association

Geert Molenberghs is Professor of Biostatistics at the Universiteit Hasselt in Belgium He

received a B.S degree in mathematics (1988) and a Ph.D in biostatistics (1993) from theUniversiteit Antwerpen He published methodological work on surrogate markers in clinical

Trang 12

xii EDITORStrials, categorical data, longitudinal data analysis, and the analysis of non-response in clini-

cal and epidemiological studies He served as Joint Editor for Applied Statistics (2001–2004) and as Associate Editor for several journals, including Biometrics and Biostatistics He was

President of the International Biometric Society (2004–2005) and later Vice-President (2006)

He was elected a Fellow of the American Statistical Association and received the GuyMedal in Bronze from the Royal Statistical Society He is an elected member of the Inter-national Statistical Institute He has held visiting positions at the Harvard School of PublicHealth (Boston) He has co-authored a book on surrogate marker evaluation in clinicialtrials (Springer, 2005) and on incomplete data in clinical studies (Wiley, 2007)

Geert Molenberghs and Geert Verbeke have co-authored monographs on linear mixedmodels for longitudinal data (Springer, 2000) and on models for discrete longitudinal data(Springer, 2005) They received the American Statistical Association’s Excellence in Con-tinuing Education Award, based on short courses on longitudinal and incomplete data atthe Joint Statistical Meetings of 2002, 2004, and 2005

Trang 13

Rockville, Maryland

Tihomir Asparouhov Muth´en & Muth´en

Los Angeles, California

Babette A Brumback Division of Biostatistics, University of Florida

Raymond Carroll Department of Statistics, Texas A&M University

College Station, Texas

Boston, Massachusetts

UniversityUnited Kingdom

Diepenbeek, Belgium

Leuven, Belgium

Dean A Follmann Biostatistics Research Branch, National Institute of Allergy

& Infectious DiseasesBethesda, Maryland

Trang 14

xiv CONTRIBUTORS

Michael G Kenward Medical Statistics Unit, London School of Hygiene & Tropical

MedicineUnited Kingdom

Boston, Massachusetts

Mary J Lindstrom Department of Biostatistics & Medical Informatics,

University of WisconsinMadison, Wisconsin

Boston, Massachusetts

Roderick Little Department of Biostatistics, University of Michigan

Ann Arbor, Michigan

Hans-Georg M¨ uller Department of Statistics, University of California

Davis, California

Bengt Muth´ en Graduate School of Education & Information Studies

University of CaliforniaLos Angeles, California

Peter Philipson School of Mathematics and Statistics, University of Newcastle

Buenos Aires, Argentina

United Kingdom

Harpenden, United Kingdom

Trang 16

PART I

Introduction and Historical Overview

Trang 18

CHAPTER 1

Advances in longitudinal data analysis:

An historical perspective

Garrett Fitzmaurice and Geert Molenberghs

Contents

1.1 Introduction 3

1.2 Early origins of linear models for longitudinal data analysis 3

1.3 Linear mixed-effects model for longitudinal data 7

1.4 Models for non-Gaussian longitudinal data 8

1.4.1 Marginal or population-averaged models 9

1.4.2 Generalized linear mixed models 16

1.4.3 Conditional and transition models 20

1.5 Concluding remarks 21

Acknowledgments 22

References 22

1.1 Introduction

There have been remarkable developments in statistical methodology for longitudinal data analysis in the past 25 to 30 years Statisticians and empirical researchers now have access

to an increasingly sophisticated toolbox of methods As might be expected, there has been

a lag between the recent developments that have appeared in the statistical journals and their widespread application to substantive problems At least part of the reason why these advances have been somewhat slow to move into the mainstream is their limited implemen-tation in widely available standard computer software Recently, however, the introduction

of new programs for analyzing multivariate and longitudinal data has made many of these methods far more accessible to statisticians and empirical researchers alike Also, because statistical software is constantly evolving, we can anticipate that many of the more recent advances will soon be implemented Thus, the outlook is bright that modern methods for lon-gitudinal analysis will be applied more widely and across a broader spectrum of disciplines

In this chapter, we take an historical perspective and review many of the key advances that have been made, especially in the past 30 years Our review will be somewhat selective, and omissions are inevitable; our main goal is to highlight important and enduring developments

in methodology No attempt is made to assign priority to these methods Our review will set the stage for the remaining chapters of the book, where the focus is on the current state

of the art of longitudinal data analysis

1.2 Early origins of linear models for longitudinal data analysis

The analysis of change is a fundamental component of so many research endeavors in al-most every discipline Many of the earliest statistical methods for the analysis of change were based on the analysis of variance (ANOVA) paradigm, as originally developed by

Trang 19

R A Fisher One of the earliest methods proposed for analyzing longitudinal data was

a mixed-effects ANOVA, with a single random subject effect The inclusion of a randomsubject effect induced positive correlation among the repeated measurements on the same

subject Note that throughout this chapter we use the terms subjects and individuals

in-terchangeably to refer to the participants in a longitudinal study Interestingly, it was theBritish astronomer George Biddel Airy who laid the foundations for the linear mixed-modelformulation (Airy, 1861), before it was put on a more formal theoretical footing in the sem-inal work of R A Fisher (see, for example, Fisher, 1918, 1925) Airy’s work on a model forerrors of observation in astronomy predated Fisher’s more systematic study of related issueswithin the ANOVA paradigm (e.g., Fisher’s [1921, 1925] writings on the intraclass corre-lation) Scheff´e (1956) provides a fascinating discussion of the early contributions of 19thcentury astronomers to the development of the theory of random-effects models As such,

it can be argued that statistical methods for the analysis of longitudinal data, in commonwith classical linear regression and the method of least squares, have their earliest origins

in the field of astronomy

The mixed-effects ANOVA model has a long history of use for analyzing longitudinal

data, where it is often referred to as the univariate repeated-measures ANOVA ticians recognized that a longitudinal data structure, with N individuals and n repeated

Statis-measurements, has striking similarities to data collected in a randomized block design, or theclosely related split-plot design So it seemed natural to apply ANOVA methods developedfor these designs (e.g., Yates, 1935; Scheff´e, 1959) to the repeated-measures data collectedfrom longitudinal studies In doing so, the individuals in the study are regarded as the blocks

or main plots The univariate repeated-measures ANOVA model can be written as

Yij = X  ij β + b i + e ij , i = 1, , N ; j = 1, , n,

where Y ij is the outcome of interest, X ij is a design vector, β is a vector of regression

param-eters, b i ∼ N(0, σ2

b ), and e ij ∼ N(0, σ2) In this model, the blocks or plot effects are regarded

as random rather than fixed effects The random effect, b i, represents an aggregation ofall the unobserved or unmeasured factors that make individuals respond differently Theconsequence of including a single, individual-specific random effect is that it induces positivecorrelation among the repeated measurements, albeit with the following highly restrictive

“compound symmetry” structure for the covariance: constant variance Var(Y ij ) = σ b2+ σ2e and constant covariance Cov(Y ij , Yik ) = σ2b

On the one hand, the univariate repeated-measures ANOVA model provided a natural

generalization of Student’s (1908) paired t-test to handle more than two repeated

measure-ments, in addition to various between-subject factors On the other hand, it can be arguedthat this model was a Procrustean bed for longitudinal data because the blocks or plots wererandom rather than fixed by design and there is no sense in which measurement occasionscan ever be randomized Importantly, it is only when the within-subject factor is randomlyallocated to individuals that randomization arguments can be made to justify the “com-pound symmetry” structure for the covariance There is no basis for this randomizationargument in the case of longitudinal data, where the within-subject factor is the measure-ment occasions Recognizing that the compound symmetry assumption is restrictive, and

to accommodate more general covariance structures for the repeated measures, Greenhouseand Geisser (1959) suggested a correction to the numerator and denominator degrees offreedom of tests derived from the univariate repeated-measures ANOVA (see also Huynhand Feldt, 1976)

In spite of its restrictive assumptions, and many obvious shortcomings, the univariaterepeated-measures ANOVA model can be considered a forerunner of more versatile regres-sion models for longitudinal data As we will discuss later, the notion of allowing effects tovary randomly from one individual to another is the basis of many modern regression models

Trang 20

EARLY ORIGINS OF LINEAR MODELS 5for longitudinal data analysis Also, it must be remembered that the elegant computationalformulae for balanced designs meant that the calculation of ANOVA tables was relativelystraightforward, albeit somewhat laborious For balanced data, estimates of variance com-ponents could be readily obtained in closed form by equating ANOVA mean squares totheir expectations; sometime later, Henderson (1963) developed a related approach for un-balanced data So, from an historical perspective, an undoubted appeal of the repeated-measures ANOVA was that it was one of the few models that could realistically be fit tolongitudinal data at a time when computing was in its infancy This explains why, in thosedays, the key issue perceived to arise with incomplete data was lack of balance.

A related approach for the analysis of longitudinal data with an equally long history,

but requiring somewhat more advanced computations, is the repeated-measures

multivari-ate analysis of variance (MANOVA) While the univarimultivari-ate repemultivari-ated-measures ANOVA is

conceptualized as a model for a single response variable, allowing for positive correlation

among the repeated measures on the same individual via the inclusion of a random subjecteffect, MANOVA is a model for multivariable responses As originally developed, MANOVAwas intended for the simultaneous analysis of a single measure of a multivariate vector of

substantively distinct response variables In contrast, while longitudinal data are

multi-variate, the vector of responses are commensurate, being repeated measures of the sameresponse variable over time So, although MANOVA was developed for multiple, but dis-tinct, response variables, statisticians recognized that such data share a common featurewith longitudinal data, namely, that they are correlated This led to the development of avery specific variant of MANOVA, known as repeated-measures analysis by MANOVA (orsometimes referred to as multivariate repeated-measures ANOVA)

A special case of the repeated-measures analysis by MANOVA is a general approach

known as profile analysis (Box, 1950; see also Geisser and Greenhouse, 1958; Greenhouse

and Geisser, 1959) It proceeds by constructing a set of derived variables, based on a linearcombination of the original sequence of repeated measures, and using relevant subsets ofthese to address questions about longitudinal change and its relation to between-subjectfactors These derived variables provide information about the mean level of the response,averaged over all measurement occasions, and also about change in the response over time.For the most part, the primary interest in a longitudinal analysis is in the analysis of the lat-ter derived variables The multiple derived variables representing the effects of measurementoccasions are then analyzed by MANOVA

Box (1950) provided one of the earliest descriptions of this approach, proposing the struction of derived variables that represent polynomial contrasts of the measurement oc-casions; closely related work can be found in Danford, Hughes, and McNee (1960), Geisser(1963), Potthoff and Roy (1964), Cole and Grizzle (1966), and Grizzle and Allen (1969) Al-ternative transformations can be used, as the MANOVA test statistics are invariant to howchange over time is characterized in the transformation of the original repeated measures.Although the MANOVA approach is computationally more demanding than the univariaterepeated-measures ANOVA, an appealing feature of the method is that it allows assump-tions on the structure of the covariance among repeated measures to be relaxed In standardapplications of the method, no explicit structure is assumed for the covariance among re-peated measures (other than homogeneity of covariance across different individuals).There is a final related approach to longitudinal data analysis based on the ANOVAparadigm that has a long history and remains in widespread use In this approach, thesequence of repeated measures for each individual is reduced to a single summary value (or,

con-in certacon-in cases, a set of summary values) The major motivation behcon-ind the use of thisapproach is that, if the sequence of repeated measures can be reduced to a single numbersummary, then ANOVA methods (or, alternatively, non-parametric methods) for the anal-ysis of a univariate response can be applied For example, the area under the curve (AUC)

Trang 21

is one common measure that is frequently used to summarize the sequence of repeatedmeasures on any individual The AUC, usually approximated by the area of the trapezoidsjoining adjacent repeated measurements, can then be related to covariates (e.g., treatment orintervention groups) using ANOVA Wishart (1938) provided one of the earliest descriptions

of this approach in a paper with the almost unforgettable title “Growth-rate determinations

in nutrition studies with the bacon pig, and their analysis”; closely related methods can befound in Box (1950) and Rao (1958)

Within a limited context, the three ANOVA-based approaches discussed thus far providedthe basis for a longitudinal analysis However, all of these methods had shortcomings thatlimited their usefulness in applications The univariate repeated-measures ANOVA madevery restrictive assumptions about the covariance structure for repeated measures on thesame individual The assumed compound symmetry form for the covariance is not appro-priate for longitudinal data for at least two reasons First, the constraint on the correlationamong repeated measurements is somewhat unappealing for longitudinal data, where thecorrelations are expected to decay with increasing separation in time Second, the assump-tion of constant variance across time is often unrealistic In many longitudinal studies thevariability of the response at the beginning of the study is discernibly different from the vari-ability toward the completion of the study; this is especially the case when the first repeatedmeasurement represents a “baseline” response Finally, as originally conceived, the repeated-measures ANOVA model was developed for the analysis of data from designed experiments,where the repeated measures are obtained at a set of occasions common to all individu-als, the covariates are discrete factors (e.g., representing treatment group and time), andthe data are complete As a result, early implementations of the repeated-measures ANOVAcould not be readily applied to longitudinal data that were irregularly spaced or incomplete,

or when it was of interest to include quantitative covariates in the analysis

In contrast, the repeated-measures analysis by MANOVA did not make restrictive sumptions on the covariance among the longitudinal responses on the same individual As

as-a result, the correlas-ations could as-assume as-any pas-attern as-and the vas-arias-ability could chas-ange overtime However, MANOVA had a number of features that also limited its usefulness Inparticular, the MANOVA formulation forced the within-subject covariates to be the samefor all individuals There are at least two practical consequences of this constraint First,repeated-measures MANOVA cannot be used when the design is unbalanced over time (i.e.,when the vectors of repeated measures are of different lengths and/or obtained at differentsequences of time) Second, the repeated-measures MANOVA (at least as implemented inexisting statistical software packages) did not allow for general missing-data patterns toarise Thus, if any individual has even a single missing response at any occasion, the entiredata vector from that individual must be excluded from the analysis This so-called “list-wise” deletion of missing data from the analysis often results in dramatically reduced samplesize and very inefficient use of the available data Listwise deletion of missing data can alsoproduce biased estimators of change in the mean response over time when the so-called

“completers” (i.e., those with no missing data) are not a random sample from the targetpopulation Furthermore, balance between treatment groups is destroyed, hence the earlyattraction of so-called imputation methods

Finally, although the analysis of summary measures had a certain appeal due to thesimplicity of the method, it too had a number of distinct drawbacks By definition, it forcesthe data analyst to focus on only a single aspect of the repeated measures over time; when

n repeated measures are replaced by a single-number summary, there must necessarily be

some loss of information Also, individuals with discernibly different response profiles canproduce the same summary measure A second potential drawback is that the covariatesmust be time-invariant; the method cannot be applied when covariates are time-varying.Furthermore, many of the simple summary measures are not so well defined when there are

Trang 22

LINEAR MIXED-EFFECTS MODEL 7missing data or irregularly spaced repeated measures Even in cases where the summarymeasure can be defined, the resulting analysis is not fully efficient In particular, whensome individuals have missing data or different numbers of repeated measures, the derivedsummary measures no longer have the same variance, thereby violating the fundamentalassumption of homogeneity of variance for standard ANOVA models.

In summary, the origins of the statistical analysis of change can be traced back to theANOVA paradigm ANOVA methods have a long and extensive history of use in the analysis

of longitudinal data While ANOVA methods can provide a reasonable basis for a dinal analysis in cases where the study design is very simple, they have many shortcomingsthat have limited their usefulness in applications In many longitudinal studies there isconsiderable variation among individuals in both the number and timing of measurements.The resulting data are highly unbalanced and not readily amenable to ANOVA methodsdeveloped for balanced designs It was these features of longitudinal data that providedthe impetus for statisticians to develop far more versatile techniques that can handle thecommonly encountered problems of data that are unbalanced and incomplete, mistimedmeasurements, time-varying and time-invariant covariates, and responses that are discreterather than continuous

longitu-1.3 Linear mixed-effects model for longitudinal data

The linear mixed-effects model is probably the most widely used method for analyzing gitudinal data Although the early development of mixed-effects models for hierarchical orclustered data can be traced back to the ANOVA paradigm (see, for example, Scheff´e, 1959)and to the seminal paper by Harville (1977), their usefulness for analyzing longitudinal data,especially in the life sciences, was highlighted in the 1980s in a widely cited paper by Lairdand Ware (1982) Goldstein (1979) is often seen as the counterpart for the humanities Theidea of allowing certain regression coefficients to vary randomly across individuals was also

lon-a recurring theme in the elon-arly contributions to growth curve lon-anlon-alysis by Wishlon-art (1938),Box (1950), Rao (1958), Potthoff and Roy (1964), and Grizzle and Allen (1969); these earlycontributions to growth curve modeling laid the foundation for the linear mixed-effectsmodel The idea of randomly varying regression coefficients was also a common thread inthe so-called two-stage approach to analyzing longitudinal data In the two-stage formu-lation, the repeated measurements on each individual are assumed to follow a regressionmodel with distinct regression parameters for each individual The distribution of theseindividual-specific regression parameters, or “random effects,” is modeled in the secondstage A version of the two-stage formulation was popularized by biostatisticians working

at the U.S National Institutes of Health (NIH) They proposed a method for analyzingrepeated-measures data where, in the first stage, subject-specific regression coefficients areestimated using ordinary least-squares regression In the second stage, the estimated re-gression coefficients are then analyzed as summary measures using standard parametric (ornon-parametric) methods Interestingly, this method for analyzing repeated-measures databecame known as the “NIH method.” Although it is difficult to attribute the popularization

of the NIH method to any single biostatistician at NIH, Sam Greenhouse, Max Halperin,and Jerry Cornfield introduced many biostatisticians to this technique In the agriculturalsciences, a similar approach was popularized in a highly cited paper by Rowell and Walters(1976) Rao (1965) put this two-stage approach on a more formal footing by specifying aparametric growth curve model that assumed normally distributed random growth curveparameters

Although remarkably simple and useful, the two-stage formulation of the linear effects model introduced some unnecessary restrictions Specifically, in the first stage, thecovariates were restricted to be time-varying (with the exception of the column of 1s for

Trang 23

mixed-the intercept); between-subject (or time-invariant) covariates could only be introduced inthe second stage, where the individual-specific regression coefficients were modeled as alinear function of these covariates The two-stage formulation placed unnecessary, and ofteninconvenient, constraints on the choice of the design matrix for the fixed effects But, from

an historical perspective, it provided motivation for the main ideas and concepts lying linear mixed-effects models The method can be viewed as based on summaries andconsequently it shares the disadvantages with such methods

under-In the early 1980s, Laird and Ware (1982), drawing upon a general class of mixed els introduced earlier by Harville (1977), proposed a flexible class of linear mixed-effectsmodels for longitudinal data These models could handle the complications of mistimed andincomplete measurements in a very natural way The linear mixed-effects model is given by

mod-Yij = X  ij β + Z  ij b i + e i

where Z ij is a design vector for the random effects, b i ∼ N(0, G), and ei ∼ N(0, Ri)

Commonly, it is assumed that V i = σ2I, although additional correlation among the errors

can be accommodated by allowing more general covariance structures for V i (e.g., gressive) In addition, alternative distributions for the random effects can be entertained.The linear mixed-effects model proposed by Laird and Ware (1982) included the univari-ate repeated-measures ANOVA and growth curve models for longitudinal data as specialcases In addition, the Laird and Ware (1982) formulation of the model had two desir-able features: first, there were fewer restrictions on the design matrices for the fixed andrandom effects; second, the model parameters could be estimated efficiently via likelihood-based methods Previously, difficulties with estimation of mixed-effects models had heldback their widespread application to longitudinal data Laird and Ware (1982) showed howthe expectation–maximization (EM) algorithm (Dempster, Laird, and Rubin, 1977) could

autore-be used to fit this general class of models for longitudinal data Soon after, Jennrich andSchluchter (1986) proposed a variety of alternative algorithms, including Fisher scoring andNewton–Raphson Currently, maximum likelihood and restricted maximum likelihood esti-mation, the latter devised to diminish the small-sample bias of maximum likelihood, are themost frequently employed routes for estimation and inference (Verbeke and Molenberghs2000; Fitzmaurice, Laird, and Ware, 2004)

So, by the mid-1980s, a very general class of linear models for longitudinal data had beenproposed that could handle issues of unbalanced data, due to either mistimed measurement

or missing data, could handle both time-varying and time-invariant covariates, and provided

a flexible, yet parsimonious, model for the covariance Moreover, these developments peared at a time when there were great advances in computing power It was not too long be-fore these methods were available at the desktop and were being applied to longitudinal data

ap-in a wide variety of disciplap-ines Nevertheless, many of the simple and simplifyap-ing proceduresstuck, out of habit and/or because they have become part of standard operating procedures

1.4 Models for non-Gaussian longitudinal data

The advances in methods for longitudinal data analysis discussed so far have been based

on linear models for continuous responses that may be approximately normally distributed.Next, we consider some of the parallel developments when the response variable is dis-crete The developments in methods for analyzing a continuous longitudinal response spanmore than a century, from the early work on simple random-effects models by the Britishastronomer Airy (1861) through the landmark paper on linear mixed-effects models for lon-gitudinal data by Laird and Ware (1982) In contrast, many of the advances in methods fordiscrete longitudinal data have been concentrated in the last 25 to 30 years, harnessing thehigh-speed computing resources available at the desktop

Trang 24

MODELS FOR NON-GAUSSIAN LONGITUDINAL DATA 9When the longitudinal response is discrete, linear models are no longer appropriate forrelating changes in the mean response to covariates Instead, statisticians have developed

extensions of generalized linear models (Nelder and Wedderburn, 1972) for longitudinal

data Generalized linear models provide a unified class of models for regression analysis ofindependent observations of a discrete or continuous response A characteristic feature ofgeneralized linear models is that a suitable non-linear transformation of the mean response

is assumed to be a linear function of the covariates As we will discuss, this non-linearityraises some additional issues concerning the interpretation of the regression coefficients inmodels for longitudinal data Statisticians have extended generalized linear models to handlelongitudinal observations in a number of different ways; here we consider three broad, but

quite distinct, classes of regression models for longitudinal data: (i) marginal or

population-averaged models, (ii) random-effects or subject-specific models, and (iii) transition or response conditional models These models differ not only in how the correlation among the repeated

measures is accounted for, but also have regression parameters with discernibly differentinterpretations These differences in interpretation reflect the different targets of inference

of these models Here we sketch some of the early developments of these models from anhistorical perspective; later chapters of this book will discuss many of these models in muchgreater detail Because binary data are so common, we focus much of our review on modelsfor longitudinal binary data Most of the developments apply to, say, categorical data andcounts equally well

1.4.1 Marginal or population-averaged models

As mentioned above, the extensions of generalized linear models from the univariate to themultivariate response setting have followed a number of different research threads In thissection we consider an approach for extending generalized linear models to longitudinal data

that leads to a class of regression models known as marginal or population-averaged models

(see Chapter 3 of this volume) It must be admitted from the outset that the former term is

potentially confusing; nonetheless it has endured faute de mieux The term marginal in this

context is used to emphasize that the model for the mean response at each occasion dependsonly on the covariates of interest, and not on any random effects or previous responses

This is in contrast to mixed-effects models, where the mean response depends not only on covariates but also on a vector of random effects, and to transition or generally conditional

models (e.g., Markov models), where the mean response depends also on previous responses.Marginal models provide a straightforward way to extend generalized linear models to

longitudinal data They directly model the mean response at each occasion, E(Y ij |Xij),using an appropriate link function Because the focus is on the marginal mean and itsdependence on the covariates, marginal models do not necessarily require full distributionalassumptions for the vector of repeated responses, only a regression model for the meanresponse As we will discuss later, this can be advantageous, as there are few tractablelikelihoods for marginal models for discrete longitudinal data

Typically, a marginal model for longitudinal data has the following three-partspecification:

1 The mean of each response, E (Y ij|X ij ) = μ ij, is assumed to depend on the covariatesthrough a known link function

h −1 (μ ij ) = X  ij β.

2 The variance of each Y ij, given the covariates, is assumed to depend on the mean ing to

accord-Var (Y ij|X ) = φ v (μ ) ,

Trang 25

where v (μ ij ) is a known variance function and φ is a scale parameter that may be known

or may need to be estimated

3 The conditional within-subject association among the vector of repeated responses, giventhe covariates, is assumed to be a function of an additional set of association parameters,

α (and may also depend upon the means, μ ij)

Of the three, the first is the key component of a marginal model and specifies the model

for the mean response at each occasion, E(Y ij|X ij), and its dependence on the covariates.However, there is an implicit assumption in the first component that is often overlooked

Marginal models assume that the conditional mean of the jth response, given X i1, , Xin,

depends only on X ij, that is,

where obviously X i = (X i1, , Xin); see Fitzmaurice, Laird, and Rotnitzky (1993) andPepe and Anderson (1994) for a discussion of the implications of this assumption With time-invariant covariates, this assumption necessarily holds Also, with time-varying covariatesthat are fixed by design of the study (e.g., time since baseline, treatment group indicator

in a crossover trial), the assumption also holds, as values of the covariates are determined

a priori by study design and in a manner unrelated to the longitudinal response However,

when a time-varying covariate varies randomly over time, the assumption may no longerhold As a result, somewhat greater care is required when fitting marginal models withtime-varying covariates that are not fixed by design of the study This problem has longbeen recognized by econometricians (see, for example, Engle, Hendry, and Richard, 1983),and there is now an extensive statistical literature on this topic (see, for example, Robins,Greenland, and Hu, 1999)

The second component specifies the marginal variance at each occasion, with the choice ofvariance function depending upon the type of response For balanced longitudinal designs,

a separate scale parameter, φ j, can be specified at each occasion; alternatively, the scale

parameter could depend on the times of measurement, with φ(t ij) being some parametric

function of t ij Restriction to a single unknown parameter φ is especially limiting in the

analysis of continuous responses where the variance of the repeated measurements is oftennot constant over the duration of the study

The first two components of a marginal model specify the mean and variance of Y ij, closelyfollowing the standard formulation of a generalized linear model The only minor difference

is that marginal models typically specify a common link function relating the vector of meanresponses to the covariates It is the third component that recognizes the characteristic lack

of independence among longitudinal data by modeling the within-subject association amongthe repeated responses from the same individual In describing this third component, we

have been careful to avoid the use of the term correlation for two reasons First, with a

continuous response variable, the correlation is a natural measure of the linear dependenceamong the repeated responses and is variation independent of the mean response However,this is not the case with discrete responses With discrete responses, the correlations areconstrained by the mean responses, and vice versa The most extreme example of this ariseswhen the response variable is binary For binary responses, the correlations are heavilyrestricted to ranges that are determined by the means (or probabilities of success) of theresponses As a result, the correlation is not the most natural measure of within-subjectassociation with discrete responses For example, for two associated binary outcomes withprobabilities of success equal to 0.2 and 0.8, the correlation can be no larger than 0.25.Instead, the odds ratio is a preferable metric for association among pairs of binary re-sponses There are no restrictions for two outcomes, while they are mild for longer sequences

of repeated measures Second, for a continuous response that has a multivariate normal

Trang 26

MODELS FOR NON-GAUSSIAN LONGITUDINAL DATA 11distribution, the correlations, along with the variances and the means, completely specifythe joint distribution of the vector of longitudinal responses This is not the case with dis-crete data The vector of means and the covariance matrix do not, in general, completelyspecify the joint distribution of discrete longitudinal responses Instead, the joint distribu-tion requires specification of pairwise and higher-order associations among the responses.This three-part specification of a marginal model makes transparent the extension ofgeneralized linear models to longitudinal data The first two parts of the marginal modelcorrespond to the standard generalized linear model, albeit with no explicit distributionalassumptions about the responses It is the third component, the incorporation of a modelfor the within-subject association among the repeated responses from the same individual,that represents the main extension of generalized linear models to longitudinal data A cru-cial aspect of marginal models is that the mean response and within-subject associationare modeled separately This separation of the modeling of the mean response and the as-sociation among responses has important implications for interpretation of the regression

parameters β In particular, the regression parameters have population-averaged

interpreta-tions They describe how the mean response in the population changes over time and how

these changes are related to covariates Note that the interpretation of β is not altered in

any way by the assumptions made about the nature or magnitude of the within-subjectassociation

From an historical perspective, it is difficult to pinpoint the origins of marginal els In the case of linear models, the earliest approaches based on the ANOVA paradigm

mod-fit squarely within the framework of marginal models In a certain sense, the necessity

to distinguish marginal models from other classes of models becomes critical only for crete responses The development of marginal models for discrete longitudinal data hasits origins in likelihood-based approaches, where the three-part specification given above is

dis-extended by making full distributional assumptions about the n × 1 vector of responses,

of the issues that have complicated the application of marginal models to discrete data,leading to the widespread use of alternative, semi-parametric methods

At least three main research threads can be distinguished in the development of based marginal models for discrete longitudinal data Because binary data are so common,

likelihood-we focus much of this review on models for longitudinal binary data One of the earliestlikelihood-based approaches was proposed by Gumbel (1961), who posited a latent-variablemodel for multivariate binary data In this approach, there is a vector of unobserved latent

variables, say L i1, , Lin, and each of these is related to the observed binary responses via

Assuming a multivariate joint distribution for L i1, , Lin identifies the joint distribution

for Y i1, , Yin, with

Pr(Y i1 = 1, Y i2 = 1, , Y in = 1) = Pr(L i1 ≤ X  i1 β, L i2 ≤ X  i2 β, , L in ≤ X  in β)

= F (X  i1 β, X  i2 β, , X  in β),

where F ( ·) denotes the joint cumulative distribution function of the latent variables

Fur-thermore, any dependence among the L ij induces dependence among the Y ij For example,

a bivariate logistic distribution for any L ij and L ikinduces marginally a logistic regression

model for Y ij and Y ik,

Trang 27

Although Gumbel’s (1961) model can accommodate more than two responses, the

marginal covariance among the Y ij becomes quite complicated Other multivariate

distri-butions for the latent variables, with arbitrary marginals (e.g., logistic or probit) for the Y ij

can be derived, but in general the joint distribution for Y i1 , ,Y inis relatively complicated,

as is the marginal covariance structure As a result, these models were not widely adoptedfor the analysis of discrete longitudinal data Closely related work, assuming a multivariatenormal distribution for the latent variables, appeared in Ashford and Sowden (1970), Cox

(1972), and Ochi and Prentice (1984) In the latter model, the Y ij marginally follow a probitmodel,

where Φ(·) denotes the normal cumulative distribution function, and the model allows both

positive and negative correlation among the repeated binary responses, depending on thesign of the correlation among the underlying latent variables This model is often referred

to as the “multivariate probit model.” Interestingly, the multivariate probit model can also

be motivated through the introduction of random effects (see the discussion of generalizedlinear mixed models in Section 1.4.2)

One of the main drawbacks of the latent-variable model formulations that limited their

application to longitudinal data is that they require n-dimensional integration over the

joint distribution of the latent variables In general, it can be computationally intensive tocalculate or even approximate these integrals In addition, the simple correlation structureassumed for the latent variables may be satisfactory for many types of clustered data but

is somewhat less appealing for longitudinal data In principle, however, a more complexcovariance structure for the latent variables could be assumed

At around the same time as Gumbel (1961) proposed his latent-variable formulation, asecond approach to likelihood-based inferences was proposed by Bahadur (1961) Bahadur(1961) proposed an elegant expansion for an arbitrary probability mass function for a vector

of responses Y i1, , Yin The expansion for repeated binary responses is of the form

πij = E(Y ij ), and ρ ijk = E(Z ij Zik ), , ρ i1 n = E(Z i1 Zin ) Here, ρ ijk is the pairwise

or second-order correlation and the additional parameters relate to third- and higher-ordercorrelations among the responses

The Bahadur expansion has a particularly appealing property, shared with the ate probit model and many other marginal models, of being “reproducible” or “upwardlycompatible” in the sense that the same model holds for any subset of the vector of responses

multivari-In addition, the multinomial probabilities for the vector of binary responses are relativelystraightforward to obtain given the model parameters Kupper and Haseman (1978) andAltham (1978) discussed applications of this model, albeit with very simple pairwise correla-tion structure and assuming higher-order terms are zero The chief drawback of the Bahadurexpansion that has limited its application to longitudinal data is its parameterization of thehigher-order associations in terms of correlation parameters As noted earlier, for discrete

Trang 28

MODELS FOR NON-GAUSSIAN LONGITUDINAL DATA 13data there are severe restrictions on the correlations and dependence of the correlations

on the means Thus, for discrete data, the Bahadur model requires a complicated set ofinequality constraints on the model parameters that make maximization of the likelihoodvery difficult Except in very simple settings with a small number of repeated measures, theBahadur model has not been widely applied to longitudinal data

Because of the restrictions on the correlations, alternative multinomial models for thejoint distribution of the vector of discrete responses have recently been proposed where thewithin-subject association is parameterized in terms of other metrics of association Forexample, Dale (1984), McCullagh and Nelder (1989), Lipsitz, Laird, and Harrington (1990),Liang, Zeger, and Qaqish (1992), Becker and Balagtas (1993), Molenberghs and Lesaffre(1994), Lang and Agresti (1994), Glonek and McCullagh (1995), and others have proposedfull likelihood approaches where the higher-order moments are parameterized in terms ofmarginal odds ratios In closely related work, Ekholm (1991) parameterizes the associationdirectly in terms of the higher-order marginal probabilities (see also Ekholm, Smith, andMcDonald, 1995) An alternative approach is to parameterize the within-subject associa-tion in terms of conditional associations, leading to so-called “mixed-parameter” models(Fitzmaurice and Laird, 1993; Glonek, 1996; Molenberghs and Ritter, 1996) However,except in certain special cases (e.g., Markov models), these conditional association param-eters have somewhat less appealing interpretations in the longitudinal setting; moreover,their interpretation is straightforward only in balanced longitudinal designs

In virtually all of these later advances, the application of the methodology has beenhampered by at least three main factors First, unlike in the Bahadur model, there are nosimple expressions for the joint probabilities in terms of the model parameters This makesmaximization of the likelihood somewhat difficult Second, even with the current advances

in computing, these models are difficult to fit except when the number of repeated measures

is relatively small Finally, many of these models are not robust to misspecification of thehigher-order moments That is, many of the likelihood-based methods require that the entirejoint distribution be correctly specified Thus, if the marginal model for the mean responseshas been correctly specified but the model for any of the higher-order moments has not, thenthe maximum likelihood estimators of the marginal mean parameters will fail to converge inprobability to the true mean parameters The “mixed-parameter” models are an exception

to the rule; however, even these models lose this robustness property when there are missingdata

A third approach to likelihood-based marginal models is to specify the entire multinomialdistribution of the vector of repeated categorical responses and estimate the multinomialprobabilities non-parametrically This was the approach first proposed in Grizzle, Starmer,and Koch (1969) Specifically, they proposed a weighted least-squares (WLS) method forfitting a general family of models for categorical data; in recognition of its developers, themethod is often referred to as the “GSK method.” Koch and Reinfurt (1971) and Koch

et al (1977) later recognized how these models could be applied to discrete longitudinaldata; Stanish, Gillings, and Koch (1978), Stanish and Koch (1984), and Woolson and Clarke(1984) further developed the methodology for longitudinal analysis

The GSK method provides a very general family of models for repeated categorical data,allowing non-linear link functions to relate the marginal expectations to covariates TheGSK method stratifies individuals according to values of the covariates and fully specifies themultinomial distribution of the vector of repeated categorical responses within each stratum.This method, for example, allows the fitting of logistic regression models to repeated binarydata, albeit with the restrictions that the longitudinal study design be balanced on time,all covariates must be categorical, and there are sufficient numbers of individuals withincovariate strata to estimate the multinomial probabilities non-parametrically as the sampleproportions The method requires the estimation of the covariance among the repeated

Trang 29

responses, within strata defined by covariate values; the covariance follows directly from theproperties of the multinomial distribution Asymptotically, the GSK method is equivalent

to maximum likelihood estimation; thus, this approach was appealing for analyzing discretelongitudinal data when all of the conditions required for its use were met

Although the GSK method was a landmark technique for the analysis of repeated egorical data, it had many restrictions that limited its usefulness Specifically, it requiredthat all covariates be categorical and sample sizes be of sufficient size to allow for strat-ification and separate estimation of the multinomial covariance in each covariate stra-tum However, as the number of categorical covariates in the model increases, sparse dataproblems quickly arise due to Bellman’s (1961) “curse of dimensionality.” Furthermore,missing data are not easily handled by the GSK method because they require additionalstratification by patterns of missingness Thus, the GSK method was restricted to bal-anced designs with categorical covariates and relatively large sample sizes It suffered frommany of the same limitations as were noted for the repeated measures by MANOVA inSection 1.2

cat-Generally speaking, there have been a number of impediments to the application oflikelihood-based marginal models for the analysis of discrete longitudinal data The latent-variable model formulations, first proposed by Gumbel (1961), require high-dimensionalintegration over the joint distribution of the latent variables that is computationally too dif-ficult In contrast, methods that fully specify the multinomial probabilities for the responsevector, such as the GSK method, are relatively straightforward to implement However, theconditions for the use of the GSK method are typically not satisfied in many longitudinalsettings Alternative likelihood-based approaches that place more restrictions on the multi-nomial probabilities have proven to be substantially more difficult to implement While thelatter approaches do not require stratification and can incorporate a mixture of discrete andcontinuous covariates, they can be applied only in relatively simple cases

For the most part, all of the likelihood-based approaches that have been proposed havebeen hampered by various combinations of the following factors The first is the lack of

a convenient joint distribution for discrete multivariate responses, with similar properties

to the multivariate normal Paradoxically, the joint distributions for discrete longitudinaldata require specification of many higher-order moments despite the fact that there is, insome sense, substantially less information in the discrete than in the continuous data case.Second, with the exception of the Bahadur expansion, for many multinomial models there

is no closed-form expression for the joint probabilities in terms of the model parameters

As such, there is no analog of the multivariate normal distribution for repeated categoricaldata that has simple and convenient properties This makes maximization of the likelihooddifficult Third, all likelihood-based approaches face difficulties with sparseness of data oncethe number of repeated measures exceeds 5 or 6 Recall that a vector of repeated measures

on a categorical response with C categories requires specification of C n − 1 multinomial

probabilities For example, with a binary response measured at 10 occasions, there are 210−1

(or 1023) non-redundant multinomial probabilities So, while data on 200 subjects may bemore than adequate for estimation of the marginal probabilities at each occasion, they can

be wholly inadequate for estimation of the joint probabilities of the vector of responsesdue to the curse of dimensionality Finally, many of these difficulties are compounded by

the fact that likelihood-based estimates of the interest parameters, β, are quite sensitive

to misspecification of the higher-order moments Many of the proposed methods requirethat the entire joint distribution be correctly specified For example, if the model for themean response has been correctly specified but the model for the higher-order moments isincorrect, then the maximum likelihood estimator will fail to converge in probability to the

true value of β Although some of the computational difficulties previously mentioned can

be ameliorated by faster and more powerful computers, many of the other problems reflect

Trang 30

MODELS FOR NON-GAUSSIAN LONGITUDINAL DATA 15the curse of dimensionality and cannot be easily handled with the typical amount of datacollected in many longitudinal studies.

In the mid-1980s, remarkable advances in methodology for analyzing discrete longitudinaldata were made when Liang and Zeger (1986) proposed the generalized estimating equa-tions (GEE) approach Because marginal models separately parameterize the model for themean responses from the model for the within-subject association, Liang and Zeger (1986)recognized that it is possible to estimate the regression parameters in the former withoutmaking full distributional assumptions The avoidance of distributional assumptions is po-tentially advantageous because, as we have discussed, there is no convenient and generally

accepted specification of the joint multivariate distribution of Y ifor marginal models whenthe responses are discrete The appeal of the GEE approach is that it only requires specifi-cation of that part of the probability mechanism that is of scientific interest, the marginal

means By avoiding full distributional assumptions for Y i, the GEE approach provided aremarkably convenient alternative to maximum likelihood estimation of multinomial mod-els for repeated categorical data, without many of the inherent complications of the latter.Chapter 3 provides a comprehensive account of the GEE methodology

The GEE approach advocated in Liang and Zeger (1986) was a natural extension of thequasi-likelihood approach (Wedderburn, 1974) for generalized linear models to the multivari-ate response setting, where an additional set of nuisance parameters for the within-subjectassociation must be incorporated The foundation for the GEE approach relied on the theory

of optimal estimating functions developed by Godambe (1960) and Durbin (1960) Liangand Zeger (1986) highlighted how the GEE provides a unified approach to the formulationand fitting of generalized linear models to longitudinal and clustered data They demon-strated the versatility of the GEE method in handling unbalanced data, mixtures of discreteand continuous covariates, and arbitrary patterns of missingness Until the publication oftheir landmark paper (Liang and Zeger, 1986), methods for the analysis of discrete lon-gitudinal data had lagged behind corresponding methods for continuous responses Soonafter, marginal models were being widely applied to address substantive questions aboutlongitudinal change across a broad spectrum of disciplines Their work also generated muchadditional theoretical and applied research on the use of this methodology for analyzinglongitudinal data For example, to improve upon efficiency, Prentice (1988) proposed joint

estimating equations for both the main regression parameters, β, and the nuisance ation parameters, α.

associ-The essential idea behind the GEE approach is to extend quasi-likelihood methods, inally developed for a univariate response, by incorporating additional nuisance parametersfor the covariance matrix of the vector of responses For example, given a model for the pair-wise correlations, the corresponding covariance matrix can be constructed as the product

orig-of the standard deviations and correlations

Vi = A 1/2 i Corr(Y i )A 1/2 i ,

where A i is a diagonal matrix with Var (Y ij ) = φ v (μ ij ) along the diagonal, and Corr(Y i)

is a correlation matrix (here a function of α) In the GEE approach, V i is referred to as a

“working” covariance matrix to distinguish it from the true underlying covariance matrix of

Y i The term “working” in this context acknowledges uncertainty about the assumed model

for the variances and within-subject associations Because the GEE depend on both β and

α, an iterative two-stage estimation procedure is required; this has been implemented in

many widely available software packages As noted by Crowder (1995), ambiguity concerningthe definition of the working covariance matrix can, in certain cases, result in a breakdown

of this estimation procedure

In summary, the GEE approach has a number of appealing properties for estimation ofthe regression parameters in marginal models First, in many longitudinal designs the GEE

Trang 31

estimator of β is almost efficient when compared to the maximum likelihood estimator.

For example, it can be shown that the GEE has a similar expression to the likelihood

equations for β in a linear model for continuous responses that are assumed to have a

multivariate normal distribution The GEE also has an expression similar to the likelihood

equations for β in certain models for discrete longitudinal data As a result, for many

longitudinal designs, there is relatively little loss of precision when the GEE approach isadopted as an alternative to maximum likelihood Second, the GEE estimator has a very

appealing robustness property, yielding a consistent estimator of β even if the within-subject

associations among the repeated measures have been misspecified It only requires that themodel for the mean response be correct This robustness property of GEE is importantbecause the usual focus of a longitudinal study is on changes in the mean response Although

the GEE approach yields a consistent estimator of β under misspecification of the

within-subject associations, the usual standard errors obtained under the misspecified model forthe within-subject association are not valid However, valid standard errors for the resultingestimator β can be obtained using the empirical or so-called sandwich estimator of Cov( β).

The sandwich estimator is also robust in the sense that, with sufficiently large samples,

it provides valid standard errors when the assumed model for the covariances among therepeated measures is not correct

1.4.2 Generalized linear mixed models

In the previous section, we discussed how marginal models can be considered an extension of

generalized linear models that directly incorporate the within-subject association among the

repeated measurements In a certain sense, marginal models account for the consequences

of the correlation among the repeated measures but do not provide any explanation for itspotential source An alternative approach for accounting for the within-subject association,and one that provides a source for the within-subject association, is via the introduction

of random effects in the model for the mean response Following the same basic ideas as inlinear mixed-effects models, generalized linear models can be extended to longitudinal data

by allowing a subset of the regression coefficients to vary randomly from one individual to

another These models are known as generalized linear mixed (effects) models (GLMMs),

and they extend in a natural way the conceptual approach represented by the linear effects models discussed in Section 1.3; see Chapter 4 for a detailed overview In GLMMsthe model for the mean response is conditional upon both measured covariates and unob-served random effects; it is the inclusion of the latter that induces correlation among therepeated responses marginally, when averaged over the distribution of the random effects.However, as we discuss later, with non-linear link functions, the introduction of randomeffects has important ramifications for the interpretation of the “fixed-effects” regressionparameters

mixed-The generalized linear mixed model can be formulated using the following two-part ification:

spec-1 Given a q × 1 vector of random effects bi , the Y ij are assumed to be conditionally pendent and to have exponential family distributions with conditional mean dependingupon both fixed and random effects,

ij β + Z  ij b i,

for some known link function, h −1(·) The conditional variance is assumed to depend on

the conditional mean according to Var(Y ij |bi ) = φ v {E(Yij |bi)}, where v(·) is a known

variance function and φ is a scale parameter that may be known or may need to be

estimated

Trang 32

MODELS FOR NON-GAUSSIAN LONGITUDINAL DATA 17

2 The random effects, b i , are assumed to be independent of the covariates, X ij, and to

have a multivariate normal distribution, with zero mean and q × q covariance matrix G.

These two components completely specify a broad class of generalized linear mixed models

In principle, the conditional independence assumption in the first component is not sary, but is commonly made Similarly, any multivariate distribution can be assumed for

neces-the b i ; in practice, however, it is common to assume that the b ihave a multivariate normaldistribution

Generalized linear mixed models have their foundation in simple random-effects modelsfor binary and count data The early literature on random-effects models for discrete datacan be traced back to the development of random compounding models that introducedrandom effects on the response scale For example, Greenwood and Yule (1920) introducedthe negative binomial distribution as a compound Poisson distribution for count data, whileSkellam (1948) provided an early discussion of the beta-binomial distribution for binarydata The beta-binomial model can be conceptualized as a two-stage model, where in the

first stage the binary responses, Y i1, , Yin, are assumed to be conditionally independent

with common success probability p i , where Pr(Y ij = 1|pi ) = E(Y ij |pi ) = p i In the second

stage, the success probabilities, p1, , p N, are assumed to be independently distributedwith a beta density The mean of the success probabilities can be related to covariates via

an appropriate link function, such as a logit or probit link function Although the binomial model accounts for overdispersion relative to the usual binomial variance, themodel is somewhat more natural for clustered rather than longitudinal data As a result,the model has been used in a wide variety of different clustered data applications (e.g.,Chatfield and Goodhardt, 1970; Griffiths, 1973; Williams, 1975; Kupper and Haseman,1978; Crowder, 1978, 1979; Otake and Prentice, 1984; Aerts et al., 2002)

beta-The main feature of the beta-binomial model that has limited its usefulness for analyzinglongitudinal data is that it produces the same marginal distribution at each measurementoccasion While this may not be so problematic in certain clustered data settings (e.g.,

in study designs where X i1 = X i2 =· · · = Xin), in a longitudinal study where interest

is primarily in changes in the marginal means over time, this restriction on the marginaldistributions is very unappealing Nonetheless, the beta-binomial and other random com-pounding models motivated the later development of more versatile random-effects models.Recall that in the beta-binomial model, it is assumed that success probabilities vary ran-domly about a mean and the latter can be related to covariates via an appropriate linkfunction, such as a logit link function In contrast to this formulation, Pierce and Sands

(1975) proposed an alternative model where the logit of p i is assumed to vary about an

expectation given by X  ij β,

logit{E(Yij |bi)} = X  ij β + b i,

where b i has a normal distribution with zero mean and constant variance The appealingfeature of the model proposed by Pierce and Sands (1975) is that the fixed and randomeffects are combined together on the same logistic scale This model is often referred to as thesimple “logit-normal model” and is very similar in spirit to the random intercept model forcontinuous outcomes discussed in Section 1.2 Although this model was remarkably simple,

it proved to be difficult to fit at the time because maximum likelihood estimation requiredmaximization of the marginal likelihood, averaged over the distribution of the random effect.This required integration, and no analytic solutions were available The fact that the integralcannot be evaluated in a closed form limited the application of this model

In closely related work, Ashford and Sowden (1970) proposed a very similar model, exceptwith probit rather than logit link function Interestingly, Ashford and Sowden’s (1970)

Trang 33

model with random intercept and probit link function was equivalent to the equicorrelatedlatent-variable model discussed in Section 1.4.1, leading to identical inferences provided thecorrelation is positive Despite the fact that maximum likelihood estimation for even thesimple logit-normal model was computationally demanding with the computer resourcesavailable at the time, Korn and Whittemore (1979) proposed a far more ambitious version

of the model, where

logit{E(Yij |bi)} = X 

ij β + Z  ij b i,

with Z ij = X ij Although their model was very general and avoided some of the obviousdrawbacks of the simple logit-normal model, it was difficult to fit and required a very longsequence of repeated measures on each subject

From an historical perspective, the papers by Ashford and Sowden (1970), Pierce andSands (1975), and Korn and Whittemore (1979) laid the conceptual foundations for gen-eralized linear mixed models; much of the work that followed focused on the thorny prob-lem of estimation In GLMMs the marginal likelihood is used as the basis for inferencesfor the fixed-effects parameters, complemented with empirical Bayes estimation for therandom effects In general, evaluation and maximization of the marginal likelihood forGLMMs requires integration over the distribution of the random effects While this is,strictly speaking, true for the linear mixed-effects model as well, there the integration can

be done analytically, so effectively a closed form for the marginal likelihood function arises,

in which case the application of maximum or restricted maximum likelihood is forward In the absence of an analytical solution, and because high-dimensional numericalintegration can be very trying, a variety of approaches has been suggested for tackling thisproblem

straight-Because no simple analytic solutions were available, Stiratelli, Laird, and Ware (1984)proposed an approximate method of estimation for the logit-normal model, based on em-pirical Bayes ideas, that circumvented the need for numerical integration Specifically, theyavoided the need for numerical integration by approximating the integrands with simple ex-pansions whose integrals have closed forms The paper by Stiratelli, Laird, and Ware (1984)led to the development of a general approach for fitting GLMMs, known as penalized quasi-likelihood (PQL) Various authors (e.g., Schall, 1991; Breslow and Clayton, 1993; Wolfinger,1993) motivated PQL as a Laplace approximation to the marginal likelihood for GLMMs.Despite the generality of this method, and its implementation in a variety of commerciallyavailable software packages, the PQL method can often yield quite biased estimators of the

variance components, which in turn leads to biased estimators of β, especially for

longitu-dinal binary data This motivated research on bias corrections (e.g., Breslow and Lin, 1995)and on more accurate approximations based on higher-order Laplace approximations (e.g.,Raudenbush, Yang, and Yosef, 2000) In general, the inclusion of higher-order terms forPQL has been shown to improve estimation Breslow and Clayton (1993) also considered

an alternative approach, related to PQL, known as marginal quasi-likelihood (MQL) MQLdiffers from PQL by being based on an expansion around the current estimates of the fixed

effects and around b i= 0 In general, MQL yields severely biased estimators of the variancecomponents, providing a good approximation only when the variance of the random effects

is relatively small

There has also been much recent research on alternative methods, including approachesbased on numerical integration (e.g., adaptive Gaussian quadrature) and Markov chainMonte Carlo algorithms In particular, adaptive Gaussian quadrature, with the numericalintegration centered around the empirical Bayes estimates of the random effects, permitsmaximization of the marginal likelihood with any desired degree of accuracy (e.g., Andersonand Aitkin, 1985; Hedeker and Gibbons, 1994, 1996) Adaptive Gaussian quadrature isespecially appealing for longitudinal data where the dimension of the random effects isoften relatively low Monte Carlo approaches to integration, for example Monte Carlo EM

Trang 34

MODELS FOR NON-GAUSSIAN LONGITUDINAL DATA 19(McCulloch, 1997; Booth and Hobert, 1999) and Monte Carlo Newton–Raphson algorithms(Kuk and Cheng, 1997), have been proposed The hierarchical formulation of GLMMs alsomakes Bayesian approaches quite appealing For example, Zeger and Karim (1991) haveproposed the use of Monte Carlo integration, via Gibbs sampling, to calculate the posteriordistribution.

The normality assumption for the random effects in GLMMs leads, in general, to tractable likelihood functions, except in the case of the linear mixed model for continuousdata This is because the normal random-effects distribution is conjugate to the normal dis-tribution for the outcome, conditional on the random effects Lee and Nelder (1996, 2001,2003) have extended this idea and propose using conjugate random-effects distributions incontexts other than the classical normal linear model

in-Finally, it is worth emphasizing some differences between GLMMs and the marginal els discussed in Section 1.4.1 Although the introduction of random effects can simply bethought of as a means of accounting for and explaining the potential sources of the corre-lation among longitudinal responses, it has important implications for the interpretation of

mod-the regression coefficients in GLMMs The fixed effects, β, have somewhat different

inter-pretations than the corresponding regression parameters in marginal models In GLMMsthe regression parameters have “subject-specific” interpretations They represent the effects

of covariates on changes in an individual’s possibly transformed mean response per unit

change in the covariate, while controlling for all other covariates and the random effects.

This interpretation for β can be better appreciated by considering the following example of

a simple logit-normal model given by

where b i is assumed to have a univariate normal distribution with zero mean and constant

variance The interpretation of a component of β, say β k, is in terms of changes in any

given individual’s log odds of response for a unit change in the corresponding covariate, say

Xijk Because β k has interpretation that depends upon holding b i fixed, it is referred to

as a subject-specific effect Note that this subject-specific interpretation of β k is far morenatural for a covariate that varies within an individual (i.e., a time-varying covariate) With

a time-invariant covariate, problems of interpretation arise because a change in the value of

the covariate also requires a change in the index i of X ijk to, say, X i  jk (for i = i ) However,

βk then becomes confounded with differences between b i and b i  One way around this is tothink of the population, defined by all subjects sharing the same value of the random effect

bi The effect of a covariate is then conditional on changing X ijk within the fine population

For the special case of linear models, where an identity link function has been adopted, thefixed effects in the model for the conditional means,

Trang 35

also happen to have interpretation in terms of the population means because

when averaged over the distribution of the random effects However, in general, for the linear link functions usually adopted for discrete data, this relationship no longer holds,and if

1.4.3 Conditional and transition models

There is a third way in which generalized linear models can be extended to handle nal data This is accomplished by modeling the mean and time dependence simultaneouslyvia conditioning an outcome on other outcomes or on a subset of other outcomes (see, forexample, Molenberghs and Verbeke, 2005, Part III) A particular case is given by so-calledtransition, or Markov, models Transition models are appealing due to the sequential nature

longitudi-of longitudinal data In transition models, the conditional distribution longitudi-of each response isexpressed as an explicit function of the past responses and the covariates Transition models

can be considered conditional models in the sense of modeling the conditional distribution

of the response at any occasion given the previous responses and the covariates The pendence among the repeated measures is thought of as arising due to past values of theresponse influencing the present observation In transition models, it is assumed that

s

r=1 αrfr (H ij ) = α1f1(H ij ) = α1Yij −1

A more general autoregressive model of order s, say, AR(s), is obtained by incorporating the s previously generated values of the response In general, models where the condi-

tional distribution of the response at the jth occasion, given H ij , depends only on the s immediately prior responses are known as Markov models of order s When the response

variable is discrete, these models are referred to as Markov chain models With discretedata and non-identity link functions, it may be necessary to transform the history of the

past responses, H ij, in a manner similar to the transformation of the conditional mean,

for example, f r (H ij ) = h −1 (H ij ) Also, upon closer examination, the model given by (1.1)

appears to make a strong assumption that the effects of the covariates are the same less of the actual history of past responses However, this assumption can be relaxed by

regard-including interactions between relevant covariates and H ij

There is an extensive history to the use of Markov chains to model equally spaced crete longitudinal data with a finite number of states or categories (e.g., Anderson andGoodman, 1957; Cox, 1958; Billingsley, 1961) In the simplest of models for longitudinaldata, a first-order Markov chain, the transition probabilities are assumed to be the same for

Trang 36

dis-CONCLUDING REMARKS 21each time interval The resulting Markov chain can then be described in terms of the initialstate and the set of transition probabilities The transition probabilities are the conditionalprobabilities of going into each state, given the immediately preceding state In a first-orderMarkov chain, there is dependence on the immediately preceding state but not on earlieroutcomes In the more general model given by (1.1), higher-order sequential dependence can

be incorporated, with dependence on more than the immediately preceding state, and thetransition probabilities can be allowed to vary over time Moreover, the time dependenceneed not necessarily be a linear function of the history Among others, Cox (1972), Korn andWhittemore (1979), Zeger, Liang, and Self (1985), and Ware, Lipsitz, and Speizer (1988)discuss transition models applicable to longitudinal data

One appealing aspect of transition models is that the joint distribution of the vector ofresponses can be expressed as the product of a sequence of conditional distributions, that is,

n



j=1

Strictly speaking, for an sth-order Markov chain model, this is a conditional likelihood, given

a set of s initial values In the specification of the transition model given by (1.1), initial

values of the responses are assumed to be incorporated into the covariates In general, theunconditional distribution of the initial responses cannot be determined from the conditionaldistributions specified by (1.1) There are two ways to handle this initial value problem.The first is to treat the initial responses as a set of given constants rather than randomvariables and base estimation on the conditional likelihood, ignoring the contribution of theunconditional distribution of the initial responses Maximization of the resulting likelihood isrelatively straightforward; indeed, standard software for univariate generalized linear models

can be used when f r (H ij ) does not depend on β Alternatively, the initial responses can be

assigned the equilibrium distribution of the sequence of longitudinal responses In general,

the latter will yield more efficient estimates of the regression coefficient, β.

Although Markov and autoregressive models have a long and extensive history of use forthe analyses of time series data, their application to longitudinal data has been somewhatmore limited There are a number of features of transition models that limit their usefulnessfor the analysis of longitudinal data In general, transition models have been developed forrepeated measures that are equally separated in time; these models are more difficult toapply when there are missing data, mistimed measurements, and non-equidistant intervals

between measurement occasions In addition, estimation of the regression parameters β is

very sensitive to assumptions concerning the time dependence; moreover, the interpretation

of β changes with the order of the serial dependence Finally, in many longitudinal studies

β is not the usual target of inference because conditioning on the history of past responses

may lead to attenuation of the effects of covariates of interest That is, when a covariate

is expected to influence the mean response at all occasions, its effect may be somewhatdiminished if there is conditioning on the past history of the responses

1.5 Concluding remarks

In the preceding sections, we traced the development of a very general and versatile class

of linear mixed-effects models for longitudinal data when the response is continuous Thesemodels can handle issues of unbalanced data, due to either mistimed measurement or miss-ing data, time-varying and time-invariant covariates, and modeling of the covariance, in aflexible way Linear mixed-effects models rely on assumptions of multivariate normality, andlikelihood-based inferences for both the fixed and random effects are relatively straightfor-ward In contrast, when the longitudinal response is discrete, we have seen that there is more

Trang 37

than one way to extend generalized linear models to the longitudinal setting This has led

to the development of “marginal” and “conditional” models for non-Gaussian longitudinaldata; in the former, there is no conditioning on past responses or random effects, while inthe latter there is conditioning on either the response history or a set of random effects.Although this classification is useful for pedagogical purposes, it should be recognized thatthis distinction between classes of models is somewhat artificial and is made, in part, toemphasize certain aspects of interpretation that arise when analyzing discrete longitudi-nal data In contrast to the situation for linear models, conditioning on past responses orrandom effects has important implications for the regression parameters in models for dis-crete longitudinal data While their interpretation obviously changes in all cases, also theparameter estimates are different because in fact the underlying estimands cannot be com-pared directly However, it is possible to combine features of these models, thereby blurringthe distinctions For example, Conaway (1989, 1990) has suggested extending mixed-effectsmodels to include lagged responses, while, more recently, Heagerty and Zeger (2000) havedeveloped “conditionally specified” marginal model formulations

In general, we have seen that likelihood-based approaches are somewhat more difficult

to formulate in the non-Gaussian data setting than is the case with continuous responses.This has led to various avenues of research where more tractable approximations have beendeveloped (e.g., PQL methods) and where likelihood-based approaches have been abandonedaltogether in favor of semi-parametric methods (e.g., GEE approaches)

Our review of the developments of regression models for longitudinal data has focusedexclusively on extensions of generalized linear models Limitations of space have precluded

a discussion of non-linear models (i.e., models where the relationship between the mean andcovariates is non-linear in the regression parameters) for longitudinal data; see Chapter 5 ofthis volume and Davidian and Giltinan (1995) for a comprehensive and unified treatment

of this topic Perhaps not surprisingly, the development of non-linear regression modelsfor longitudinal data has faced many of the challenges and issues that were discussed inSection 1.4.2

We conclude this chapter by noting that in almost every discipline there is increasedawareness of the importance of longitudinal studies for studying change over time and thefactors that influence change This has led to a steady growth in the availability of longitu-dinal data, often arising from relatively complex study designs The analysis of longitudinaldata continues to pose many interesting methodological challenges and is likely to do sofor the foreseeable future The goal of the remaining chapters in this book is to highlightthe current state of the art of longitudinal data analysis and to provide a glimpse of futuredirections

Acknowledgments

This work was supported by the following grants from the U.S National Institutes of Health:R01GM029745 and R01MH054693

References

Aerts, M., Geys, H., Molenberghs, G., and Ryan, L (2002) Topics in Modelling of Clustered Data.

Boca Raton, FL: Chapman & Hall/CRC

Airy, G B (1861) On the Algebraical and Numerical Theory of Errors of Observation and the

Combination of Observations London: Macmillan.

Altham, P M E (1978) Two generalizations of the binomial distribution Applied Statistics 27,

162–167

Trang 38

REFERENCES 23Anderson, D A and Aitkin, M (1985) Variance components models with binary response: Inter-

viewer variability Journal of the Royal Statistical Society, Series B 47, 203–210.

Anderson, T W and Goodman, L A (1957) Statistical inference about Markov chains Annals

of Mathematical Statistics 28, 89–110.

Ashford, J R and Sowden, R R (1970) Multivariate probit analysis Biometrics 26, 535–546.

Bahadur, R R (1961) A representation of the joint distribution of responses to n dichotomous items In H Solomon (ed.), Studies in Item Analysis and Prediction, pp 158–168 Palo Alto,

CA: Stanford University Press

Becker, M P and Balagtas, C C (1993) Marginal modeling of binary cross-over data Biometrics

49, 997–1009.

Bellman, R E (1961) Adaptive Control Processes Princeton, NJ: Princeton University Press.

Billingsley, P (1961) Statistical methods in Markov chains Annals of Mathematical Statistics 32,

12–40

Booth, J G and Hobert, J P (1999) Maximizing generalized linear mixed model likelihoods with

an automated Monte Carlo EM algorithm Journal of the Royal Statistical Society, Series B 61,

265–285

Box, G E P (1950) Problems in the analysis of growth and wear data Biometrics 6, 362–389.

Breslow, N E and Clayton, D G (1993) Approximate inference in generalized linear mixed

models Journal of the American Statistical Association 88, 9–25.

Breslow, N E and Lin, X (1995) Bias correction in generalized linear models with a single

com-ponent of dispersion Biometrika 82, 81–91.

Chatfield, C and Goodhardt, G J (1970) The beta-binomial model for consumer purchasing

behaviour Applied Statistics 19, 240–250.

Cole, J W L and Grizzle, J E (1966) Applications of multivariate analysis of variance to repeated

measurements experiments Biometrics 22, 810–828.

Conaway, M (1989) Conditional likelihood methods for repeated categorical responses Journal of

the American Statistical Association 84, 53–62.

Conaway, M (1990) A random effects model for binary data Biometrics 46, 317–328.

Cox, D R (1958) The regression analysis of binary sequences (with discussion) Journal of the

Royal Statistical Society, Series B 20, 215–242

Cox, D R (1972) The analysis of multivariate binary data Applied Statistics 21, 113–120 Crowder, M J (1978) Beta-binomial Anova for proportions Applied Statistics 27, 34–37.

Crowder, M J (1979) Inference about the intra-class correlation coefficient in the beta-binomial

ANOVA for proportions Journal of the Royal Statistical Society, Series B 41, 230–234.

Crowder, M J (1995) On the use of a working correlation matrix in using generalised linear models

for repeated measures Biometrika 82, 407–410.

Dale, J R (1984) Local versus global association for bivariate ordered responses Biometrika 71,

507–514

Danford, M B., Hughes, H M., and McNee, R C (1960) On the analysis of repeated measurements

experiments Biometrics 16, 547–565.

Davidian, M and Giltinan, D M (1995) Nonlinear Models for Repeated Measurement Data

Lon-don: Chapman & Hall

Dempster, A P., Laird, N M., and Rubin, D B (1977) Maximum likelihood from incomplete data

via the EM algorithm (with discussion) Journal of the Royal Statistical Society, Series B 39,

1–38

Durbin, J (1960) Estimation of parameters in time-series regression models Biometrika 47, 139–

153

Ekholm, A (1991) Fitting regression models to a multivariate binary response In G Rosenqvist,

K Juselius, K Nordstr¨om, and J Palmgren (eds.), A Spectrum of Statistical Thought: Essays

Trang 39

in Statistical Theory, Economics, and Population Genetics in Honour of Johan Fellman, pp.

19–32 Helsingfors: Swedish School of Economics and Business Administration

Ekholm, A., Smith, P W F., and McDonald, J W (1995) Marginal regression analysis of a

multivariate binary response Biometrika 82, 847–854.

Engle, R F., Hendry, D F., and Richard, J F (1983) Exogeneity Econometrica 51, 277–304.

Fisher, R A (1918) The correlation between relatives on the supposition of Mendelian inheritance

Transactions of the Royal Society of Edinburgh 52, 399–433.

Fisher, R A (1921) On the probable error of a coefficient of correlation deduced from a small

sample Metron 1, 3–32.

Fisher, R A (1925) Statistical Methods for Research Workers Edinburgh: Oliver and Boyd.

Fitzmaurice, G M and Laird, N M (1993) A likelihood-based method for analysing longitudinal

binary responses Biometrika 80, 141–151.

Fitzmaurice, G M., Laird, N M., and Rotnitzky, A G (1993) Regression models for discrete

longitudinal responses (with discussion) Statistical Science 8, 248–309.

Fitzmaurice, G M., Laird, N M., and Ware, J H (2004) Applied Longitudinal Analysis Hoboken,

NJ: Wiley

Geisser, S (1963) Multivariate analysis of variance for a special covariance case Journal of the

American Statistical Association 58, 660–669.

Geisser, S and Greenhouse, S W (1958) An extension of Box’s results on the use of the F

distribution in multivariate analysis Annals of Mathematical Statistics 29, 885–891.

Glonek, G F V (1996) A class of regression models for multivariate categorical responses

Biometrika 83, 15–28.

Glonek, G F V and McCullagh, P (1995) Multivariate logistic models Journal of the Royal

Statistical Society, Series B 57, 533–546.

Godambe, V P (1960) An optimum property of regular maximum likelihood estimation Annals

of repeated accidents Journal of the Royal Statistical Society 83, 255–279.

Griffiths, D A (1973) Maximum likelihood estimation for the beta-binomial distribution and an

application to the household distribution of the total number of cases of a disease Biometrics

Harville, D A (1977) Maximum likelihood approaches to variance component estimation and to

related problems Journal of the American Statistical Association 72, 320–338.

Heagerty, P J and Zeger, S L (2000), Marginalized multilevel models and likelihood inference

(with discussion) Statistical Science 15, 1–26.

Hedeker, D and Gibbons, R D (1994) A random-effects ordinal regression model for multilevel

analysis Biometrics 50, 933–944.

Hedeker, D and Gibbons, R D (1996) MIXOR: A computer program for mixed-effects ordinal

regression analysis Computer Methods and Programs in Biomedicine 49, 157–176.

Trang 40

REFERENCES 25Henderson, C R (1963) Selection index and expected genetic advance In W D Hanson and H F.

Robinson (eds.), Statistical Genetics and Plant Breeding Washington, D.C.: National Academy

of Sciences–National Research Council

Huynh, H and Feldt, L S (1976) Estimation of the Box correction for degrees of freedom from

sample data in the randomized block and split plot designs Journal of Educational Statistics 1,

69–82

Jennrich, R I and Schluchter, M D (1986) Unbalanced repeated-measures models with structured

covariance matrices Biometrics 42, 805–820.

Koch, G G and Reinfurt, D W (1971) The analysis of categorical data from mixed models

Biometrics 27, 157–173.

Koch, G G., Landis, J R., Freeman, J L., Freeman, D H., and Lehnen, R G (1977) A generalmethodology for the analysis of experiments with repeated measurement of categorical data

Biometrics 33, 133–158.

Korn, E L and Whittemore, A S (1979) Methods for analyzing panel studies of acute health

effects of air pollution Biometrics 35, 795–802.

Kuk, A Y C and Cheng, Y W (1997) The Monte Carlo Newton-Raphson algorithm Journal of

Statistical Computation and Simulation 59, 233–250.

Kupper, L L and Haseman, J K (1978) The use of a correlated binomial model for the analysis

of certain toxicological experiments Biometrics 34, 69–76.

Laird, N M and Ware, J H (1982) Random effects models for longitudinal data Biometrics 38,

963–974

Lang, J B and Agresti, A (1994) Simultaneous modeling joint and marginal distributions

of multivariate categorical responses Journal of the American Statistical Association 89,

625–632

Lee, Y and Nelder, J A (1996) Hierarchical generalized linear models (with discussion) Journal

of the Royal Statistical Society, Series B 58, 619–678.

Lee, Y and Nelder, J A (2001) Hierarchical generalized linear models: a synthesis of generalized

linear models, random-effect models and structured dispersions Biometrika 88, 987–1006 Lee, Y and Nelder, J A (2003) Extended-REML estimators Journal of Applied Statistics 30,

845–856

Liang, K.-Y and Zeger, S L (1986) Longitudinal data analysis using generalized linear models

Biometrika 73, 13–22.

Liang, K.-Y., Zeger, S L., and Qaqish, B (1992) Multivariate regression analyses for categorical

data (with discussion) Journal of the Royal Statistical Society, Series B 54, 2–24.

Lipsitz, S R., Laird, N M., and Harrington, D P (1990) Maximum likelihood regression methods

for paired binary data Statistics in Medicine 9, 1417–1425.

McCullagh, P and Nelder, J (1989) Generalized Linear Models, 2nd ed London: Chapman & Hall.

McCulloch, C E (1997) Maximum likelihood algorithms for generalized linear mixed models

Journal of the American Statistical Association 92, 162–170.

Molenberghs, G and Lesaffre, E (1994) Marginal modeling of correlated ordinal data using a

multivariate Plackett distribution Journal of the American Statistical Association 89, 633–644.

Molenberghs, G and Ritter, L (1996) Methods for analyzing multivariate binary data, with the

association between outcomes of interest Biometrics 52, 1121–1133.

Molenberghs, G and Verbeke, G (2005) Models for Discrete Longitudinal Data New York:

Springer

Nelder, J A and Wedderburn, R W M (1972) Generalized linear models Journal of the Royal

Statistical Society, Series A 135, 370–384.

Ochi, Y and Prentice, R L (1984) Likelihood inference in a correlated probit regression model

Biometrika 71, 531–543.

... series data, their application to longitudinal data has been somewhatmore limited There are a number of features of transition models that limit their usefulnessfor the analysis of longitudinal data. .. spaced crete longitudinal data with a finite number of states or categories (e.g., Anderson andGoodman, 1957; Cox, 1958; Billingsley, 1961) In the simplest of models for longitudinaldata, a first-order... mixed-effects models for longitudinal data when the response is continuous Thesemodels can handle issues of unbalanced data, due to either mistimed measurement or miss-ing data, time-varying and

Ngày đăng: 13/09/2021, 09:08

TỪ KHÓA LIÊN QUAN