1. Trang chủ
  2. » Thể loại khác

Joint modelling of longitudinal data and survival data

103 4 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 103
Dung lượng 2,28 MB
File đính kèm 121.Joint modelling of longitudinal.rar (1 MB)

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 4 / 49 Introduction Survival Data Longitudinal Data Clinical example I 312 patients with primary biliary cirrhosis I

Trang 1

and survival data in Stata

Prof Paul C Lambert1,2 &

1 Department of Health Sciences, University of Leicester, UK

2 Department of Medical Epidemiology and Biostatistics,

Karolinska Institutet, Stockholm, Sweden

Trang 2

11.00 11.30 Welcome and introduction

11.30 12.30 Introduction to survival analysis and longitudinal analysis

9.00 9.15 Lab review of day 1

9.15 10.15 Joint models of longitudinal and survival data

15.30 16.00 Wrap-up session - further topics

16.00 Tea/Coffee and farewell

Trang 3

What is JM? & Terminology

• 2 broad inter-linked processes:

– Biomarker process ( longitudinal ) [mixed] model

– Time to clinical outcome process ( event ) model

survival/time-to-• Focus may be …

– Estimating biomarker profile/trajectory allowing for informative dropout, e.g death

– Estimating relationship between underlying [adjusting

for measurement error ] biomarker profile/trajectory

and clinical outcome

Trang 4

What does JM add? – 2

Follow-up time (years)

logb Longitudinal prediction (including BLUPS)

Panel 2

Trang 5

0.0 0.2 0.4 0.6 0.8 1.0

Panel 2

Why the need for JM? – 1

• Technology is very rapidly evolving … more

biomarkers are being used/collected …

• e-Health agenda means that more routinely

collected biomarker data is being linked to

outcome data

• For example, CPRD now links primary care

records with Hospital Episode Statistics (HES)

data, cancer registry and mortality data

Trang 6

• Clinical (and health policy) decision making is not now just about who will (or will not)

benefit from a particular treatment, e.g

Cetuximab, bevacizumab and panitumumab for the treatment of metastatic colorectal

cancer after first-line chemotherapy [NICE TA

medicine – try patient on a treatment and see

if they “respond” or not, BUT …

Why the need for JM? – 3

• requires quick, reliable and valid (i.e linked to clinical outcomes) surrogates that can be

monitored repeatedly and routinely, e.g

biomarkers – so patients can stop asap if not responding

• For example, PSA-defined response in

previously treated metastatic prostate cancer [NICE TA 259]

Trang 7

– Obesity reduction/physical activity – CVD

Digested Technology (IT)

BUT … back to today (and tomorrow)!

• Survival models

• Longitudinal models

• Survival models with time-varying covariates

• 2-stage approaches to Joint Models

• Fully Joint Models

sub-models using association structures

• Prediction

Trang 8

Lecture 2: Introduction to survival analysis

and longitudinal analysis

Karolinska Institutet, Stockholm, Sweden

paul.lambert@le.ac.uk

Trang 9

Introduction

Survival Data

Longitudinal Data

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 2 / 49

Introduction Survival Data Longitudinal Data

Outline

Introduction

Survival Data

Longitudinal Data

Trang 10

I This course is essentially about simultaneously fitting a

survival model and a longitudinal model

I First, we will review key features of both of these types ofmodel separately

I In particular we will discuss their use in Stata

consider both outcomes simultaneously

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 4 / 49

Introduction Survival Data Longitudinal Data

Clinical example

I 312 patients with primary biliary cirrhosis

I Cirrhosis is a slowly progressing disease in which healthy

liver tissue is replaced with scar tissue, eventually

preventing the liver from functioning properly

liver function

died

Research question: How does serum bilirubin change over

time, and are those changes associated with survival?

In this session we will consider the survival model and

longitudinal model separately

Trang 11

Introduction

Survival Data

Longitudinal Data

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 6 / 49

Introduction Survival Data Longitudinal Data

Survival Data

I Interest in time to an event

I Time from diagnosis to death

I Time from randomisation to disease progression

I Time from hospital admission to discharge

I Unlikely that all subjects will have the event before end ofstudy We have censored data

but not when their time of death is

Trang 12

Schematic Graph with censoring

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 8 / 49

Introduction Survival Data Longitudinal Data

The hazard function

I h(t) gives the event rate as a function of t

I Note it is the event rate conditional on still being at risk

I Useful for understanding natural history of disease

I Many survival models estimate (relative) differences in

hazard rates

Trang 13

Example of a hazard rate 1[1]

0 50

diagnosis to delivery, estimated from an unadjusted flexible parametric

survival model The curves are only plotted until the last event time for each

group.

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 10 / 49

Introduction Survival Data Longitudinal Data

Example of a hazard rate 2[2]

First distant metastasis by age group 80

60 40 20 0

Time since diagnosis (years)

I Rate of first distant metastasis: Women with first

invasive breast cancer

Trang 14

Likelihood for survival data

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 12 / 49

Introduction Survival Data Longitudinal Data

Why not the Cox model?

our understanding of the disease process

I Under proportional hazards we get (almost) identical

estimates of the hazards ratios if fitting a reasonable

model (see Rutherford et al [3])

I Easier for (out of sample) predictions

I See our work on flexible parametric models for further

discussion of this[4, 5]

Trang 15

Using stset in Stata

I In Stata you declare the structure of your survival data

using stset This creates some internal variables

internal variables

I For example, sts graph, stcox, streg, stjm

I With the bilirubin data we have survival time recorded in

I The key internal variables created after using stset are

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 14 / 49

Introduction Survival Data Longitudinal Data

Example of stset

use pbc_baseline

(Example dataset for baseline analysis)

stset stime, failure(died==1)

failure event: died == 1 obs time interval: (0, stime]

exit on or before: failure

312 total observations

0 exclusions

312 observations remaining, representing

140 failures in single-record/single-failure data 2000.307 total analysis time at risk and under observation

last observed exit t = 14.30566 list id stime died _t0 _t _d _st in 1/5, noobs

Trang 16

Kaplan-Meier plot by treatment (sts graph)

sts graph, by(trt) risktable

0.00 0.25 0.50 0.75 1.00

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 16 / 49

Introduction Survival Data Longitudinal Data

Kaplan-Meier plot by bilirubin (grouped)

0.00 0.25 0.50 0.75 1.00

Kaplan-Meier survival estimates

Trang 17

Parametric models

form For example, the Weibull distribution

S(t) = exp(−λtγ), h(t) = λγtγ−1, f (t) = λγtγ−1exp(−λtγ)

I Often standard parametric distributions are not flexible

enough to capture shape of underlying hazard function,

e.g Weibull hazard is montonic

splines to model the baseline hazard

I In this course we will restrict attention to the Weibull

survival model, but Michael has programmed a number ofdistributions including the use of splines [6]

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 18 / 49

Introduction Survival Data Longitudinal Data

Proportional hazards models

the proportional hazards model

hi(t) = h0(t) exp(φTvi)

I hi(t) is the hazard for the ith subject

I h0(t) is the baseline hazard (all covariates equal zero)

I vi is a vector of baseline covariates for the ith subject

I φT is a vector of parameters (log hazard ratios)

I With a Weibull model, h0(t) = λγtγ−1

Trang 18

The Cox Model

hi(t) = h0(t) exp(φTvi)

I However, when using partial likelihood, h0(t) is not

directly estimated

I Thus relative effects (hazard ratios) are estimated, but

not the baseline hazard

I We are interested in various predictions after fitting a

model It is far easier to do this with a parametric

framework[4]

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 20 / 49

Introduction Survival Data Longitudinal Data

Fitting a Cox model

stcox trt logb, nolog noshow

Cox regression Breslow method for ties

Time at risk = 2000.30665

Trang 19

Fitting a Weibull model

streg trt logb, dist(weibull) nolog noshow

Weibull regression log relative-hazard form

Time at risk = 2000.30665

I The HR for trt is 1.10 This means that the mortality rate

for those on treatment is 10% greater than those on placebo.

I As this is proportional hazards model the 10% relative

increase is assumed to be the same at all points in follow-up

time (e.g 1 month, 6 months, 2 years, 5 years).

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 22 / 49

Introduction Survival Data Longitudinal Data

Interpreting the effect of (log) bilirubin

I The interpretation of bilirubin is more complicated For

every unit increase in log bilirubin the mortality rate

increases by a factor of 2.83

I In order to understand, define a reference point for (log)

bilirubin We will take the median value of 1.35 [units?]

I We can predict the hazard ratio relative to this point

partpred hr_logb, for(logb) ref(logb ‘=ln(1.35)’) eform ci(hr_logb_lci hr_logb_uci)

I We can then plot against log bilirubin or bilirubin

Trang 20

Hazard Ratio for log Bilirubin

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 24 / 49

Introduction Survival Data Longitudinal Data

Hazard Ratio for Bilirubin

Trang 21

Predicting survival (stcurve, surv)

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 26 / 49

Introduction Survival Data Longitudinal Data

Predicting survival (stcurve, surv)

I Predicted survival in placebo group at lower quartile,

median and upper quartile of bilirubin

Weibull regression

Trang 22

Other parametric models

are exponential, Weibull, log-normal, log-logistic,

Gompertz, generalized gamma

models may not be flexible enough to capture the shape

of the underlying hazard/survival functions

I In these cases we often use splines to model the

underlying baseline using stpm2 [5]

I The joint models you will hear about later incorporate

exponential, Weibull, Gompertz, mixture Weibull, splines

on the log-hazard scale and splines on the log cumulative

hazard scale

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 28 / 49

Introduction Survival Data Longitudinal Data

Summary

baseline covariates, i.e their values are assumed not to

change over time

I Michael will talk later about time-varying covariates

baseline (Weibull), but the ideas in this course extend to

more complex models

This assumption can be relaxed by fitting interactions

with time

Trang 23

Introduction

Survival Data

Longitudinal Data

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 30 / 49

Introduction Survival Data Longitudinal Data

Longitudinal Data

bilirubin over time

I Potential issue of informative drop-out, but for the

moment we will ignore

I We are interested in the profile of (log) bilirubin over

time Does this vary between treatment groups?

Trang 24

List of longitudinal data

list id time age trt serbilir logb in 1/15, sepby(id) noobs

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 32 / 49

Introduction Survival Data Longitudinal Data

Random intercept model: schematic plot

Time

Trang 25

Random intercept model

I Consider the following model (subject i, observation j),

yij = β0i + β1tij + eij

β0i ∼ N(β0, σ20) eij ∼ N(0, σe2)

I β0 is the mean intercept

I β1 is slope, which the same for all subjects

I β0i is the subject specific intercept

I σ20 is the between subject variance

I σ2e is the within subject variance

yij = β0 + β1itij + bi + eij

bi ∼ N(0, σ20) eij ∼ N(0, σe2)

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 34 / 49

Introduction Survival Data Longitudinal Data

Random slope model: schematic plot

Time

Trang 26

Random intercept and slope

I β0 is the mean intercept

I β1 is the mean slepe

I σ20 is the between subject intercept variance

I σ21 is the between subject slope variance

I σ2e is the within subject variance

yij = (β0 + b0i) + (β1 + b1i)tij + eij

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 36 / 49

Introduction Survival Data Longitudinal Data

General longitudinal model

I We have only considered simple fixed linear effects and

simple random effects

yi(t) = XiT(t)β +ZiT(t)bi + ei(t), ei(t) ∼ N(0, σ2)

where for subject i,

I XiT(t) is the design matrix for the fixed effects

I β are the fixed effects parameters

I ZiT(t) is the design matrix for the random effects

distribution)

Trang 27

Trend for first 30 patients

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 38 / 49

Introduction Survival Data Longitudinal Data

Using mixed in Stata

regression models

subject variation (level 2) and within subject variation

(level 1)

I If the subject identifier (level 2) is stored in id, the

following will fit a random intercept model

mixed y time || id:

I Note the use of the colon

following,

estimated

Trang 28

Random intercept model

mixed logb time || id:, nolog

LR test vs linear regression: chibar2(01) = 2149.19 Prob >= chibar2 = 0.0000

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 40 / 49

Introduction Survival Data Longitudinal Data

Random intercept and slope model

mixed logb time || id: time, nolog cov(unstructured)

Trang 29

What have we estimated? Fixed effects

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 42 / 49

Introduction Survival Data Longitudinal Data

What have we estimated? Random effects

Trang 30

Postestimation after mixed in Stata

I the linear predictor of the fixed coefficents (xb)

I the best linear unbiased predictions (BLUPs) of the random effects (reffects)

I the fitted values, fixed + random effects (fitted)

I the within subject residuals

predict xb, xb

predict b1 b0, reffects

predict fitted, fitted

gen fitted2 = xb + b0 + b1*time

predict resid, residuals

gen resid2 = logb - fitted

format %5.4f xb b0 b1 fitted fitted2 resid resid2

list id xb b0 b1 fitted fitted2 resid resid2 if id==1, sepby(id) noobs compress

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 44 / 49

Introduction Survival Data Longitudinal Data

Complexity of the longitudinal profile and random effects

I We want to model the subject-specific profiles

I If this is too simplistic we will not get good subject level

estimates

I We need to consider non-linear effects (e.g using

polynomials, splines or fractional polynomials)

I The following plots show the difference in the

subject-specific estimates between a random intercept

and a random intercept + slope model

Trang 31

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 46 / 49

Introduction Survival Data Longitudinal Data

Random intercept and slope

Trang 32

I Multilevel random effects models are a useful way to

model longitudinal data

I We have ignored drop out here If higher values of our

biomarker are associated with increased mortality then

those remaining at time goes on will tend to have lower

biomarker values

I In this situation, we may see a trend at the study

population level, when none exists at the individual level

I This will be covered in more detail in the next lecture

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 48 / 49

Introduction Survival Data Longitudinal Data

References I

Johansson ALV, Andersson TML, Hsieh CC, Cnattingius S, Lambe M Increased mortality in women with

breast cancer detected during pregnancy and different periods postpartum Cancer Epidemiol Biomarkers

Prev Sep 2011; 20(9):1865–1872, doi:10.1158/1055-9965.EPI-11-0515 URL

http://dx.doi.org/10.1158/1055-9965.EPI-11-0515.

Colzani E, Johansson ALV, Liljegren A, Foukakis T, Clements M, Adolfsson J, Hall P, Czene K.

Time-dependent risk of developing distant metastasis in breast cancer patients according to treatment, age and tumour characteristics Br J Cancer Mar 2014; 110(5):1378–1384, doi:10.1038/bjc.2014.5 URL

http://dx.doi.org/10.1038/bjc.2014.5.

Rutherford MJ, Crowther MJ, Lambert PC The use of restricted cubic splines to approximate complex

hazard functions in the analysis of time-to-event data: a simulation study J Statist Comput Simulation

Trang 33

Lecture 3: Survival analysis with time-varying covariates and two-stage

2 Department of Medical Epidemiology and Biostatistics,

Karolinska Institutet, Stockholm, Sweden

∗ michael.crowther@le.ac.uk

Trang 34

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 2 / 36

Introduction Survival analysis with a time-varying covariate Two-stage models Summary References

Trang 35

Brief review

This morning we looked at how to model a continuous

outcome, such as blood pressure, over time, using a linear

mixed effects model

yi(t) = XiT(t)β +ZiT(t)bi + ei(t), ei(t) ∼ N(0, σ2)

You also learned how to model a time-to-event outcome, such

as time to death, using a proportional hazards model

hi(t) = h0(t) exp(φTvi)

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 4 / 36

Introduction Survival analysis with a time-varying covariate Two-stage models Summary References

Brief review

In this afternoon’s lecture and practical we will start looking atwhat we can do if the longitudinal and survival outcomes are

related? This gives rise to such research questions as:

I What if the trajectory of blood pressure, i.e how it

changes over time, impacts the risk of death?

I If patients with higher blood pressure are more likely to

die, will this affect our estimates of the trajectory of BP

over time?

Trang 36

Biomarkers are often collected repeatedly over time, in parallel

to the time to an event of interest Some examples from the

clinical literature include:

progression to AIDS

I Prostate specific antigen and risk of prostate cancer

recurrence

I Serum bilirubin and primary biliary cirrhosis of the liver

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 6 / 36

Introduction Survival analysis with a time-varying covariate Two-stage models Summary References

Background

What is important to note here is that we have subject-level

covariates which are measured at multiple time points

I If we just used baseline values of biomarkers, we are

throwing away a lot of (statistically and clinically) useful

information

I Interest may lie in whether a change in the biomarker is

associated with poorer/improved prognosis

prognosis?

Trang 37

Clinical example

I 312 patients with primary biliary cirrhosis

I Cirrhosis is a slowly progressing disease in which healthy

liver tissue is replaced with scar tissue, eventually

preventing the liver from functioning properly

liver function

died

Research question: How does serum bilirubin change over

time, and are those changes associated with survival?

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 8 / 36

Introduction Survival analysis with a time-varying covariate Two-stage models Summary References

In this morning’s lecture and the previous practical, you fitted

survival models adjusting for covariates measured at baseline

This is what is most often conducted in clinical research, for

Trang 38

The dataset you fitted survival models to earlier, consists of anobservation per subject

use "C:\JM_Course\Data\pbc_baseline.dta",clear

stset stime, f(died=1)

list id logb trt _t0 _t _d if id==4 | id==5, table noobs

4 5877866 D-penicil 0 5.2705069 1

5 1.223776 Placebo 0 4.1205783 0

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 10 / 36

Introduction Survival analysis with a time-varying covariate Two-stage models Summary References

Problems with using only the observed baseline biomarker

value:

1 We are throwing away a lot of potentially useful

information by using only baseline observations

with error

These are the two issues we are going to begin to address in

this lecture

Trang 39

In the original study, serum bilirubin was measured at multipletime-points throughout follow-up The full data looks like this:

list id logb trt time stime died if id==4 | id==5, ///

> table sepby(id) noobs

id logb trt time stime died

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 12 / 36

Introduction Survival analysis with a time-varying covariate Two-stage models Summary References

Trang 40

Survival analysis with a time-varying covariate

We now want to fit a survival model where our covariate of

interest changes value over time

hi(t) = h0(t) exp

φTvi + αyi(t)

where yi(t) is the observed biomarker value for the ith patient

at time t

University of Leicester Joint Modelling in Stata 22nd-23rd April 2015 14 / 36

Introduction Survival analysis with a time-varying covariate Two-stage models Summary References

Consider a hypothetical patient that had measurements taken

at baseline, 1.2 and 3.5 years, and died at 4.5 years We can

use start/stop notation to set-up our data:

id biomarker start stop status

Ngày đăng: 09/09/2021, 16:30

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN