1. Trang chủ
  2. » Công Nghệ Thông Tin

Statistical Methods for Survival Data Analysis Third Edition phần 4 potx

54 329 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 54
Dung lượng 312,67 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

The exponential, Weibull, lognormal, and gamma distributions are specialcases of a generalized gamma distribution with three parameters,, , and ,whose density function is defined as f t:

Trang 1

Figure 6.13 Lognormal probability plot of the survival time in months plus 4 of 162 male patients with chronic myelocytic leukemia (From Feinleib and MacMahon, 1960 Reproduced by permission of the publisher.)

Figure 6.14 Lognormal probability plot of the survival time in months plus 4 of female patients with two types of leukemia (From Feinleib and MacMahon, 1960 Reproduced

by permission of the publisher.)

Suppose that failure or death takes place in n stages or as soon as n subfailures have happened At the end of the first stage, after time T, the first

subfailure occurs; after that the second stage begins and the second subfailure

occurs after time T; and so on Total failure or death occurs at the end of the nth stage, when the nth subfailure happens The survival time, T, is then T;T;%;TL The times T, T, , TL spent in each stage are assumed to

Trang 2

Figure 6.15 Gamma hazard functions with  : 1.

be independently exponentially distributed with probability density function

 exp(9tG), i:1, , n That is, the subfailures occur independently at a

constant rate The distribution of T is then called the Erlangian distribution.

There is no need for the stages to have physical significance since we can

always assume that death occurs in the n-stage process just described This

idea, introduced by A K Erlang in his study of congestion in telephonesystems, has been used widely in queuing theory and life processes

A natural generalization of the Erlangian distribution is to replace the

parameter n restricted to the integers 1, 2, by a parameter taking any real

positive value We then obtain the gamma distribution.

The gamma distribution is characterized by two parameters, and  When

0   1, there is negative aging and the hazard rate decreases monotonicallyfrom infinity to  as time increases from 0 to infinity When   1, there ispositive aging and the hazard rate increases monotonically from 0 to as timeincreases from 0 to infinity When : 1, the hazard rate equals , a constant,

as in the exponential case Figure 6.15 illustrates the gamma hazard functionfor  : 1 and   1,  : 1, 2, 4 Thus, the gamma distribution describes adifferent type of survival pattern where the hazard rate is decreasing orincreasing to a constant value as time approaches infinity

The probability density function of a gamma distribution is

f (t): 

() (t)A\e\HR t 0,   0,   0 (6.4.1)

where () is defined as in (6.2.9) Figures 6.16 and 6.17 show the gammadensity function with various values of and  It is seen that varying  changesthe shape of the distribution while varying  changes only the scaling.Consequently,  and  are shape and scale parameters, respectively When

  1, there is a single peak at t : ( 9 1)/.

Trang 3

Figure 6.16 Gamma density functions with  : 1.

Figure 6.17 Gamma density functions with  : 3.

The cumulative distribution function F(t) has a more complex form:

Trang 4

For the Erlangian distribution, it can be shown that

for the Erlangian distribution

Since the hazard function is the ratio of f (t) to S(t), it can be calculated from

(6.4.1) and (6.4.7) When is an integer n,

h(t): (t)L\

(n 9 1)! L\ I (1/k!)( t)I (6.4.8)

When : 1, the distribution is exponential When  :  and :, where 

is an integer, the distribution is chi-square with degrees of freedom The meanand variance of the standard gamma distribution are, respectively,/ and /,

so that the coefficient of variation is 1/(

Many survival distributions can be represented, at least roughly, by suitablechoice of the parameters and  In many cases, there is an advantage in usingthe Erlangian distribution, that is, in taking integer

The exponential, Weibull, lognormal, and gamma distributions are specialcases of a generalized gamma distribution with three parameters,, , and ,whose density function is defined as

f (t): ()?A t ?A\ exp [9(t)?] t 0,   0,   0,   0 (6.4.9)

It is easily seen that this generalized gamma distribution is the exponentialdistribution if  :  : 1, the Weibull distribution if  : 1; the lognormaldistribution if ; -, and the gamma distribution if  : 1.

In later chapters (e.g., Chapters 7 and 9), we discuss several parametricprocedures for estimation and hypothesis testing To use available computersoftware such as SAS to carry out the computation, we use the distributionsadopted by the software One of the very few software packages that includethe gamma or generalized gamma distribution is SAS In SAS, the generalized

Trang 5

Table 6.4 Lifetimes of 101 Strips of Aluminum Coupon

1293 1300 1310 1313 1315 1330 1355 1390 1416 1419 1420 1420 1450 1452 1475 1478 1481 1485 1502 1505 1513

1522 1522 1530 1540 1560 1567 1578 1594 1602 1604 1608 1630 1642 1674 1730 1750 1750 1763 1768 1781 1782

1792 1820 1868 1881 1890 1893 1895 1910 1923 1940 1945 2023 2100 2130 2215 2268 2440

Source: Birnbaum and Saunders(1958).

gamma distribution is defined as having the following density function:

f (t):A?A () t ?A\ exp [9(t)?], t 0,   0,   0 (6.4.10)

To differentiate this form of the generalized gamma distribution from thegeneralized gamma in (6.4.9), we refer to this distribution as the extended generalized gamma distribution It can be shown that the extended generalized

gamma distribution reduces to the Weibull distribution when  0 and  : 1,the lognormal distribution when ; -, the gamma distribution when  : 1,

and the exponential distribution when :  : 1

gamma distribution to the lifetime of aluminum coupon In their study, 17 sets

of six strips were placed in a specially designed machine Periodic loading wasapplied to the strips with a frequency of 18 hertz and a maximum stress of21,000 pounds per square inch The 102 strips were run until all of them failed.One of the 102 strips tested had to be discarded for an extraneous reason,yielding 101 observations The lifetime data are given in Table 6.4 in ascendingorder From the data the two parameters of the gamma distribution were

Trang 6

Figure 6.18 Graphical comparison of observed and fitted cumulative distribution functions of data in Example 6.4 (From Birnbaum and Saunders, 1958.)

estimated (estimation methods are discussed in Chapter 7) They obtained

 : 11.8 and  : 1/(118.76;10)

A graphical comparison of the observed and fitted cumulative distributionfunction is given in Figure 6.18, which shows very good agreement Achi-square goodness-of-fit test (discussed in Chapter 9) yielded a value of4.49 for 6 degrees of freedom, corresponding to a significance level between 0.5and 0.6 Thus, it was concluded that the gamma distribution was an adequatemodel for the life length of some materials

6.5 LOG-LOGISTIC DISTRIBUTION

The survival time T has a log-logistic distribution if log(T ) has a logistic

distribution The density, survivorship, hazard, and cumulative hazard tions of the log-logistic distribution are, respectively,

Trang 7

When  1, the log-logistic hazard has the value 0 at time 0, increases to a

peak at t : ( 9 1)A/A, and then declines, which is similar to the lognormal

hazard When : 1, the hazard starts at A and then declines monotonically.When  1, the hazard starts at infinity and then declines, which is similar to

the Weibull distribution The hazard function declines toward 0 as t

ap-proaches infinity Thus, the log-logistic distribution may be used to describe afirst increasing and then decreasing hazard or a monotonically decreasinghazard

describe the rate of spread of HIV between 1978 and 1986 Between 1978 and

1980, over 6700 homosexual and bisexual men in San Francisco were enrolled

in studies of the prevalence and incidence of sexually transmitted hepatitis Bvirus(HBV) infections Blood specimens were collected from the participants.Four hundred and eighty-eight men who were HBV-seronegative were ran-domly selected to participate in a study of HIV infection later These menagreed to allow the investigators to test the specimens collected previouslytogether with a current specimen For those who convert to positive, theinfection time is only known to have occurred between the previous negativetest and the time of the first positive one The exact time is unknown The time

to infection is therefore interval censored The investigators tried to fit severaldistributions to the interval-censored data, including the Weibull and log-logistic by maximum likelihood methods (discussed in Chapter 7) Based onthe Akaike information criterion (discussed in Chapter 9), the log-logisticdistribution was found to provide the best fit to the data The maximumlikelihood estimates of the two parameters are : 0.003757 and  : 1.424328.Based on the log-logistic model, the median infection time is estimated to be50.4 months, and the hazard function approaches its peak at 27.6 months

6.6 OTHER SURVIVAL DISTRIBUTIONS

Many other distributions can be used as models of survival time, three of which

we discuss briefly in this section: the linear exponential, the Gompertz(1825),

Trang 9

Figure 6.20 Hazard function of linear-exponential model.

and a distribution whose hazard rate is a step function The linear-exponentialmodel and the Gompertz distribution are extensions of the exponentialdistribution Both describe survival patterns that have a constant initial hazardrate The hazard rate varies as a linear function of time or age in thelinear-exponential model and as an exponential function of time or age in theGompertz distribution

In demonstrating the use of the linear-exponential model, Broadbent(1958),uses as an example the service of milk bottles that are filled in a dairy,circulated to customers, and returned empty to the dairy The model was alsoused by Carbone et al.(1967) to describe the survival pattern of patients withplasmacytic myeloma The hazard function of the linear-exponential distribu-tion is

where  and  can be values such that h(t) is nonnegative The hazard rate

increases from with time if   0, decreases if   0, and remains constant (anexponential case) if : 0, as depicted in Figure 6.20

The probability density function and the survivorship function are, tively,

Trang 10

Table 6.5 Values of L(x) and G(x)

Figure 6.21 Gompertz hazard function.

is tabulated in Table 6.5 A special case of the linear-exponential distribution,the Rayleigh distribution, is obtained by replacing by  (Kodlin, 1967) That

is, the hazard function of the Rayleigh distribution is h(t)The Gompertz distribution is also characterized by two parameters,:  ; t.  and

 The hazard function,

is plotted in Figure 6.21 When  0, there is positive aging starting from eH;

when  0, there is negative aging; and when  : 0, h(t) reduces to a constant,

eH The survivorship function of the Gompertz distribution is

S(t): exp9eH

Trang 11

Figure 6.22 Step hazard function.

and the probability density function, from(6.6.4) and (2.2.5), is then

f (t): exp( ; t) 91(e H>AR 9 eH) (6.6.6) The mean of the Gompertz distribution is G(e H/)/eH, where

Trang 12

The probability density function f (t) can then be obtained from(6.6.7) and(6.6.8) using (2.2.5):

The nine distributions described above are, among others, reasonablemodels for survival time distribution All have been designed by considering abiological failure, a death process, or an aging property They may or may not

be appropriate for many practical situations, but the objective here is toillustrate the various possible techniques, assumptions, and arguments that can

be used to choose the most appropriate model If none of these distributionsfits the data, investigators might have to derive an original model to suit theparticular data, perhaps by using some of the ideas presented here

Bibliographical Remarks

In addition to the papers on the distributions cited in this chapter, Mann et al

(1974), Hahn and Shapiro (1967), Johnson and Kotz (1970a, b),

Elandt-Johnson and Elandt-Johnson(1980), Lawless (1982), Nelson (1982), Cox and Oakes(1984), Gertsbakh (1989), and Klein and Moeschberger (1997) also discussstatistical failure models, including the exponential, Weibull, gamma, lognor-mal, generalized gamma, and log-logistic distributions Applications of survivaldistributions can be found easily in medical and epidemiological journals Thefollowing are a few examples: Dharmalingam et al (2000), Riffenburgh andJohnstone(2001), and Mafart et al (2002)

Trang 13

6.2 Suppose that the survival distribution of a group of patients follows the

exponential distribution with G: 0 (year),  : 0.65 Plot the ship function and find:

survivor-(a) The mean survival time

(b) The median survival time

(c) The probability of surviving 1.5 years or more

6.3 Suppose that the survival distribution of a group of patients follows the

exponential distribution with G: 5 (years) and  : 0.25 Plot the orship function and find:

surviv-(a) The mean survival time

(b) The median survival time

(c) The probability of surviving 6 years or more

6.4 Consider the following two Weibull distributions as survival models:

(c) The coefficient of variation

Which distribution gives the larger probability of surviving at least 3 units

(e) The mode

6.6 Suppose that pain relief time follows the gamma distribution with : 1,

 : 0.5 Find:

(a) The mean

(b) The variance

(c) The coefficient of variation

6.7 Suppose that the survival distribution is (1) Gompertz and (2) exponential, and : 1,  : 2.0 Plot the hazard function and find:(a) The mean

linear-(b) The probability of surviving longer than 1 unit of time

6.8 Consider the survival times of hypernephroma patients given in ExerciseTable 3.1 From the plot you obtained in Exercise 4.5, suggest adistribution that might fit the data

Trang 14

C H A P T E R 7

Estimation Procedures for

Parametric Survival Distributions without Covariates

In this chapter we discuss some analytical procedures for estimatingthe mostcommonly used survival distributions discussed in Chapter 6 We introduce themaximum likelihood estimates(MLEs) of the parameters of these distributions.The general asymptotic likelihood inference results that are most widely usedfor these distributions are given in Section 7.1 We begin to used the generalsymbol b: (b,b, , bN) to denote a set of parameters For example, in discussingthe Weibull distribution, b could be  and b could be , and p :2.

b is called a vector in linear algebra Readers who are not familiar with linear

algebra or are not interested in the mathematical details may skip this sectionand proceed to Section 7.2 without loss of continuity In Sections 7.2 to 7.7 weintroduce the MLEs for the parameters of the exponential, Weibull, lognormal,gamma, log-logistic, and Gompertz distributions for data with and withoutcensored observations The related BMDP or SAS programming codes thatmay be used to obtain the MLE are given in the respective sections

PROCEDURE

7.1.1 Estimation Procedures for Data with Right-Censored Observations

Suppose that persons were followed to death or censored in a study Let t, t, , tP, t>P>, , t>L be the survival times observed from the n individuals, with r exact times and (n 9 r) right-censored times Assume that the survival times follow a distribution with the density function f (t, b) and survivorship function S(t, b), where b : (b, , bN) denotes unknown p parameters b, , bN in the distribution As shown in Chapter 6, an exponential distribu-

tion has one (p : 1) parameter , the Weibull distribution has two (p : 2)

162

Trang 15

parameters and , and so on If the survival time is discrete (i.e., it is observed

at discrete time only), f (t, b) represents the probability of observing t and S(t, b)

represents the probability that the survival or event time is greater than t In other words, f (t, b) and S(t, b) represent the information that can be obtained

from an observed uncensored survival time and an observed right-censoredsurvival time, respectively Therefore, the product LG f(tG,b) represents

the joint probability of observingthe uncensored survival times, and

LGP>S(t>G,b) represents the joint probability of those right-censored survival times The product of these two probabilities, denoted by L (b),

specific set of parameters b The method of the MLE is to find an estimator of

b that maximizes L(b), or in other words, which is ‘‘most likely’’ to have

produced the observed data t, t, , tP, t>P>, , t>L Take the logarithm of

(l(b)).

It is clear that b is a solution of the followingsimultaneous equations, which

are obtained by takingthe derivative of l(b ) with respect to each bH:

l(b)

bH : 0 j : 1, 2, , p (7.1.2)

The exact forms of(7.1.2) for the parametric survival distributions discussed inChapter 6 are given in Sections 7.2 to 7.7 Often, there is no closed solution forthe MLE b from (7.1.2) To obtain the MLE b, one can use a numerical

method A commonly used numerical method is the Newton—Raphson

iter-ative procedure, which can be summarized as follows

1 Let the initial values of b, , bN be zero; that is, let

b : 0

Trang 16

2 The changes for b at each subsequent step, denoted byH, is obtained by

takingthe second derivative of the log-likelihood function:

The iteration terminates at, say, the mth step if

precision, usually a very small value, 10\ or 10\ Then the MLE b is definedas

The estimated 100(1

(b  G9Z?(vGG, bG;Z?(vGG) (7.1.6)

percentile point of the standard normal distribution [P(Z

g(bG) is its respective range R on the confidence interval (7.1.6), that is,

R : g(bG) :bG + (bG9Z?(vGG, bG;Z?(vGG) (7.1.7)

for g(bG) is

[g(b  G9Z?(vGG), g(bG;Z?(vGG)] (7.1.8)

164      

Trang 17

7.1.2 Estimation Procedures for Data with Right-, Left-, and

Interval-Censored Observations

If the survival times t, t, , tL observed for the n persons consist of

uncen-sored, left-, right-, and interval-censored observations, the estimation cedures are similar Assume that the survival times follow a distribution with the

pro-density function f (t, b) and the survivorship function S(t, b), where b denotes all

unknown parameters of the distribution Then the log-likelihood function is

l(b): log L (b) :  log[ f (tG, b)]; log[S(tG, b)]

;  log[1 9 S(tG, b)]; log[S(vG, b)9S(tG, b)] (7.1.9)where the first sum is over the uncensored observations, the second sum overthe right-censored observations, the third sum over the left-censored observa-

tions, and the last sum over the interval-censored observations, with vG as the

lower end of a censoringinterval The other steps for obtainingthe MLE b of

b are similar to the steps shown in Section 7.1.1 by substitutingthe loglikelihood function defined in(7.1.1) with the log-likelihood function in (7.1.9).The computation of the MLE b and its estimated covariance matrix istedious The following example gives the general procedure for using SAS tocarry out the computation

left-, and interval-censored observations, one needs to create a new data setfrom the observed data to use SAS to obtain the estimates of the parameters

in the distribution For an observed survival time t (uncensored, right-, orleft-censored), we define two variables LB and UB as follows: If t is uncensored,take LB: UB : t; if t is left-censored, LB : and UB : t; and if t is

right-censored, then LB: t and UB:., where ‘‘.’’ means ‘‘missing’’ in SAS If

a survival time is interval-censored, [i.e., one observed two numbers t and t, t t and the survival time is in the interval (t,t)], let LB:t and UB:t.

Assume that the new data set (in terms of LB and UB) has been saved in

‘‘C:EXAMPLEA.DAT’’ as a text file, which contains two columns (LB in thefirst column and UB the second column) separated by a space

As an example, the followingSAS code can be used to obtain the estimatedcovariance matrix defined in (7.1.5) and the MLE of the parameters of theWeibull distribution for the survival data observed in the text file ‘‘C:EXAM-PLEA.DAT’’ One can replace d: weibull in the followingcode with therespective distribution in Sections 7.2 to 7.6(see the SAS code in these sectionsfor details) to obtain the estimate

Trang 18

7.2 EXPONENTIAL DISTRIBUTION

7.2.1 One-Parameter Exponential Distribution

The one-parameter exponential distribution has the followingdensity function;

Suppose that there are n persons in the study and everyone is followed to death

or failure Let t, t, , tL be the exact survival times of the n people The

likelihood function, using(7.2.1) and (7.1.1), is

Trang 19

It can be shown 2n / has an exact chi-square distribution with 2n degrees of

freedom (Epstein and Sobel, 1953) Since  : 1/ and  : 1/, an exact100(1

L \?

2n  L ?

where

degrees of freedom, that is, P(

(n 25, say),  is approximately normally distributed with mean  andvariance

 9Z?

(n   ;Z?

normal distribution(Table B-1)

Since 2n / has an exact chi-square distribution with 2n degrees of freedom,

an exact 100(19 a)% confidence interval for the mean survival time is

2n

L ? 

2n

The followingexample illustrates the procedures

patients with acute leukemia: 1, 1, 2, 2, 3, 4, 4, 5, 5, 6, 8, 8, 9, 10, 10, 12, 14, 16,

20, 24, and 34 Assume that remission duration follows the exponentialdistribution Let us estimate the parameter  by usingthe formulas givenabove

Accordingto(7.2.5), the MLE of the relapse rate,, is

Trang 20

following(7.2.9), is

(42)(9.429)59.342  (42)(9.429)

24.433

or(6.673, 16.208)

Once the parameter  is estimated, other estimates can be obtained Forexample, the probability of stayingin remission for at least 20 weeks, estimatedfrom (7.2.2), is S (20) : exp[90.106(20)]: 0.120 Any percentile of survival

time tN may be estimated by equating S(t) to p and solvingfor tN, that is, tN: 9logp/ For example, the median (50th percentile) survival time can be

We first consider singly censored and then progressively censored data.Suppose that without loss of generality, the study or experiment begins at time

0 with a total of n subjects Survival times are recorded and the data become

available when the subjects die one after the other in such a way that theshortest survival time comes first, the second shortest second, and so on

Suppose that the investigator has decided to terminate the study after r out of the n subjects have died and to sacrifice the remaining n 9 r subjects at that time Then the survival times for the n subjects are

t t%tP: t>P>:%: t>L

where a superscript plus indicates a sacrificed subject, and thus t > Gis a censored

observation In this case, n and r are fixed values and all of the n 9 r censored

observations are equal

The likelihood function, using(7.1.1), (7.2.1), and (7.2.2), is

Trang 21

degrees of freedom The mean and variance of are r/(r 9 1) and /(r 9 1),

to carcinogens The experimenter decides to terminate the study after half ofthe mice are dead and to sacrifice the other half at that time The survival times

of the five dead mice are 4, 5, 8, 9, and 10 weeks The survival data of the 10mice are 4, 5, 8, 9, 10, 10;, 10;, 10;, 10;, and 10; Assumingthat thefailure of these mice follows an exponential distribution, the survival rate andmean survival time  are estimated, respectively, accordingto (7.2.10) and(7.2.11) by

36; 50: 0.058 per weekand : 1/0.058 : 17.241 weeks A 95% confidence interval for  by (7.2.12) is

(0.058)(3.247)(2)(5)  (0.058)(20.483)

(2)(5)

Trang 22

or(0.019, 0.119) A 95% confidence interval for following (7.2.13) is

2(5)(17.241)20.483  2(5)(17.241)

3.247

or(8.417, 53.098)

The probability of survivinga given time for the mice can be estimated from(7.2.2) For example, the probability that a mouse exposed to the samecarcinogen will survive longer than 8 weeks is

S (8) : exp[90.058(8)] : 0.629The probability of dyingin 8 weeks is then 19 0.629 : 0.371

A slightly different situation may arise in laboratory experiments Instead of

terminatingthe study after the rth death, the experimenter may stop after a period of time T, which may be six months or a year If we denote the number

of deaths between 0 and T as r, the survival data may look as follows:

t t%tPt>P>:% :t>L:T

Mathematical derivations of the MLE of  and  are exactly the same and(7.2.10) can still be used The samplingdistribution of  for singly censoreddata is also discussed by Bartholomew(1963)

Progressively censored data come more frequently from clinical studieswhere patients are entered at different times and the study lasts a predeter-mined period of time Suppose that the study begins at time 0 and terminates

at time T and there are a total of n people entered Let r be the number of patients who die before or at time T and n 9 r the number of patients who are lost to follow-up duringthe study period or remain alive at time T The data look as follows: t, t, , tP, t>P>, t>P>, , t>L Orderingthe r uncensored

observations accordingto their magnitude, we have

Trang 23

and from(7.1.2), the MLE of the parameter is

 : L

G

t>

In practice, this estimate has little value

Distributions of the estimators are discussed by Bartholomew(1957) Thedistribution of for large n is approximately normal with mean  and variance:

where TG is the time that the ith person is under observation In other words,

TG is computed from the time the ith person enters the study to the end of the study If the observation times TG are not known, the followingquick estimate

of Var() can be used:

Var () : 

Thus an approximate 100(19 )% confidence interval for  is, by (7.1.6),

 9 Z?(Var()  ;Z?(Var() (7.2.21)The distribution of is approximately normal with mean  and variance:

Var() : 

Trang 24

Again, a quick estimate is

Var () : 

An approximate 100(1

 9 Z?(Var()  ;Z?(Var() (7.2.24)The exact distribution of derived by Bartholomew (1963) is too cumbersomefor general use and thus is not included here

receiving6-MP in Example 3.3 The remission times in weeks were 6, 6, 6, 7,

10, 13, 16, 22, 23, 6;, 9;, 10;, 11;, 17;, 19;, 20;, 25;, 32;, 32;, 34;,and 35; The hazard plot given in Figure 3.6 shows that the exponentialdistribution fits the data very well Maximum likelihood estimates of therelapse rate and the mean remission time can be obtained, respectively, from(7.2.16) and (7.2.17):

109; 250: 0.025 per week  :

10.025: 40 weeksThe graphical estimate of obtained in Example 3.3 is 0.027, which is veryclose to the MLE Thus, the remission duration of leukemia patients receiving6-MP can be described by an exponential distribution with a constant weeklyrelapse rate of 2.5% and a mean remission time of 40 weeks The probability

of stayingin remission for one year(or 52 weeks) or more is estimated by

S (52) : exp[90.025(52)]: 0.273Using (7.2.20) and (7.2.23) for the variance of  and , the 95% confidenceintervals for and  are, respectively, (0.009, 0.041) and (13.867, 66.133)

usingavailable statistical software Let t denote the observed survival time

(exact or censored) and CENS be an index (or dummy) variable withCENS: 0 if t is censored and 1 otherwise Assume that the data have been

saved in ‘‘C:EXAMPLE.DAT’’ as a text file, which contains two columns (t

in the first column and CENS in second column for the same study subject),separated by a space

The followingSAS code for procedure LIFEREG can be used to obtainthe estimated covariance matrix defined in (7.1.5) and the MLE of theparameter of the exponential distribution for the observed survival data in

‘‘C:EXAMPLE.DAT’’

172      

Trang 25

The respective BMDP code for program 2L is

/input file : ‘c:example.dat’

 : exp(9CONSTANT)where CONSTANT is given by the program

7.2.2 Two-Parameter Exponential Distribution

In the case where a two-parameter exponential distribution is more ate for the data (Zelen, 1966), the density and survivorship functions aredefined, respectively, as

Trang 26

Estimation of  and G for Data without Censored Observations

If t, t, , tL are the survival times of the n patients, using (7.1.1), (7.1.2),

(7.2.25), and (7.2.26), the MLE of is

and the mean survival time is estimated by : G ; 1/.

initial pulmonary metastasis from ostenogenic sarcoma considered by Burdetteand Gehan(1970) The data were 11, 13, 13, 13, 13, 13, 14, 14, 15, 15, and 17.Suppose that the two-parameter exponential distribution is selected The

guarantee time G is estimated by the smallest observation (i.e., G : 11), andthe hazard rate estimated by (7.2.27) is

(119 11) ; (13 9 11) ; % ; (17 9 11): 0.367Thus, the exponential model tells us that the minimum survival time is 11months, and after that the chance of death per month is 0.367 Similarly, theprobability of survivinga given amount of time can then be estimated from(7.2.26) For example, the estimated probability of surviving18 months orlonger is

S (18) : exp[90.367(18 9 11)] : 0.077

We first consider singly censored data Suppose that an experiment begins with

n animals and terminates as soon as the first r deaths occur For this case, we

introduce the estimation procedures derived by Epstein(1960a)

Let the first r survival times be t t% tP and let T * be the total survival observed between the first and the rth death:

T * : (n 9 1)(t9t) ;(n 92)(t9t) ;%;(n 9r;1)(tP9tP\) : 9(n 9 1)t;t;t;%;tP\;(n 9r;1)tP

: P

174      

Trang 27

The best estimates for G and in the sense that they are unbiased and haveminimum variance are given by

and

 : T *

Then can then be estimated by  : 1/

Confidence intervals for the mean survival time are easy to obtain fromthe fact that 2(r9 1)/ : 2T */ has a chi-square distribution with 2(r 9 1) degrees of freedom Thus, for r

t9 n(r T *9 1) F P\ ? G t (7.2.36)

Epstein and Sobel(1953) show that this interval is the shortest in the class of

intervals beingused If for some particular values of r and

is not tabulated in the F-table, Epstein(1960a) suggests using the following

Ngày đăng: 14/08/2014, 09:22

TỪ KHÓA LIÊN QUAN