DESCRIPTIVE AND INFERENTIAL STATISTICS

Một phần của tài liệu Sách DESIGN AND ANALYSIS OF CLINICAL TRIALS (Trang 79 - 85)

Dietrich and Kearns (1986) divided statistics into two broad areas, namely descriptive and inferential statistics. Descriptive statistics is the science of summarizing or describing data, while inferential statistics is the science of interpreting data in order to make estimates, hypotheses testing, predictions, or decisions from the samples to the targeted population.

In clinical trials, data are usually collected through case report forms which are designed to capture clinical information from the studies. The information on the case report forms is then entered into the database. The raw database is always messy, though it does contain valuable clinical information from the study. In practice, it is often of interest to summarize the raw database by a graphical presentation (e.g., a data plot) or by descrip- tive (or summary) statistics. Descriptive statistics are simple sample statistics such as means and standard deviations (or standard errors) of clinical variables or endpoints. Note that the standard deviation describes the variability of a distribution, either a population distribution or a sample distribution, whereas the standard error is the variability of a sam- ple statistic (e.g., sample mean or sample variance). Descriptive statistics are often used to describe the targeted population before and after the study. For example, at baseline, descriptive statistics are often employed to describe the comparability between treatment

DESCRIPTIVE AND INFERENTIAL STATISTICS 65

Figure 2.4.5 Interaction of two subgroups. (Source:Gail and Simon, 1985.)

groups. After the completion of the study, descriptive statistics are useful tools to reveal possible clinical differences (or effects) or trends of study drugs. As an example, Table 2.5.1 provides a partial listing of individual patient demographics and baseline characteris- tics from a study comparing the effects of captopril and enalapril on quality of life in the older hypertensive patients (Testa et al., 1993). As can be seen from Table 2.5.1, although as a whole, the patient listing gives a detailed description of the characteristics for individ- ual patients, it does not provide much summary information regarding the study popula- tion. In addition descriptive statistics for demographic and baseline information describe not only the characteristics of the study population but also the comparability between treatment groups (see Table 2.5.2). In addition, for descriptive purposes, Table 2.5.3 groups patients into low, medium, and high categories according to the ranking of their scores on the baseline quality of life scale. It can be seen that there is a potential difference in treat- ment effect among the three groups with regard to the change from baseline on the quality of life. These differences were confirmed to be statistically significant by valid statistical tests. Therefore a preliminary investigation of descriptive statistics of primary clinical end- points may reveal a potential drug effect.

When we observe some potential differences (effects) or trends, it is necessary to further confirm with certain assurance that the differences (effects) or trends indeed exist and are not due to chance alone. For this purpose it is necessary to provide inferential statistics for the observed differences (effects) or trends. Inferential statistics such as confidence inter- vals and hypotheses testing are often performed to provide statistical inference on the pos- sible differences (effects) or trends that can be detected based on descriptive statistics. For the rest of this section, we will focus on confidence intervals (or interval estimates).

Hypotheses testing will be discussed in more detail in the following section.

Clinical endpoints are often used to asses the efficacy and safety of drug products. For example, diastolic blood pressure is one of the primary clinical endpoints for the study of ACE inhibitor agents in the treatment of hypertensive patients. The purpose of the diastolic blood pressure for hypertensive patients is to compare their average diastolic blood pres- sure with the norm for ordinary health subjects. However, the average diastolic blood pres- sure for the hypertensive patients is unknown. We will need to estimate the average diastolic blood pressure based on the observed diastolic pressures obtained from the hyper- tensive patients. The observed diastolic blood pressures and the average of these diastolic blood pressures are the sample and sample mean of the study. The sample mean is an esti- mate of the unknown population average diastolic blood pressure. Point estimates may not be of practical use. For example, suppose that the sample mean is 98 mmHg. It is then important to know whether the population average for the hypertensive patients could rea- sonably be 90 mmHg given that the sample average turned out to be 98 mmHg. This kind of information depends on the knowledge of the standard error, not merely of the point estimate itself.

The observed diastolic blood pressures are usually scattered around the sample mean.

Based on these observed diastolic blood pressures, the standard error of the sample mean of the observed diastolic blood pressures can be obtained. If the distribution of the diastolic blood pressure appears to be a bell shaped and the sample size is of moderate size, then there is about 95% chance that the unknown average diastolic blood pressure of the tar- geted population will fall within the area between approximate two (i.e., 1.96) standard errors below and above the sample mean. The lower and upper limits of the area constitute an interval estimate for the unknown population average diastolic blood pressure. An inter- val estimate is usually referred to as a confidence interval with a desired confidence level,

Table 2.5.1Partial Data of Capoten Quality of Life Study AgeHeightWeightHeart RateSystolic BPDiastolic BP Alcohol Tobacco PatientRace(years)(inches)(pounds)(per minute)(mmHg)(mmHg)ConsumptionConsumption C01-003Caucasian69701957214694NoNo C01-004Caucasian69701887217094YesYes C01-006Caucasian6276231.58415891NoNo C01-008Caucasian56722447614097NoNo C01-010Caucasian557525860139100YesNo C01-012Caucasian58741918613895YesNo C02-004Black66702206414199NoNo C02-006Black5565244.564171109NoNo C02-008Black616928176173104NoNo C02-009Black6168190.56015091YesNo C02-013Black71741938814097NoYes C03-001Caucasian557430376151103YesNo C03-004Caucasian65712437815691NoNo C03-005Caucasian556917810015091YesNo C03-006Caucasian596517464157101YesYes C03-010Caucasian746717180188109NoNo C03-012Caucasian65651505816999YesNo C04-003Caucasian64721949616199YesNo C04-005Black59692017617996NoNo C04-009Caucasian587833480159114NoNo

67

such as 95%. Unlike a point estimate, a confidence interval provides a whole interval as an estimate for a population parameter instead of just a single value. A 95% confidence inter- val is a random interval that is calculated according to a certain procedure that would pro- duce a different interval for each sample upon repeated sampling from the population, and 95% of these intervals would contain the unknown fixed population parameter. A 95%

Table 2.5.2 Demographic, Clinical, and Quality of Life Variables at the Baseline

Captopril Enalapril

Variable (N192) (N187)

Demographic

Age (yr) 64.25.5 64.66.4

Education (%)

No high school 9 7

Some high school 36 30

Some college 49 58

Postgraduate degree 6 5

Income (%)

$15,000 15 12

$15,000–40,999 41 47

$41,000–80,000 33 35

$80,000 11 6

Percent married 87 88

Occupational status (%)

Employed full-time 40 36

Employed part-time 14 15

Retired 43 49

Unemployed 3 1

Race (%)

White 84 82

Black 14 18

Other 3 1

Clinical

Weight (lb) 197.836.4 198.737.9

Body-mass index 28.84.7 28.65.0

Blood pressure (mmHg)

Systolic 155.014.8 154.615.8

Diastolic 97.35.8 97.35.9

Previous antihypertensive therapy (%) 89 87

Quality-of-life scales

Psychological well-being 46278 45280

Psychological distress 52664 52159

General perceived health 49377 49567

Well-being at work or in daily routine 48561 48062

Sexual-symptom distress 518144 503162

Distress and stress indexes

Side effects and symptoms distress 2440 2535

Life events 3040 2840

Stress 274144 296142

Source:Testa et al. (1993).

Table 2.5.3Changes from the Baseline to End Point in Quality of Life According to Scores on the Quality of Life Scale at the Baseline Baseline Scores for Randomized PatientsCaptopril (N184)Enalapril (N178) Low Medium HighLow Medium High ScaleLowMediumHigh(N53)(N60)(N71)(N60)(N65)(N53) General perceived health42050655221.06.62.76.31.63.21.77.310.45.315.86.1 Psychological well-being37446952419.38.81.47.38.23.017.79.01.36.17.96.0 Psychological distress45753357719.76.86.25.93.52.67.26.50.95.110.84.5 Overall quality of life42750254518.15.36.85.40.52.45.96.14.34.510.74.6 Source:Testa et al. (1993). Note:Mean SD change from baseline score.

69

confidence interval is not an interval that will contain 95% of the sample averages that would be obtained on repeating the sampling procedure, nor is a particular 95% confidence interval in which the population average will fall 95% of the times. It should be noted that the population average is an unknown constant and does not vary while a confidence inter- val is random. It is either in the confidence interval or not.

A classical confidence interval for the population average of a clinical variable is sym- metric about the observed sample mean of the observed responses of the clinical variable.

This classical confidence interval is sometimes called the shortest confidence interval because its width is the shortest among all of the confidence intervals of the same con- fidence level by other statistical procedures. In some situations, it may be of interest to obtain a symmetric confidence with respect to a fixed number. For example, in bioequiva- lence trials it is of interest to obtain a confidence interval for the difference in a pharmaco- kinetic parameter such as area under the blood or plasma concentration time curve (AUC) between the test and reference drug product. If the 90% confidence interval falls within 20% of the average of the reference product, then we conclude that the test product is bioequivalent to the reference product (e.g., Chow and Liu, 2000). Since the limits are 20%, which is symmetric about 0%, Westlake (1976) proposed the idea to consider a symmetric confidence interval with respect to 0 rather than the shortest confidence interval symmetric about the observed difference in the sample means. Note that as indicated in Chow and Liu (2000), the most common criticisms of Westlake’s symmetric confidence interval are that it has shifted away from the direction in which the sample difference was observed and that the tail probabilities associated with Westlake’s symmetric confidence interval are not symmetric. As a result Westlake’s symmetric confidence interval moves from a two-sided to a one-sided approach as the true difference and the random error increase.

The confidence level is the degree of certainty that the interval actually contains the unknown population parameter value. It provides the degree of assurance or confidence that the statement regarding the population parameter is correct. The more certainty we want, the wider the interval will have to be. A very wide interval estimate may not be of practical use because it fails to identify the population parameter closely. In practice, it is more usual to use 90%, 95%, or 99% as confidence levels. Table 2.5.4 summarizes the multiple of standard errors that are needed for confidence levels of 68, 95, and 99. There- fore we will have 68%, 95%, and more than 99% confidence that the population parameter will fall within one, two, and three standard errors of the observed value, respectively.

Table 2.5.4 Confidence Levels with Various Standard Errors

Standard Errors Confidence Level

0.5 0.3830

0.675 0.5000

1.0 0.6826

1.5 0.8664

1.96 0.9500

2.0 0.9544

2.5 0.9876

3.0 0.9974

4.0 1.0000

When we claim that a drug product is effective and safe with 95% assurance, it is expected that we will observe consistent significant results 95% of times if the clinical trial were repeatedly carried out with the same protocol. However, current FDA regulation only requires two adequate well-controlled clinical trials be conducted to provide substantial evidence for efficacy and safety. It is therefore of interest to estimate the probability that the drug is effective and safe based on clinical results obtained from the two adequate well-controlled trials.

Một phần của tài liệu Sách DESIGN AND ANALYSIS OF CLINICAL TRIALS (Trang 79 - 85)

Tải bản đầy đủ (PDF)

(750 trang)