Statistical Tests and Confidence Intervals

Một phần của tài liệu Statistical modeling for medical researcher (Trang 134 - 137)

In this section we briefly introduce three fundamental types of statistical tests, which we will use in this and later chapters: likelihood ratio tests, score tests and Wald tests. Each of these tests involves a statistic whose distribution is approximately normal or chi-squared. The accuracy of these approxima- tions increases with increasing study sample size. We will illustrate these tests using the AIDS example from Section 4.8.

4.9.1. Likelihood Ratio Tests

Suppose that we wish to test the null hypothesis that π =π0. Let L[π] denote the likelihood function forπgiven the observed data. We look at the likelihood ratioL[π0]/L[ ˆπ]. If this ratio is small then we would be much more likely to have observed the data that was actually obtained if the true value ofπwas ˆπrather thanπ0. Hence, small values ofL[π0]/L[ ˆπ] provide evidence thatπ =π0. Moreover, it can be shown that if the null hypothesis is true, then

χ2= −2 log[L[π0]/L[ ˆπ]] (4.9)

has an approximately chi-squared distribution with one degree of freedom.

Equation (4.9) is an example of alikelihood ratio test.ThePvalue asso- ciated with this test is the probability that a chi-squared distribution with one degree of freedom exceeds the value of this test statistic.

In our AIDS example, the likelihood ratio is L[π0]/L[ ˆπ]=

π05(1−π0)45

/( ˆπ5(1−πˆ)45).

Suppose that we wished to test the null hypothesis thatπ0=0.2.Now since ˆπ =0.1, equation (4.9) gives us that

χ2= −2 log[(0.25×0.845)/(0.15×0.945)]=3.67.

The probability that a chi-squared distribution with one degree of freedom exceeds 3.67 isP =0.055.

116 4. Simple logistic regression

4.9.2. Quadratic Approximations to the Log Likelihood Ratio Function

Consider quadratic equations of the form f[x]= −a(xb)2, wherea ≥0.

Note that all equations of this form achieve a maximum value of 0 atx =b.

Suppose that g[x] is any smooth function that has negative curvature at x0. Then it can be shown that there is a unique equation of the form f[x]= −a(xb)2 such thatfandghave the same slope and curvature atx0. Let

q[π]=log[L[π]/L[ ˆπ]] (4.10)

equal the logarithm of the likelihood ratio at π relative to ˆπ. Suppose that we wish to test the null hypothesis thatπ =π0. Then the likelihood ratio test is given by−2q[π0] (see equation (4.9)). In many practical sit- uations, equation (4.10) is difficult to calculate. For this reason q[π] is often approximated by a quadratic equation. The maximum value ofq[π] isq[ ˆπ]=log[L[ ˆπ]/L[ ˆπ]]=0. We will consider approximatingq[π] by quadratic equations that also have a maximum value of 0. Let

fs[π] be the quadratic equation that has the same slope and curvature as q[π] atπ0and achieves a maximum value of 0,

fw[π] be the quadratic equation that has the same slope and curvature as q[π] at ˆπand achieves a maximum value of 0.

Tests that approximateq[π] by fs[π] are called score tests. Tests that ap- proximateq[π] by fw[π] are called Wald tests. We will introduce these two types of tests in the next two sections.

4.9.3. Score Tests

Suppose we again wish to test the null hypothesis thatπ =π0. If the null hypothesis is true then it can be shown that

χ2 = −2fs[π0] (4.11)

has an approximately chi-squared distribution with one degree of freedom.

Equation (4.11) is an example of a score test. Score tests are identical to likelihood ratio tests except that a likelihood ratio test is based on the true log likelihood ratio functionq[π] while a score test approximatesq[π] by

fs[π].

In the AIDS example,

πˆ =0.1 andq[π]=log((π/0.1)5((1−π)/0.9)45).

117 4.9. Statistical tests and confidence intervals

It can be shown that q[π] has slope 5

π − 45

1−π and curvature− 5

π2 − 45 (1−π)2.

We wish to test the null hypothesis thatπ0 =0.2.The slope and curva- ture ofq[π] atπ =0.2 are−31.25 and−195.3125, respectively. It can be shown that fs[π]= −97.656 25(π−0.04)2also has this slope and curva- ture atπ =0.2.Therefore, if the true value ofπ =0.2 then−2fs[0.2]= 2×97.656 25(0.2−0.04)2=5 has an approximately chi-squared distri- bution with one degree of freedom. ThePvalue associated with this score statistic is P =0.025, which is lower than the corresponding likelihood ratio test.

4.9.4. Wald Tests and Confidence Intervals

If the null hypothesis thatπ =π0is true, then

χ2= −2fw[π0] (4.12)

also has an approximately chi-squared distribution with one degree of free- dom. Equation (4.12) is an example of aWald Test. It is identical to the likelihood ratio test except that a likelihood ratio test is based on the true log likelihood ratio functionq[π], while a Wald test approximatesq[π] by fw[π]. In Section 4.8.1 we said that the variance of a maximum likelihood estimate can be approximated by var[ ˆπ]= −1/C. It can be shown that

−2fw[π0]=(π0−πˆ)2/var[ ˆπ]. (4.13) The standard error of ˆπ is approximated by se[ ˆπ]=√

−1/C. Recall that a chi-squared statistic with one degree of freedom equals the square of a standard normal random variable. Hence, an equivalent way of performing a Wald test is to calculate

z =( ˆππ0)/se[ ˆπ], (4.14)

which has an approximately standard normal distribution. An approximate 95% confidence interval forπ is given by

πˆ ±1.96se[ ˆπ]. (4.15)

Equations (4.15) is known as aWald confidence interval.

In the AIDS example, ˆπ =0.1 and se[ ˆπ]=0.0424.Consider the null hy- pothesis thatπ0=0.2.Equation (4.14) gives thatz=(0.1−0.2)/0.0424=

118 4. Simple logistic regression

−2.36.The probability that azstatistic is less than−2.36 or greater than 2.36 isP =0.018.The 95% confidence interval forπ is 0.1±1.96×0.0424= (0.017, 0.183).

4.9.5. Which Test Should You Use?

The three tests outlined above all generalize to more complicated situations.

Given a sufficiently large sample size all of these methods are equivalent.

However, likelihood ratio tests and score tests are more accurate than Wald tests for most problems that are encountered in practice. For this reason, you should use a likelihood ratio or score test whenever they are available.

The likelihood ratio test has the property that it is unaffected by transfor- mations of the parameter of interest and is preferred over the score test for this reason. The Wald test is much easier to calculate than the other two, which are often not given by statistical software packages. It is common practice to use Wald tests when they are the only ones that can be easily calculated.

Wide divergence between these three tests can result when the log like- lihood function is poorly approximated by a quadratic curve. In this case it is desirable to transform the parameter is such a way as to give the log likelihood function a more quadratic shape.

In this text, the most important example of a score test is the logrank test, which is discussed in Chapter 6. In Chapters 5, 7 and 9 we will look at changes in model deviance as a means of selecting the best model for our data. Tests based on these changes in deviance are an important example of likelihood ratio tests. All of the confidence intervals in this text that are derived from logistic regression, survival or Poisson regression models are Wald intervals.

Tests of statistical significance in these models that are derived directly from the parameter estimates are Wald tests.

Một phần của tài liệu Statistical modeling for medical researcher (Trang 134 - 137)

Tải bản đầy đủ (PDF)

(405 trang)