Researchers considering using a case-control study approach must become familiar and comfortable with the concepts of odds and odds ratios. The odds ratio is the measure of association that
readers will expect to be reported for a case- control study. Odds are most familiar from their connection with betting. A horse with an equal chance of winning a race (50% likely to win) or of losing a race (50% likely to lose) is said to have
“even odds,” or odds of 1 (50%/50%). Similarly, a case-control study compares the likelihood of having had a particular exposure to not having had it (Figure 10-3). If 50% of the participants in a study report a history of exposure and 50% report no past exposure, then the odds of exposure are 50%/50%, or 1. If 25% report having the exposure and 75% do not, then the odds are 25%/75%, or 0.33. If 2% report being exposed in the past and 98% report not being exposed, then the odds are 2%/98%, or 0.02.
The main measure of association for case-control studies compares the odds of exposure among cases to the odds of exposure among controls.
This is called an odds ratio (OR). A contingency table (sometimes called a crosstab) is a row-by- column table that displays the counts of how often various combinations of events happen. Two-by- two (2×2) tables are used in case-control studies to compare two dichotomous (yes/no) variables.
Figure 10-4 shows a sample 2×2 table for a case- control study. In the 2×2 table for an unmatched
case-control study, the columns are for disease status (case = yes, and control = no) and the rows are for exposure status (exposed = yes, and
unexposed = no). All of the participants in the study are assigned to one of the four resulting boxes: (a) cases with an exposure history, (b) controls with an exposure history, (c) cases with no exposure
history, and (d) controls with no exposure history.
As a check, the total number of cases in the study should be a + c, the total number of controls in the study should be b + d, and the total number of participants should be a + b + c + d.
FIGURE 10-3 Odds
FIGURE 10-4 Odds Ratio (Point Estimate)
The odds of exposure in cases are the number of cases with the exposure (a) divided by the number of cases without the exposure (c). The odds of exposure in controls are the number of controls with the exposure (b) divided by the number of controls without the exposure (d). Basic algebra shows that the equation for the odds ratio of (a ÷ c)/(b ÷ d) can be simplified to
This calculation is the point estimate for the odds ratio, and it provides a starting point for
understanding the relationship between the disease and exposure status in the study population.
OR = 1: the odds of exposure were the same for cases and controls.
OR > 1: cases had higher odds of exposure than controls, implying that the exposure was risky.
OR < 1: cases had lower odds of exposure than controls, implying that the exposure was
protective.
The 95% confidence interval shows whether an OR is statistically significant (Figure 10-5). (Chapter 17 provides additional information about how to
interpret confidence intervals and how they are related to sample size.)
FIGURE 10-5 Interpretation of the Odds Ratio Based on Its 95% Confidence Interval
When the entire 95% confidence interval is less than 1, the OR is statistically significant, and the exposure is deemed to be protective in the
study population.
When the entire 95% confidence interval is greater than 1, the OR is statistically significant, and the exposure is deemed to be risky in the study population.
When the 95% confidence interval (95% CI) overlaps OR = 1, the OR is said to be not statistically significant in the study population.
This is because the lower end of the confidence interval is less than 1, suggesting protection, while the higher end of the confidence interval is greater than 1, suggesting risk. In this situation, the exposure and disease are deemed to have no association in the study population. This may reflect a true absence of a relationship between the exposure and the disease, but it may also indicate that the sample size was too small.
Power calculations can be used to verify
whether the sample size was sufficient to detect differences in the odds among cases and
controls if a difference really did exist.
Chi-square tests are derived from the same tables used to calculate odds ratios. When the 95%
confidence interval does not overlap 1, the p-value for the Chi-square test will be p < 0.05, which is statistically significant. When the 95% confidence interval for an OR overlaps the number 1, the p- value for the Chi-square test will be p > 0.05, which indicates no association.
Once the counts for a, b, c, and d are known,
computer- and Internet-based statistical programs
(such as statistical software packages or the OpenEpi.com website) can be used to calculate the point estimate for the OR (the value of ad/bc), along with its corresponding 95% confidence interval. Sample output is shown in Figure 10-6.
One example has an odds ratio of 1.588 and a 95%
confidence interval of (1.027, 2.453), implying that the exposure was risky since the entire 95% CI is greater than 1. The Chi-square p-value of p < 0.05 confirms this conclusion. The other example has an odds ratio of 1.158 (0.650, 2.066). Since the 95% CI overlaps 1, the association is not statistically
significant. The correct conclusion in this example is that there is no association between the
exposure and the disease. The Chi-square p-value of p > 0.05 confirms this conclusion. Logistic
regression models can be used to calculate odds ratios that adjust for possible confounding
variables.
For a case-control study, it is incorrect to say that
“the exposed had a higher (or lower) rate of disease than the unexposed” because the rates of disease in exposed and unexposed participants are not known. Case-control studies recruit participants because they have or do not have a disease.
Usually about 50% of participants in a case-control study are cases even if cases make up less than
1% of the community from which the study
population was drawn. As a result, the prevalence of disease among exposed persons in the study population could be 70% even when the
prevalence of disease among exposed persons in the community from which participants were drawn is less than 1%. Because the study population is usually not representative of the community as a whole, case-control studies are unable to estimate rates of disease among the exposed and
unexposed.
FIGURE 10-6 Examples of Odds Ratio Calculations
Case-control studies are, however, able to examine odds of exposure among the diseased and the not diseased. For case-control studies, the orientation
should always be from disease status to exposure history, and from odds rather than risks or rates.
Accordingly, the results should always be phrased to indicate that cases had greater (or lesser) odds of exposure than controls.