If we are comparing a categorical variable dichotomous or otherwise between two or more groups, a chi-square test is commonly used.. This test can be applied if there are more than two c
Trang 1Results drawn from samples are susceptible to two types of error: type I error and type II error A type I error is one in which we conclude that a difference exists when in truth it does not (the observed difference was due to chance) From probability distributions, we can estimate the probability of making this
type of error, which is referred to as alpha This is also the P value—the
probability that we have made a type I error Such errors are evident when a P
value is statistically significant but there is no true difference or association In a given study, we may conduct many tests of comparison and association, and each time we are willing to accept a 5% chance of a type I error The more tests that we do, the more likely we are to make a type I error, since by definition 5%
of our tests may reach the threshold of a P value <.05 by chance or random error
alone This is the challenge of doing multiple tests or comparisons To avoid making this error, we could lower our threshold for defining statistical
significance, or we can perform adjustments that take into account the number of tests or comparisons being made That said, for many observational studies, it is appropriate to take advantage of the opportunity to examine data in multiple ways, and interesting findings may still be of importance and necessarily
rejected simply because multiple comparisons were made
In a type II error, we conclude from the results in our sample that there is no difference or association, when in truth one actually exists We can also
determine the probability of making this type of error, called beta, from the
probability distribution Beta is most strongly influenced by the number of
subjects or observations in the study, with a greater number of subjects giving a lower beta We can use beta to calculate power, which is 1-beta Power is the probability of concluding from the results in our sample that a difference or association exists when in truth it does exist It is a useful calculation to make
when the P value of a result is nonsignificant and we are not confident that the
observed result is due to chance or random error Before we conclude that there
is no difference or association, we must be sure that we had sufficient power to detect an important difference or association reliably As a general rule, a study reporting a negative result must have a power of at least 80% This means that
we are 80% sure that the negative results from this study are not due to a failure
to reject a difference when one truly exists We are taking a 20% chance of
making a type II error Power is affected by several factors, including the beta (or chance of a type II error) that we set, the alpha we set (a higher acceptable alpha raises power), the sample size (a bigger sample increases power), and the effect size being measured (a larger effect to detect increases power)
Trang 2Comparing Two or More Groups or Categories
Comparing two or more groups defined by a particular characteristic or
treatment is a very common application of statistics Using the concept of
independent and dependent variables, the group assignment represents the
independent variable, and we seek to determine differences in outcomes, which are the dependent variables The type of variable being measured dictates the test used for a comparison
If we are comparing a categorical variable (dichotomous or otherwise)
between two or more groups, a chi-square test is commonly used This test can
be applied if there are more than two categories for either the dependent or
independent variable or both Chi-square testing uses the distance between the observed frequency of a variable and the frequency that would be expected
under the null hypothesis to determine significance A related test, called Fisher's exact test, is used when the number of subjects being compared is small If the categories of the dependent variable are ordinal in nature, then a special type of chi-square called the Mantel-Haentzel test can be an indicator of trend
When the dependent variable is continuous and has a relatively normal
distribution and the independent categorical variable has only two categories or
groups, then Student's t-test is applied The probability of the observed
difference relative to a hypothesis that there is no difference is derived from a unique probability distribution called a t-distribution When there are more than
two groups, an analysis of variance, or ANOVA, is applied, with use of the F-distribution Importantly, if the P value for an ANOVA is significant, there is not
a clear way to tell where the difference between multiple groups occurred This
is a case in which making multiple two-group comparisons is appropriate and useful
If the dependent variable is a skewed continuous variable, then a
nonparametric analysis may be needed that utilizes ranks rather than actual
values A rank is the ordinal number value of a dataset arranged in some
particular order (typically from low absolute value to high absolute value) The Wilcoxon rank-sum test, also called the Mann-Whitney U test, compares two groups using the rank of a value in the set measures rather than the magnitude of the actual measure itself When there are more than two groups to compare, Kruskal-Wallis testing can be used
Trang 3Sometimes a statistical test is aimed not at comparing two groups but at
characterizing the extent to which two variables are associated with each other Correlations estimate the extent to which change in one variable is associated with change in a second variable Correlations are unable to assess any cause-and-effect relationship, only associations The strength in the association is
represented by the correlation coefficient r, which can range from −1 to 1 An r
value of −1 represents a perfect inverse correlation, where an increase of 1 unit
in a variable is exactly associated with a decrease of 1 unit in the other
Conversely, an r value of 1 is a perfect correlation, where an increase of 1 unit in
a variable is exactly associated with an increase of 1 unit in the other An r value
of zero indicates that a change in one variable is not at all associated with a change in the other
There are many types of measures of correlation For two ordinal variables, Spearman rank correlation is used For two continuous variables, Pearson
correlation is used For instance, if we were studying the association of body mass index with maximal VO2, we might find a Pearson correlation of −0.5, meaning that for every 1 unit increase in body mass index, there was a 0.5 unit decrease in VO2
Matched Pairs and Measures of Agreement
When measurements are made in two groups of subjects composed of separate individuals that bear no relationship to one another, we use nonmatched or
unpaired statistics Alternatively, the two groups may not be independent but have an individual-level relationship, such as a group of subjects and a group of their siblings or the systolic blood pressure measurement in an individual before and after antihypertensive medication When this is the case, we must use
statistical testing that takes into account the fact that the two groups are not independent If the independent variable was categorical, we would use a
McNemar chi-square test If the independent variable was ordinal, we would use
an appropriate nonparametric type of test, such as the Wilcoxon signed rank test
If the independent variable was continuous, we would use a paired t-test.
Linear and Logistic Regression