26.1 Comparing Several GroupsRelating the t-Test to Regression Is there a significant difference between the average yield of Endurance and the others?. 26.1 Comparing Several GroupsCo
Trang 2Analysis of Variance
Chapter 26
Trang 326.1 Comparing Several Groups
Did agricultural yield go up this year because
of more fertilizer or more rain? Or is it the result of temperature or type of seed used?
Use regression analysis with dummy variables to compare the averages of several groups
This approach is also known as analysis of
variance
Trang 426.1 Comparing Several Groups
Which Wheat Variety Should a Farmer Plant?
Endurance, Hatcher, NuHills, RonL, and Ripper.
and yield was measured as bushels per acre.
number of observations for each treatment.
Trang 526.1 Comparing Several Groups
Steps to Follow in the Analysis
Plot the data to find patterns
Propose a regression model for the data
Check conditions associated with the model
Test hypotheses and draw a conclusion
Trang 626.1 Comparing Several Groups
Comparing Groups in Plots –
Boxplots of Yield
Trang 726.1 Comparing Several Groups
Comparing Groups in Plots –
Summary Statistics
Trang 826.1 Comparing Several Groups
Relating the t-Test to Regression
Is there a significant difference between the
average yield of Endurance and the others?
Since the variances among groups appear
similar, use the two sample t-test and pool the
variances
Trang 926.1 Comparing Several Groups
Relating the t-Test to Regression
The t-statistic and p-value show that Endurance
has a significantly higher mean yield per acre
than the combination of other varieties
Trang 1026.1 Comparing Several Groups
Relating the t-Test to Regression
The t-test can be formulated as a regression with
a dummy variable D(Endurance) that is coded 1 if
plot is seeded with Endurance and 0 otherwise
Trang 1126.1 Comparing Several Groups
Relating the t-Test to Regression
The slope b 1 = 5.53 matches the estimate for the difference between means
Testing the slope is equivalent to a pooled
two-sample t-test of the difference between means
(the t-statistic and p-value are the same).
Trang 1226.1 Comparing Several Groups
Comparing Several Groups Using Regression
Define the following dummy variables:
D(Endurance) = 1 if plot grows Endurance, 0 otherwise.
D(Hatcher) = 1 if plot grows Hatcher, 0 otherwise.
D(NuHills) = 1 if plot grows NuHills, 0 otherwise.
D(Ripper) = 1 if plot grows Ripper, 0 otherwise.
J-1 dummy variables are needed to represent J
categories
Trang 1326.1 Comparing Several Groups
Comparing Several Groups Using Regression
The variety RonL is the baseline category
(defined by all zeros for the dummy variables)
Analysis of variance (ANOVA): the comparison of two or more averages using regression model
with all dummy variables
Trang 1426.1 Comparing Several Groups
Comparing Several Groups Using Regression
Trang 1526.1 Comparing Several Groups
Interpreting the Estimates
The slope of each dummy variable compares the
average response of its category to the average of the baseline category.
If D(Endurance) = 1, we find = 19.58 bushels per acre Since b0 = 11.68 is the mean yield for RonL,
the slope for D(Endurance), which is b 1 = 7.9 is the
difference between the average yields
yˆ
Trang 1626.1 Comparing Several Groups
ANOVA Regression Model
The equation of the MRM for the Wheat example
can be written in terms of the population means:
) (
) (
) (
Trang 1726.1 Comparing Several Groups
ANOVA Regression Model
One-Way Analysis of Variance This regression
model compares the averages of the groups
defined by J levels of a categorical variable The observations in each group are a sample from the associated population
Equation:
Assumptions: Errors are independent, have equal
variances and are normally distributed
ij j
ij
y
Trang 1826.2 Inference in ANOVA Regression
Models
Checking Conditions
Linear association: automatic for ANOVA
No obvious lurking variable: automatic if data are from a randomized experiment (i.e., wheat
example)
Check the remaining conditions (independence,
similar variances, and normality) with appropriate residual plots
Trang 1926.2 Inference in ANOVA Regression
Models
Checking Conditions
If IQR’s are similar, within a factor of 3 to 1 with up to five groups, similar variances condition is met.
Trang 2026.2 Inference in ANOVA Regression
Models
Checking Conditions
Residuals appear nearly normal
Trang 2126.2 Inference in ANOVA Regression
Models
F-Test for the Difference among Means
H0: µ1 = µ2 = µ3 = µ4 = µ5
Trang 2226.2 Inference in ANOVA Regression
Models
Understanding the F-Test
Consider the following hypothetical means:
Are these averages statistically significantly different? To answer this question, need to know the variance within each group.
Trang 2326.2 Inference in ANOVA Regression
Models
Understanding the F-Test
Both plots show groups with the same averages, but different
within group variances No significant differences in averages in right plot.
Trang 2426.2 Inference in ANOVA Regression
Models
Confidence Intervals
Since the F-test shows that the mean yields among varieties of wheat are not the same, which
variety is best?
Trang 2626.3 Multiple Comparisons
Tukey Confidence Intervals
These intervals hold the chance for a Type I error to 5% over the entire collection of
pairwise comparisons.
Replaces the t-percentile in confidence intervals with a larger multiple of the standard error
(obtained from a special table).
Trang 2726.3 Multiple Comparisons
Tukey Confidence Intervals - Wheat Example
The 95% Tukey confidence interval for the two best varieties of wheat (Endurance and Hatcher): 2.04 ± 2.875 2.11 = 2.04 ± 6.07 bushels/acre
This difference is not statistically significant since the Tukey interval includes 0.
Trang 2826.3 Multiple Comparisons
Tukey Confidence Intervals - Wheat Example
Note that the width of the 95% Tukey confidence interval is the same for any pairwise comparison
The difference in yield between any two varieties compared must be more than 6.07 bushels/acre
in order to be statistically significant
Trang 2926.3 Multiple Comparisons
Bonferroni Confidence Intervals
These intervals adjust for multiple comparisons by changing the α level used in the standard interval
to α/M for M intervals.
Bonferroni confidence intervals reduce α = 0.05 to
α/10 = 0.005 and replaced t = 2.08 with t = 3.00.
Trang 3026.4 Groups of Different Size
With groups of different sizes, unbalanced data
produce confidence intervals of different widths
Compute the estimated standard error for a
pairwise comparison using the following formula with relevant sample sizes:
2 1
2 1
1
1)
(
n n
s y
y
Trang 314M Example 26.1: JUDGING THE
CREDIBILITY OF ADVERTISEMENTS
Motivation
Advertising executives want to compare four
commercials for a retail item that make claims of varying strengths Specifically, they want to know how over-the-top an ad can be before customers turn away in disbelief
Trang 324M Example 26.1: JUDGING THE
CREDIBILITY OF ADVERTISEMENTS
Method
The data consist of reactions for a sample of 80
customers who viewed commercials with claims
in one of four categories: Tame, Plausible,
Stretch and Outrageous Each customer was
randomly assigned to a commercial The
response variable is Credibility obtained by
customers’ responses to items on a questionnaire they completed after viewing the ad
Trang 334M Example 26.1: JUDGING THE
CREDIBILITY OF ADVERTISEMENTS
Method
Use regression with three dummy variables to
capture the four types of claims made in the
commercials Check the conditions for ANOVA Linearity is not an issue and there are no obvious lurking variables because randomization was
used in designing the study
Trang 344M Example 26.1: JUDGING THE
CREDIBILITY OF ADVERTISEMENTS
Mechanics - Results
Trang 354M Example 26.1: JUDGING THE
CREDIBILITY OF ADVERTISEMENTS
Mechanics – Results
Trang 364M Example 26.1: JUDGING THE
Trang 374M Example 26.1: JUDGING THE
Trang 384M Example 26.1: JUDGING THE
CREDIBILITY OF ADVERTISEMENTS
Mechanics –
The F-test has a p-value 0.0251; reject H0 The
mean credibility of the four commercials is not
equal Performing pairwise comparisons using
Tukey intervals, the difference between average credibility must be more than 3.25 to be
statistically significant
Trang 394M Example 26.1: JUDGING THE
CREDIBILITY OF ADVERTISEMENTS
Message
Based on the Tukey intervals, there is only one
statistically significant pairwise difference
(between commercials making tame claims and
those that make outrageous) Customers place
less credibility in ads that make outrageous
claims than ads that make tame claims
Trang 40Best Practices
when using ANOVA regression
Use Tukey or Bonferroni confidence intervals to
identify groups that are significantly different
Recognize the cost of snooping in the data to
choose hypotheses
Trang 41 Don’t compare the means of several groups using
lots of t-tests
Don’t forget confounding factors
Never pretend you have only two groups
Trang 42Pitfalls (Continued)
Do not add or subtract standard errors
Do not use a one-way ANOVA to analyze data
with repeated measurements