The SS in a 1-way ANOVA can be split into two components, calledthe "sum of squares of treatments" and "sum of squares of error", abbreviated as SST and SSE, respectively.. The definitio
Trang 17 Product and Process Comparisons
7.4 Comparisons based on data from more than two processes
7.4.3 Are the means equal?
7.4.3.1 1-Way ANOVA overview
Overview and
principles
This section gives an overview of the one-way ANOVA First we explain the principles involved in the 1-way ANOVA
Partition
response into
components
In an analysis of variance the variation in the response measurements is partitoned into components that correspond to different sources of variation.
The goal in this procedure is to split the total variation in the data into
a portion due to random error and portions due to changes in the values of the independent variable(s)
Variance of n
measurements
The variance of n measurements is given by
where is the mean of the n measurements
Sums of
squares and
degrees of
freedom
The numerator part is called the sum of squares of deviations from the mean, and the denominator is called the degrees of freedom.
The variance, after some algebra, can be rewritten as:
The first term in the numerator is called the "raw sum of squares" and the second term is called the "correction term for the mean" Another name for the numerator is the "corrected sum of squares", and this is usually abbreviated by Total SS or SS(Total).
7.4.3.1 1-Way ANOVA overview
http://www.itl.nist.gov/div898/handbook/prc/section4/prc431.htm (1 of 2) [5/1/2006 10:38:54 AM]
Trang 2The SS in a 1-way ANOVA can be split into two components, called
the "sum of squares of treatments" and "sum of squares of error",
abbreviated as SST and SSE, respectively
The guiding
principle
behind
ANOVA is the
decomposition
of the sums of
squares, or
Total SS
Algebraically, this is expressed by
where k is the number of treatments and the bar over the y denotes the "grand" or "overall" mean Each n i is the number of observations
for treatment i The total number of observations is N (the sum of the
n i)
Note on
subscripting
Don't be alarmed by the double subscripting The total SS can be written single or double subscripted The double subscript stems from the way the data are arranged in the data table The table is usually a
rectangular array with k columns and each column consists of n i rows
(however, the lengths of the rows, or the n i, may be unequal)
Definition of
"Treatment"
We introduced the concept of treatment The definition is: A treatment
is a specific combination of factor levels whose effect is to be compared with other treatments.
7.4.3.1 1-Way ANOVA overview
Trang 37.4.3.2 The 1-way ANOVA model and assumptions
http://www.itl.nist.gov/div898/handbook/prc/section4/prc432.htm (2 of 2) [5/1/2006 10:38:55 AM]
Trang 4ANOVA table
Total (corrected) SS N-1 The word "source" stands for source of variation Some authors prefer
to use "between" and "within" instead of "treatments" and "error", respectively
ANOVA Table Example
A numerical
example
The data below resulted from measuring the difference in resistance resulting from subjecting identical resistors to three different
temperatures for a period of 24 hours The sample size of each group was 5 In the language of Design of Experiments, we have an
experiment in which each of three treatments was replicated 5 times
Level 1 Level 2 Level 3
The resulting ANOVA table is
Example
Total (corrected) 45.349 14 Correction Factor 779.041 1 7.4.3.3 The ANOVA table and tests of hypotheses about means
Trang 5of the
ANOVA table
The test statistic is the F value of 9.59 Using an of 05, we have that
F.05; 2, 12 = 3.89 (see the F distribution table in Chapter 1) Since the test statistic is much larger than the critical value, we reject the null hypothesis of equal population means and conclude that there is a (statistically) significant difference among the population means The
p-value for 9.59 is 00325, so the test statistic is significant at that
level
Techniques
for further
analysis
The populations here are resistor readings while operating under the
three different temperatures What we do not know at this point is
whether the three means are all different or which of the three means is different from the other two, and by how much
There are several techniques we might use to further analyze the differences These are:
constructing confidence intervals around the difference of two means,
●
estimating combinations of factor levels with confidence bounds
●
multiple comparisons of combinations of factor levels tested simultaneously
●
7.4.3.3 The ANOVA table and tests of hypotheses about means
http://www.itl.nist.gov/div898/handbook/prc/section4/prc433.htm (3 of 3) [5/1/2006 10:38:55 AM]
Trang 6The 829.390 SS is called the "raw" or "uncorrected " sum of squares
Step 3:
compute
SST
STEP 3 Compute SST, the treatment sum of squares.
First we compute the total (sum) for each treatment
T1 = (6.9) + (5.4) + + (4.0) = 26.7
T2 = (8.3) + (6.8) + + (6.5) = 38.6
T1 = (8.0) + (10.5) + + (9.3) = 42.8 Then
Step 4:
compute
SSE
STEP 4 Compute SSE, the error sum of squares.
Here we utilize the property that the treatment sum of squares plus the error sum of squares equals the total sum of squares
Hence, SSE = SS Total - SST = 45.349 - 27.897 = 17.45
Step 5:
Compute
MST, MSE,
and F
STEP 5 Compute MST, MSE and their ratio, F.
MST is the mean square of treatments, MSE is the mean square of error (MSE is also frequently denoted by )
MST = SST / (k-1) = 27.897 / 2 = 13.949 MSE = SSE / (N-k) = 17.452/ 12 = 1.454 where N is the total number of observations and k is the number of treatments Finally, compute F as
F = MST / MSE = 9.59
That is it These numbers are the quantities that are assembled in the
ANOVA table that was shown previously
7.4.3.4 1-Way ANOVA calculations
Trang 7discussed
later
Later on the topic of estimating more general linear combinations of means (primarily contrasts) will be discussed, including how to put
confidence bounds around contrasts 7.4.3.5 Confidence intervals for the difference of treatment means
http://www.itl.nist.gov/div898/handbook/prc/section4/prc435.htm (2 of 2) [5/1/2006 10:38:56 AM]
Trang 8intervals for
the factor
level means
It can be shown that:
has a t-distribution with (N- k) degrees of freedom for the ANOVA model under consideration, where N is the total number of observations and k is the number of factor levels or groups The degrees of freedom
are the same as were used to calculate the MSE in the ANOVA table
That is: dfe (degrees of freedom for error) = N - k From this we can
calculate (1- )100% confidence limits for each i These are given by:
Example 1
Example for
a 4-level
treatment (or
4 different
treatments)
The data in the accompanying table resulted from an experiment run in
a completely randomized design in which each of four treatments was replicated five times
Total Mean
7.4.3.6 Assessing the response from any factor combination
Trang 9ANOVA
table layout
This experiment can be illustrated by the table layout for this 1-way ANOVA experiment shown below:
1. n1
2. n2
3. n3
4. n4
n t
ANOVA
table
The resulting ANOVA table is
Treatments 38.820 3 12.940 9.724
Total (Corrected) 60.112 19
Total (Raw) 979.480 20
The estimate for the mean of group 1 is 5.34, and the sample size is n1
= 5
Computing
the
confidence
interval
Since the confidence interval is two-sided, the entry /2 value for the
ttable is 5(1 95) = 025, and the associated degrees of freedom is N
-4, or 20 - 4 = 16
From the t table in Chapter 1, we obtain t.025;16 = 2.120
Next we need the standard error of the mean for group 1:
Hence, we obtain confidence limits 5.34 ± 2.120 (0.5159) and the confidence interval is
7.4.3.6 Assessing the response from any factor combination
http://www.itl.nist.gov/div898/handbook/prc/section4/prc436.htm (3 of 7) [5/1/2006 10:38:58 AM]
Trang 10Definition and Estimation of Contrasts
Definition of
contrasts
and
orthogonal
contrasts
Definitions
A contrast is a linear combination of 2 or more factor level means with coefficients that sum to zero.
Two contrasts are orthogonal if the sum of the products of corresponding coefficients (i.e., coefficients for the same means) adds
to zero.
Formally, the definition of a contrast is expressed below, using the notation i for the i-th treatment mean:
C = c1 1 + c2 2 + + c j j + + c k k
where
c1 + c2 + + c j + + c k = = 0
Simple contrasts include the case of the difference between two factor means, such as 1 - 2 If one wishes to compare treatments 1 and 2 with treatment 3, one way of expressing this is by: 1 + 2 - 2 3 Note that
1 - 2 has coefficients +1, -1
1 + 2 - 2 3 has coefficients +1, +1, -2
These coefficients sum to zero.
An example
of
orthogonal
contrasts
As an example of orthogonal contrasts, note the three contrasts defined
by the table below, where the rows denote coefficients for the column treatment means
7.4.3.6 Assessing the response from any factor combination
Trang 11properties of
orthogonal
contrasts
The following is true:
The sum of the coefficients for each contrast is zero
1
The sum of the products of coefficients of each pair of contrasts
is also 0 (orthogonality property)
2
The first two contrasts are simply pairwise comparisons, the third one involves all the treatments
3
Estimation of
contrasts
As might be expected, contrasts are estimated by taking the same linear combination of treatment mean estimators In other words:
and
Note: These formulas hold for any linear combination of treatment
means, not just for contrasts
Confidence Interval for a Contrast
Confidence
intervals for
contrasts
An unbiased estimator for a contrast C is given by
The estimator of is 7.4.3.6 Assessing the response from any factor combination
http://www.itl.nist.gov/div898/handbook/prc/section4/prc436.htm (5 of 7) [5/1/2006 10:38:58 AM]
Trang 12The estimator is normally distributed because it is a linear combination of independent normal random variables It can be shown that:
is distributed as t N-r for the one-way ANOVA model under discussion Therefore, the 1- confidence limits for C are:
Example 2 (estimating contrast)
Contrast to
estimate
We wish to estimate, in our previous example, the following contrast:
and construct a 95 percent confidence interval for C.
Computing
the point
estimate and
standard
error
The point estimate is:
Applying the formulas above we obtain
and
and the standard error is = 0.5159
7.4.3.6 Assessing the response from any factor combination