Conduct a test of hypothesis for dependent samples using the Wilcoxon signed-rank test.. Conduct and interpret the Wilcoxon rank-sum test for Conduct a test of hypothesis to determ
Trang 1©The McGraw-Hill Companies, Inc 2008 McGraw-Hill/Irwin
Non-parametric:
Analysis of Ranked Data
Chapter 18
Trang 2GOALS
Conduct the sign test for dependent samples using the
binomial and standard normal distributions as the test statistics
Conduct a test of hypothesis for dependent samples using the Wilcoxon signed-rank test
Conduct and interpret the Wilcoxon rank-sum test for
Conduct a test of hypothesis to determine whether the
correlation among the ranks in the population is different from zero
Trang 3The Sign Test
The Sign Test is based on the sign of a difference between two related observations.
No assumption is necessary regarding the shape of the population of differences.
The binomial distribution is the test statistic for small
samples and the standard normal (z) for large samples.
The test requires dependent (related) samples.
Trang 4Procedure to conduct the test:
Determine the sign (+ or -) of the difference between related pairs Determine the number of usable pairs.
Compare the number of positive (or negative)
differences to the critical value.
n is the number of usable pairs (without ties), X is the
number of pluses or minuses, and the binomial
probability π = 5
Trang 5The Sign Test - Example
The director of information systems at
Samuelson Chemicals recommended
that an in-plant training program be
instituted for managers The objective is
to improve the knowledge of database
usage in accounting, procurement,
production, and so on A sample of 15
managers was selected at random A
panel of database experts determined
the general level of competence of each
manager with respect to using the
database Their competence and
understanding were rated as being
either outstanding, excellent, good, fair,
or poor After the three-month training
program, the same panel of information
systems experts rated each manager
again The two ratings (before and
after) are shown along with the sign of
the difference A “+” sign indicates
improvement, and a “-” sign indicates
that the manager’s competence using
databases had declined after the
training program
Did the in-plant training program effectively
increase the competence of the
managers using the company’s
database?
Trang 6Step 1: State the Null and Alternative Hypotheses
H0: π ≤.5 (There is no increase in competence as a result of the
in-plant training program.)
H1: π >.5 (There is an increase in competence as a result of the
in-plant training program.)
Step 2: Select a level of significance
We chose the 10 level
Step 3: Decide on the test statistic
It is the number of plus signs resulting from the
experiment
Step 4: Formulate a decision rule.
Trang 7
7
Trang 9 Hence, the decision rule for a two-tailed test would
be to reject the null hypothesis if there are 3 or fewer plus signs, or 11 or more plus signs.
Trang 10in the rejection region, which starts at 10, so is rejected
We conclude that the three-month training course was effective It increased the database competency of the managers
Trang 12Normal Approximation - Example
Trang 133
Normal Approximation - Example
Trang 144
Normal Approximation - Example
Trang 155
Wilcoxon Signed-Rank Test for Dependent Samples
If the assumption of normality is violated for
the paired-t test, use the Wilcoxon rank test
Signed- The test requires the ordinal scale of measurement
The observations must be related or dependent
Trang 166
Wilcoxon Signed-Rank Test
The steps for the test are:
Compute the differences between related
observations
Rank the absolute differences from low to high.
Return the signs to the ranks and sum positive and negative ranks.
Compare the smaller of the two rank sums with the T
value, obtained from Appendix B.7.
Trang 177
Fricker’s is a family restaurant chain
located primarily in the southeastern part of the United States It offers a full dinner menu, but its specialty is
chicken Recently, Fricker, the owner and founder, developed a new spicy flavor for the batter in which the chicken is cooked Before replacing the current flavor, he wants to conduct some tests to be sure that patrons will like the spicy flavor better To begin, Bernie selects a random sample of 15 customers Each sampled customer is given a small piece of the current chicken and asked to rate its overall taste on a scale of 1 to 20 A value near 20 indicates the participant liked the flavor, whereas a score near 0 indicates they did not like the flavor
Next, the same 15 participants are given a sample of the new chicken with the spicier flavor and again asked to rate its taste on a scale of 1 to 20
The results are reported in the table on the
right Is it reasonable to conclude that the spicy flavor is preferred? Use the
05 significance level.
Wilcoxon Signed-Rank Test for Dependent Samples - Example
Trang 188
Wilcoxon Signed-Rank Test for Dependent Samples - Example
Trang 199
Each assigned rank in column 6 is then given the same sign as the original difference, and the results are reported in column 7 For example, the second participant has a difference score of 8 and a rank of
6 This value is located in the section of column 7.
The R+ and R- columns are totaled The sum of the positive ranks is 75 and the sum of the negative ranks is 30
The smaller of the two rank sums is used as the test
statistic and referred to as T.
Trang 200
The critical values for the Wilcoxon signed-rank test are located in Appendix B.7 A portion of that table is shown on the table below.
Trang 211
The value at the intersection is 25, so the critical
value is 25
The decision rule is to reject the null hypothesis if the
obtained from Appendix B.7 is the largest value in
the rejection region To put it another way, our
decision rule is to reject if the smaller of the two rank sums is 25 or less
In this case the smaller rank sum is 30, so the
decision is not to reject the null hypothesis We cannot conclude there is a difference in the flavor ratings between the current and the spicy.
Trang 222
Wilcoxon Rank-Sum Test
used to determine if two independent samples came from the same or equal populations.
No assumption about the shape of the population is required
The data must be at least ordinal scale
Each sample must contain at least eight observations
Trang 233
Wilcoxon Rank-Sum Test
To determine the value of the test
statistic W, all data values are
ranked from low to high as if they were from a single population.
The sum of ranks for each of the two samples is determined
Trang 24 The sum of ranks for each of the two samples is determined
If the null hypothesis is true, then the ranks will be about evenly distributed between the two samples, and the sum of the ranks for the two samples will be about the same
Trang 255
Dan Thompson, the president of
CEO Airlines, recently noted an increase in the number of no- shows for flights out of Atlanta He
is particularly interested in determining whether there are more no-shows for flights that originate from Atlanta compared with flights leaving Chicago A sample of nine flights from Atlanta and eight from Chicago are
reported in Table 18–4 At the 05 significance level, can we
conclude that there are more shows for the flights originating in Atlanta?
no-Wilcoxon Rank-Sum Test for Independent Samples - Example
Trang 266
Mr Thompson believes there are more no-shows for Atlanta flights Thus, a one tailed test is appropriate, with the rejection region located in the upper tail
The null and alternate hypotheses are:
H0: The population distribution of no-shows is the same or less for
Atlanta and Chicago.
H1: The population distribution of no-shows is larger for
Atlanta than for Chicago.
The test statistic follows the standard normal
distribution At the 05 significance level, we find from
Appendix B.1 the critical value of z is 1.65
The null hypothesis is rejected if the computed value of
z is greater than 1.65.
Trang 27 The Chicago flight with only 8 no-shows had the fewest, so it is assigned a rank of 1 The Chicago flight with
9 no-shows is ranked
2, and so on.
Trang 288
The value of W is calculated for the Atlanta group
and is found to be 96.5, which is the sum of the ranks for the no-shows for the Atlanta flights.
Trang 292 9
Trang 300
Kruskal-Wallis Test:
Analysis of Variance by Ranks
This is used to compare three or more samples to determine if they came from equal populations.
The ordinal scale of measurement is required.
It is an alternative to the one-way ANOVA.
The chi-square distribution is the test statistic.
Each sample should have at least five
observations.
The sample data is ranked from low to high as if
it were a single group.
Trang 311
A management seminar consists of executives from
manufacturing, finance, and engineering Before scheduling the seminar sessions, the seminar leader is interested in whether the three groups are equally knowledgeable about management principles Plans are to take samples of the executives in manufacturing, in finance, and in engineering and
to administer a test to each executive If there is no difference
in the scores for the three distributions, the seminar leader will conduct just one session However, if there is a difference in the scores, separate sessions will be given We will use the Kruskal-Wallis test instead of ANOVA because the seminar leader is unwilling to assume that (1) the populations of
management scores follow the normal distribution or (2) the population standard deviations are the same
Kruskal-Wallis Test:
Analysis of Variance by Ranks - Example
Trang 32H1: The population distributions of the management scores for
the populations of executives in manufacturing, finance, and engineering are NOT the same
Step 2: H0 is rejected if χ2 is greater than 7.185 There are 3 degrees of freedom at the 05 significance level
Kruskal-Wallis Test:
Analysis of Variance by Ranks - Example
Trang 333
The next step is to select random samples from the three
populations A sample of seven manufacturing, eight finance, and six engineering executives was selected Their scores on the test are recorded below
Kruskal-Wallis Test:
Analysis of Variance by Ranks - Example
Trang 344
Considering the scores as a single population, the engineering executive with a score of 35 is the lowest, so it is ranked 1 There are two scores of 38 To resolve this tie, each score is given a rank of 2.5, found by (2+3)/2 This process is continued for all scores The highest score is 107, and that finance executive is given a rank of 21 The scores, the ranks, and the sum of the ranks for each of the three samples are given in the table below.
Kruskal-Wallis Test:
Analysis of Variance by Ranks - Example
Trang 355
Because the computed value of H (5.736) is less than the critical value of
5.991, the null hypothesis is not rejected There is not enough evidence to conclude there is a difference among the executives from manufacturing, finance, and engineering with respect to their typical knowledge of management principles From a practical standpoint, the seminar leader should consider offering only one session including executives from all areas.
Kruskal-Wallis Test:
Analysis of Variance by Ranks - Example
Trang 37 It can range from –1.00 up to 1.00.
It is similar to Pearson’s coefficient of correlation, but is based
on ranked data
It computed using the formula:
Trang 388
Lorrenger Plastics, Inc., recruits
management trainees at colleges and universities throughout the United States
Each trainee is given a rating by the recruiter during the on-
campus interview This rating is
an expression of future potential and may range from 0 to 15, with the higher score indicating more potential The recent
college graduate then enters an in-plant training program and is given another composite rating based on tests, opinions of group leaders, training officers, and so on The on-campus rating and the in-plant training ratings are given in the table on the right.
Rank-Order Correlation - Example
Trang 399
Rank-Order Correlation - Example
Trang 400
Rank-Order Correlation - Example
Trang 411
State the null hypothesis: Rank correlation in population is 0
State the alternate hypothesis: Rank correlation in population is not 0
For a sample of 10 or more, the significance of is determined
by computing t using the following formula The sampling distribution of follows the t distribution with n - 2 degrees of
freedom
Trang 422
Trang 433
End of Chapter 18