1. Trang chủ
  2. » Luận Văn - Báo Cáo

Ebook Business statistics - A decision - making approach (9th edition): Part 2

458 51 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 458
Dung lượng 18,57 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

BQ) Part 2 book Business statistics: A decision - making approach has contents: Estimating single population parameters, introduction to hypothesis testing, estimation and hypothesis testing for two population parameters, analysis of variance,...and other contents.

Trang 1

Estimation and Hypothesis

Testing for Two Population

Parameters

From Chapter 10 of Business Statistics, A Decision-Making Approach, Ninth Edition David F Groebner,

Patrick W Shannon and Phillip C Fry Copyright © 2014 by Pearson Education, Inc All rights reserved.



Trang 2

Outcome 1 Discuss the logic behind and demonstrate the techniques for using independent samples to test hypotheses and develop interval estimates for the difference between two population means.

Outcome 3 Carry out hypothesis tests and establish interval estimates, using sample data, for the difference between two population proportions.

Outcome 2 Develop confidence interval estimates and conduct hypothesis tests for the difference between two population means for paired samples.

1 Estimation for Two Population

Means Using Independent Samples

2 Hypothesis Tests for Two

Population Means Using pendent Samples

Inde-3 Interval Estimation and

Hypothesis Tests for Paired Samples

4 Estimation and Hypothesis

Tests for Two Population Proportions

about single population means and single population proportions

Estimation and Hypothesis Testing for Two Population

Parameters

associated with sampling distributions for x and p.

t Review the steps for developing confidence

interval estimates for a single population mean and a single population proportion

t Review material on calculating and

interpreting sample means and standard

deviations

t Review the normal distribution.

Why you need to know

In many business decision-making situations, managers must decide between two or more alternatives For example, fleet managers in large companies must decide which model and make of car to purchase next year Airlines must decide whether to purchase replacement planes from Boeing or Airbus When deciding on a new advertising campaign, a company may need to evaluate proposals from competing adver- tising agencies Hiring decisions may require a personnel director to select one employee from a list of appli- cants Production managers are often confronted with decisions concerning whether to change a production process or leave it alone Each day, consumers purchase a product from among several competing brands Fortunately, there are statistical procedures that can help decision makers use sample information to com- pare different populations In this text, we introduce these procedures and techniques by discussing methods that can be used to make statistical comparisons between two populations Later, we will discuss some methods to extend this comparison to more than two populations Whether we are discussing cases involving two popula- tions or those with more than two populations, the techniques we present are all extensions of the statistical tools involving a single population parameter.

Tatiana Popova/Shutterstock



Trang 3

1 Estimation for Two Population

Means Using Independent Samples

In this section, we examine situations in which we are interested in the difference between two population means, looking first at the case in which the samples from the two populations are independent.

We will introduce techniques for estimating the difference between the means of two populations in the following situations:

1 The population standard deviations are known and the samples are independent

2 The population standard deviations are unknown and the samples are independent

Estimating the Difference between Two Population

Independent Samples

Recall that the standard normal distribution z-values were used in establishing the

criti-cal value and developing the interval estimate when the population standard deviation was assumed known and the population distribution is assumed to be normally distributed.1 The general format for a confidence interval estimate is shown in Equation 1

Independent Samples

Samples selected from two or more populations

in such a way that the occurrence of values in

one sample has no influence on the probability

of the occurrence of values in the other

sample(s).

Chapter Outcome 1

1 If the samples from the two populations are large 1n Ú 302 the normal distribution assumption is not required.

Confidence Interval, General Format

Point estimate{ 1Critical value21Standard error2 (1)

In business situations, you will often need to estimate the difference between two tion means For instance, you may wish to estimate the difference in mean starting salaries between males and females, the difference in mean production output in union and nonunion factories, or the difference in mean service times at two different fast-food businesses In these situations, the best point estimate for m1- m2 is

1

n n2 oopulations 1 and 2

Further, the critical value for determining the confidence interval will be a z-value from

the standard normal distribution In these circumstances, the confidence interval estimate for

m1- m2 is found by using Equation 3



Trang 4

The z-values for several of the most commonly used confidence levels are

Confidence Level Critical z-value

EXAMPLE 1 CONFIDENCE INTERVAL ESTIMATE FOR M1 − M2 WHEN S1

AND S2 ARE KNOWN, USING INDEPENDENT SAMPLES

Axiom Fitness Axiom Fitness is a small chain of fitness centers located primarily in the South but with some clubs scattered in other parts of the U.S and Canada Recently, the club in Winston-Salem, North Carolina, worked with a business class from a local university

on a project in which a team of students observed Axiom customers with respect to their club usage As part of the study, the students measured the time that customers spent in the club during a visit The objective is to estimate the difference in mean time spent per visit for male and female customers Previous studies indicate that the standard deviation is 11  minutes for males and 16 minutes for females To develop a 95% confidence interval estimate for the difference in mean times, the following steps are taken:

samples from the two populations.

In this case, the company is interested in estimating the difference in mean time spent in the club between males and females The measure of interest is m1- m2 The student team has selected simple random samples of 100 males and 100 females at different times in the Winston-Salem club

The plan is develop a 95% confidence interval estimate

The resulting sample means are

Males: x1 = 34.5 minutes Females: x2 = 42.4 minutes The point estimate is

x1- x2 = 34.5 - 42.4 = - 7.9 minutes Women in the sample spent an average of 7.9 minutes longer in the club

The standard error is calculated as

121 2 2 2

11100

16

100 1 9416

The interval estimate will be developed using a 95% confidence interval Because the population standard deviations are known, the critical value is a

z-value from the standard normal table The critical value is

2

(3)



Trang 5

Step 6 Develop the confidence interval estimate using Equation 3.

Thus, based on the sample data and the specified confidence level, women spend

on average between 4.09 and 11.71 minutes longer at this Axiom Fitness Center

> END EXAMPLE

TRY PROBLEM 4

Estimating the Difference between Two Means When

When estimating a single population mean when the population standard deviation is

unknown, the critical value is a t-value from the t-distribution This is also the case when you

are interested in estimating the difference between two population means, if the following assumptions hold:

Chapter Outcome 1

Assumptions

r The populations are normally distributed

r The populations have equal variances

r The samples are independent

The following application illustrates how a confidence interval estimate is developed

using the t-distribution.

BUSINESS APPLICATION ESTIMATING THE DIFFERENCE BETWEEN TWO

POPULATION MEANS

RETIREMENT INVESTING A major political issue for the past decade

has focused on the long-term future of the U.S Social Security system Many people who have entered the workforce in the past 20 years believe the system will not be solvent when they retire, so they are actively investing in their own retirement accounts One investment alternative is a tax-sheltered annuity (TSA) marketed by life insurance companies Certain people, depending on occupation, qualify to invest part of their paychecks in a TSA and to pay no federal income tax on this money until it is withdrawn While the money is invested, the insurance companies invest it in either stock or bond portfolios A second alternative open to many people is a plan known

as a 401(k), in which employees contribute a portion of their paychecks to purchase stocks, bonds, or mutual funds In some cases, employers match all or part of the employee contributions In many 401(k) systems, the employees can control how their funds are invested

A recent study was conducted in North Dakota to estimate the difference in mean annual contributions for individuals covered by the two plans [TSA or 401(k)] A simple random sample of 15 people from the population of adults who are eligible for a TSA investment was selected A second sample of 15 people was selected from the population of adults in North Dakota who have 401(k) plans The variable of interest is the dollar amount of money invested in the retirement plan during the previous year Specifically, we are interested in estimating m1- m2 using a 95% confidence interval estimate where:

m1 = Mean dollars invested by the TSA - eligible population during the past year

m2 = Mean dollars invested by the 401 1 k 2- eligible population during the past year



Trang 6

TSA–Eligible 401(k)–Eligible

x1 = +2,119.70 x2 = +1,777.70

s1 = +709.70 s2 = +593.90

Before applying the t-distribution, we need to determine whether the assumptions are likely

to be satisfied First, the samples are considered independent because the amount invested by one group should have no influence on the likelihood that any specific amount will be found for the second sample

Next, Figure 1 shows the sample data and the box and whisker plots for the two samples These plots exhibit characteristics that are reasonably consistent with those associated with normal distributions and approximately equal variances Although using a box and whisker

plot to check the t-distribution assumptions may seem to be imprecise, studies have shown the

t-distribution to be applicable even when there are small violations of the assumptions This is

particularly the case when the sample sizes are approximately equal.2

Equation 4 can be used to develop the confidence interval estimate for the difference between two population means when you have small independent samples

Confidence Interval Estimate for M1−M2 When S1 and S2

Are Unknown, Independent Samples

Minimum First Quartile Median

d Quartile Maximum

951 1,572 2,318 2,641 3,253

334 1,465 1,773 2,234 2,676

3,122 3,253 2,021 2,479 2,318 1,407 2,641 1,648 2,439 1,059 2,799 1,714 951 2,372 1,572

1,781 2,594 1,615 334 2,322 2,234 2,022 1,603 1,395 1,604 2,676 1,773 1,156 2,092 1,465

401(k)

401(k) TSA

Trang 7

To use Equation 4, we must compute the pooled standard deviation, s p If the

equal-variance assumption holds, then both s2 and s2 are estimators of the same population variance,

s2 To use only one of these, say s2, to estimate s2 would be disregarding the information

obtained from the other sample To use the average of s2 and s2, if the sample sizes were ferent, would ignore the fact that more information about s2 is obtained from the sample hav-

dif-ing the larger sample size We therefore use a weighted average of s2 and s2, denoted as s2, to estimate s2, where the weights are the degrees of freedom associated with each sample The

square root of s2 is known as the pooled standard deviation and is computed using

s p , we must first calculate s2 and s2 This requires that we estimate m1 and m2 using x1 and

x2, respectively The degrees of freedom are equal to the sample size minus the parameters estimated before the variance estimate is obtained Therefore, our degrees of freedom must

342 489 45

( )

Thus, the 95% confidence interval estimate for the difference in mean dollars for people who invest in a TSA versus those who invest in a 401(k) is

- +147.45 … 1 m1 - m22 … +831.45This confidence interval estimate crosses zero and therefore indicates there may be no difference between the mean contributions to TSA accounts and to 401(k) accounts by adults

in North Dakota The implication of this result is that the average amount invested by those individuals who invest in pretax TSA programs is no more or no less than that invested by those participating in after-tax 401(k) programs Based on this result, there may be an oppor-tunity to encourage the TSA investors to increase deposits

EXAMPLE 2 CONFIDENCE INTERVAL ESTIMATE FOR M1 − M2 WHEN S1

AND S2 ARE UNKNOWN, USING INDEPENDENT SAMPLES

Andreason Marketing, Inc. Andreason Marketing, Inc has been hired by a major newspaper in the U.S to estimate the dif-ference in mean time that newspaper subscribers spend reading the Saturday newspaper when subscribers age 50 and under are compared with those more than 50 years old A simple random sample of six people age 50 or younger and eight people over 50 participated in the study The estimate can be developed using the following steps:



Trang 8

Step 1 Define the population parameter of interest and select independent

samples from the two populations.

The objective here is to estimate the difference between the two age groups with respect to the mean time spent reading the Saturday edition of the news-paper The parameter of interest is m1 - m2

The marketing company has selected simple random samples of six

“younger” and eight “older” people Because the reading time by one son does not influence the reading time for any other person, the samples are independent

The marketing firm wishes to have a 95% confidence interval estimate

The resulting sample means and sample standard deviations for the two groups are

age … 50: x1 = 13.6 minutes age 7 50: x2 = 11.2 minutes

s1 = 3.1 minutes s2 = 5.0 minutes

n1 = 6 n2 = 8 The point estimate is

x1 x2 13 6 11 2 2 4 minutes

The pooled standard deviation is computed using

1

8 2 3277

Because the population standard deviations are unknown, the critical value

will be a t-value from the t-distribution as long as the population variances

are equal and the populations are assumed to be normally distributed The critical t for 95% confidence and 6 + 8 - 2 = 12 degrees of

4 5 0715

2 6715 1 2 7 4715



Trang 9

Because the interval crosses zero, we cannot conclude that a difference exists between the age groups with respect to the mean reading time for the Saturday edition Thus, with respect to this factor, it does not seem to matter whether the person is 50 or younger or over 50.

> END EXAMPLE

TRY PROBLEM 1

What If the Population Variances Are Not Equal? If you have reason to believe that the population variances are substantially different, Equation 4 is not appropriate for comput-ing the confidence interval Instead of computing the pooled standard deviation as part of the confidence interval formula, we use Equations 5 and 6

Confidence Interval for M1 − M2 When S1 and S2 Are Unknown and Not

Equal, Independent Samples

(x x ) t s

n

s n

2

1

222

(5)

where:

t is from the t-distribution with degrees of freedom computed using

Degrees of Freedom for Estimating Difference between Population Means When S1 and S2 Are Not Equal

df s n s n

s n n

s n n

2 2

(6)

EXAMPLE 3 ESTIMATINGM1 − M2 WHEN THE POPULATION VARIANCES

ARE NOT EQUAL

Citibank The marketing managers at Citibank are planning to roll out a new marketing campaign addressed at increasing bank card use As one part of the campaign, the company will be offering a low interest rate incentive to induce people to spend more money using its charge cards However, the company is concerned whether this plan will have a differ-ent impact on married card holders than on unmarried card holders So, prior to starting the marketing campaign nationwide, the company tests

it on a random sample of 30 unmarried and 25 married customers The managers wish to mate the difference in mean credit card spending for unmarried versus married for a two-week period immediately after being exposed to the marketing campaign Based on past data, the managers have reason to believe the spending distributions for unmarried and married will be approximately normally distributed, but they are unwilling to conclude the population vari-ances for spending are equal for the two populations

esti-A 95% confidence interval estimate for the difference in population means can be oped using the following steps:

The parameter of interest is the difference between the mean dollars spent on credit cards by unmarried versus married customers in the two-week period after being exposed to Citi’s new marketing program

The research manager wishes to have a 95% confidence interval estimate

Independent samples of 30 unmarried and 25 married customers were taken, and the credit card spending for each sampled customer during



Trang 10

the two-week period was recorded The following sample results were observed:

The standard error is calculated as

s n

s n

121 2 2 2

102 4030

77 25

25 24 25

Because we are unable to assume the population variances are equal, we must

first use Equation 6 to calculate the degrees of freedom for the t-distribution

This is done as follows:

df s n s n

s n n

s n n

( / ) ( / )

12 1 22 22

12 121

77 25 2524

Thus, the degrees of freedom (rounded down) will be 52 At the 95% confidence

level, using the t-distribution table, the approximate t-value is 2.0086 Note,

since there is no entry for 52 degrees of freedom in the table, we have selected

the t-value associated with 95% confidence and 50 degrees of freedom, which provides a slightly larger t-value than would have been the case for 52 degrees

of freedom Thus, the interval estimate will be generously wide

The confidence interval estimate is computed using

(x x ) t s

n

s n

2

1

222

Then the interval estimate is

$455 10 $268 90 2 0086 102 40. .

30

77 252

> END EXAMPLE

TRY PROBLEM 6



Trang 11

10-6 Two random samples were selected independently

from populations having normal distributions The following statistics were extracted from the samples:

x1 = 42.3 x2= 32.4

a If s1 = 3 and s2 = 2 and the sample sizes are

n1 = 50 and n2 = 50, construct a 95% confidence interval for the difference between the two population means

b If s1 = s2, s1 = 3, and s2 = 2, and the sample

sizes are n1 = 10 and n2 = 10, construct a 95% confidence interval for the difference between the two population means

c If s1 ≠ s2, s1 = 3, and s2 = 2, and the sample

sizes are n1 = 10 and n2 = 10, construct a 95% confidence interval for the difference between the two population means

10-7 Amax Industries operates two manufacturing facilities

that specialize in doing custom manufacturing work for the semiconductor industry The facility

in Denton, Texas, is highly automated, whereas the facility in Lincoln, Nebraska, has more manual functions For the past few months, both facilities have been working on a large order for a specialized product The vice president of operations is interested

in estimating the difference in mean time it takes to complete a part on the two lines To do this, he has requested that a random sample of 15 parts at each facility be tracked from start to finish and the time required be recorded The following sample data were recorded:

Denton, Texas Lincoln, Nebraska

x1= 56.7 hours x2 = 70.4 hours

s1= 7.1 hours s1 = 8.3 hours

Assuming that the populations are normally distributed with equal population variances, construct and interpret

a 95% confidence interval estimate

10-8 A credit card company operates two customer service

centers: one in Boise and one in Richmond Callers

to the service centers dial a single number, and a computer program routs callers to the center having the fewest calls waiting As part of a customer service review program, the credit card center would like to determine whether the average length of a call (not including hold time) is different for the two centers The managers of the customer service centers are willing to assume that the populations of interest are normally distributed with equal variances Suppose

10-1 The following information is based on independent

random samples taken from two normally distributed

populations having equal variances:

n1 = 15 n2= 13

x1 = 50 x2= 53

s1 = 5 s2= 6

Based on the sample information, determine the 90%

confidence interval estimate for the difference between

the two population means

10-2 The following information is based on independent

random samples taken from two normally distributed

populations having equal variances:

n1 = 24 n2= 28

x1 = 130 x2= 125

s1= 19 s2 = 17.5

Based on the sample information, determine the 95%

confidence interval estimate for the difference between

the two population means

10-3 Construct a 90% confidence interval estimate for the

difference between two population means given the

following sample data selected from two normally

distributed populations with equal variances:

Sample 1 Sample 2

10-4 Construct a 95% confidence interval estimate for the

difference between two population means based on the

10-5 Construct a 95% percent confidence interval for

the difference between two population means using

the following sample data that have been selected

from normally distributed populations with different

Trang 12

b Suppose the manufacturers of each of these batteries wished to warranty their batteries One small company to which they both ship batteries receives shipments of 200 batteries weekly If the average length of time to failure of the batteries is less than a specified number, the manufacturer will refund the company’s purchase price of that set of batteries What value should each manufacturer set if they wish

to refund money on at most 5% of the shipments?

10-11 Wilson Construction and Concrete Company is known

as a very progressive company that is willing to try new ideas to improve its products and service One

of the key factors of importance in concrete work

is the time it takes for the concrete to “set up.” The company is considering a new additive that can be put in the concrete mix to help reduce the setup time Before going ahead with the additive, the company plans to test it against the current additive To do this,

14 batches of concrete are mixed using each of the additives The following results were observed:

Old Additive New Additive

x = 17.2 hours x = 15.9 hours

s = 2.5 hours s = 1.8 hours

a Use these sample data to construct a 90%

confidence interval estimate for the difference in mean setup time for the two concrete additives On the basis of the confidence interval produced, do you agree that the new additive helps reduce the setup time for cement? (Assume the populations are normally distributed.) Explain your answer

b Assuming that the new additive is slightly more expensive than the old additive, do the data support switching to the new additive if the managers of the company are primarily interested in reducing average setup time?

10-12 A working paper (Mark Aguiar and Erik Hurst,

“Measuring Trends in Leisure: The Allocation of Time over Five Decades,” 2006) for the Federal Reserve Bank of Boston concluded that average leisure time spent per week by women in 2003 was 33.80 hours and 37.56 hours for men The sample standard deviations were 40 and 70, respectively These results were obtained from samples of women and men of size 8,492

and 6,752, respectively In this study, leisure refers to

the time individuals spent socializing, in passive leisure,

in active leisure, volunteering, in pet care, gardening, and recreational child care Assume that the amount

of leisure time spent by men and women have normal distributions with equal population variances

a Determine the pooled estimate of the common populations’ standard deviation

b Produce the margin of error to estimate the difference of the two population means with a confidence level of 95%

a random sample of phone calls to the two centers is

selected and the following results are reported:

Boise Richmond

Sample Mean (seconds) 195 216

Sample St Dev (seconds) 35.10 37.80

a Using the sample results, develop a 90% confidence

interval estimate for the difference between the two

population means

b Based on the confidence interval constructed in part

a, what can be said about the difference between the

average call times at the two centers?

10-9 A pet food producer manufactures and then fills

25-pound bags of dog food on two different production

lines located in separate cities In an effort to determine

whether differences exist between the average fill rates

for the two lines, a random sample of 19 bags from line

1 and a random sample of 23 bags from line 2 were

recently selected Each bag’s weight was measured and

the following summary measures from the samples were

reported:

Production Line 1 Production Line 2

Sample Mean, x 24.96 25.01

Sample Standard Deviation, s 0.07 0.08

Management believes that the fill rates of the two lines

are normally distributed with equal variances

a Calculate the point estimate for the difference

between the population means of the two lines

b Develop a 95% confidence interval estimate of the

true mean difference between the two lines

c Based on the 95% confidence interval estimate

calculated in part b, what can the managers of the

production lines conclude about the differences

between the average fill rates for the two lines?

10-10 Two companies that manufacture batteries for electronics

products have submitted their products to an independent

testing agency The agency tested 200 of each company’s

batteries and recorded the length of time the batteries

lasted before failure The following results were

determined:

Company A Company B

x = 41.5 hours x = 39.0 hours

a Based on these data, determine the 95% confidence

interval to estimate the difference in average life of

the batteries for the two companies Do these data

indicate that one company’s batteries will outlast the

other company’s batteries on average? Explain



Trang 13

c Calculate a 95% confidence interval for the

difference in the average leisure time between

women and men

d Do your results in part c indicate that the average

amount of men’s leisure time was larger than that of

women in 2003? Support your assertions

e Would your conclusion in part d change if you did

not assume the population variances were equal?

10-13 The Graduate Management Admission Council

reported a shift in the job-hunting strategies among

second-year masters of business administration (MBA)

candidates Even though their prospective base salary

has increased from $81,900 to $93,770 from 2002 to

2005, it appears that MBA candidates are submitting

fewer job applications Data obtained from online

surveys of 1,442 MBA candidates at 30 business

school programs indicate that in 2002 the average

number of job applications per candidate was 38.9 and

2.0 in 2005 The sample variances were 64 and 0.32,

respectively

a Examine the sample variances Conjecture

whether this sample evidence indicates that the

two population variances are equal to each other

Support your assertion

b On the basis of your answer in part a, construct a

99% confidence interval for the difference in the

average number of job applications submitted by

MBA candidates between 2002 and 2005

c Using your result in part b, is it plausible that the

difference in the average number of job applications

submitted is 36.5? Is it plausible that the difference

in the average number of job applications submitted

is 37? Are your answers to these two questions

contradictory? Explain

10-14 Logston Enterprises operates a variety of businesses

in and around the St Paul, Minnesota, area

Recently, the company was notified by the law firm

representing several female employees that a lawsuit

was going to be filed claiming that males were given

preferential treatment when it came to pay raises by

the company The Logston human resources manager

has requested that an estimate be made of the

difference between mean percentage raises granted

to males versus females Sample data are contained

in the file Logston Enterprises She wants you to

develop and interpret a 95% confidence interval

estimate She further states that the distribution of

percentage raises can be assumed approximately

normal, and she expects the population variances to

be about equal

10-15 The owner of the A.J Fitness Center is interested in

estimating the difference in mean years that female members have been with the club compared with male members He wishes to develop a 95% confidence interval estimate Sample data are in the file called

AJ Fitness Assuming that the sample data are

approximately normal and that the two populations have equal variances, develop and interpret the confidence interval estimate Discuss the result

10-16 Platinum Billiards, Inc., based in Jacksonville, Florida,

is a retailer of billiard supplies It stands out among billiard suppliers because of the research it does to assure its products are top notch One experiment was conducted to measure the speed attained by a cue ball struck by various weighted pool cues The conjecture

is that a light cue generates faster speeds while breaking the balls at the beginning of a game of pool Anecdotal experience has indicated that a billiard cue weighing less than 19 ounces generates faster speeds Platinum used a robotic arm to investigate this claim The research generated the data given in the file titled

Breakcue.

a Calculate the sample standard deviation and mean speed produced by cues in the two weight categories: (1) under 19 ounces and (2) at or above

19 ounces

b Calculate a 95% confidence interval for the difference in the average speed of a cue ball generated by each of the weight categories

c Is the anecdotal evidence correct? Support your assertion

d What assumptions are required so that your results

in part b would be valid?

10-17 The Federal Reserve reported in its comprehensive

Survey of Consumer Finances, released every three years, that the average income of families in the United States declined from 2001 to 2004 This was the first decline since 1989–1992 A sample of incomes was taken in 2001 and repeated in 2004 After adjusting for inflation, the data that arise from these samples are given in a file titled Federal Reserve.

a Determine the percentage decline indicated by the two samples

b Using these samples, produce a 90% confidence interval for the difference in the average family income between 2001 and 2004

c Is it plausible that there has been no decline in the average income of U.S families? Support your assertion

d How large an error could you have made by using the difference in the sample means to estimate the difference in the population means?

END EXERCISES 10-1



Trang 14

2 Hypothesis Tests for Two Population

Means Using Independent Samples

You are going to encounter situations that will require you to test whether two populations have equal means or whether one population mean is larger (or smaller) than another These hypothesis-testing applications are just an extension of the hypothesis-testing process for

a single population mean They also build directly on the estimation process introduced in Section 1

In this section, we will introduce hypothesis-testing techniques for the difference between the means of two populations in the following situations:

1 The population standard deviations are known and the samples are independent

2 The population standard deviations are unknown and the samples are independent

The remainder of this section presents examples of hypothesis tests for these different situations

Using Independent Samples

Samples are considered to be independent when the samples from the two populations are

taken in such a way that the occurrence of values in one sample has no influence on the ability of occurrence of the values in the second sample In special cases in which the popu-lation standard deviations are known and the samples are independent, the test statistic is a

prob-z-value computed using Equation 7.

1m1- m22 = Hypothesized difference in population means

If the calculated z-value using Equation 7 exceeds the critical z-value from the standard

nor-mal distribution, the null hypothesis is rejected Example 4 illustrates the use of this test statistic

EXAMPLE 4 HYPOTHESIS TEST FOR M1 − M2 WHEN S1 AND S2 ARE

KNOWN, INDEPENDENT SAMPLES

Brooklyn Brick, Inc. Brooklyn Brick, Inc is a Pennsylvania-based company that makes bricks and concrete blocks for the building industry One product is a brick facing material that looks like a real brick but is much thinner The ideal thickness is 0.50 inches The bricks that the company makes must be very uniform in their dimension so brickmasons can build straight walls The company has two plants that produce brick facing products, and the tech-nology used at the two plants is slightly different At plant 1, the standard deviation in the thickness of brick facing products is known to be 0.025 inches, and the standard deviation at plant 2 is 0.034 inches These are known values However, the company is interested in deter-mining whether there is a difference in the average thickness of brick facing products made at the two plants Specifically, the company wishes to know whether plant 2 also provides brick facing products that have a greater mean thickness than the products produced at plant 1 If the test determines that plant 2 does provided thicker materials than plant 1, the managers will



Trang 15

have the maintenance department attempt to adjust the process to reduce the mean thickness

To test this, you can use the following steps:

This is m1 - m2, the difference in the two population means

We are interested in determining whether the mean thickness for plant 2 exceeds that for plant 1 The following null and alternative hypotheses are specified:

H0: m1 - m2 Ú 0.0 H0: m1 Ú m2

or

H A: m1 - m2 6 0.0 H A: m1 6 m2

The test will be conducted using a = 0.05

Because the population standard deviations are assumed to be known, the

critical value is a z-value from the standard normal distribution This test

is a one-tailed lower-tail test, with a = 0.05 From the standard normal

distribution, the critical z-value is

- z0.05 = - 1.645The decision rule compares the test statistic found in Step 5 to the critical

z-value.

If z 6 -1.645, reject the null hypothesis;

Otherwise, do not reject the null hypothesis

Alternatively, you can state the decision rule in terms of a p-value, as follows:

If p@value 6 a = 0.05, reject the null hypothesis;

Otherwise, do not reject the null hypothesis

Select simple random samples of brick facing pieces from the two populations and compute the sample means A simple random sample of 100 brick facing pieces is selected from plant 1’s production, and another simple random sample

of 100 brick facing pieces is selected from plant 2’s production The samples are independent because the thicknesses of the brick pieces made by one plant can in no way influence the thicknesses of the bricks made by the other plant The means computed from the samples are

x1 0 501 inches and x2 0 509 inchesThe test statistic is obtained using Equation 7

z x x

n n z

The critical -z0.05 = - 1.645, and the test statistic value was computed to be

z = -1.90 Applying the decision rule,

Because z = -1.90 6 -1.645, reject the null hypothesis.

Figure 2 illustrates this hypothesis test

How to do it (Example 4)

The Hypothesis-Testing Process

for Two Population Means

The hypothesis-testing process

for tests involving two population

means introduced in this section is

essentially the same as for a single

population mean The process is

composed of the following steps:

1.Specify the population

param-eter of interest.

2.Formulate the appropriate null

and alternative hypotheses The

null hypothesis should contain

the equality Possible formats

for hypotheses testing

concern-ing two populations means are

where c = any specified number.

3.Specify the significance level

1a2 for testing the hypothesis

Alpha is the maximum

allow-able probability of committing a

Type I statistical error.

4.Determine the rejection region

and develop the decision rule.

5.Compute the test statistic or the

p-value Of course, you must

first select simple random

sam-ples from each population and

compute the sample means.

6.Reach a decision Apply the

decision rule to determine

whether to reject the null

hypothesis.

7.Draw a conclusion.



Trang 16

Test Statistic:

Decision Rule:

Since z = –1.90 < z = –1.645, reject H0 Conclude that the brick facings made by plant 2 have a larger mean thickness than those made by plant 1.

= –1.90

(x1 – x2) – ( 1 – 2)

= (0.501– 0.509) – 00

n1

2

0.025 2 100

0.034 2 100

FIGURE 2 |

Example 4 Hypothesis Test

There is statistical evidence to conclude that the brick facings made by plant 2 have

a larger mean thickness than those made by plant 1 Thus, the managers of Brooklyn Brick, Inc need to take action to reduce the mean thicknesses from plant 2

> END EXAMPLE

TRY PROBLEM 21

Using p-Values The z-test statistic computed in Example 4 indicates that the

differ-ence in sample means is 1.90 standard errors below the hypothesized differdiffer-ence of zero

Because this falls below the z critical level of –1.645, the null hypothesis is rejected You could have also tested this hypothesis using the p-value approach The p-value for this one-tailed test is the probability of a z-value in a standard normal distribution being less than –1.90 From the standard normal table, the probability associated with z = -1.90 is 0.4713 Then the p-value is

p@value = 0.5000 - 0.4713 = 0.0287

The decision rule to use with p-values is

If p@value 6 a, reject the null hypothesis;

Otherwise, do not reject the null hypothesis

Because

p@value = 0.0287 6 a = 0.05reject the null hypothesis and conclude that the mean brick facing thickness for plant 2 is greater than the mean thickness for products produced by plant 1

Using Independent Samples

In Section 1 we showed that to develop a confidence interval estimate for the difference between

two population means when the standard deviations are unknown, we used the t-distribution to

obtain the critical value As you might suspect, this same approach is taken for hypothesis-testing

situations Equation 8 shows the t-test statistic that will be used when s1 and s2 are unknown



Trang 17

t-Test Statistic for M1 − M2 When S1 and S2 Are Unknown and Assumed

Equal, Independent Samples

t x x s

x1andx2 Sample means from populations 1 annd 2

Hypothesized difference between

and Sample sizes fro

n1 n2 mm the two populations

Pooled standard de

s p vviation (see Equation 4)The test statistic in Equation 8 is based on three assumptions:

Assumptions

r Each population has a normal distribution.3

r The two population variances, s2 and s2, are equal

r The samples are independent

Notice that in Equation 8, we are using the pooled estimate for the common population standard deviation that we developed in Section 1

BUSINESS APPLICATION HYPOTHESIS TEST FOR THE DIFFERENCE BETWEEN

TWO POPULATION MEANS

RETIREMENT INVESTING (CONTINUED) Recall the earlier

example discussing a study in North Dakota involving retirement investing The leaders of the study are interested in determining whether there is a difference in mean annual contributions for individuals covered

by TSAs and those with 401(k) retirement programs A simple random sample of 15 people from the population of adults who are eligible for

a TSA investment was selected A second sample of 15 people was selected from the population of adults in North Dakota who have 401(k) plans The variables of interest are the dollars invested in the two retirement plans during the previous year

Specifically, we are interested in testing the following null and alternative hypotheses:

H0: m1 - m2 = 0.0 H0: m1 = m2

or

H A: m1 - m2 ≠ 0.0 H A: m1 ≠ m2

m1 = Mean dollars invested by the TSA - eligible population during the past year

m2 = Mean dollars invested by the 401 1 k 2 - eligible population during the past yearThe leaders of the study select a significance level of a = 0.05 The sample results are



Trang 18

We are now in a position to complete the hypothesis test to determine whether the mean dollar amount invested by TSA employees is different from the mean amount invested by 401(k) employees We first determine the critical values with degrees of freedom equal to

n1 + n2 - 2 = 15 + 15 - 2 = 28and a = 0.05 for the two-tailed test.4 The appropriate t-values are

1 4313)

The difference in sample means is attributed to sampling error Figure 3 summarizes this hypothesis test Based on the sample data, there is no statistical justification to believe that the mean annual investment by individuals eligible for the TSA option is different from those individuals eligible for the 401(k) plan

4 You can also use Excel’s T.INV.2T function 1=T.INV.2T10.05,282.

FIGURE 3 |

Hypothesis Test for the

Equality of the Two Population

Means for the North Dakota

–t0.025 = –2.0484 t0.025 = 2.0484

df = n1+ n2 – 2 = 15 + 15 – 2 = 28

t = 1.4313

+ 1

n1

1

1 15

1 15

s p

x1– x2 = (2,119.70 – 1,777.70) = 342.00

Rejection Region /2 = 0.025



Trang 19

BUSINESS APPLICATION USING EXCEL TO TEST FOR THE DIFFERENCE

BETWEEN TWO POPULATION MEANS

SUV VEHICLE MILEAGE Excel has a procedure for

performing the necessary calculations to test hypotheses involving two population means Consider a national car rental company that is interested in testing to determine whether there is a difference in mean mileage for sport utility vehicles (SUVs) driven in town versus those driven

on the highway Based on its experience with regular automobiles, the company believes the mean highway mileage will exceed the mean city mileage

To test this belief, the company has randomly selected 25 SUV rentals driven only on the highway and another random sample of 25 SUV rentals driven only in the city The vehicles were filled with 14 gallons of gasoline The company then asked each customer to drive the car until

it ran out of gasoline At that point, the elapsed miles were noted and the miles per gallon (mpg) were recorded For their trouble, the customers received free use of the SUV and a coupon valid for one week’s free rental The results of the experiment are contained in the file Mileage.

Excel can be used to perform the calculations required to determine whether the manager’s belief about SUV highway mileage is justified We first formulate the null and alternative hypotheses to be tested:

H0: m1 - m2 … 0.0 H0: m1 … m2

or

H A: m1 - m2 7 0.0 H A: m1 7 m2

Population 1 represents highway mileage, and population 2 represents city mileage The test

is conducted using a significance level of 0.05 = a

Figure 4 shows the descriptive statistics for the two independent samples

FIGURE 4 |

Excel 2010 Output—SUV

Mileage Descriptive Statistics

Excel 2010 Instructions:

1 Open file: Mileage.xlsx.

2 Select Data . Data

Analysis.

3 Select Descriptive

Statistics.

4 Define the data range

for all variables to be

Trang 20

Figure 5 displays the Excel box and whisker plots for the two samples Based on these plots, the normal distribution and equal variance assumptions appear reasonable We will pro-ceed with the test of means assuming normal distributions and equal variances.

Figure 6 shows the Excel output for the hypothesis test The mean highway mileage is 19.6468 mpg, whereas the mean for city driving is 16.146 At issue is whether this differ-ence in sample means119.6468 - 16.146 = 3.5008 mpg2is sufficient to conclude the mean

highway mileage exceeds the mean city mileage The one-tail t critical value for a = 0.05 is shown in Figure 6 to be

t0.05 = 1.6772

Figure 6 shows that the “t-Stat” value from Excel, which is the calculated test statistic (or

t-value, based on Equation 8), is equal to

t = 2.52The difference in sample means (3.5008 mpg) is 2.52 standard errors larger than the hypoth-esized difference of zero Because the test statistic

t = 2.52 7 t0.05 = 1.6772

we reject the null hypothesis Thus, the sample data do provide sufficient evidence to conclude that mean SUV highway mileage exceeds mean SUV city mileage, and this study confirms the expectations of the rental company managers This will factor into the company’s fuel pricing

The output shown in Figures 6 also provides the p-value for the one-tailed test, which can also be used to test the null hypothesis Recall, if the calculated p-value is less than alpha, the

null hypothesis should be rejected The decision rule is

If p @value 6 0.05, reject H0

Otherwise, do not reject H0

The p-value for the one-tailed test is 0.0075 Because 0.0075 6 0.05, the null hypothesis is

rejected This is the same conclusion as the one we reached using the test statistic approach

FIGURE 5 |

Excel 2010 Output (PHStat

Add-in) Box and Whisker

Plot—SUV Mileage Test

Minitab Instructions (for similar results):

1 Open file: Mileage.MTW.

2 Choose Graph > Boxplot.

3 Under Multiple Ys, select Simple.

Trang 21

EXAMPLE 5 HYPOTHESIS TEST FOR M1 − M2 WHEN S1 AND S2 ARE

UNKNOWN, USING INDEPENDENT SAMPLES

Color Printer Ink Cartridges A recent Associated Press news story out of Brussels, Belgium, indicated the European Union was considering a probe of computer makers after consumers com-plained that they were being overcharged for ink cartridges Companies such as Canon, Hewlett-Packard, and Epson are the printer market leaders and make most of their printer-related profits

by selling replacement ink cartridges Suppose an independent test agency wishes to conduct a test to determine whether name-brand ink cartridges generate more color pages on average than competing generic ink cartridges The test can be conducted using the following steps:

We are interested in determining whether the mean number of pages printed

by name-brand cartridges (population 1) exceeds the mean pages printed by generic cartridges (population 2)

The following null and alternative hypotheses are specified:

H0: m1 - m2 … 0.0 H0: m1 … m2

or

H A: m1 - m2 7 0.0 H A: m1 7 m2

Minitab Instructions (for similar results):

1 Open file: Mileage.MTW.

2 Choose Stat > Basic Statistics >

2-Sample t.

3 Choose Samples in different columns.

4 In First, enter the first data column.

5 In Second, enter the other data column.

6 Check Assume equal variances.

7 Click Options and enter 1 –

1 Open file: Mileage.xlsx.

2 Select Data . Data

Analysis.

3 Select t-test: Two

Sample Assuming

Equal Variances.

4 Define data ranges for

the two variables of

9 Click the Home tab and

adjust decimal points

in output

a

FIGURE 6 |

Excel 2010 Output for the

SUV Mileage t-Test for Two

Population Means



Trang 22

Step 3 Specify the significance level for the test.

The test will be conducted using a = 0.05

When the populations have standard deviations that are unknown, the

critical value is a t-value from the t-distribution if the populations are assumed

to be normally distributed and the population variances are assumed to be equal

A simple random sample of 10 users was selected, and the users were given a name-brand cartridge A second sample of 8 users was given generic cartridges Both groups used their printers until the ink ran out The number of pages printed was recorded The samples are independent because the pages printed by users in one group did not in any way influence the pages printed by users in the second group The means computed from the samples are

x1 = 322.5 pages and x2 = 298.3 pagesBecause we do not know the population standard deviations, these values are computed from the sample data and are

s1 = 48.3 pages and s2 = 53.3 pagesSuppose previous studies have shown that the number of pages printed by both types of cartridge tends to be approximately normal with equal variances

Based on a one-tailed test with a = 0.05, the critical value is a t-value from the t-distribution with 10 + 8 - 2 = 16 degrees of freedom From the t-table, the critical t-value is

t0.05 = 1.7459 = Critical value

The calculated test statistic from step 5 is compared to the critical t-value to

form the decision rule The decision rule is

If t 7 1.7459, reject the null hypothesis;

Otherwise, do not reject the null hypothesis

t x x s

18

1 0093

Because

t = 1.0093 6 t0.05 = 1.7459

do not reject the null hypothesis

Figure 7 illustrates the hypothesis test

Based on these sample data, there is insufficient evidence to conclude that the mean number of pages produced by name-brand ink cartridges exceeds the mean for generic cartridges

> > END EXAMPLE

TRY PROBLEM 20



Trang 23

What If the Population Variances Are Not Equal? In the previous examples, we assumed that the population variances were equal, and we carried out the hypothesis test for two population means using Equation 8 Even in cases in which the population variances

are not equal, the t-test as specified in Equation 8 is generally considered to be appropriate

as long as the sample sizes are equal.5 However, if the sample sizes are not equal and if the

sample data lead us to suspect that the variances are not equal, the t-test statistic must be

approximated using Equation 9.6 In cases in which the variances are not equal, the degrees of freedom are computed using Equation 10

1 8 50.55

+ 1

5Studies show that when the sample sizes are equal or almost equal, the t distribution is appropriate even when one

population variance is twice the size of the other.

t-Test Statistic for M1 − M2 When Population Variances Are

Unknown and Not Assumed Equal

t x x

s n

s n

( 1 2) ( 1 2)

12 22

(9)

Degrees of Freedom for t-Test Statistic When Population

Variances Are Not Equal

df s n s n

s n n

s n n

( / / )( / ) ( / )

121

22

(10)



Trang 24

My Stat Lab

null hypothesis based on the sample information Use the test statistic approach

10-21 Given the following null and alternative hypotheses,

conduct a hypothesis test using an alpha equal to 0.05

(Note: The population standard deviations are assumed

10-22 The following statistics were obtained from independent

samples from populations that have normal distributions:

b Determine the p-value for the test described in

10-24 Consider the following two independently chosen

samples whose population variances are not equal to each other

Sample 1 12.1 13.4 11.7 10.7 14.0

Sample 2 10.5 9.5 8.2 7.8 11.1

10-18 A decision maker wishes to test the following null and

alternative hypotheses using an alpha level equal to 0.05:

H0: m1 - m2 = 0

H A: m1 - m2 ≠ 0 The population standard deviations are assumed to

be known After collecting the sample data, the test

statistic is computed to be

z = 1.78

a Using the test statistic approach, what conclusion

should be reached about the null hypothesis?

b Using the p-value approach, what decision should

be reached about the null hypothesis?

c Will the two approaches (test statistic and p-value)

ever provide different conclusions based on the

same sample data? Explain

10-19 The following null and alternative hypotheses have

been stated:

H0: m1 - m2 = 0

H A: m1 - m2 ≠ 0

To test the null hypothesis, random samples have been

selected from the two normally distributed populations

with equal variances The following sample data were

a Assuming that the populations are normally

distributed with equal variances, test at the 0.10

level of significance whether you would reject the

null hypothesis based on the sample information

Use the test statistic approach

b Assuming that the populations are normally

distributed with equal variances, test at the 0.05

level of significance whether you would reject the



Trang 25

central part of the store compared with stores that have the dairy section at the rear of the store To consider relocating the dairy products, the manager feels that the increase in the mean amount spent by customers must be at least 25 cents To determine whether relocation is justified, her staff selected a random sample of 25 customers at stores in which the dairy section is central in the store A second sample of 25 customers was selected in stores with the dairy section

at the rear of the store The following sample results were observed:

Central Dairy Rear Dairy

x1= +3.74 x2= +3.26

s1 = +0.87 s2= +0.79

a Conduct a hypothesis test with a significance level

of 0.05 to determine if the manager should relocate the dairy products in those stores displaying their dairy products in the rear of the store

b If a statistical error associated with hypothesis testing was made in this hypothesis test, what error could it have been? Explain

10-28 Sherwin-Williams is a major paint manufacturer

Recently, the research and development (R&D) department came out with a new paint product designed to be used in areas that are particularly prone to periodic moisture and hot sun They believe that this new paint will be superior to anything that Sherwin-Williams or its competitors currently offer However, they are concerned about the coverage area that a gallon of the new paint will provide compared

to their current products The R&D department set

up a test in which two random samples of paint were selected The first sample consisted of 25 one-gallon containers of the company’s best-selling paint, and the second sample consisted of 15 one-gallon containers

of the new paint under consideration The following statistics were computed from each sample and refer

to the number of square feet that each gallon will cover:

Best-Selling Paint New Paint Product

on a significance level equal to 0.01?

10-29 Albertsons was once one of the largest grocery chains

in the United States, with more than 1,100 grocery stores, but in the early 2000s, the company began

to feel the competitive pinch from companies like Wal-Mart and Costco In January 2006, the company announced that it would be sold to SuperValu, Inc.,

a Using a significance level of 0.025, test the null

hypothesis that m1 - m2 … 0

b Calculate the p-value.

10-25 Descent, Inc., produces a variety of climbing and

mountaineering equipment One of its products is a

traditional three-strand climbing rope An important

characteristic of any climbing rope is its tensile

strength Descent produces the three-strand rope on

two separate production lines: one in Bozeman and

the other in Challis The Bozeman line has recently

installed new production equipment Descent regularly

tests the tensile strength of its ropes by randomly

selecting ropes from production and subjecting them to

various tests The most recent random sample of ropes,

taken after the new equipment was installed at the

Bozeman plant, revealed the following:

Bozeman Challis

x1 = 7,200 lb x2 = 7,087 lb

s1 = 425 s2 = 415

n1 = 25 n2 = 20

Descent’s production managers are willing to

assume that the population of tensile strengths for

each plant is approximately normally distributed

with equal variances Based on the sample results,

can Descent’s managers conclude that there is a

difference between the mean tensile strengths of

ropes produced in Bozeman and Challis? Conduct

the appropriate hypothesis test at the 0.05 level of

significance

10-26 The management of the Seaside Golf Club regularly

monitors the golfers on its course for speed of play

Suppose a random sample of golfers was taken in

2005 and another random sample of golfers was

selected in 2006 The results of the two samples are as

Based on the sample results, can the management

of the Seaside Golf Club conclude that average speed

of play was different in 2006 than in 2005? Conduct

the appropriate hypothesis test at the 0.10 level of

significance Assume that the management of the club

is willing to accept the assumption that the populations

of playing times for each year are approximately

normally distributed with equal variances

10-27 The marketing manager for a major retail grocery

chain is wondering about the location of the stores’

dairy products She believes that the mean amount

spent by customers on dairy products per visit is

higher in stores in which the dairy section is in the



Trang 26

college average was $11,800 Suppose the respective standard deviations were $2,050 and $2,084 The sample sizes were 75 and 205, respectively.

a Examine the sample standard deviations What

do these suggest is the relationship between the two population standard deviations? Support your assertion

b Conduct a hypothesis test to determine if the average college debt for bachelor of arts degree recipients is at least $25,000 more for graduates from private colleges than from public colleges Use

a 0.01 significance level and a p-value approach for

this hypothesis test

10-32 Suppose a professional job-placement firm that

monitors salaries in professional fields is interested

in determining if the fluctuating price of oil and the outsourcing of computer-related jobs have had

an effect on the starting salaries of chemical and electrical engineering graduates Specifically, the job-placement firm would like to know if the 2007 average starting salary for chemical engineering majors is higher than the 2007 average starting salary for electrical engineering majors To conduct its test, the job-placement firm has selected a random sample of 124 electrical engineering majors and 110 chemical engineering majors who graduated and received jobs in 2007 Each graduate was asked

to report his or her starting salary The results of the survey are contained in the file Starting Salaries.

a Conduct a hypothesis test to determine whether the mean starting salary for 2007 graduates in chemical engineering is higher than the mean starting salary for 2007 graduates in electrical engineering Conduct the test at the 0.05 level of significance

Be sure to state a conclusion (Assume that the firm believes the two populations from which the samples were taken are approximately normally distributed with equal variances.)

b Suppose the job-placement firm is unwilling to assume that the two populations are normally distributed with equal variances Conduct the appropriate hypothesis test to determine whether a difference exists between the mean starting salaries for the two groups Use a level of significance of 0.05 What conclusion should the job-placement firm reach based on the hypothesis test?

10-33 A USA Today editorial addressed the growth of

compensation for corporate CEOs Quoting a study

made by BusinessWeek, USA Today indicated that

the pay packages for CEOs have increased almost sevenfold on average from 1994 to 2004 The file titled

CEODough contains the salaries of CEOs in 1994 and

in 2004, adjusted for inflation

a Determine the ratio of the average salary for 1994

and 2004 Does it appear that BusinessWeek was

correct? Explain your answer

headquartered in Minneapolis Prior to the sale,

Albertsons had attempted to lower prices to regain its

competitive edge In an effort to maintain its profit

margins, Albertsons took several steps to lower costs

One was to replace some of the traditional checkout

stands with automated self-checkout facilities After

making the change in some test stores, the company

performed a study to determine whether the average

purchase through a self-checkout facility was less than

the average purchase at the traditional checkout stand

To conduct the test, a random sample of 125 customer

transactions at the self-checkout was obtained, and

a second random sample of 125 transactions from

customers using the traditional checkout process was

obtained The following statistics were computed

from each sample:

Self-Checkout Traditional Checkout

x1 = +45.68 x2 = +78.49

s1 = +58.20 s2 = +62.45

Based on these sample data, what should be

concluded with respect to the average transaction

amount for the two checkout processes? Test using an

a = 0.05 level

10-30 The Washington Post Weekly Edition quoted an Urban

Institute study that stated that about 80% of the

estimated $200 billion of federal housing subsidies

consisted of tax breaks (mainly deductions for

mortgage interest payments) Samples indicated that the

federal housing benefits average was $8,268 for those

with incomes between $200,000 and $500,000 and only

$365 for those with incomes of $40,000 to $50,000

The respective standard deviations were $2,100 and

$150 They were obtained from sample sizes of 150

a Examine the sample standard deviations What

do these suggest is the relationship between the

two population standard deviations? Support your

assertion

b Conduct a hypothesis test to determine if the

average federal housing benefits are at least $7,750

more for those in the $200,000 to $500,000 income

range Use a 0.05 significance level

c Having reached your decision in part b, state the

type of statistical error that you could have made

d Is there any way to determine whether you were in

error in the hypothesis selection you made in part b?

Support your answer

10-31 Although not all students have debt after graduating

from college, more than half do The College Board’s

2008 Trends in Student Aid addresses, among other

topics, the difference in the average college debt

accumulated by undergraduate bachelor of arts degree

recipients by type of college for the 2006–2007

academic year Samples might have been used to

determine this difference in which the private,

for-profit colleges’ average was $38,300 and the public



Trang 27

b Based on these sample data and a 0.05 level of significance, what conclusion should be made about the average length of stay at these two hotel chains?

10-35 Airlines were severely affected by the oil price

increases of 2008 Even Southwest lost money, the first time ever, during that time Many airlines began charging for services that had previously been free, such as baggage and meals One national airline had as

an objective of getting an additional $5 to $10 per trip from its customers Surveys could be used to determine the success of the company’s actions The file titled

AirRevenue contains results of samples gathered

before and after the company implemented its changes

a Produce a 95% confidence interval for the difference

in the average fares paid by passengers before and after the change in policy Based on the confidence interval, is it possible that revenue per passenger increased by at least $10? Explain your response

b Conduct a test of hypothesis to answer the question posed in part a Use a significance level of 0.025

c Did you reach the same conclusion in both parts a and b? Is this a coincidence or will it always be so? Explain your response

b Examine the sample standard deviations What do these

suggest is the relationship between the two population

standard deviations? Support your assertion

c Based on your response to part b, conduct a test

of hypothesis to determine if the difference in the

average CEO salary between 1994 and 2004 is more

than $9.8 million Use a p-value approach with a

significance level of 0.025

10-34 The Marriott Corporation operates the largest chain of

hotel and motel properties in the world The Fairfield

Inn and the Residence Inn are just two of the hotel

brands that Marriott owns At a recent managers’

meeting, a question was posed regarding whether

the average length of stay was different at these two

properties in the United States A summer intern was

assigned the task of testing to see if there is a difference

She started by selecting a simple random sample of 100

hotel reservations from Fairfield Inn Next, she selected

a simple random sample of 100 hotel reservations from

Residence Inn In both cases, she recorded the number

of nights’ stay for each reservation The resulting data

are in the file called Marriott.

a State the appropriate null and alternative hypotheses

END EXERCISES 10-2

Tests for Paired Samples

Sections 1 and 2 introduced the methods by which decision makers can estimate and test the hypotheses for the difference between the means for two populations when the two samples are independent In each example, the samples were independent because the sample values from one population did not have the potential to influence the probability that values would

be selected from the second population However, there are instances in business in which you would want to use paired samples to control for sources of variation that might otherwise

distort the conclusions of a study

Why Use Paired Samples?

There are many situations in business in which using paired samples should be considered For instance, a paint manufacturer might be interested in comparing the area that a new paint mix will cover per gallon with that of an existing paint mixture One approach would be to have one random sample of painters apply a gallon of the new paint mixture A second sample

of painters would be given the existing mix In both cases, the number of square feet that were covered by the gallon of paint would be recorded In this case, the samples would be inde-pendent because the area covered by painters using the new mixture would not be in any way affected by the area covered by painters using the existing mixture

This would be a fine way to do the study unless the painters themselves could influence the area that the paint will cover For instance, suppose some painters, because of their tech-nique or experience, are able to cover more area from a gallon of paint than other painters regardless of the type of paint used Then, if by chance most of these “good” painters happened

to get assigned to the new mix, the results might show that the new mix covers more area, not because it is a better paint, but because the painters that used it during the test were better

To combat this potential problem, the company might want to use paired samples To do this, one group of painters would be selected and each painter would use one gallon of each paint mix

We would measure the area covered by each painter for both paint mixes Doing this controls for the effect of the painters’ ability or technique The following application involving gasoline sup-plemented with ethanol testing is one in which paired samples would most likely be warranted

Paired Samples

Samples that are selected in such a way that

values in one sample are matched with the

values in the second sample for the purpose of

controlling for extraneous factors Another term

for paired samples is dependent samples.

Chapter Outcome 2



Trang 28

BUSINESS APPLICATION ESTIMATION USING PAIRED SAMPLES

TESTING ETHANOL MIXED GASOLINE A major oil company wanted to estimate the

difference in average mileage for cars using a regular gasoline compared with cars using a gasoline-and-ethanol mixture The company used a paired-sample approach to control for any variation in mileage arising because of different cars and drivers A random sample of

10 motorists (and their cars) was selected Each car was filled with regular gasoline The car was driven 200 miles on a specified route The car then was filled again with gasoline and the miles per gallon were computed After the 10 cars completed this process, the same steps were performed using the gasoline mixed with ethanol Because the same cars and drivers tested both types of fuel, the miles-per-gallon measurements for the ethanol mixture and regular gasoline will most likely be related The two samples are not independent but are

instead considered paired samples Thus, we will compute the paired difference between the

values from each sample, using Equation 11

Paired Difference

d = x1 - x2 (11)

where:

d = Paired difference

x1 and x2 = Values from samples 1 and 2, respectively

Point Estimate for the Population Mean Paired Difference, Md

d d n

i i

5 Number of paired differrences

Figure 8 shows the Excel spreadsheet for this mileage study with the paired differences puted The data are in the file called Ethanol-Gas.

com-The first step to develop the interval estimate is to compute the mean paired difference, d, using

Equation 12 This value is the best point estimate for the population mean paired difference, md

FIGURE 8 |

Excel 2010 Worksheet for

Ethanol Mixed Gasoline Study

Excel 2010 Instructions:

1 Open file: Ethanol-Gas.xlsx.

Using Equation 12, we determine d as follows:

d d n

22 7

10 2 27

The next step is to compute the sample standard deviation for the paired differences

using Equation 13

Sample Standard Deviation for Paired Differences

s

d d n

d

i i

d i = ith paired difference

d = Mean paired difference



Trang 29

The sample standard deviation for the paired differences is

Assuming that the population of paired differences is normally distributed, the confidence interval estimate for the population mean paired difference is computed using Equation 14

Confidence Interval Estimate for Population Mean Paired Difference, Md

d t s n

1d

d differenceSample standard deviation o

Number of paired dif

n fferences (sample size)

For a 95% confidence interval with 10 - 1 = 9 of freedom, we use a critical t from the

2 27 2 26224 38

10

2 27 3 13

0 86mpg ————— 5.40 mpgBecause the interval estimate contains zero, there may be no difference in the average mileage when either regular gasoline or the ethanol mixture is used

EXAMPLE 6 CONFIDENCE INTERVAL ESTIMATE FOR THE DIFFERENCE

BETWEEN POPULATION MEANS, PAIRED SAMPLES

PGA of America Testing Center Technology has done more to change golf than possibly any other group in recent years Titanium woods, hybrid irons, and new golf ball designs have impacted professional and amateur golfers alike PGA of America is the association that only professional golfers can belong to The association provides many services for golf pro-fessionals, including operating an equipment training center in Florida Recently, a maker of golf balls developed a new ball technology, and PGA of America

is interested in estimating the mean difference in driving distance for this new ball versus the existing best-seller To conduct the test, the PGA of America staff selected six professional golfers and had each golfer hit each ball one time Here are the steps necessary to develop a confidence interval estimate for the difference in population means for paired samples:

Because the same golfers hit each golf ball, the company is controlling for the variation in the golfers’ ability to hit a golf ball The samples are paired, and the population value of interest is md, the mean paired difference in distance

We assume that the population of paired differences is normally distributed



Trang 30

Step 2 Specify the desired confidence level and determine the appropriate

critical value.

The research director wishes to have a 95% confidence interval estimate

The sample data, paired differences, are shown as follows

Golfer Existing Ball New Ball d

13

6 ⫽ 2 17. yards

The standard deviation for the paired differences is computed using Equation 13

The critical t for 95% confidence and 6 - 1 = 5 degrees of freedom is

> END EXAMPLE

TRY PROBLEM 38

The key in deciding whether to use paired samples is to determine whether a factor exists that might adversely influence the results of the estimation In the ethanol mixed gasoline test example, we controlled for potential outside influence by using the same cars and drivers to test both gasolines In Example 6, we controlled for golfer ability by having the same golfers hit both golf balls If you determine there is no need to control for an outside source of varia-tion, then independent samples should be used, as discussed earlier

Hypothesis Testing for Paired Samples

As we just illustrated, there will be instances in which paired samples can be used to control for an outside source of variation For instance, in Example 5, involving the ink cartridges, the original test of whether name-brand cartridges yield a higher mean number of printed pages than generic cartridges involved different users for the two types of cartridges, so the samples

Chapter Outcome 2



Trang 31

EXAMPLE 7 HYPOTHESIS TEST FOR Md, PAIRED SAMPLES

Color Printer Ink Cartridges Referring to Example 5, suppose the experiment regarding ink cartridges is conducted differently Instead of having different samples

of people use name-brand and generic cartridges, the test is done using paired ples This means that the same people will use both types of cartridges, and the pages printed in each case will be recorded The test under this paired-sample scenario can

sam-be conducted using the following steps Six randomly selected people have agreed to participate

In this case, we will form paired differences by subtracting the generic pages from the name-brand pages We are interested in determining whether name-brand cartridges produce more printed pages, on average, than generic cartridges, so we would expect the paired difference to be positive We assume that the paired differences are normally distributed

The null and alternative hypotheses are

H0: md … 0.0

H A: md 7 0.0

The test will be conducted using a = 0.01

The critical value is a t-value from the t-distribution, with a = 0.01 and

6 - 1 = 5 degrees of freedom The critical value is

t0.01 = 3.3649The decision rule is

If t 7 3.3649, reject the null hypothesis;

otherwise, do not reject the null hypothesis

Select the random sample and compute the mean and standard deviation for the paired differences

were independent However, different users may use more or less ink as a rule; therefore, we could control for that source of variation by having a sample of people use both types of car-tridges in a paired test format

If a paired-sample experiment is used, the test statistic is computed using Equation 15

t-Test Statistic for Paired-Sample Test

t d s n

df n

d d

s d ee standard deviation for paired differencess

Number of paired values in t

(d d)

n n

2

1h

he sample



Trang 32

Skill Development

10-36 The following dependent samples were randomly

selected Use the sample data to construct a 95%

confidence interval estimate for the population mean

10-37 The following paired sample data have been obtained

from normally distributed populations Construct a

90% confidence interval estimate for the mean paired difference between the two population means

Sample # Population 1 Population 2

d

4 6

0 2516

73

6 12 17.The standard deviation for the paired differences is

t d s n

d d

12 17 0 0

18 026

1 6543

Because t = 1.6543 6 t0.01 = 3.3649, do not reject the null hypothesis

Based on these sample data, there is insufficient evidence to conclude that name-brand ink cartridges produce more pages on average than generic brands

> END EXAMPLE

TRY PROBLEM 39



Trang 33

10-42 Consider the following set of samples obtained from

two normally distributed populations whose variances are equal:

Sample 1: 11.2 11.2 7.4 8.7 8.5 13.5 4.5 11.9 Sample 2: 11.7 9.5 15.6 16.5 11.3 17.6 17.0 8.5

a Suppose that the samples were independent

Perform a test of hypothesis to determine if there

is a difference in the two population means Use a significance level of 0.05

b Now suppose that the samples were paired samples Perform a test of hypothesis to determine if there is

a difference in the two population means

c How do you account for the difference in the outcomes of part a and part b? Support your assertions with a statistical rationale

10-43 One of the advances that helped to diminish carpal

tunnel syndrome is ergonomic keyboards The ergonomic keyboards may also increase typing speed Ten administrative assistants were chosen to type on both standard and ergonomic keyboards The resulting word-per-minute typing speeds follow:

per minute attained while typing Use a p-value

approach with a significance level of 0.01

10-44 Production engineers at Sinotron believe that a modified

layout on its assembly lines might increase average worker productivity (measured in the number of units produced per hour) However, before the engineers are ready to install the revised layout officially across the entire firm’s production lines, they would like to study the modified line’s effects on output The following data represent the average hourly production output of 12 randomly sampled employees before and after the line was modified:

10-45 The United Way raises money for community charity

activities Recently, in one community, the fundraising committee was concerned about whether there is a difference in the proportion of employees who give to United Way depending on whether the employer is a private business or a government agency A random

a Construct and interpret a 99% confidence interval

estimate for the paired difference in mean values

b Construct and interpret a 90% confidence interval

estimate for the paired difference in mean values

10-39 The following sample data have been collected from

a paired sample from two populations The claim is

that the first population mean will be at least as large

as the mean of the second population This claim will

be assumed to be true unless the data strongly suggest

b Based on the sample data, what should you

conclude about the null hypothesis? Test using

a = 0.10

c Calculate a 90% confidence interval for the

difference in the population means Are the results

from the confidence interval consistent with the

outcome of your hypothesis test? Explain why or

why not

10-40 A paired sample study has been conducted to

determine whether two populations have equal

means Twenty paired samples were obtained with the

following sample results:

d 12 45 s d 11 0

Based on these sample data and a significance level

of 0.05, what conclusion should be made about the

population means?

10-41 The following samples are observations taken from the

same elements at two different times:

Unit Sample 1 Sample 2

a Assume that the populations are normally

distributed and construct a 90% confidence interval

for the difference in the means of the distribution at

the times in which the samples were taken

b Perform a test of hypothesis to determine if the

difference in the means of the distribution at the

first time period is 10 units larger than at the

second time period Use a level of significance

equal to 0.10



Trang 34

a Discuss the appropriateness of the way this study was designed and conducted Why didn’t the consumer group select two samples with different drivers in each and have one group use the acetone and the other group not use it? Discuss.

b Using a significance level of 0.05, what conclusion should be reached based on these sample data? Discuss

10-47 An article in The American Statistician (M L R

Ernst, et al., “Scatterplots for Unordered Pairs,” 50 (1996), pp 260–265) reports on the difference in the measurements by two evaluators of the cardiac output

of 23 patients using Doppler echocardiography Both observers took measurements from the same patients The measured outcomes were as follows:

Patient 1 2 3 4 5 6 7 8 9 10 11 12

Evaluator 1 4.8 5.6 6.0 6.4 6.5 6.6 6.8 7.0 7.0 7.2 7.4 7.6 Evaluator 2 5.8 6.1 7.7 7.8 7.6 8.1 8.0 8.21 6.6 8.1 9.5 9.6 Patient 13 14 15 16 17 18 19 20 21 22 23

Evaluator 1 7.7 7.7 8.2 8.2 8.3 8.5 9.3 10.2 10.4 10.6 11.4 Evaluator 2 8.5 9.5 9.1 10.0 9.1 10.8 11.5 11.5 11.2 11.5 12.0

a Conduct a hypothesis test to determine if the average cardiac outputs measured by the two evaluators differ Use a significance level of 0.02

b Calculate the standard error of the difference between the two average outputs assuming that the sampling was done independently Compare this with the standard error obtained in part a

10-48 A prime factor in the economic troubles that started

in 2008 was the end of the “housing bubble.” The file titled House contains data for a sample showing the

average and median housing prices for selected areas

in the country in November 2007 and November 2008 Assume the data can be viewed as samples of the relevant populations

a Discuss whether the two samples are independent or dependent

b Based on your answer to part a, calculate a 90% confidence interval for the difference between the means of the average and median selling prices for houses during November 2007

c Noting your answer to part b, would it be plausible

to assert that the mean of the average selling prices for houses during the November 2007 is more than the average of the median selling prices during this period? Support your assertions

d Using a p-value approach and a significance level of

0.05, conduct a hypothesis test to determine if the mean of the average selling prices for houses during November 2007 is more than $30,000 larger than the mean of the median selling prices during this period

10-49 A treadmill manufacturer has developed a new

machine with softer tread and better fans than its

sample of people who had been contacted about

contributing last year was selected Of those contacted,

70 worked for a private business and 50 worked for a

government agency For the 70 private-sector employees,

the mean contribution was $230.25 with a standard

deviation equal to $55.52 For the 50 government

employees in the sample, the mean and standard

deviation were $309.45 and $61.75, respectively

a Based on these sample data and a = 0.05, what should

be concluded? Be sure to show the decision rule

b Construct a 95% confidence interval for the

difference between the mean contributions of

private business and government agency employees

who contribute to United Way Do the hypothesis

test and the confidence interval produce compatible

results? Explain and give reasons for your answer

10-46 An article on the PureEnergySystems.com website

written by Louis LaPoint discusses a product called

acetone The article stated that “Acetone 1CH3COCH32

is a product that can be purchased inexpensively in

most locations around the world, such as in common

hardware, auto parts, or drug stores Added to the fuel

tank in tiny amounts, acetone aids in the vaporization

of the gasoline or diesel, increasing fuel efficiency,

engine longevity, and performance—as well as reducing

hydrocarbon emissions.” To test whether this product

actually does increase fuel efficiency in passenger cars,

a consumer group has randomly selected 10 people to

participate in the study The following procedure is used:

1 People are to bring their cars into a specified

gasoline station and have the car filled with regular,

unleaded gasoline at a particular pump Nothing

extra is added to the gasoline at this fill-up The

car’s odometer is recorded at the time of fill-up

2 When the tank is nearly empty, the person is to bring

the car to the same gasoline station and pump and

have it refilled with gasoline The odometer is read

again and the miles per gallon are recorded This time,

a prescribed quantity of acetone is added to the fuel

3 When the tank is nearly empty, the person is to

bring the car back to the same station and pump to

have it filled The miles per gallon will be recorded

Each person is provided with free tanks of gasoline and

asked to drive his or her car normally

The following miles per gallon (mpg) were recorded:

Driver No Additive MPG: Acetone Added MPG:

Trang 35

excluding food, gifts, and news items A file titled

Revenues contains sample data selected from airport

retailers in 2005 and again in 2008

a Conduct a hypothesis test to determine if the average amount of retail spending by air travelers has increased as least as much as approximately

$0.10 a year from 2005 to 2008 Use a significance level of 0.025

b Using the appropriate analysis (that of part a or other appropriate methodology), substantiate the statement that average retail purchases in airports increased over the time period between 2005 and

2008 Support your assertions

c Parts a and b give what seems to be a mixed message Is there a way to determine what values are plausible for the difference between the average revenue in 2005 and 2008? If so, conduct the appropriate procedure

current model The manufacturer believes these

new features will enable runners to run for longer

times than they can on its current machines To

determine whether the desired result is achieved, the

manufacturer randomly sampled 35 runners Each

runner was measured for one week on the current

machine and for one week on the new machine The

weekly total number of minutes for each runner on

the two types of machines was collected The results

are contained in the file Treadmill At the 0.02

level of significance, can the treadmill manufacturer

conclude that the new machine has the desired

result?

10-50 As the number of air travelers with time on their

hands increases, it would seem that spending on

retail purchases in airports would increase as well

A study by Airport Revenue News addressed the

per-person spending at selected airports for merchandise,

END EXERCISES 10-3

Two Population Proportions

The previous sections illustrated the methods for estimating and testing hypotheses ing two population means There are many business situations in which these methods can be applied However, there are other instances involving two populations in which the measures of interest are not the population means This section extends the methodology for testing hypoth-eses involving a single population proportion to tests involving hypotheses about the differ-ence between two population proportions First, we will look at a confidence interval estimation involving two population proportions

involv-Estimating the Difference between Two Population Proportions

BUSINESS APPLICATION ESTIMATING THE DIFFERENCE BETWEEN TWO

POPULATION PROPORTIONS

BICYCLE DESIGN Recently, an outdoor magazine conducted an interesting study

involving a prototype bicycle that was made by a Swiss manufacturer The prototype had no identification on it to indicate the name of the manufacturer Of interest was the difference in the proportion of men versus women who would rate the bicycle as high quality

Obviously, there was no way to gauge the attitudes of the entire population of men and women who could eventually judge the quality of the bicycle Instead, the reporter for the magazine asked a random sample of 425 men and 370 women to rate the bicycle’s quality In the

results that follow, the variable x indicates the number in the sample who said the bicycle was

Trang 36

The point estimate for the difference in population proportions is

propor-A rule of thumb for “sufficiently large” is that np and n 11- p2 are greater than or equal to 5

for each sample

Confidence Interval Estimate for p1 2 p2

p1 = Sample proportion from population 1

p2 = Sample proportion from population 2

z = Critical value from the standard normal table

The analysts can substitute the sample results into Equation 16 to establish a 95% dence interval estimate, as follows:

confi-( 0 565 0 530 ) 1 96 0 565 1 0 565. ( . ) . (

425

0 530 1 0 530370

0 035 0 069

)

0 034 (p1 p2) 0 104.Thus, based on the sample data and using a 95% confidence interval, the analysts estimate that the true difference in proportion of males versus females who rate the prototype as high quality is between -0.034 and 0.104 At one extreme, 3.4% more females rate the bicycle as high in quality At the other extreme, 10.4% more males rate the bicycle as high quality than females Because zero is included in the interval, there may be no difference between the proportion of males and females who rate the prototype as high quality based on these data Consequently, the reporter is not able to conclude that one group or the other would be more likely to rate the prototype bicycle high in quality

Hypothesis Tests for the Difference between Two Population Proportions

BUSINESS APPLICATION TESTING FOR THE DIFFERENCE BETWEEN TWO

POPULATION PROPORTIONS

POMONA FABRICATIONS Pomona Fabrications, Inc produces handheld hair dryers

that several major retailers sell as in-house brands A critical component of a handheld hair dryer is the motor-heater unit, which accounts for most of the dryer’s cost and for most

of the product’s reliability problems Product reliability is important to Pomona because the company offers a one-year warranty Of course, Pomona is also interested in reducing production costs

Pomona’s R&D department has recently created a new motor-heater unit with fewer parts than the current unit, which would lead to a 15% cost savings per hair dryer However, the company’s vice president of product development is unwilling to authorize the new component unless it is more reliable than the current motor-heater

The R&D department has decided to test samples of both units to see which motor-heater

is more reliable Of each type, 250 will be tested under conditions that simulate one year’s

Excel Tutorial

Excel

tutorials



Trang 37

use, and the proportion of each type that fails within that time will be recorded This leads to the formulation of the following null and alternative hypotheses:

H0: p1 - p2 Ú 0.0 H0: p1 Ú p2

or

H A : p1 - p2 6 0.0 H A : p1 6 p2

where:

p1 = Population proportion of new dryer type that fails in simulated one - year period

p2 = Population proportion of existing dryer type that fails in simulated one - year period

The null hypothesis states that the new motor-heater is no better than the old, or rent, motor-heater The alternative states that the new unit has a smaller proportion of failures within one year than the current unit In other words, the alternative states that the new unit is more reliable The company wants clear evidence before changing units If the null hypothesis

cur-is rejected, the company will conclude that the new motor-heater unit cur-is more reliable than the old unit and should be used in producing the hair dryers To test the null hypothesis, we can use the test statistic approach

The test statistic is based on the sampling distribution of p1- p2 We showed that when

np Ú 5 and n11 - p2 Ú 5, the sampling distribution of the sample proportion is approximately

normally distributed, with a mean equal to p and a variance equal to p 11 - p2>n.

Likewise, in the two-sample case, the sampling distribution of p1 - p2 will also be approximately normal if

Assumptions

n1p1Ú 5, n111−p12 Ú 5, and n2p2Ú 5, n211−p2 2 Ú 5

Because p1 and p2 are unknown, we substitute the sample proportions, p1 and p2, to determine whether the sample size requirements are satisfied

The mean of the sampling distribution of p1 - p2 is the difference of the population

proportions, p1 - p2 The variance is, however, the sum of the variances, p111 - p12>n1 +

p211 - p22>n2 Because the test is conducted using the assumption that the null hypothesis is

true, we assume that p1 = p2 = p and estimate their common value, p, using a pooled estimate,

as shown in Equation 17 The z-test statistic for the difference between two proportions is given

x1 and x2 = Number from samples 1 and 2 with the characteristic of interest

z-Test Statistic for Difference between Population Proportions

z p p p p

p p

n n

( 1 2) ( 1 2)1

p

p p p

respeectivelyPooled estimator for the overall

p proportion for both populations combined



Trang 38

FIGURE 9 |

Hypothesis Test of Two

Population Proportions for

in the two samples, and the denominator is the total sample size Again, the pooled

estima-tor, p, is used when the null hypothesis is that there is no difference between the population

proportions

Assume that Pomona is willing to use a significance level of 0.05 and that 55 of the new motor-heaters and 75 of the originals failed the one-year test Figure 9 illustrates the decision-rule development and the hypothesis test As you can see, Pomona should reject the null hypothesis based on the sample data Thus, the firm should conclude that the new motor-heater is more reliable than the old one Because the new one is also less costly, the company should now use the new unit in the production of hair dryers

The p-value approach to hypothesis testing could also have been used to test Pomona’s hypothesis In this case, the calculated value of the test statistic, z = -2.04, results in a

p-value of 0.0207 10.5 - 0.47932 from the standard normal table Because this p-value is

smaller than the significance level of 0.05, we would reject the null hypothesis Remember,

whenever your p-value is smaller than the alpha value, your sample contains evidence to

reject the null hypothesis

The PHStat add-in to Excel contains a procedure for performing hypothesis tests involving two population proportions Figure 10 shows the PHStat output for the Pomona

example The output contains both the z-test statistic and the p-value As we observed

from our manual calculations, the difference in sample proportions is sufficient to reject the null hypothesis that there is no difference in population proportions



Trang 39

EXAMPLE 8 HYPOTHESIS TEST FOR THE DIFFERENCE BETWEEN

TWO POPULATION PROPORTIONS

Transportation Security Administration Transportation Security Administration (TSA)

is responsible for transportation security at all United States airports The TSA is evaluating two suppliers of a scanning system it is considering purchasing Both scanners are designed

to detect forged IDs that might be used by passengers trying to board airlines High-quality scanners and printers and home computers have made forging IDs an increasing security risk The TSA is interested in determining whether there is a difference in the proportion of forged IDs detected by the two suppliers To conduct this test, use the following steps:

In this case, the population parameter of interest is the population proportion of detected forged IDs At issue is whether there is a difference between the two suppliers in terms of the proportion of forged IDs detected

FIGURE 10 |

Excel 2010 (PHStat) Output of

the Two Proportions Test for

7 Enter Number of Items

of Interest and Sample

Size for both populations.

8 Indicate Lower-tail test.

9 Click OK.

Minitab Instructions (for similar results):

1 Choose Stat >

Basic Statistics >

2 Proportions.

2 Choose Summarized data.

3 In First enter Trials and

Events for sample 1(e.g., 250 and 55)

4 In Second enter Trials

and Events for sample 2(e.g., 250 and 75)

5 Select Options,

Insert 1 – in

Confidence level.

6 In Alternative

select less than.

7 Check Use pooled estimate of p for test.

8 Click OK OK.

a



Trang 40

Step 2 Formulate the appropriate null and alternative hypotheses.

The null and alternative hypotheses are

H0: p1 - p2 = 0.0

H A : p1 - p2 ≠ 0.0

The test will be conducted using an a = 0.02

For a two-tailed test, the critical values for each side of the distribution are

- z0.01 = - 2.33 and z0.01 = 2.33

The decision rule based on the z-test statistic is

If z 6 -2.33 or z 7 2.33, reject the null hypothesis;

Otherwise, do not reject the null hypothesis

rule.

Two hundred known forged IDs will be randomly selected from a large population of previously confiscated IDs and scanned by systems from each supplier For supplier 1, 186 forgeries are detected, and for supplier 2, 168 are detected The sample proportions are

p x n

x n

2 2

2 8211

Because z = 2.8211 7 z0.01 = 2.33, reject the null hypothesis

The difference between the two sample proportions provides sufficient evidence to allow us to conclude a difference exists between the two suppliers The TSA can infer that supplier 1 provides the better scanner for this purpose

> END EXAMPLE

TRY PROBLEM 54



Ngày đăng: 04/02/2020, 09:27

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w