(BQ) Part 2 book Essentials of modern business statistics has contents: has contents: Interval estimation, hypothesis tests, simple linear regression, multiple regression, comparisons involving proportions and a test of independence,...and other contents.
Trang 18.3 DETERMINING THE SAMPLE SIZE
8.4 POPULATION PROPORTIONUsing Excel
Determining the Sample Size
Trang 2Founded in 1957 as Food Town, Food Lion is one of the largest
supermarket chains in the United States with 1200 stores in 11
Southeastern and Mid-Atlantic states The company sells more
than 24,000 different products and offers nationally and
region-ally advertised brand-name merchandise, as well as a growing
number of high-quality private label products manufactured
es-pecially for Food Lion The company maintains its low price
leadership and quality assurance through operating efficiencies
such as standard store formats, innovative warehouse design,
energy-efficient facilities, and data synchronization with
suppli-ers Food Lion looks to a future of continued innovation, growth,
price leadership, and service to its customers
Being in an inventory-intense business, Food Lion made the
decision to adopt the LIFO (last-in, first-out) method of inventory
valuation This method matches current costs against current
revenues, which minimizes the effect of radical price changes on
profit and loss results In addition, the LIFO method reduces net
income thereby reducing income taxes during periods of inflation
Food Lion establishes a LIFO index for each of seven
in-ventory pools: Grocery, Paper/Household, Pet Supplies, Health
& Beauty Aids, Dairy, Cigarette/Tobacco, and Beer/Wine For
example, a LIFO index of 1.008 for the Grocery pool would
in-dicate that the company’s grocery inventory value at current
costs reflects a 0.8% increase due to inflation over the most
re-cent one-year period
A LIFO index for each inventory pool requires that the
year-end inventory count for each product be valued at the current
year-end cost and at the preceding year-end cost To avoid
ex-cessive time and expense associated with counting the inventory
in all 1200 store locations, Food Lion selects a random sample
of 50 stores Year-end physical inventories are taken in each ofthe sample stores The current-year and preceding-year costs foreach item are then used to construct the required LIFO indexesfor each inventory pool
For a recent year, the sample estimate of the LIFO index forthe Health & Beauty Aids inventory pool was 1.015 Using a95% confidence level, Food Lion computed a margin of error
of 006 for the sample estimate Thus, the interval from 1.009
to 1.021 provided a 95% confidence interval estimate of thepopulation LIFO index This level of precision was judged to
be very good
In this chapter you will learn how to compute the margin
of error associated with sample estimates You will also learnhow to use this information to construct and interpret intervalestimates of a population mean and a population proportion
The Food Lion store in the Cambridge ShoppingCenter, Charlotte, North Carolina © Courtesy ofFood Lion
FOOD LION*
SALISBURY, NORTH CAROLINA
*The authors are indebted to Keith Cunningham, Tax Director, and Bobby
Harkey, Staff Tax Accountant, at Food Lion for providing this Statistics in
Practice.
In Chapter 7, we stated that a point estimator is a sample statistic used to estimate a tion parameter For instance, the sample mean is a point estimator of the population mean
popula-µ and the sample proportion is a point estimator of the population proportion p Because
a point estimator cannot be expected to provide the exact value of the population parameter,
aninterval estimateis often computed by adding and subtracting a value, called the gin of error, to the point estimate The general form of an interval estimate is as follows:
mar-Point estimate Margin of error
p¯
x¯
Trang 3The purpose of an interval estimate is to provide information about how close the point timate, provided by the sample, is to the value of the population parameter.
es-In this chapter we show how to compute interval estimates of a population mean µ and
a population proportion p The general form of an interval estimate of a population mean is
Similarly, the general form of an interval estimate of a population proportion is
The sampling distributions of and play key roles in computing these interval estimates
In order to develop an interval estimate of a population mean, either the population dard deviation σ or the sample standard deviation s must be used to compute the margin of
stan-error In most applications σ is not known, and s is used to compute the margin of error In
some applications, however, large amounts of relevant historical data are available and can
be used to estimate the population standard deviation prior to sampling Also, in quality trol applications where a process is assumed to be operating correctly, or “in control,” it isappropriate to treat the population standard deviation as known We refer to such cases astheσ knowncase In this section we introduce an example in which it is reasonable to treat
con-σ as known and show how to construct an interval estimate for this case.
Each week Lloyd’s Department Store selects a simple random sample of 100 customers
in order to learn about the amount spent per shopping trip With x representing the amount
spent per shopping trip, the sample mean provides a point estimate of µ, the mean amount
spent per shopping trip for the population of all Lloyd’s customers Lloyd’s has been usingthe weekly survey for several years Based on the historical data, Lloyd’s now assumes aknown value of σ $20 for the population standard deviation The historical data also in-
dicate that the population follows a normal distribution
During the most recent week, Lloyd’s surveyed 100 customers (n 100) and obtained
a sample mean of $82 The sample mean amount spent provides a point estimate of thepopulation mean amount spent per shopping trip, µ In the discussion that follows, we show
how to compute the margin of error for this estimate and develop an interval estimate of thepopulation mean
Margin of Error and the Interval Estimate
In Chapter 7 we showed that the sampling distribution of can be used to compute the bility that will be within a given distance of µ In the Lloyd’s example, the historical data
proba-show that the population of amounts spent is normally distributed with a standard deviation
ofσ 20 So, using what we learned in Chapter 7, we can conclude that the sampling
dis-tribution of follows a normal disdis-tribution with an unknown mean µ, and a known standard
error of σ x¯σ兾兹n20兾兹100 2 This sampling distribution is shown in Figure 8.1.*
Trang 4FIGURE 8.1 SAMPLING DISTRIBUTION OF THE SAMPLE MEAN AMOUNT
SPENT FROM SIMPLE RANDOM SAMPLES OF 100 CUSTOMERS
FIGURE 8.2 SAMPLING DISTRIBUTION OF SHOWING THE LOCATION OF SAMPLE
MEANS THAT ARE WITHIN 3.92 OF x¯ µ
Because the sampling distribution shows how values of are distributed around the lation mean µ, the sampling distribution of provides information about the possible differ-
popu-ences between and µ.
Using the standard normal probability table, we find that 95% of the values of anynormally distributed random variable are within 1.96 standard deviations of the mean.Thus, when the sampling distribution of is normally distributed, 95% of the values must
be within 1.96 of the mean µ In the Lloyd’s example we know that the sampling
dis-tribution of is normally distributed with a standard error of 2 Because 1.96 1.96(2) 3.92, we can conclude that 95% of all values obtained using a sample size of
n 100 will be within 3.92 of the population mean µ See Figure 8.2.
Trang 5The populationmean
µ µ
margin of error equal to 3.92 and compute the interval estimate of µ using 3.92 To
pro-vide an interpretation for this interval estimate, let us consider the values of that could be
obtained if we took three different simple random samples, each consisting of 100 Lloyd’s
cus-tomers The first sample mean might turn out to have the value shown as 1in Figure 8.3 Inthis case, Figure 8.3 shows that the interval formed by subtracting 3.92 from 1and adding3.92 to 1includes the population mean µ Now consider what happens if the second sample
mean turns out to have the value shown as 2in Figure 8.3 Although this sample mean fers from the first sample mean, we see that the interval formed by subtracting 3.92 from 2and adding 3.92 to 2also includes the population mean µ However, consider what happens
dif-if the third sample mean turns out to have the value shown as 3in Figure 8.3 In this case, theinterval formed by subtracting 3.92 from 3and adding 3.92 to 3does not include the popu-lation mean µ Because 3falls in the upper tail of the sampling distribution and is farther than3.92 from µ, subtracting and adding 3.92 to 3forms an interval that does not include µ.
Any sample mean that is within the darkly shaded region of Figure 8.3 will provide
an interval that contains the population mean µ Because 95% of all possible sample means
are in the darkly shaded region, 95% of all intervals formed by subtracting 3.92 from andadding 3.92 to will include the population mean µ.
Recall that during the most recent week, the quality assurance team at Lloyd’s surveyed
100 customers and obtained a sample mean amount spent of 82 Using 3.92 toconstruct the interval estimate, we obtain 82 3.92 Thus, the specific interval estimate ofx¯ x¯
x¯
x¯ x¯
Trang 6µ based on the data from the most recent week is 82 3.92 78.08 to 82 3.92 85.92.
Because 95% of all the intervals constructed using 3.92 will contain the populationmean, we say that we are 95% confident that the interval 78.08 to 85.92 includes the popu-lation mean µ We say that this interval has been established at the 95% confidence level.The value 95 is referred to as the confidence coefficient, and the interval 78.08 to 85.92
is called the 95% confidence interval.Another term sometimes associated with an interval estimate is the level of signifi- cance The level of significance associated with an interval estimate is denoted by the Greekletterα The level of significance and the confidence coefficient are related as follows:
The level of significance is the probability that the interval estimation procedure will erate an interval that does not contain µ For example, the level of significance corre-
gen-sponding to a 95 confidence coefficient is α 1 95 05 In Lloyd’s case, the level of
significance (α 05) is the probability of drawing a sample, computing the sample mean,
and finding that lies in one of the tails of the sampling distribution (see 3in Figure 8.3).When the sample mean happens to fall in the tail of the sampling distribution (and it will5% of the time), the confidence interval generated will not contain µ.
With the margin of error given by z α/2( ), the general form of an interval estimate
of a population mean for the σ known case follows σ兾兹
This discussion provides
insight as to why the
interval is called a 95%
confidence interval.
INTERVAL ESTIMATE OF A POPULATION MEAN: σ KNOWN
(8.1)
where (1 α) is the confidence coefficient and z α/2 is the z value providing an area
ofα/2 in the upper tail of the standard normal probability distribution.
x¯ z α/2 σ
兹n
Let us use expression (8.1) to construct a 95% confidence interval for the Lloyd’s ample For a 95% confidence interval, the confidence coefficient is (1 α) 95 and thus,
ex-α 05 Using the tables of areas for the standard normal distribution, an area of ex-α/2
.05/2 025 in the upper tail provides z.025 1.96 With the Lloyd’s sample mean 82,
σ 20, and a sample size n 100, we obtain
Thus, using expression (8.1), the margin of error is 3.92 and the 95% confidence interval is
82 3.92 78.08 to 82 3.92 85.92
Although a 95% confidence level is frequently used, other confidence levels such as
90% and 99% may be considered Values of z α/2for the most commonly used confidencelevels are shown in Table 8.1 Using these values and expression (8.1), the 90% confidenceinterval for the Lloyd’s example is
The level of significance is
also referred to as the
significance level.
Trang 7Thus, at 90% confidence, the margin of error is 3.29 and the confidence interval is
82 3.29 78.71 to 82 3.29 85.29 Similarly, the 99% confidence interval is
Thus, at 99% confidence, the margin of error is 5.15 and the confidence interval is
82 5.15 76.85 to 82 5.15 87.15
Comparing the results for the 90%, 95%, and 99% confidence levels, we see that inorder to have a higher degree of confidence, the margin of error and thus the width of theconfidence interval must be larger
Enter Data: A label and the sales data are entered into cells A1:A101
Enter Functions and Formulas: The sample size and sample mean are computed in cells D4:D5 using Excel’s COUNT and AVERAGE functions, respectively The value work-sheet shows that the sample size is 100 and the sample mean is 82 The value of the knownpopulation standard deviation (20) is entered into cell D7 and the desired confidence co-efficient (.95) is entered into cell D8 The level of significance is computed in cell D9 byentering the formula 1D8; the value worksheet shows that the level of significanceassociated with a confidence coefficient of 95 is 05 The margin of error is computed incell D11 using Excel’s CONFIDENCE function The CONFIDENCE function has three in-puts: the level of significance (cell D9); the population standard deviation (cell D7); andthe sample size (cell D4) Thus, to compute the margin of error associated with a 95% con-fidence interval, the following formula is entered into cell D11:
The resulting value of 3.92 is the margin of error associated with the interval estimate ofthe population mean amount spent per week
Cells D13:D15 provide the point estimate and the lower and upper limits for the dence interval Because the point estimate is just the sample mean, the formula D5 is en-tered into cell D13 To compute the lower limit of the 95% confidence interval, (margin
confi-of error), we enter the formula D13-D11 into cell D14 To compute the upper limit confi-of the 95% confidence interval, (margin of error), we enter the formula D13D11 into cell D15 The value worksheet shows a lower limit of 78.08 and an upper limit of 85.92 Inother words, the 95% confidence interval for the population mean is from 78.08 to 85.92
Trang 8A Template for Other Problems To use this worksheet as a template for another lem of this type, we must first enter the new problem data in column A Then, the cell for-mulas in cells D4 and D5 must be updated with the new data range and the known populationstandard deviation must be entered into cell D7 After doing so, the point estimate and a 95%confidence interval will be displayed in cells D13:D15 If a confidence interval with a dif-ferent confidence coefficient is desired, we simply change the value in cell D8.
prob-We can further simplify the use of Figure 8.4 as a template for other problems by nating the need to enter new data ranges in cells D4 and D5 To do so we rewrite the cellformulas as follows:
elimi-With the A:A method of specifying data ranges, Excel’s COUNT function will count thenumber of numeric values in column A and Excel’s AVERAGE function will compute the
Trang 9Methods
1 A simple random sample of 40 items resulted in a sample mean of 25 The population dard deviation is σ 5.
stan-a What is the standard error of the mean, ?
b At 95% confidence, what is the margin of error?
2 A simple random sample of 50 items from a population with σ 6 resulted in a sample
mean of 32
a Provide a 90% confidence interval for the population mean
b Provide a 95% confidence interval for the population mean
c Provide a 99% confidence interval for the population mean
σ x¯
average of the numeric values in column A Thus, to solve a new problem it is only sary to enter the new data into column A and enter the value of the known population stan-dard deviation in cell D7
neces-This worksheet can also be used as a template for text exercises in which the samplesize, sample mean, and the population standard deviation are given In this type of situation
we simply replace the values in cells D4, D5, and D7 with the given values of the samplesize, sample mean, and the population standard deviation
Practical Advice
If the population follows a normal distribution, the confidence interval provided by pression (8.1) is exact In other words, if expression (8.1) were used repeatedly to generate95% confidence intervals, exactly 95% of the intervals generated would contain the popu-lation mean If the population does not follow a normal distribution, the confidence inter-val provided by expression (8.1) will be approximate In this case, the quality of theapproximation depends on both the distribution of the population and the sample size
ex-In most applications, a sample size of n 30 is adequate when using expression (8.1)
to develop an interval estimate of a population mean If the population is not normally tributed, but is roughly symmetric, sample sizes as small as 15 can be expected to providegood approximate confidence intervals With smaller sample sizes, expression (8.1) shouldonly be used if the analyst believes, or is willing to assume, that the population distribution
dis-is at least approximately normal
test
SELF
The Lloyd’s data set
includes a worksheet titled
Template that uses the A:A
method for entering the
data ranges.
NOTES AND COMMENTS
1 The interval estimation procedure discussed in
this section is based on the assumption that thepopulation standard deviation σ is known By σ
known we mean that historical data or other formation are available that permit us to obtain agood estimate of the population standard devia-tion prior to taking the sample that will be used
in-to develop an estimate of the population mean
So technically we don’t mean that σ is actually
known with certainty We just mean that we tained a good estimate of the standard deviationprior to sampling and thus we won’t be using the
ob-same sample to estimate both the populationmean and the population standard deviation
2 The sample size n appears in the denominator of the
interval estimation expression (8.1) Thus, if a ticular sample size provides too wide an interval to
par-be of any practical use, we may want to consider
in-creasing the sample size With n in the
denomina-tor, a larger sample size will provide a smallermargin of error, a narrower interval, and greaterprecision The procedure for determining the size
of a simple random sample necessary to obtain adesired precision is discussed in Section 8.3
Trang 103 A simple random sample of 60 items resulted in a sample mean of 80 The populationstandard deviation isσ 15.
a Compute the 95% confidence interval for the population mean
b Assume that the same sample mean was obtained from a sample of 120 items Provide
a 95% confidence interval for the population mean
c What is the effect of a larger sample size on the interval estimate?
4 A 95% confidence interval for a population mean was reported to be 152 to 160 If σ 15,
what sample size was used in this study?
a At 95% confidence, what is the margin of error?
b Develop a 95% confidence interval estimate of the mean amount spent for dinner
6 Nielsen Media Research conducted a study of household television viewing times duringthe 8 p.m to 11 p.m time period The data contained in the CD file named Nielsen are con-
sistent with the findings reported (The World Almanac, 2003) Based upon past studies the
population standard deviation is assumed known with σ 3.5 hours Develop a 95%
con-fidence interval estimate of the mean television viewing time per week during the 8 p.m
to 11 p.m time period
7 A survey of small businesses with Web sites found that the average amount spent on a
site was $11,500 per year (Fortune, March 5, 2001) Given a sample of 60 small businesses
and a population standard deviation of σ $4000, what is the margin of error? Use
95% confidence What would you recommend if the study required a margin of error
of $500?
8 The National Quality Research Center at the University of Michigan provides a
quar-terly measure of consumer opinions about products and services (The Wall Street Journal,
February 18, 2003) A survey of 10 restaurants in the Fast Food/ Pizza group showed asample mean customer satisfaction index of 71 Past data indicate that the population stan-dard deviation of the index has been relatively stable with σ 5.
a What assumption should the researcher be willing to make if a margin of error is desired?
b Using 95% confidence, what is the margin of error?
c What is the margin of error if 99% confidence is desired?
9 A study was conducted of students admitted to the top graduate business schools The datacontained in the CD file named GPA show the undergraduate grade point average for stu-
dents and is consistent with the findings reported (“Best Graduate Schools,” U.S News and World Report, 2001) Using past years’ data, the population standard deviation can be as-
sumed known with σ 28 What is the 95% confidence interval estimate of the mean
under-graduate grade point average for students admitted to the top under-graduate business schools?
10 Playbill magazine reported that the mean annual household income of its readers is
$119,155 (Playbill, December 2003) Assume this estimate of the mean annual household
income is based on a sample of 80 households and, based on past studies, the populationstandard deviation is known to be σ $30,000.
a Develop a 90% confidence interval estimate of the population mean
b Develop a 95% confidence interval estimate of the population mean
c Develop a 99% confidence interval estimate of the population mean
d Discuss what happens to the width of the confidence interval as the confidence level
is increased Does this result seem reasonable? Explain
Trang 118.2 Population Mean: σ Unknown
When developing an interval estimate of a population mean we usually do not have a goodestimate of the population standard deviation either In these cases, we must use the samesample to estimateµ and σ This situation represents the σ unknown case When s is used
to estimate σ, the margin of error and the interval estimate for the population mean are based
on a probability distribution known as the t distribution Although the mathematical
de-velopment of the t distribution is based on the assumption of a normal distribution for the population we are sampling from, research shows that the t distribution can be successfully
applied in many situations where the population deviates significantly from normal Later
in this section we provide guidelines for using the t distribution if the population is not
nor-mally distributed
The t distribution is a family of similar probability distributions, with a specific t
dis-tribution depending on a parameter known as the degrees of freedom The t distribution with one degree of freedom is unique, as is the t distribution with two degrees of free-
dom, with three degrees of freedom, and so on As the number of degrees of freedom
increases, the difference between the t distribution and the standard normal distribution becomes smaller and smaller Figure 8.5 shows t distributions with 10 and 20 degrees of
freedom and their relationship to the standard normal probability distribution Note that a
t distribution with more degrees of freedom exhibits less variability and more closely resembles the standard normal distribution Note also that the mean of the t distribution
is zero
We place a subscript on t to indicate an area in the upper tail of the t distribution For example, just as we used z.025 to indicate the z value providing a 025 area in the upper tail of a standard normal distribution, we will use t.025to indicate the t value pro- viding a 025 area in the upper tail of a t distribution In general, we will use the notation
t α/2 to represent a t value with an area of α/2 in the upper tail of the t distribution See
Figure 8.6
William Sealy Gosset,
writing under the name
“Student,” is the founder of
the t distribution Gosset,
an Oxford graduate in
mathematics, worked for
the Guinness Brewery in
Dublin, Ireland He
developed the t distribution
while working on
small-scale materials and
temperature experiments.
Standard normal distribution
t distribution (20 degrees of freedom)
t distribution (10 degrees of freedom)
FIGURE 8.5 COMPARISON OF THE STANDARD NORMAL DISTRIBUTION
WITH t DISTRIBUTIONS HAVING 10 AND 20 DEGREES
OF FREEDOM
Trang 12α/2
FIGURE 8.6 t DISTRIBUTION WITH α/2 AREA OR PROBABILITY IN THE UPPER TAIL
Table 8.2 provides the t value for upper tail areas of 20, 10, 05, 025, 01, and 005 Each row in the table corresponds to a separate t distribution with the degrees of freedom shown For example, for a t distribution with 10 degrees of freedom, t.025 2.228 Similarly, for a t dis- tribution with 20 degrees of freedom, t.025 2.086 As the degrees of freedom continue to in-
crease, t.025approaches z.025 1.96 In fact, the standard normal distribution z values can be
found in the infinite degrees of freedom row (labeled grees of freedom exceed 100, the infinite degrees of freedom row can be used to approximate
the actual t value; in other words, for more than 100 degrees of freedom, the standard normal
z value provides a good approximation to the t value Table 2 in Appendix B provides a more extensive t distribution table, with all the degrees of freedom from 1 to 100 included.
Margin of Error and the Interval Estimate
In Section 8.1 we showed that an interval estimate of a population mean for the σ known
As the degrees of freedom
increase, the t distribution
approaches the standard
normal distribution.
INTERVAL ESTIMATE OF A POPULATION MEAN: σ UNKNOWN
(8.2)
where s is the sample standard deviation, (1 α) is the confidence coefficient, and
t α/2 is the t value providing an area of α/2 in the upper tail of the t distribution with
n 1 degrees of freedom
x¯ t α/2 s
兹n
Trang 13Note: A more extensive table is provided as Table 2 of Appendix B.
in the upper tail is t.025 2.228
Trang 14The reason the number of degrees of freedom associated with the t value in expression (8.2) is n 1 concerns the use of s as an estimate of the population standard deviation σ.
The expression for the sample standard deviation is
Degrees of freedom refer to the number of independent pieces of information that go into thecomputation of 兺(x i )2 The n pieces of information involved in computing 兺(x i )2
are as follows: x1 , x2 , , x n In Section 3.2 we indicated that 兺(x i ) 0
for any data set Thus, only n 1 of the x i values are independent; that is, if we know
n 1 of the values, the remaining value can be determined exactly by using the condition
that the sum of the x i values must be 0 Thus, n 1 is the number of degrees of freedom
associated with 兺(x i )2and hence the number of degrees of freedom for the t distribution
in expression (8.2)
To illustrate the interval estimation procedure for the σ unknown case, we will consider
a study designed to estimate the mean credit card debt for the population of households in a
certain city A sample of n 85 households provided the credit card balances shown in Table 8.3 For this situation, no previous estimate of the population standard deviation σ
is available Thus, the sample data must be used to estimate both the population mean and the population standard deviation Using the data in Table 8.3, we compute the samplemean $5900 and the sample standard deviation s $3058 With 95% confidence and
n 1 84 degrees of freedom, Table 2 in Appendix B provides t.025 1.989 We can nowuse expression (8.2) to compute an interval estimate of the population mean
The point estimate of the population mean is $5900, the margin of error is $660, and the95% confidence interval is 5900 660 $5240 to 5900 660 $6560 Thus, we are95% confident that the population mean credit card balance for all households is between
x¯
x¯
x¯ x¯
Trang 15Enter Data: A label and the credit card balances are entered into cells A1:A86.
Apply Analysis Tools: The following steps describe how to use Excel’s Descriptive tistics tool for these data:
Sta-Step 1 Select the Tools menu Step 2 Choose the Data Analysis option Step 3 Choose Descriptive Statistics from the list of Analysis Tools Step 4 When the Descriptive Statistics dialog box appears:
Enter A1:A86 in the Input Range box Select Grouped By Columns
Select Labels in First Row Select Output Range Enter C1 in the Output Range box Select Summary Statistics
Select Confidence Level for Mean Enter 95 in the Confidence Level for Mean box Click OK
FIGURE 8.7 EXCEL WORKSHEET: 95% CONFIDENCE INTERVAL FOR CREDIT CARD BALANCES
Trang 16The sample mean is in cell D3 The margin of error, labeled “Confidence Level(95.0%),”appears in cell D16 The value worksheet shows and a margin of error equal to 660.
Enter Functions and Formulas: Cells D18:D20 provide the point estimate and the lowerand upper limits for the confidence interval Because the point estimate is just the samplemean, the formula D3 is entered into cell D18 To compute the lower limit of the 95%confidence interval, we enter the formula D18-D16 into cell D19
To compute the upper limit of the 95% confidence interval, we enterthe formula D18D16 into cell D20 The value worksheet shows a lower limit of 5240and an upper limit of 6560 In other words, the 95% confidence interval for the populationmean is from 5240 to 6560
Practical Advice
If the population follows a normal distribution, the confidence interval provided by pression (8.2) is exact and can be used for any sample size If the population does not fol-low a normal distribution, the confidence interval provided by expression (8.2) will beapproximate In this case, the quality of the approximation depends on both the distribution
ex-of the population and the sample size
In most applications, a sample size of n 30 is adequate when using expression (8.2)
to develop an interval estimate of a population mean However, if the population tion is highly skewed or contains outliers, most statisticians would recommend increasingthe sample size to 50 or more If the population is not normally distributed but is roughlysymmetric, sample sizes as small as 15 can be expected to provide good approximate con-fidence intervals With smaller sample sizes, expression (8.2) should only be used if theanalyst believes, or is willing to assume, that the population distribution is at least approxi-mately normal
distribu-Using a Small Sample
In the following example we develop an interval estimate for a population mean when thesample size is small As we already noted, an understanding of the distribution of the popu-lation becomes a factor in deciding whether the interval estimation procedure provides ac-ceptable results
Scheer Industries is considering a new computer-assisted program to train maintenanceemployees to do machine repairs In order to fully evaluate the program, the director ofmanufacturing requested an estimate of the population mean time required for maintenanceemployees to complete the computer-assisted training
A sample of 20 employees is selected, with each employee in the sample completingthe training program Data on the training time in days for the 20 employees are shown inTable 8.4 A histogram of the sample data appears in Figure 8.8 What can we say about thedistribution of the population based on this histogram? First, the sample data do not support
x¯ (margin of error),
x¯ (margin of error),
x¯ 5900
(x¯)
Larger sample sizes are
needed if the distribution of
the population is highly
skewed or includes outliers.
Trang 17the conclusion that the distribution of the population is normal, yet we do not see any dence of skewness or outliers Therefore, using the guidelines in the previous subsection,
evi-we conclude that an interval estimate based on the t distribution appears acceptable for the
sample of 20 employees
We continue by computing the sample mean and sample standard deviation as follows
For a 95% confidence interval, we use Table 8.2 and n 1 19 degrees of freedom to
ob-tain t.025 2.093 Expression (8.2) provides the interval estimate of the population mean
The point estimate of the population mean is 51.5 days The margin of error is 3.2 days andthe 95% confidence interval is 51.5 3.2 48.3 days to 51.5 3.2 54.7 days.Using a histogram of the sample data to learn about the distribution of a population isnot always conclusive, but in many cases it provides the only information available Thehistogram, along with judgment on the part of the analyst, can often be used to decidewhether expression (8.2) can be used to develop the interval estimate
Trang 18Summary of Interval Estimation Procedures
We provided two approaches to developing an interval estimate of a population mean Fortheσ known case, σ and the standard normal distribution are used in expression (8.1) to
compute the margin of error and to develop the interval estimate For the σ unknown case, the sample standard deviation s and the t distribution are used in expression (8.2) to com-
pute the margin of error and to develop the interval estimate
A summary of the interval estimation procedures for the two cases is shown in
Fig-ure 8.9 In most applications, a sample size of n 30 is adequate If the population has anormal or approximately normal distribution, however, smaller sample sizes may be used.For the σ unknown case a sample size of n 50 is recommended if the population dis-
tribution is believed to be highly skewed or has outliers
NOTES AND COMMENTS
1 When σ is known, the margin of error,
z α/2( ), is fixed and is the same for all
samples of size n When σ is unknown, the
mar-gin of error, t α/2( ), varies from sample
to sample This variation occurs because s
varies from sample to sample A large value
for s provides a larger margin of error, while a
small value for s provides a smaller margin
of error
2 What happens to confidence interval
esti-mates when the population is skewed?
Con-s兾兹n
σ兾兹n sider a population that is skewed to the rightwith large data values stretching the
distribu-tion to the right When such skewness exists,the sample mean and the sample standard
deviation s are positively correlated Larger values of s tend to be associated with larger
values of Thus, when is larger than the
population mean, s tends to be larger than σ.
This skewness causes the margin of error,
t α/2( ), to be larger than it would be with
FIGURE 8.9 SUMMARY OF INTERVAL ESTIMATION PROCEDURES
FOR A POPULATION MEAN
Trang 1912 Find the t value(s) for each of the following cases.
a Upper tail area of 025 with 12 degrees of freedom
b Lower tail area of 05 with 50 degrees of freedom
c Upper tail area of 01 with 30 degrees of freedom
d Where 90% of the area falls between these two t values with 25 degrees of freedom
e Where 95% of the area falls between these two t values with 45 degrees of freedom (See Table 2 of Appendix B for a more extensive t table.)
13 The following sample data are from a normal population: 10, 8, 12, 15, 13, 11, 6, 5
a What is the point estimate of the population mean?
b What is the point estimate of the population standard deviation?
c With 95% confidence, what is the margin of error for the estimation of the populationmean?
d What is the 95% confidence interval for the population mean?
14 A simple random sample with n 54 provided a sample mean of 22.5 and a sample
stan-dard deviation of 4.4 (See Table 2 of Appendix B for a more extensive t table.)
a Develop a 90% confidence interval for the population mean
b Develop a 95% confidence interval for the population mean
c Develop a 99% confidence interval for the population mean
d What happens to the margin of error and the confidence interval as the confidencelevel is increased?
Applications
15 Sales personnel for Skillings Distributors submit weekly reports listing the customer tacts made during the week A sample of 65 weekly reports showed a sample mean of 19.5 customer contacts per week The sample standard deviation was 5.2 Provide 90% and 95% confidence intervals for the population mean number of weekly customer con-tacts for the sales personnel
con-16 The mean number of hours of flying time for pilots at Continental Airlines is 49 hours per
month (The Wall Street Journal, February 25, 2003) Assume that this mean was based on
actual flying times for a sample of 100 Continental pilots and that the sample standard viation was 8.5 hours
de-test
SELF
σ known The confidence interval with the
larger margin of error tends to include thepopulation mean µ more often than it would
if the true value of σ were used But when
is smaller than the population mean, the
cor-relation between and s causes the margin of x¯
Trang 20a At 95% confidence, what is the margin of error?
b What is the 95% confidence interval estimate of the population mean flying time forthe pilots?
c The mean number of hours of flying time for pilots at United Airlines is 36 hours permonth Use your results from part (b) to discuss differences between the flying times
for the pilots at the two airlines The Wall Street Journal reported United Airlines as
having the highest labor cost among all airlines Does the information in this exerciseprovide insight as to why United Airlines might expect higher labor costs?
17 The International Air Transport Association surveys business travelers to develop qualityratings for transatlantic gateway airports The maximum possible rating is 10 Suppose asimple random sample of 50 business travelers is selected and each traveler is asked to pro-vide a rating for the Miami International Airport The ratings obtained from the sample of
50 business travelers follow
7 8 7 5 9 5 8 4 3 8 5 5 4
4 4 8 4 5 6 2 5 9 9 8 4 8
9 9 5 9 7 8 3 10 8 9 6Develop a 95% confidence interval estimate of the population mean rating for Miami
18 Thirty fast-food restaurants including Wendy’s, McDonald’s, and Burger King were
vis-ited during the summer of 2000 (The Cincinnati Enquirer, July 9, 2000) During each visit,
the customer went to the drive-through and ordered a basic meal such as a “combo” meal
or a sandwich, fries, and shake The time between pulling up to the menu board and ceiving the filled order was recorded The times in minutes for the 30 visits are as follows:0.9 1.0 1.2 2.2 1.9 3.6 2.8 5.2 1.8 2.16.8 1.3 3.0 4.5 2.8 2.3 2.7 5.7 4.8 3.52.6 3.3 5.0 4.0 7.2 9.1 2.8 3.6 7.3 9.0
re-a Provide a point estimate of the population mean drive-through time at fast-food restaurants
b At 95% confidence, what is the margin of error?
c What is the 95% confidence interval estimate of the population mean?
d Discuss skewness that may be present in this population What suggestion would youmake for a repeat of this study?
19 ANational Retail Foundation survey found households intended to spend an average of $649
during the December holiday season (The Wall Street Journal, December 2, 2002) Assume
that the survey included 600 households and that the sample standard deviation was $175
a With 95% confidence, what is the margin of error?
b What is the 95% confidence interval estimate of the population mean?
c The prior year, the population mean expenditure per household was $632 Discuss thechange in holiday season expenditures over the one-year period
20 The American Association of Advertising Agencies records data on nonprogram minutes
on half-hour, prime-time television shows Representative data in minutes for a sample of
20 prime-time shows on major networks at 8:30 p.m follow
Trang 2121 Complaints about rising prescription drug prices caused the U.S Congress to consider lawsthat would force pharmaceutical companies to offer prescription discounts to senior citi-zens without drug benefits The House Government Reform Committee provided data on
the prescription cost for some of the most widely used drugs (Newsweek, May 8, 2000).
Assume the following data show a sample of the prescription cost in dollars for Zocor, adrug used to lower cholesterol
110 112 115 99 100 98 104 126Given a normal population, what is the 95% confidence interval estimate of the populationmean cost for a prescription of Zocor?
22 The first few weeks of 2004 were good for the stock market A sample of 25 large
open-end funds showed the following year-to-date returns through January 16, 2004 (Barron’s,
In providing practical advice in the two preceding sections, we commented on the role ofthe sample size in providing approximate confidence intervals when the population is notnormally distributed In this section, we focus on another aspect of the sample size issue
We describe how to choose a sample size large enough to provide a desired margin of error
To see how this is done, we return to the σ known case presented in Section 8.1 Using
ex-pression (8.1), the interval estimate is
The quantity z α/2( ) is the margin of error Thus, we see that z α/2, the population dard deviation σ, and the sample size n combine to determine the margin of error Once we
stan-select a confidence coefficient 1 α, z α/2can be determined Then, if we have a value for
σ, we can determine the sample size n needed to provide any desired margin of error velopment of the formula used to compute the required sample size n follows.
De-Let E the desired margin of error:
Solving for , we have
sampling, the procedures in
this section can be used to
determine the sample size
necessary to satisfy the
margin of error
requirement.
Trang 22Squaring both sides of this equation, we obtain the following expression for the sample size.
SAMPLE SIZE FOR AN INTERVAL ESTIMATE OF A POPULATION MEAN
(8.3)
n (z α/2)
2σ2
E2
Equation (8.3) can be used
to provide a sample size
recommendation However,
judgment on the part of the
analyst should be used to
determine whether the final
sample size should be
adjusted upward.
This sample size provides the desired margin of error at the chosen confidence level
In equation (8.3) E is the margin of error that the user is willing to accept, and the value
of z α/2follows directly from the confidence level to be used in developing the interval mate Although user preference must be considered, 95% confidence is the most frequently
esti-chosen value (z.025 1.96)
Finally, use of equation (8.3) requires a value for the population standard deviation σ.
However, even if σ is unknown, we can use equation (8.3) provided we have a preliminary
or planning value for σ In practice, one of the following procedures can be chosen.
1 Use the estimate of the population standard deviation computed from data of
previ-ous studies as the planning value for σ.
2 Use a pilot study to select a preliminary sample The sample standard deviation from
the preliminary sample can be used as the planning value for σ.
3 Use judgment or a “best guess” for the value of σ For example, we might begin by
estimating the largest and smallest data values in the population The difference tween the largest and smallest values provides an estimate of the range for the data.Finally, the range divided by 4 is often suggested as a rough approximation of thestandard deviation and thus an acceptable planning value for σ.
be-Let us demonstrate the use of equation (8.3) to determine the sample size by ing the following example A previous study that investigated the cost of renting automo-biles in the United States found a mean cost of approximately $55 per day for renting amidsize automobile Suppose that the organization that conducted this study would like toconduct a new study in order to estimate the population mean daily rental cost for a mid-size automobile in the United States In designing the new study, the project director speci-fies that the population mean daily rental cost be estimated with a margin of error of $2 and
consider-a 95% level of confidence
The project director specified a desired margin of error of E 2, and the 95% level
of confidence indicates z.025 1.96 Thus, we only need a planning value for the tion standard deviationσ in order to compute the required sample size At this point, an
popula-analyst reviewed the sample data from the previous study and found that the sample dard deviation for the daily rental cost was $9.65 Using 9.65 as the planning value forσ,
stan-we obtain
Thus, the sample size for the new study needs to be at least 89.43 midsize automobile rentals
in order to satisfy the project director’s $2 margin-of-error requirement In cases where the
computed n is not an integer, we round up to the next integer value; hence, the
recom-mended sample size is 90 midsize automobile rentals
specified before the sample
size can be determined.
Three methods of obtaining
a planning value for σ are
discussed here.
Equation (8.3) provides the
minimum sample size
needed to satisfy the
desired margin of error
requirement If the
computed sample size is not
an integer, rounding up to
the next integer value will
provide a margin of error
slightly smaller than
required.
Trang 23Methods
23 How large a sample should be selected to provide a 95% confidence interval with a gin of error of 10? Assume that the population standard deviation is 40
mar-24 The range for a set of data is estimated to be 36
a What is the planning value for the population standard deviation?
b At 95% confidence, how large a sample would provide a margin of error of 3?
c At 95% confidence, how large a sample would provide a margin of error of 2?
nual survey to monitor the cost of a wedding Use 95% confidence
a What is the recommended sample size if the desired margin of error is $1000?
b What is the recommended sample size if the desired margin of error is $500?
c What is the recommended sample size if the desired margin of error is $200?
27 Annual starting salaries for college graduates with degrees in business administration aregenerally expected to be between $30,000 and $45,000 Assume that a 95% confidence in-terval estimate of the population mean annual starting salary is desired What is the plan-ning value for the population standard deviation? How large a sample should be taken ifthe desired margin of error is
a $500?
b $200?
c $100?
d Would you recommend trying to obtain the $100 margin of error? Explain
28 Smith Travel Research provides information on the one-night cost of hotel rooms out the United States (USA Today, July 8, 2002) Use $2 as the desired margin of error and
through-$22.50 as the planning value for the population standard deviation to find the sample sizerecommended in (a), (b), and (c)
a A 90% confidence interval estimate of the population mean cost of hotel rooms
b A 95% confidence interval estimate of the population mean cost of hotel rooms
c A 99% confidence interval estimate of the population mean cost of hotel rooms
d When the desired margin of error is fixed, what happens to the sample size as the fidence level is increased? Would you recommend a 99% confidence level be used bySmith Travel Research? Discuss
con-29 The travel-to-work time for residents of the 15 largest cities in the United States is reported
in the 2003 Information Please Almanac Suppose that a preliminary simple random
sample of residents of San Francisco is used to develop a planning value of 6.25 minutesfor the population standard deviation
a If we want to estimate the population mean travel-to-work time for San Francisco dents with a margin of error of 2 minutes, what sample size should be used? Assume95% confidence
resi-b If we want to estimate the population mean travel-to-work time for San Francisco dents with a margin of error of 1 minute, what sample size should be used? Assume95% confidence
resi-test
SELF
test
SELF
Trang 2430 During the first quarter of 2003, the price/earnings (P/ E) ratio for stocks listed on the New
York Stock Exchange generally ranged from 5 to 60 (The Wall Street Journal, March 7,
2003) Assume that we want to estimate the population mean P/ Eratio for all stocks listed
on the exchange How many stocks should be included in the sample if we want a margin
of error of 3? Use 95% confidence
in-In Chapter 7 we said that the sampling distribution of can be approximated by a normal
distribution whenever np 5 and n(1 p) 5 Figure 8.10 shows the normal approximation
of the sampling distribution of The mean of the sampling distribution of is the
popula-tion proporpopula-tion p, and the standard error of is
(8.4)
Because the sampling distribution of is normally distributed, if we choose z α/2 asthe margin of error in an interval estimate of a population proportion, we know that100(1 α)% of the intervals generated will contain the true population proportion Un-
fortunately, cannot be used directly in the computation of the margin of error because p will not be known; p is what we are trying to estimate So, is substituted for p and the
margin of error for an interval estimate of a population proportion is given by
Trang 25With this margin of error, the general expression for an interval estimate of a tion proportion is as follows.
popula-The following example illustrates the computation of the margin of error and intervalestimate for a population proportion A national survey of 900 women golfers was con-ducted to learn how women golfers view their treatment at golf courses in the United States.The survey found that 396 of the women golfers were satisfied with the availability of teetimes Thus, the point estimate of the proportion of the population of women golfers whoare satisfied with the availability of tee times is 396/900 44 Using expression (8.6) and
a 95% confidence level,
Thus, the margin of error is 0324 and the 95% confidence interval estimate of the tion proportion is 4076 to 4724 Using percentages, the survey results enable us to statewith 95% confidence that between 40.76% and 47.24% of all women golfers are satisfiedwith the availability of tee times
popula-Using Excel
Excel can be used to construct an interval estimate of the population proportion of womengolfers who are satisfied with the availability of tee times The responses in the survey wererecorded as a Yes or No for each woman surveyed Refer to Figure 8.11 as we describe the tasks involved in constructing a 95% confidence interval The formula worksheet is inthe background; the value worksheet is in the foreground
Enter Data: The Yes-No data for the 900 women golfers are entered into cells A2:A901
Enter Functions and Formulas: The descriptive statistics we need and the response of terest are provided in cells D3:D6 Because Excel’s COUNT function only works with nu-merical data, we used the COUNTA function in cell D3 to compute the sample size Theresponse for which we want to develop an interval estimate, Yes or No, is entered into cell D4.Figure 8.11 shows that Yes has been entered into cell D4, indicating that we want to develop
in-an interval estimate of the population proportion of women golfers who are satisfied with theavailability of tee times If we had wanted to develop an interval estimate of the populationproportion of women golfers who are not satisfied with the availability of tee times, we wouldhave entered No in cell D4 With Yes entered in cell D4, the COUNTIF function in cell D5counts the number of Yes responses in the sample The sample proportion is then computed
in cell D6 by dividing the number of Yes responses in cell D5 by the sample size in cell D3
confidence intervals for
proportions, the quantity
Trang 26Cells D8:D10 are used to compute the appropriate z value The confidence coefficient
(0.95) is entered into cell D8 and the level of significance (α) is computed in cell D9 by
en-tering the formula 1-D8 The z value corresponding to an upper tail area of α/2 is puted by entering the formula NORMSINV(1-D9/2) into cell D10 The value worksheet
Trang 27the standard error using the sample proportion and the sample size as inputs The formula
D10*D12 is entered into cell D13 to compute the margin of error
Cells D15:D17 provide the point estimate and the lower and upper limits for a dence interval The point estimate in cell D15 is the sample proportion The lower and upperlimits in cells D16 and D17 are obtained by subtracting and adding the margin of error tothe point estimate We note that the 95% confidence interval for the proportion of womengolfers who are satisfied with the availability of tee times is 4076 to 4724
confi-A Template for Other Problems The worksheet in Figure 8.11 can be used as a
tem-plate for developing confidence intervals about a population proportion p To use this
worksheet for another problem of this type, we must first enter the new problem data incolumn A The response of interest would then be typed in cell D4 and the ranges for theformulas in cells D3 and D5 would be revised to correspond to the new data After doing
so, the point estimate and a 95% confidence interval will be displayed in cells D15:D17 If
a confidence interval with a different confidence coefficient is desired, we simply changethe value in cell D8
Determining the Sample Size
Let us consider the question of how large the sample size should be to obtain an estimate
of a population proportion at a specified level of precision The rationale for the sample size
determination in developing interval estimates of p is similar to the rationale used in
Sec-tion 8.3 to determine the sample size for estimating a populaSec-tion mean
Previously in this section we said that the margin of error associated with an interval
estimate of a population proportion is z α/2 The margin of error is based on the
value of z α/2 , the sample proportion , and the sample size n Larger sample sizes provide
a smaller margin of error and better precision
Let E denote the desired margin of error.
Solving this equation for n provides a formula for the sample size that will provide a gin of error of size E.
mar-Note, however, that we cannot use this formula to compute the sample size that will providethe desired margin of error because will not be known until after we select the sample.What we need, then, is a planning value for that can be used to make the computation
Using p* to denote the planning value for , the following formula can be used to compute the sample size that will provide a margin of error of size E.
Trang 28In practice, the planning value p* can be chosen by one of the following procedures.
1 Use the sample proportion from a previous sample of the same or similar units.
2 Use a pilot study to select a preliminary sample The sample proportion from this
sample can be used as the planning value, p*.
3 Use judgment or a “best guess” for the value of p*.
4 If none of the preceding alternatives apply, use a planning value of p* 50.Let us return to the survey of women golfers and assume that the company is interested
in conducting a new survey to estimate the current proportion of the population of womengolfers who are satisfied with the availability of tee times How large should the sample be
if the survey director wants to estimate the population proportion with a margin of error of
.025 at 95% confidence? With E 025 and z α/2 1.96, we need a planning value p* to
answer the sample size question Using the previous survey result of 44 as the
plan-ning value p*, equation (8.7) shows that
Thus, the sample size must be at least 1514.5 women golfers to satisfy the margin of errorrequirement Rounding up to the next integer value indicates that a sample of 1515 womengolfers is recommended to satisfy the margin of error requirement
The fourth alternative suggested for selecting a planning value p* is to use p* 50 This value
of p* is frequently used when no other information is available To understand why, note that the numerator of equation (8.7) shows that the sample size is proportional to the quantity p*(1 p*).
A larger value for the quantity p*(1 p*) will result in a larger sample size Table 8.5 gives some possible values of p*(1 p*) Note that the largest value of p*(1 p*) occurs when p* 50 Thus, in case of any uncertainty about an appropriate planning value, we know that p* 50 willprovide the largest sample size recommendation In effect, we play it safe by recommending thelargest possible sample size If the sample proportion turns out to be different from the 50 plan-
ning value, the margin of error will be smaller than anticipated Thus, in using p* 50, we antee that the sample size will be sufficient to obtain the desired margin of error
guar-In the survey of women golfers example, a planning value of p* 50 would have vided the sample size
pro-Thus, a slightly larger sample size of 1537 women golfers would be recommended
Trang 29Methods
31 A simple random sample of 400 individuals provides 100 Yes responses
a What is the point estimate of the proportion of the population that would provide Yesresponses?
b What is your estimate of the standard error of , ?
c Compute the 95% confidence interval for the population proportion
32 A simple random sample of 800 elements generates a sample proportion 70
a Provide a 90% confidence interval for the population proportion
b Provide a 95% confidence interval for the population proportion
33 In a survey, the planning value for the population proportion is p* 35 How large asample should be taken to provide a 95% confidence interval with a margin of error of 05?
34 At 95% confidence, how large a sample should be taken to obtain a margin of error of 03for the estimation of a population proportion? Assume that past data are not available for
developing a planning value for p*.
Applications
35 Asurvey of 611 office workers investigated telephone answering practices, including how ofteneach office worker was able to answer incoming telephone calls and how often incomingtelephone calls went directly to voice mail (USA Today, April 21, 2002) A total of 281 office
workers indicated that they never need voice mail and are able to take every telephone call
a What is the point estimate of the proportion of the population of office workers whoare able to take every telephone call?
b At 90% confidence, what is the margin of error?
c What is the 90% confidence interval for the proportion of the population of officeworkers who are able to take every telephone call?
36 A survey by the Society for Human Resource Management asked 346 job seekers why
em-ployees change jobs so frequently (The Wall Street Journal, March 28, 2000) The answer
selected most (152 times) was “higher compensation elsewhere.”
a What is the point estimate of the proportion of job seekers who would select “highercompensation elsewhere” as the reason for changing jobs?
b What is the 95% confidence interval estimate of the population proportion?
37 Towers Perrin, a New York human resources consulting firm, conducted a survey of 1100 ployees at medium-sized and large companies to determine how dissatisfied employees were
em-with their jobs (The Wall Street Journal, January 29, 2003) The data are shown in the file
named Job Satisfaction A response of Yes indicates that the employee strongly dislikes the current work experience
a What is the point estimate of the proportion of the population of employees whostrongly dislike the current work experience?
b At 95% confidence, what is the margin of error?
NOTES AND COMMENTS
The desired margin of error for estimating a
popu-lation proportion is almost always 10 or less In
national public opinion polls conducted by
organi-zations such as Gallup and Harris, a 03 or 04
mar-gin of error is common With such marmar-gins of error,
equation (8.7) will almost always provide a samplesize that is large enough to satisfy the requirements
of np 5 and n(1 p) 5 for using a normal
tribution as an approximation for the sampling tribution of p¯
dis-file
CD
JobSatisfaction
Trang 30c What is the 95% confidence interval for the proportion of the population of ees who strongly dislike the current work experience?
employ-d Towers Perrin estimates that it costs employers one-third of an hourly employee’sannual salary to find a successor and as much as 1.5 times the annual salary to find
a successor for a highly compensated employee What message did this survey send
c How large a sample should be taken if the desired margin of error is 03?
39 An Employee Benefit Research Institute survey explored the reasons small business ployers offer a retirement plan to their employees (USA Today, April 4, 2000) The reason
em-“competitive advantage in recruitment/retention” was anticipated 33% of the time
a What sample size is recommended if a survey goal is to estimate the proportion ofsmall business employers who offer a retirement plan primarily for “competitive ad-vantage in recruitment/retention” with a margin of error of 03? Use 95% confidence
b Repeat part (a) using 99% confidence
40 The professional baseball home run record of 61 home runs in a season was held for 37 years
by Roger Maris of the New York Yankees However, between 1998 and 2001, three players—Mark McGwire, Sammy Sosa, and Barry Bonds—broke the standard set by Maris withBonds holding the current record of 73 home runs in a single season With the long-standinghome run record being broken and with many other new offensive records being set, sus-picion arose that baseball players might be using illegal muscle-building drugs called steroids
AUSA Today/CNN/Gallup poll found that 86% of baseball fans think professional ball players should be tested for steroids (USA Today, July 8, 2002) If 650 baseball fans
base-were included in the sample, compute the margin of error and the 95% confidence intervalfor the population proportion of baseball fans who think professional baseball playersshould be tested for steroids
41 An American Express retail survey found that 16% of U.S consumers used the net to buy gifts during the holiday season (USA Today, January 18, 2000) If 1285 cus-
Inter-tomers participated in the survey, what is the margin of error and what is the interval mate of the population proportion of customers using the Internet to buy gifts? Use 95%confidence
esti-42 A poll for the presidential campaign sampled 491 potential voters in June A primary pose of the poll was to obtain an estimate of the proportion of potential voters who favor
pur-each candidate Assume a planning value of p* 50 and a 95% confidence level
a For p* 50, what was the planned margin of error for the June poll?
b Closer to the November election, better precision and smaller margins of error are sired Assume the following margins of error are requested for surveys to be conductedduring the presidential campaign Compute the recommended sample size for eachsurvey
Trang 3143 A Phoenix Wealth Management/ Harris Interactive survey of 1500 individuals with net
worth of $1 million or more provided a variety of statistics on wealthy people Week, September 22, 2003) The previous three-year period had been bad for the stock mar-
(Business-ket, which motivated some of the questions asked
a The survey reported that 53% of the respondents lost 25% or more of their portfoliovalue over the past three years Develop a 95% confidence interval for the propor-tion of wealthy people who lost 25% or more of their portfolio value over the past three years
b The survey reported that 31% of the respondents feel they have to save more for tirement to make up for what they lost Develop a 95% confidence interval for thepopulation proportion
re-c Five percent of the respondents gave $25,000 or more to charity over the previous year.Develop a 95% confidence interval for the proportion who gave $25,000 or more tocharity
d Compare the margin of error for the interval estimates in parts (a), (b), and (c) How
is the margin of error related to ? When the same sample is being used to estimate avariety of proportions, which of the proportions should be used to choose the planning
value p*? Why do you think p* 50 is often used in these cases?
Summary
In this chapter we presented methods for developing interval estimates of a population meanand a population proportion A point estimator may or may not provide a good estimate of
a population parameter The use of an interval estimate provides a measure of the precision
of an estimate Both the interval estimate of the population mean and the population portion are of the form: point estimate margin of error
pro-We presented interval estimates for a population mean for two cases In the σ known
case, historical data or other information is used to develop an estimate of σ prior to taking
a sample Analysis of new sample data then proceeds based on the assumption that σ is
known In the σ unknown case, the sample data are used to estimate both the population
mean and the population standard deviation The final choice of which interval estimationprocedure to use depends upon the analyst’s understanding of which method provides thebest estimate of σ.
In the σ known case, the interval estimation procedure is based on the assumed value
ofσ and the use of the standard normal distribution In the σ unknown case, the interval estimation procedure uses the sample standard deviation s and the t distribution In both
cases the quality of the interval estimates obtained depends on the distribution of thepopulation and the sample size If the population is normally distributed the interval esti-mates will be exact in both cases, even for small sample sizes If the population is notnormally distributed, the interval estimates obtained will be approximate Larger samplesizes will provide better approximations, but the more highly skewed the population is, thelarger the sample size needs to be to obtain a good approximation Practical advice aboutthe sample size necessary to obtain good approximations was included in Sections 8.1 and8.2 In most cases a sample of size 30 or more will provide good approximate confidenceintervals
The general form of the interval estimate for a population proportion is margin oferror In practice the sample sizes used for interval estimates of a population proportion aregenerally large Thus, the interval estimation procedure is based on the standard normaldistribution
Often a desired margin of error is specified prior to developing a sampling plan Weshowed how to choose a sample size large enough to provide the desired precision
p¯
p¯
Trang 32Interval estimateAn estimate of a population parameter that provides an interval believed
to contain the value of the parameter For the interval estimates in this chapter, it has theform: point estimate margin of error
Margin of errorThe value added to and subtracted from a point estimate in order to velop an interval estimate of a population parameter
de-σ knownThe case when historical data or other information provides a good value for thepopulation standard deviation prior to taking a sample The interval estimation procedureuses this known value of σ in computing the margin of error.
Confidence levelThe confidence associated with an interval estimate For example, if aninterval estimation procedure provides intervals such that 95% of the intervals formed usingthe procedure will include the population parameter, the interval estimate is said to be con-structed at the 95% confidence level
Confidence coefficientThe confidence level expressed as a decimal value For example,.95 is the confidence coefficient for a 95% confidence level
Confidence intervalAnother name for an interval estimate
Level of significanceThe probability that the interval estimation procedure will generate
an interval that does not contain µ.
σ unknownThe more common case when no good basis exists for estimating the tion standard deviation prior to taking the sample The interval estimation procedure uses
popula-the sample standard deviation s in computing popula-the margin of error.
t distributionA family of probability distributions that can be used to develop an intervalestimate of a population mean whenever the population standard deviation σ is unknown and is estimated by the sample standard deviation s.
Degrees of freedomA parameter of the t distribution When the t distribution is used in the computation of an interval estimate of a population mean, the appropriate t distribution has
n 1 degrees of freedom, where n is the size of the simple random sample.
Trang 33Supplementary Exercises
44 A survey of first-time home buyers found that the mean of annual household income was
$50,000 (http://CNBC.com, July 11, 2000) Assume the survey used a sample of 400 time home buyers and assume that the population standard deviation is $20,500
first-a At 95% confidence, what is the margin of error for this study?
b What is the 95% confidence interval for the population mean annual household come for first-time home buyers?
in-45 A survey conducted by the American Automobile Association showed that a family of fourspends an average of $215.60 per day while on vacation Suppose a sample of 64 families
of four vacationing at Niagara Falls resulted in a sample mean of $252.45 per day and asample standard deviation of $74.50
a Develop a 95% confidence interval estimate of the mean amount spent per day by afamily of four visiting Niagara Falls
b Based on the confidence interval from part (a), does it appear that the population meanamount spent per day by families visiting Niagara Falls differs from the mean reported
by the American Automobile Association? Explain
46 The motion picture Harry Potter and the Sorcerer’s Stone shattered the box office debut record previously held by The Lost World: Jurassic Park (The Wall Street Journal, No-
vember 19, 2001) A sample of 100 movie theaters showed that the mean three-day end gross was $25,467 per theater The sample standard deviation was $4980
week-a What is the margin of error for this study? Use 95% confidence
b What is the 95% confidence interval estimate for the population mean weekend grossper theater?
c The Lost World took in $72.1 million in its first three-day weekend Harry Potter and the Sorcerer’s Stone was shown in 3672 theaters What is an estimate of the total Harry Potter and the Sorcerer’s Stone took in during its first three-day weekend?
d An Associated Press article claimed Harry Potter “shattered” the box office debut record held by The Lost World Do your results agree with this claim?
47 Many stock market observers say that when the P/E ratio for stocks gets over 20 the ket is overvalued The P/E ratio is the stock price divided by the most recent 12 months ofearnings Suppose you are interested in seeing whether the current market is overvaluedand would also like to know the proportion of companies that pay dividends A random
mar-sample of 30 companies listed on the New York Stock Exchange (NYSE) is provided ron’s, January 19, 2004).
file
CD
NYSEStocks
Trang 34a What is a point estimate of the P/E ratio for the population of stocks listed on the NewYork Stock Exchange? Develop a 95% confidence interval.
b Based on your answer to part (a), do you believe that the market is overvalued?
c What is a point estimate of the proportion of companies on the NYSE that pay dends? Is the sample size large enough to justify using the normal distribution to con-struct a confidence interval for this proportion? Why or why not?
divi-48 US Airways conducted a number of studies that indicated a substantial savings could beobtained by encouraging Dividend Miles frequent flyer customers to redeem miles and
schedule award flights online (US Airways Attaché, February 2003) One study collected
data on the amount of time required to redeem miles and schedule an award flight over thetelephone A sample showing the time in minutes required for each of 150 award flightsscheduled by telephone is contained in the data set Flights Use Excel to help answer thefollowing questions
a What is the sample mean number of minutes required to schedule an award flight bytelephone?
b What is the 95% confidence interval for the population mean time to schedule anaward flight by telephone?
c Assume a telephone ticket agent works 7.5 hours per day How many award flightscan one ticket agent be expected to handle a day?
d Discuss why this information supported US Airways’ plans to use an online system toreduce costs
49 A survey by Accountemps asked a sample of 200 executives to provide data on the ber of minutes per day office workers waste trying to locate mislabeled, misfiled, or mis-placed items Data consistent with this survey are contained in the data set ActTemps
num-a Use ActTemps to develop a point estimate of the number of minutes per day officeworkers waste trying to locate mislabeled, misfiled, or misplaced items
b What is the sample standard deviation?
c What is the 95% confidence interval for the mean number of minutes wasted per day?
50 Mileage tests are conducted for a particular model of automobile If a 98% confidence terval with a margin of error of 1 mile per gallon is desired, how many automobiles should
in-be used in the test? Assume that preliminary mileage tests indicate the standard deviation
is 2.6 miles per gallon
51 In developing patient appointment schedules, a medical center wants to estimate the meantime that a staff member spends with each patient How large a sample should be taken
if the desired margin of error is two minutes at a 95% level of confidence? How large asample should be taken for a 99% level of confidence? Use a planning value for the popu-lation standard deviation of eight minutes
52 Annual salary plus bonus data for chief executive officers are presented in the Week Annual Pay Survey A preliminary sample showed that the standard deviation is $675
Business-with data provided in thousands of dollars How many chief executive officers should be
in a sample if we want to estimate the population mean annual salary plus bonus with a
margin of error of $100,000? (Note: The desired margin of error would be E 100 if thedata are in thousands of dollars.) Use 95% confidence
53 The National Center for Education Statistics reported that 47% of college students work
to pay for tuition and living expenses Assume that a sample of 450 college students wasused in the study
a Provide a 95% confidence interval for the population proportion of college studentswho work to pay for tuition and living expenses
b Provide a 99% confidence interval for the population proportion of college studentswho work to pay for tuition and living expenses
c What happens to the margin of error as the confidence is increased from 95% to 99%?
Trang 3554 AUSA Today/CNN/Gallup survey of 369 working parents found 200 who said they spend
too little time with their children because of work commitments
a What is the point estimate of the proportion of the population of working parents whofeel they spend too little time with their children because of work commitments?
b At 95% confidence, what is the margin of error?
c What is the 95% confidence interval estimate of the population proportion of ing parents who feel they spend too little time with their children because of workcommitments?
work-55 Which would be hardest for you to give up: Your computer or your television? In a recentsurvey of 1677 U.S Internet users, 74% of the young tech elite (average age of 22) say
their computer would be very hard to give up (PC Magazine, February 3, 2004) Only 48%
say their television would be very hard to give up
a Develop a 95% confidence interval for the proportion of the young tech elite thatwould find it very hard to give up their computer
b Develop a 99% confidence interval for the proportion of the young tech elite thatwould find it very hard to give up their television
c In which case, part (a) or part (b), is the margin of error larger? Explain why
56 A Roper Starch survey asked employees ages 18 to 29 whether they would prefer betterhealth insurance or a raise in salary (USA Today, September 5, 2000) Answer the follow-
ing questions assuming 340 of 500 employees said they would prefer better health ance over a raise
insur-a What is the point estimate of the proportion of employees ages 18 to 29 who wouldprefer better health insurance?
b What is the 95% confidence interval estimate of the population proportion?
57 The 2003 Statistical Abstract of the United States reported the percentage of people 18 years
of age and older who smoke Suppose that a study designed to collect new data on smokersand nonsmokers uses a preliminary estimate of the proportion who smoke of 30
a How large a sample should be taken to estimate the proportion of smokers in the lation with a margin of error of 02? Use 95% confidence
popu-b Assume that the study uses your sample size recommendation in part (a) and finds
520 smokers What is the point estimate of the proportion of smokers in the population?
c What is the 95% confidence interval for the proportion of smokers in the population?
58 A well-known bank credit card firm wishes to estimate the proportion of credit card ers who carry a nonzero balance at the end of the month and incur an interest charge As-sume that the desired margin of error is 03 at 98% confidence
hold-a How large a sample should be selected if it is anticipated that roughly 70% of thefirm’s card holders carry a nonzero balance at the end of the month?
b How large a sample should be selected if no planning value for the proportion could
a What is the point estimate of the proportion of the population of business travelerswho believe a frequent flyer program is the most important factor when choosing anairline carrier?
Trang 36b Develop a 95% confidence interval estimate of the population proportion.
c How large a sample would be required to report the margin of error of 01 at 95% fidence? Would you recommend that USA Today attempt to provide this degree of pre-
con-cision? Why or why not?
The goal of Bock Investment Services (BIS) is to be the leading money market advisoryservice in South Carolina To provide better service for its present clients and to attract newclients, BIS developed a weekly newsletter BIS is considering adding a new feature to thenewsletter that will report the results of a weekly telephone survey of fund managers To in-vestigate the feasibility of offering this service, and to determine what type of information
to include in the newsletter, BIS selected a simple random sample of 45 money marketfunds A portion of the data obtained is shown in Table 8.6, which reports fund assets andyields for the past 7 and 30 days Before calling the money market fund managers to obtainadditional data, BIS decided to do some preliminary analysis of the data already collected
Managerial Report
1 Use appropriate descriptive statistics to summarize the data on assets and yields for
the money market funds
2 Develop a 95% confidence interval estimate of the mean assets, mean 7-day yield,
and mean 30-day yield for the population of money market funds Provide a gerial interpretation of each interval estimate
mana-3 Discuss the implication of your findings in terms of how BIS could use this type of
information in preparing its weekly newsletter
4 What other information would you recommend that BIS gather to provide the most
useful information to its clients?
Gulf Real Estate Properties, Inc., is a real estate firm located in southwestern Florida Thecompany, which advertises itself as “expert in the real estate market,” monitors condo-minium sales by collecting data on location, list price, sale price, and number of days it takes
to sell each unit Each condominium is classified as Gulf View if it is located directly on the Gulf of Mexico or No Gulf View if it is located on the bay or a golf course, near but not on
the Gulf Sample data from the Multiple Listing Service in Naples, Florida, provided recentsales data for 40 Gulf View condominiums and 18 No Gulf View condominiums.* Pricesare in thousands of dollars The data are shown in Table 8.7
Managerial Report
1 Use appropriate descriptive statistics to summarize each of the three variables for
the 40 Gulf View condominiums
2 Use appropriate descriptive statistics to summarize each of the three variables for
the 18 No Gulf View condominiums
3 Compare your summary results Discuss any specific statistical results that would
help a real estate agent understand the condominium market
*Data based on condominium sales reported in the Naples MLS (Coldwell Banker, June 2000).
Trang 37Assets 7-Day 30-Day
Source: Barron’s, October 3, 1994.
TABLE 8.6 DATA FOR BOCK INVESTMENT SERVICES
file
CD
Bock
Trang 38Gulf View Condominiums No Gulf View Condominiums List Price Sale Price Days to Sell List Price Sale Price Days to Sell
Trang 394 Develop a 95% confidence interval estimate of the population mean sales price and
population mean number of days to sell for Gulf View condominiums Interpretyour results
5 Develop a 95% confidence interval estimate of the population mean sales price and
population mean number of days to sell for No Gulf View condominiums Interpretyour results
6 Assume the branch manager requested estimates of the mean selling price of Gulf
View condominiums with a margin of error of $40,000 and the mean selling price
of No Gulf View condominiums with a margin of error of $15,000 Using 95% fidence, how large should the sample sizes be?
con-7 Gulf Real Estate Properties just signed contracts for two new listings: a Gulf View
condominium with a list price of $589,000 and a No Gulf View condominium with
a list price of $285,000 What is your estimate of the final selling price and number
of days required to sell each of these units?
Metropolitan Research, Inc., a consumer research organization, conducts surveys designed
to evaluate a wide variety of products and services available to consumers In one lar study, Metropolitan looked at consumer satisfaction with the performance of automo-biles produced by a major Detroit manufacturer A questionnaire sent to owners of one ofthe manufacturer’s full-sized cars revealed several complaints about early transmissionproblems To learn more about the transmission failures, Metropolitan used a sample ofactual transmission repairs provided by a transmission repair firm in the Detroit area Thefollowing data show the actual number of miles driven for 50 vehicles at the time of trans-mission failure
39,323 89,641 94,219 116,803 92,857 63,436 65,605 85,86164,342 61,978 67,998 59,817 101,769 95,774 121,352 69,568
74,425 67,202 118,444 53,500 79,294 64,544 86,813 116,26937,831 89,341 73,341 85,288 138,114 53,402 85,586 82,25677,539 88,798
Managerial Report
1 Use appropriate descriptive statistics to summarize the transmission failure data.
2 Develop a 95% confidence interval for the mean number of miles driven until
trans-mission failure for the population of automobiles with transtrans-mission failure Provide
a managerial interpretation of the interval estimate
3 Discuss the implication of your statistical finding in terms of the belief that some
owners of the automobiles experienced early transmission failures
4 How many repair records should be sampled if the research firm wants the
popula-tion mean number of miles driven until transmission failure to be estimated with amargin of error of 5000 miles? Use 95% confidence
5 What other information would you like to gather to evaluate the transmission
fail-ure problem more fully?
file
CD
Auto
Trang 40CHAPTER 9
Hypothesis Tests
CONTENTS
STATISTICS IN PRACTICE:
JOHN MORRELL & COMPANY
9.1 DEVELOPING NULL AND
ALTERNATIVE HYPOTHESES
Testing Research Hypotheses
Testing the Validity of a Claim
9.4 POPULATION MEAN:
σ UNKNOWN
One-Tailed TestTwo-Tailed TestUsing ExcelSummary and Practical Advice
9.5 POPULATION PROPORTIONUsing Excel
Summary