1. Trang chủ
  2. » Luận Văn - Báo Cáo

Solution manual statistics for business and economics

570 8 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Statistics For Business And Economics
Tác giả James T. McClave, P. George Benson, Terry Sincich
Người hướng dẫn Nancy S. Boudreau
Trường học Bowling Green State University
Thể loại solutions manual
Năm xuất bản 2023
Thành phố Bowling Green
Định dạng
Số trang 570
Dung lượng 9,48 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Using MINITAB, the stem-and-leaf display is: The middle number or the median is 24... Both the mean and the median are in the middle of the stem-and-leaf display.. 2.74 From Chebyshev’s

Trang 1

INSTRUCTOR'S SOLUTIONS MANUAL

to Accompany

James T McClave

P George Benson

and Terry Sincich's

STATISTICS FOR BUSINESS AND ECONOMICS

Tenth Edition

Nancy S Boudreau

Bowling Green State University

Trang 3

Contents

Chapter 7 Inferences Based on Two Samples: Confidence Intervals and

Trang 4

iv

Trang 5

Preface

This solutions manual is designed to accompany the text, Statistics for Business and Economics, Tenth

Edition, by James T McClave, P George Benson, and Terry Sincich It provides answers to most numbered exercises for each chapter in the text Other methods of solution may also be appropriate; however, the author has presented one that she believes to be most instructive to the beginning Statistics student

even-This manual is provided to help instructors save time in preparing presentations of the solutions and to possibly provide another point of view regarding their meaning

Some of the exercises are subjective in nature Subjective decisions regarding these exercises have been made and are explained by the author Solutions based on these decisions are presented; the solution to this type of exercise is often most instructive When an alternative interpretation of an exercise may occur, the author has often addressed it and given justification for the approach taken

I would like to thank Kelly Barber for creating the art work and for typing this work

Bowling Green, Ohio

Trang 7

Statistics, Data,

1.2 Descriptive statistics utilizes numerical and graphical methods to look for patterns, to

summarize, and to present the information in a set of data Inferential statistics utilizes sample data to make estimates, decisions, predictions, or other generalizations about a larger set of data

1.4 The first element of inferential statistics is the population of interest The population is a set of

existing units The second element is one or more variables that are to be investigated A variable is a characteristic or property of an individual population unit The third element is the sample A sample is a subset of the units of a population The fourth element is the

inference about the population based on information contained in the sample A statistical inference is an estimate, prediction, or generalization about a population based on information contained in a sample The fifth and final element of inferential statistics is the measure of reliability for the inference The reliability of an inference is how confident one is that the inference is correct

1.6 Quantitative data are measurements that are recorded on a meaningful numerical scale

Qualitative data are measurements that are not numerical in nature; they can only be classified into one of a group of categories

1.8 A population is a set of existing units such as people, objects, transactions, or events A

sample is a subset of the units of a population

1.10 An inference without a measure of reliability is nothing more than a guess A measure of

reliability separates statistical inference from fortune telling or guessing Reliability gives a measure of how confident one is that the inference is correct

1.12 Statistical thinking involves applying rational thought processes to critically assess data and

inferences made from the data It involves not taking all data and inferences presented at face value, but rather making sure the inferences and data are valid

1.14 a The two variables measured are ‘type of credit card used’ and ‘amount of purchase.’

‘Type of credit card used’ is qualitative It has no meaningful number associated with it, only the name of the card used ‘Amount of purchase’ is quantitative It has a meaningful number associated with it

b In Study 1, it says that all purchases were tracked Thus, the data represent a population

1.16 a High school GPA is a number usually between 0.0 and 4.0 Therefore, it is quantitative

b Honors/awards would have responses that name things Therefore, it would be

qualitative

Trang 8

c The scores on the SAT's are numbers between 200 and 800 Therefore, it is quantitative

d Gender is either male or female Therefore, it is qualitative

e Parent's income is a number: $25,000, $45,000, etc Therefore, it is quantitative

f Age is a number: 17, 18, etc Therefore, it is quantitative

1.18 a 1 The variable of interest is the status of a company’s e-commerce strategy

1.20 a The population of interest is the collection of computer security personnel at all U.S

corporations and government agencies

b Surveys were sent to computer security personnel at all U S corporations and government agencies However, in 2006, only 616 organizations responded to the survey There could be nonresponse bias Often, only those subjects with strong opinions will respond to a survey Thus, the responses may not reflect what the population as a whole thinks

c The variable measured in the survey is whether or not there was unauthorized use of computer systems at the firms during the year Since the responses will be either ‘Yes’

or “No’, the variable is qualitative

d If we assume that the responses were a random sample from the population, we could infer that about 52% of all computer security personnel will admit to unauthorized use of computer systems at their firms during the year

1.22 a The data collection method used is a designed experiment

b The experimental units in the study are the 50,000 smokers

c The variable of interest is the age at which the scanning method first detects a tumor Since this is a meaningful number, this variable is quantitative

Trang 9

d The population of interest is the set of all smokers in the U.S The sample of interest is the set of 50,000 smokers surveyed

e The researchers want to compare the age at first detection for the 2 methods to see if one

is more sensitive than the other

1.24 a The variable of interest to the researchers is the rating of highway bridges

b Since the rating of a bridge can be categorized as one of three possible values, it is

employing two or more professionals that use sampling methods in auditing their clients

b The four responses that were unusable could have been returned blank or could have been filled out incorrectly

c Any time a survey is mailed it is questionable whether the returned questionnaires

represent a random sample Often times, only those with very strong opinions return the surveys In such a case, the returned surveys would not be representative of the entire population

1.28 a The experimental units in this study are the 24 projects

b The population from which the sample was selected is the set of all new software

development projects

c The variable of interest in this project is the outcome of reusing previously developed software for the new software development projects

d In the sample, 9 of the 24 projects were judged failures This is (9 / 24)*100% = 37.5%

We could infer that approximately 37.5% of all projects would be judged failures

1.30 a The process being studied is the process of filling beverage cans with softdrink at CCSB's

Wakefield plant

b The variable of interest is the amount of carbon dioxide added to each can of beverage

c The sampling plan was to monitor five filled cans every 15 minutes The sample is the total number of cans selected

Trang 10

d The company's immediate interest is learning about the process of filling beverage cans with softdrink at CCSB's Wakefield plant To do this, they are measuring the amount of carbon dioxide added to a can of beverage to make an inference about the process of filling beverage cans In particular, they might use the mean amount of carbon dioxide added to the sampled cans of beverage to estimate the mean amount of carbon dioxide added to all the cans on the process line

e The technician would then be dealing with a population The cans of beverage have already been processed He/she is now interested in the outputs

Trang 11

Methods for

2.2 a To find the frequency for each class, count the number of times each letter occurs The

frequencies for the three classes are:

Class Frequency

X 8

Y 9

Z 3 Total 20

b The relative frequency for each class is found by dividing the frequency by the total sample size The relative frequency for the class X is 8/20 = 40 The relative frequency for the class Y is 9/20 = 45 The relative frequency for the class Z is 3/20 = 15

Class Frequency Relative Frequency

c The frequency bar chart is:

d The pie chart for the frequency distribution is:

Trang 12

2.4 a The variable summarized in the table is ‘Reason for requesting the installation of the

passenger-side on-off switch.’ The values this variable could assume are: Infant, Child,

Medical, Infant & Medical, Child & Medical, Infant & Child, and Infant & Child &

Medical Since the responses name something, the variable is qualitative

b The relative frequencies are found by dividing the number of requests for each category by

the total number of requests For the category ‘Infant’, the relative frequency is

1,852/30,337 = 061 The rest of the relative frequencies are found in the table below:

c Using MINITAB, a pie chart of the data is:

Medical ( 8377, 27.6%) Inf ant&Medic ( 44, 0.1%)

Inf ant&Child ( 1878, 6.2%) Inf ant ( 1852, 6.1%)

Inf &Chd&Med ( 135, 0.4%) Child&Medica ( 903, 3.0%)

Child (17148, 56.5%)

Pie Chart of Reason

d There are 4 categories where Medical is mentioned as a reason: Medical, Infant &

Medical, Child & Medical, and Infant & Child & Medical The sum of the frequencies

for these 4 categories is 8,377 + 44 + 903 + 135 = 9,459 The proportion listing Medical

as one of the reasons is 9,459/30,337 = 312

Trang 13

2.6 a To find relative frequencies, we divide the frequencies of each category by the total

number of incidents The relative frequencies of the number of incidents for each of the cause categories are:

Management System

Cause Category

Proc&Pract Eng&Des

35 30

Management Systen Cause Category

c The category with the highest relative frequency of incidents is Engineering and Design The category with the lowest relative frequency of incidents is Training and

Communication

b Since the data were numbers (percentage of US labor and materials), the variable is quantitative Once the data were collected, they were grouped into 4 categories

Trang 14

c Using MINITAB, a pie chart of the data is:

<50% ( 4, 3.8%)

75-99% (20, 18.9%)

50-74% (18, 17.0%)

100% (64, 60.4%) Pie Chart of Made in USA

About 60% of those surveyed believe that “Made in USA” means 100% US labor and

Chart of INDUSTRY

Trang 15

2.12 Using MINITAB, the side-by-side bar charts are:

Yes

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

Don't know No

Yes

1999 2006

Chart of 1999, 2006 vs Use

The relative frequency of unauthorized use of computer systems has decreased from

16 12 8 4 0

2 3 4 5

16 12 8 4 0

Exposure Opportunity

Content Faculty

Chart of Exposure, Opportunity, Content, Faculty vs Stars

From these graphs, one can see that very few of the top 30 MBA programs got 5-stars in any criteria In addition, about the same number of programs got 4 stars in each of the 4 criteria The biggest difference in ratings among the 4 criteria was in the number of programs receiving 3-stars More programs received 3-stars in Course Content than in any

of the other criteria Consequently, fewer programs received 2-stars in Course Content than in any of the other criteria

b Since this chart lists the rankings of only the top 30 MBA programs in the world, it is reasonable that none of these best programs would be rated as 1-star on any criteria

Trang 16

2.16

2.18 a The original data set has 1 + 3 + 5 + 7 + 4 + 3 = 23 observations

b For the bottom row of the stem-and-leaf display:

The stem is 0

The leaves are 0, 1, 2

The numbers in the original data set are 0, 1, and 2

c The dot plot corresponding to all the data points is:

2.20 a The measurement class that contains the highest proportion of respondents is “none”

Sixty-one percent of the respondents said that their companies did not outsource any computer security functions

b From the graph, 6% of the respondents indicated that they outsourced between 20% and 40% of their computer security functions

c The proportion of the 609 respondents who outsourced at least 40% of computer security functions is 04 + 01 + 01 = 06

d The number of the 609 respondents who outsourced less than 20% of computer security functions is (.27 + 61)*609 = 88(609) = 536

Trang 17

2.22 a Using MINITAB, the stem-and-leaf display of the data is:

Stem-and-Leaf Display: SCORE

c The sanitation score of 84 is in bold in the stem-and-leaf display in part a

50 40

30 20

Trang 18

b Using MINITAB, the frequency histogram is:

250 2000

1500 1000

500 0

35 30 25 20 15 10 5 0

0

140 120 100 80 60 40 20 0

DDT

2.26 Using MINITAB, the two dot plots are:

Dotplot for Arrive-Depart

Yes Most of the numbers of items arriving at the work center per hour are in the 135 to 165 area Most of the numbers of items departing the work center per hour are in the 110 to 140 area Because the number of items arriving is larger than the number of items departing, there will probably be some sort of bottleneck

Trang 19

2.28 a Using MINITAB, the three frequency histograms are as follows (the same starting point and

class interval were used for each):

Trang 20

2.30 a A stem-and-leaf display is as follows, where the stems are the units place and the leaves are

the decimal places:

b A little more than half (26/49 = 53) of all companies spent less than 2 months in

bankruptcy Only two of the 49 companies spent more than 6 months in bankruptcy It appears then, in general, the length of time in bankruptcy for firms using "prepacks" is less than that of firms not using "prepacks."

c A dot diagram will be used to compare the time in bankruptcy for the three types of

"prepack" firms:

d The circled times in part a correspond to companies that were reorganized through a

leverage buyout There does not appear to be any pattern to these points They appear to

be scattered about evenly throughout the distribution of all times

2.32 Using MINITAB, the stem-and-leaf display for the data is:

Trang 21

2.40 The median is the middle number once the data have been arranged in order If n is even, there

is not a single middle number Thus, to compute the median, we take the average of the middle

two numbers If n is odd, there is a single middle number The median is this middle number

A data set with five measurements arranged in order is 1, 3, 5, 6, 8 The median is the middle number, which is 5

A data set with six measurements arranged in order is 1, 3, 5, 5, 6, 8 The median is the average

= 5

Trang 22

2.42 a x = 7 4

x n

144.5

n i i

x x

Since the mean is larger than the median, the data are skewed to the right

b The sample mean is:

5.23

n i i

x x

Trang 23

The mode is the observation appearing the most For this data set, the mode is 6, which appears 6 times

Since the mean and median are about the same, the data are somewhat symmetric

1.881

n i i

x x n

The sample average surface roughness of the 20 observations is 1.881

b The median is found as the average of the 10th and 11th observations, once the data have been ordered The ordered data are:

1.06 1.09 1.19 1.26 1.27 1.40 1.51 1.72 1.95 2.03 2.05 2.13 2.13 2.16 2.24 2.31 2.41 2.50 2.57 2.64 The 10th and 11th observations are 2.03 and 2.05 The median is:

2.48 a Using MINITAB, the stem-and-leaf display is:

The middle number or the median is 24

c The mean of the data is x = x

Trang 24

d The number occurring most frequently is 0 The mode is 0

e The mode corresponds to the smallest number It does not seem to locate the center of the distribution Both the mean and the median are in the middle of the stem-and-leaf display Thus, it appears that both of them locate the center of the data

42.81

n i i

x x

n

The average length of the 144 fish is 42.81 cm

The median is the average of the middle two observations once they have been ordered The 72nd and 73rd observations are 45 and 45 The average of these two observations is 45

Half of the fish lengths are less than 45 cm and half are longer

The mode is 46 cm This observation occurred 12 times

b The sample mean weight is:

1049.72

n i i

x x

n

The average weight of the 144 fish is 1049.72 grams

The median is the average of the middle two observations once they have been ordered The 72nd and 73rd observations are 989 and 1011 The average of these two observations is

Half of the fish weights are less than 1000 grams and half are heavier

There are 2 modes, 886 and 1186 Each of these observations occurred 3 times

c The sample mean DDT level is:

24.35

n i i

x x

Trang 25

The median is the average of the middle two observations once they have been ordered The 72nd and 73rd observations are 7.1 and 7.2 The average of these two observations is

Half of the fish DDT levels are less than 7.15 parts per million and half are greater

The mode is 12 This observation occurred 8 times

d From the graph in Exercise 2.24a, the data are skewed to the left This corresponds to the relationship between the mean and the median For data skewed to the left, the mean is less than the median For the fish lengths, the mean is 42.81 and the median is 45

e From the graph in Exercise 2.24b, the data are slightly skewed to the right This

corresponds to the relationship between the mean and the median For data skewed to the right, the mean is more than the median For the fish weights, the mean is 1049.72 and the median is 1000

f From the graph in Exercise 2.24c, the data are skewed to the right This corresponds to the relationship between the mean and the median For data skewed to the right, the mean is more than the median For the fish DDT levels, the mean is 24.35 and the median is 7.15 2.52 a Due to the "elite" superstars, the salary distribution is skewed to the right Since this

implies that the median is less than the mean, the players' association would want to use the median

b The owners, by the logic of part a, would want to use the mean

20

80 20

3

4 3 5

n

i i

The sample median is found by finding the average of the 10th and 11th observations once the data are arranged in order The data arranged in order are:

Trang 26

b Eliminating the largest number which is 13 results in the following:

The sample mean is:

19

67 19

3

4 3 5

n

i i

The sample median is found by finding the middle observation once the data are arranged

in order The data arranged in order are:

1 1 1 1 1 2 2 3 3 3 4 4 4 5 5 5 6 7 9

The mode is the observations appearing the most For this data set, the mode is 1, which appears 5 times

By dropping the largest number, the mean is reduced from 4 to 3.53 The median is reduced from 3.5 to 3 There is no effect on the mode

c The data arranged in order are:

x x n

Trang 27

n n

n n

n n

n n

n n

n n

2 1

30110

2010

x x

n s

2 2

n s

Trang 28

The dot diagrams for the two data sets are shown below

2 2

2

7155

x x

n s

2

221025

x x

n s

2

( 13)39

5

x x

n s

d The range, variance, and standard deviation remain the same when any number is added to

or subtracted from each measurement in the data set

2.64 a The maximum age is 64 The minimum age is 39 The range is 64 – 39 = 25

b The variance is:

2

2 1

2

2 1

2494125,764-

n i n

i i

x x

n s

Trang 29

2.66 a The maximum weight is 1.1 carats The minimum weight is 18 carats The range is

b The variance is:

2

2 2

2

194.32146.19

i i i i

x x

n s

7

x x n

7

x x n

n n

n n

Trang 30

n n

n n

smaller variability in the time it takes to complete subtask 1 (part b) He or she is more

consistent in the time needed to complete the task

I would choose workers similar to Worker A to perform subtask 2 Worker A has a smaller average time on subtask 2 (A: x = 3, B: x = 4.14) Worker A also has a smaller

variability in the time needed to complete subtask 2 (part d)

2.70 Since no information is given about the data set, we can only use Chebyshev's Rule

a Nothing can be said about the percentage of measurements which will fall between

b At least 3/4 or 75% of the measurements will fall between x − 2s and x + 2s

c At least 8/9 or 89% of the measurements will fall between x − 3s and x + 3s

n n

c The percentages in part b are in agreement with Chebyshev's Rule and agree fairly well

with the percentages given by the Empirical Rule

Trang 31

d Range = 12 − 5 = 7

s ≈ range/4 = 7/4 = 1.75

The range approximation provides a satisfactory estimate of s = 1.83 from part a

2.74 From Chebyshev’s Theorem, we know that at least ¾ or 75% of all observations will fall within

2 standard deviations of the mean From Exercise 2.47, x= 631 From Exercise 2.66,

s = 2772 This interval is:

2.76 a From the information given, we have x = 375 and s = 25 From Chebyshev's Rule, we

know that at least three-fourths of the measurements are within the interval:

x ± 2s, or (325, 425) Thus, at most one-fourth of the measurements exceed 425 In other words, more than 425 vehicles used the intersection on at most 25% of the days

b According to the Empirical Rule, approximately 95% of the measurements are within the interval:

x ± 2s, or (325, 425) This leaves approximately 5% of the measurements to lie outside the interval Because of the symmetry of a mound-shaped distribution, approximately 2.5% of these will lie below

325, and the remaining 2.5% will lie above 425 Thus, on approximately 2.5% of the days, more than 425 vehicles used the intersection

2.78 a Since the sample mean (18.2) is larger than the sample median (15), it indicates that the

distribution of years is skewed to the right In addition, the maximum number of years is

50 and the minimum is 2 If the distribution were symmetric, the mean and median should

be about halfway between these two numbers Halfway between the maximum and

minimum values is 26, which is much larger than either the mean or the median

b The standard deviation can be estimated by the range divided by either 4 or 6 For this distribution, the range is:

Range = Largest − smallest = 50 − 2 = 48

Dividing the range by 4, we get an estimate of the standard deviation to be 48/4 = 12 Dividing the range by 6, we get an estimate of the standard deviation to be 48/6 = 8 Thus, the standard deviation should be somewhere between 8 and 12 For this problem, the

standard deviation is s = 10.64 This value falls in the estimated range of 8 to 12

Trang 32

c First, we calculate the number of standard deviations from the mean the value of 40 years

is To do this, we first subtract the mean and then divide by the value of the standard deviation

10.64

x s

Using Chebyshev's Rule, we know that at most 1/k2 or 1/22 = 1/4 of the data will be more than 2 standard deviations from the mean Thus, this would indicate that at most 25% of the Generation Xers responded with 40 years or more

Next, we calculate the number of standard deviations from the mean the value of 8 years is

Using Chebyshev's Rule, we get no information about the data within 1 standard deviation

of the mean However, we know the median (15) is more than 8 By definition, 50% of the data are larger than the median Thus, at least 50% of the Generation Xers responded with 8 years or more No additional information can be obtained with the information given

2.80 a Using MINITAB, the frequency histogram for the time in bankruptcy is:

10 9 8 7 6 5 4 3 2 1

Trang 33

b Using MINITAB, the descriptive measures are:

Descriptive Statistics: Time in Bankrupt

Variable N Mean Median TrMean StDev SE Mean Time in 49 2.549 1.700 2.333 1.828 0.261

Variable Minimum Maximum Q1 Q3 Time in 1.000 10.100 1.350 3.500

From Chebyshev’s Theorem, we know that at least 75% of the observations will fall within

2 standard deviations of the mean This interval is:

c There are 47 of the 49 observations within this interval The percentage would be (47/49)*100% = 95.9% This agrees with Chebyshev’s Theorem (at least 75%0 It also agrees with the Empirical Rule (approximately 95%)

d From the above interval we know that about 95% of all firms filing for prepackaged bankruptcy will be in bankruptcy between 0 and 6.2 months Thus, we would estimate that a firm considering filing for bankruptcy will be in bankruptcy up to 6.2 months

2.82 a Since it is given that the distribution is mound-shaped, we can use the Empirical Rule We

know that 1.84% is 2 standard deviations below the mean The Empirical Rule states that approximately 95% of the observations will fall within 2 standard deviations of the mean and, consequently, approximately 5% will lie outside that interval Since a mound-shaped

distribution is symmetric, then approximately 2.5% of the day's production of batches will fall below 1.84%

b If the data are actually mound-shaped, it would be extremely unusual (less than 2.5%) to observe a batch with 1.80% zinc phosphide if the true mean is 2.0% Thus, if we did observe 1.8%, we would conclude that the mean percent of zinc phosphide in today's production is probably less than 2.0%

2.84 a Since we do not have any idea of the shape of the distribution of SAT-Math score

changes, we must use Chebyshev’s Theorem We know that at least 8/9 of the observations will fall within 3 standard deviations of the mean This interval would be:

x±3s⇒19±3(65)⇒19 195± ⇒ −( 176, 214)

Thus, for a randomly selected student, we could be pretty sure that this student’s score would be any where from 176 points below his/her previous SAT-Math score to 214 points above his/her previous SAT-Math score

b Since we do not have any idea of the shape of the distribution of SAT-Verbal score changes, we must use Chebyshev’s Theorem We know that at least 8/9 of the observations will fall within 3 standard deviations of the mean This interval would be:

x±3s⇒ ±7 3(49)⇒ ±7 147⇒ −( 140, 154)

Trang 34

Thus, for a randomly selected student, we could be pretty sure that this student’s score would be any where from 140 points below his/her previous SAT-Verbal score to 154 points above his/her previous SAT-Verbal score

c A change of 140 points on the SAT-Math would be a little less than 2 standard deviations from the mean A change of 140 points on the SAT-Verbal would be a little less than 3 standard deviations from the mean Since the 140 point change for the SAT-Math is not as big a change as the 140 point on the SAT-Verbal, it would be most likely that the score was

a SAT-Math score

5

x x s

2.88 The 50th percentile of a data set is the observation that has half of the observations less than it

Another name for the 50th percentile is the median

2.90 Since the element 40 has a z-score of −2 and 90 has a z-score of 3,

−2 = 40 μ

σ

− and 3 = 90 μ

By substitution, μ = 40 + 2(10) = 60

Therefore, the population mean is 60 and the standard deviation is 10

2.92 The percentile ranking of the age of 25 years would be 100% − 73.5% = 26.5%

Trang 35

2.94 a From Exercise 2.77, x = 94.91 and s = 4.83 The z-score for an observation of 78 is:

78 94.91

3.504.83

This z-score indicates that an observation of 78 is 3.5 standard deviations below the

mean Very few observations will be lower than this one

b The z-score for an observation of 98 is:

98 94.91

0.634.83

This z-score indicates that an observation of 98 is 63 standard deviations above the

mean This score is not an unusual observation in the data set

2.96 a From the problem, μ = 2.7 and σ = 5

c If we assume the distribution of GPAs is

approximately mound-shaped, we can use the

Empirical Rule

From the Empirical Rule, we know that ≈.025

or ≈2.5% of the students will have GPAs

above 3.7 (with z = 2) Thus, the GPA

corresponding to summa cum laude (top

2.5%) will be greater than 3.7 (z > 2)

We know that ≈.16 or 16% of the students will have GPAs above 3.2 (z = 1) Thus, the

limit on GPAs for cum laude (top 16%) will be greater than 3.2 (z > 1)

We must assume the distribution is mound-shaped

Trang 36

2.98 a Since the data are approximately mound-shaped, we can use the Empirical Rule

On the blue exam, the mean is 53% and the standard deviation is 15% We know that approximately 68% of all students will score within 1 standard deviation of the mean This interval is:

b Since the data are approximately mound-shaped, we can use the Empirical Rule

On the red exam, the mean is 39% and the standard deviation is 12% We know that approximately 68% of all students will score within 1 standard deviation of the mean This interval is:

c The student would have been more likely to have taken the red exam For the blue exam,

we know that approximately 95% of all scores will be from 23% to 83% The observed 20% score does not fall in this range For the blue exam, we know that approximately 95% of all scores will be from 15% to 63% The observed 20% score does fall in this range Thus, it is more likely that the student would have taken the red exam

2.100 The 25th percentile, or lower quartile, is the measurement that has 25% of the measurements

below it and 75% of the measurements above it The 50th percentile, or median, is the

measurement that has 50% of the measurements below it and 50% of the measurements above it The 75th percentile, or upper quartile, is the measurement that has 75% of the measurements below it and 25% of the measurements above it

Trang 37

2.102 a Median is approximately 4

QU is approximately 6 (Upper Quartile)

f There are two potential outliers, 12 and 13 There is one outlier, 16

2.104 a From the problem, x = 52.33 and s = 9.22

The highest salary is 75 (thousand)

s

− = 75 52.339.22

= 2.46

Therefore, the highest salary is 2.46 standard deviations above the mean

The lowest salary is 35.0 (thousand)

s

− = 35.0 52.339.22

Therefore, the lowest salary is 1.88 standard deviations below the mean

The mean salary offer is 52.33 (thousand)

s

− = 52.33 52.339.22

= 0

No, the highest salary offer is not unusually high For any distribution, at least 8/9 of the

salaries should have z-scores between −3 and 3 A z-score of 2.46 would not be that

unusual

Trang 38

b Using MINITAB, the box plot is:

Since no salaries are outside the inner fences, none of them are potentially faulty observations 2.106 Using MINITAB, the side-by-side box plots are:

GROUP

3 2

From the boxplots, there appears to be one outlier in the third group

2.108 a First, we will compute the mean and standard deviation

The sample mean is:

5.2475

n i i

x x

2

3935943

i i i i

x x

n s

Trang 39

The standard deviation is:

Thus any observation that is greater than to 19.728 or less than -9.248 would be

considered an outlier In this data set there would be 4 outliers: 21, 21, 25, 48

b Deleting these 4 outliers, we will recalculate the mean, median, variance, and standard deviation The median for the original data set is the middle number once they have been arranged in order and is the 38th observation which is 3

The new mean is:

1 278

3.9271

n i i

x x

2

2782132

i i i i

x x

n s

The new median is the 36th observation once the data have been arranged in order and is 3

In the original data set, the mean is 5.24, the standard deviation is 7.244, and the median

is 3 In the revised data set, the mean is 3.92, the standard deviation is 3.861, and the median is 3 The mean has been decreased, the standard deviation has been almost

halved, but the median stays the same

Trang 40

2.110 For Perturbed Intrinsics, but no Perturbed Projections:

1.62

n i i

x x

2

2 1

8.115.63

2.508

n i n

i i i

x x

n s

x x

2

2 1

125.83350.1

184.972

n i n

i i i

x x

n s

z s

Since this z-score is less than -3, we would consider this an outlier for perturbed

projections, but no perturbed intrinsics

Since the z-score corresponding to 4.5 for the perturbed projections, but no perturbed intrinsics is smaller than that for perturbed intrinsics, but no perturbed projections, it is more likely that the that the type of camera perturbation is perturbed projections, but no

Ngày đăng: 28/08/2021, 14:04

TỪ KHÓA LIÊN QUAN

w