1. Trang chủ
  2. » Giáo Dục - Đào Tạo

Statistics for Business and Economics chapter 14 Simple Linear Regression

50 109 6

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 50
Dung lượng 1,62 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

The scatter diagram and the slope of the estimated regression equation indicate a negative linear relationship between x = temperature rating and y = price.. r2 = SSR/SST = 9,836.74/12,3

Trang 2

2.610

Trang 3

3180

Trang 4

( )( ) 171

0.9190

Trang 5

( )( ) 110

5.520

Trang 6

5 a.

b There appears to be a positive relationship between price and rating. The sign that says “Quality: You Get What You Pay For” does fairly reflect the price­quality relationship for ellipticals

Trang 7

6 a.

b There appears to be a negative linear relationship between x = miles and y = sales price.

If the car has higher miles, the sales price tends to be lower

( )( ) 135.66

.02633( ) 5152.40

d The slope of the estimated regression equation is -.02633 Thus, a one unit increase in the value of

x will result in a decrease in the estimated value of y equal to 02633 Because the data were

recorded in thousands, every additional 1000 miles on the car’s odometer will result in a $26.33 decrease in the estimated price

e yˆ 8.9412 02633(100) 6.3   or $6300

Trang 8

( )( ) 568

4142

Trang 9

8 a.

b The scatter diagram and the slope of the estimated regression equation indicate a negative linear

relationship between x = temperature rating and y = price Thus, it appears that sleeping bags with

a lower temperature rating cost more than sleeping bags with a higher temperature rating In otherwords, it costs more to stay warmer

c x x n i/ 209 /11 19  y y n i/ 2849 /11 259

2(x i x y)( i y) 10,090 (x i x) 1912

( )( ) 10,090

5.27721912

Trang 10

( )( ) 14,601.40

.05644( ) 258,695.60

Trang 12

b There appears to be a positive linear relationship between x = price and y = road-test score.

Trang 13

12 a.

b The scatter diagram indicates a positive linear relationship between x = weight and y = price

Thus, it appears that PWC’s that weigh more have a higher price

c x x n i/ 7730 /10 773  y y n i/ 92,200 /10 9220

2(x i x y)( i y) 332, 400 (x i x) 14,810

( )( ) 332, 400

22.444314,810

Thus, the estimate of the price of Jet Ski with a weight of 750 pounds is approximately $8704

e No The relationship between weight and price is not deterministic

f The weight of the Kawasaki SX-R 800 is so far below the lowest weight for the data used to develop the estimated regression equation that we would not recommend using the estimated regression equation to predict the price for this model

13 a

Trang 14

( )( ) 1233.7

0.16137648

Trang 16

b r2 = SSR/SST = 249,864.86/335,000 = .746

Trang 18

( )( ) 712,500

7.693,750

Trang 19

i

s s

Trang 20

c (x ix)2 180

8.7560

0.6526180

b

i

s s

Because p­value , we reject H0: 1 = 0

e MSR = SSR/1 = 1620

F  = MSR/MSE = 1620/76.6667 = 21.13

Using F table (1 degree of freedom numerator and 3 denominator), p­value is less than .025 Using Excel or Minitab, the p­value corresponding to F = 21.13 is .0193.

b

i

s s

Trang 21

Because p­value > , we cannot reject H0: 1 = 0; x and y do not appear to be related.

c MSR = SSR/1 = 153.9 /1 = 153.9

F = MSR/MSE = 153.9/42.4333 = 3.63

Using F table (1 degree of freedom numerator and 3 denominator),  p­value is greater than .10 Using Excel or Minitab, the p­value corresponding to F = 3.63 is .1530.

Because p­value > , we cannot reject H0: 1 = 0; x and y do not appear to be related.

26 a In solving exercise 18, we found SSE = 85,135.14 

s2 = MSE = SSE/(n ­ 2) = 85,135.14/4 = 21,283.79

s  MSE 21 283 79 14589, 

2(x i x) 0.74

145.89

169.590.74

b

i

s s

Regression 249864.86 1  249864.86 11.74 0266

Trang 22

Because p­value , we reject H0: 1 = 0

Upper support and price are related

c r2 = SSR/SST = 9,836.74/12,324.4 = .80

The estimated regression equation provided a good fit; we should feel comfortable using the estimated regression equation to estimate the price given the upper support rating

d y = 49.93 + 31.21(4) = 174.77

28 The sum of squares due to error and the total sum of squares are

ˆSSE�(y iy i) 12,953.09 SST�(y iy) 66, 200

Trang 23

We will first illustrate the use of the t test.

Note: from the solution to exercise 10(x ix)2 1912

37.9372 .86761912

b

i

s s

Using t table (9 degrees of freedom), area in tail is less than 005; p-value is less than 01

Using Excel or Minitab, the p-value corresponding to t = -6.0825 is 000.

Because p-value , we reject H0:  = 01

Because we can reject H0:  = 0 we conclude that temperature rating and price are related.1

Next we illustrate the use of the F test.

MSR = SSR / 1 = 53,246.91

F = MSR / MSE = 53,246.91 / 1439.2322 = 37.00

Using F table (1 degree of freedom numerator and 9 denominator), p-value is less than 01 Using Excel or Minitab, the p-value corresponding to F = 37.00 is 000.

Because p-value , we reject H0:  = 01

Because we can reject H0:  = 0 we conclude that temperature rating and price are related.1The ANOVA table is shown below

Trang 24

Error    233,333.33 4 58,333.33

Using F table (1 degree of freedom numerator and 4 denominator),  p­value is less than .01 Using Excel or Minitab, the p­value corresponding to F = 92.83 is .0006.

Because p­value , we reject H0: 1 = 0.  Production volume and total cost are related

   = 8,155,000

5.3833

.0018858,155,000

b

i

s s

Trang 25

Because p-value , we reject H0:  = 01

Conclusion: price and overall score are related

Trang 26

35 a Note: some of the values shown were computed in exercises 18 and 26.

s = 145.89 

23.2 ( i ) 0.74

Trang 28

( )( ) 1471

1.7554838

Trang 30

Using F table (1 degree of freedom numerator and 7 denominator),  p­value is less than .01 Using Excel or Minitab, the p-value corresponding to F = 28.00 is 0011.

Trang 31

Predictor Coef SE Coef T P

Predicted Values for New Observations

New Obs Fit SE Fit 95.0% CI 95.0% PI

1 34.35 1.66 ( 30.93, 37.77) ( 16.56, 52.14)

c The Minitab output is shown below:

The regression equation is

Price = 2044 - 28.3 Weight

Predictor Coef SE Coef T P

Constant 2044.4 226.4 9.03 0.000

Trang 32

that the underlying relationship between x and y may be curvilinear.

Trang 33

e The standardized residual plot has the same shape as the original residual plot The

curvature observed indicates that the assumptions regarding the error term may not be

satisfied

46 a yˆ2.32 64 x

b

-4-3-2-101234

Trang 34

Because p­value   = .05, we conclude that the two variables are related.

c

-15-10-50510

d The residual plot leads us to question the assumption of a linear relationship between x and y

Even though the relationship is significant at the 05 level of significance, it would be extremely dangerous to extrapolate beyond the range of the data

48 a yˆ 80 4  x

Trang 35

b The assumptions concerning the error term appear reasonable

49 a The Minitab output follows:

The regression equation is

Price ($) = 22636 + 59.0 Square Footage

Predictor Coef SE Coef T P

Trang 36

c The residual plot leads us to question the assumption of a linear relationship between square

footage and price Therefore, even though the relationship is very significant (p-value = 000),

using the estimated regression equation make predictions of the price for a house with square footage beyond the range of the data is not recommended

50 a The Minitab output follows:

R denotes an observation with a large standardized residual. 

The standardized residuals are: 2.11, -1.08, 14, -.38, -.78, -.04, -.41

The first observation appears to be an outlier since it has a large standardized residual

Trang 37

130 125

120 115

Trang 38

  The scatter diagram also indicates that the observation x  =  135, y  =  145 may be an outlier; the 

implication is that for simple linear regression an outlier can be identified by looking at the scatter diagram

R denotes an observation with a large standardized residual

X denotes an observation whose X value gives it large influence.The standardized residuals are:  ­1.00, ­.41, .01, ­.48, .25, .65, ­2.00, ­2.16

The last two observations in the data set appear to be outliers since the standardized residuals for these observations are 2.00 and ­2.16, respectively

Trang 39

b Using Minitab, we obtained the following leverage values:

.28, .24, .16, .14, .13, .14, .14, .76

MINITAB identifies an observation as having high leverage if h i  > 6/n; for these data, 6/n =  6/8 = .75.  Since the leverage for the observation x = 22, y = 19 is .76, Minitab would identify

observation 8 as a high leverage point.  Thus, we conclude that observation 8 is an influentialobservation

Total 9 939.35

Unusual Observations

Obs Media$ Shipment Fit SE Fit Residual St Resid

1 120 36.30 27.55 3.30 8.75 2.30R

Trang 40

R denotes an observation with a large standardized residual

b Minitab identifies observation 1 as having a large standardized residual; thus, we would consider observation 1 to be an outlier

R denotes an observation with a large standardized residual

X denotes an observation whose X value gives it large influence.

b The Minitab output identifies observation 22 as having a large standardized residual and is an

observation whose x value gives it large influence The following residual plot verifies these

50 40

Trang 41

54 a.

The scatter diagram does indicate potential outliers and/or influential observations For example, the data for the Washington Redskins, New England Patriots, and the Dallas Cowboys not only have the three highest revenues, they also have the highest team values

b A portion of the Minitab output follows:

The regression equation is

Trang 42

X denotes an observation whose X value gives it large leverage.

c The Minitab output indicates that there are five unusual observations:

 Observation 9 (Dallas Cowboys) is an outlier because it has a large standardized residual

 Observation 19 (New England Patriots) is an influential observation becasuse has high leverage

 Observation 21 (New York Giants) is an outlier because it has a large standardized residual

 Observation 22 (New York Jets) is an outlier because it has a large standardized residual

 Observation 32 (Washington Redskins) is an influential observation becasuse has high leverage

55 No.  Regression or correlation analysis can never prove that two variables are causally related

56 The estimate of a mean value is an estimate of the average of all y values associated with the same 

x. The estimate of an individual y value is an estimate of only one of the y values associated with a  particular x.

57 The purpose of testing whether 1 0is to determine whether or not there is a significant 

relationship between x and y. However, rejecting 1 0does not necessarily imply a good fit. For example, if  1 0is rejected and r2 is low, there is a statistically significant relationship between x  and y but the fit is not very good.

58 a The Minitab output is shown below:

The regression equation is

b Since the p-value corresponding to F = 23.22 = 001 <  = 05, the relationship is significant

c r = 744; a good fit The least squares line explained 74.4% of the variability in Price.2

d y ˆ 9.26 711(6) 13.53 

59 a The Minitab output is shown below:

The regression equation is

Share Price ($) = - 2.99 + 0.911 Fair Value ($)

Predictor Coef SE Coef T P

Constant -2.987 5.791 -0.52 0.610

Trang 43

b Significant relationship: p-value = 000 <  = 05

c y = -2.987 + 91128 Fair Value ($) = -2.987 + 91128(50) = 42.577 or approximately $42.58

d The estimated regression equation should provide a good estimate because r2 = 0.769

60 a

The scatter diagram indicates a positive linear relationship between the two variables Online universities with higher retention rates tend to have higher graduation rates

b The Minitab output follows:

The regression equation is

Trang 44

R denotes an observation with a large standardized residual.

X denotes an observation whose X value gives it large leverage.

c Because the p-value = 000 < α =.05, the relationship is significant.

d The estimated regression equation is able to explain 44.9% of the variability in the graduation ratebased upon the linear relationship with the retention rate It is not a great fit, but given the type ofdata, the fit is reasonably good

e In the Minitab output in part (b), South University is identified as an observation with a large standardized residual With a retention rate of 51% it does appear that the graduation rate of 25%

is low as compared to the results for other online universities The president of South University should be concerned after looking at the data Using the estimated regression equation, we estimate that the gradation rate at South University should be 25.4 + 285(51) = 40%

f In the Minitab output in part (b), the University of Phoenix is identified as an observation whose xvalue gives it large influence With a retention rate of only 4%, the president of the University of Phoenix should be concerned after looking at the data

Trang 45

c The 95% prediction interval is 28.74 to 49.52 or $2874 to $4952

d Yes, since the expected expense is  �y =  10.528 + .9534(30) = 39.13 or $3913.

62 a The Minitab output is shown below:

The regression equation is

Predicted Values for New Observations

New Obs Fit SE Fit 95.0% CI 95.0% PI

1 14.783 0.896 ( 12.294, 17.271) ( 9.957, 19.608)

b Since the p-value corresponding to F = 11.33 = 028 <  = 05, the relationship is significant.

c r = 739; a good fit The least squares line explained 73.9% of the variability in the number of 2defects

d Using the Minitab output in part (a), the 95% confidence interval is 12.294 to 17.271

63 a

Trang 46

d r2 = .711.  The estimated regression equation explained 71.1% of the variability in y; this is a 

reasonably good fit

e The 95% confidence interval is 5.195 to 7.559 or approximately 5.2 to 7.6 days

Trang 48

Total 19 1.00200

Predicted Values for New Observations

New Obs Fit SE Fit 95.0% CI 95.0% PI

1 0.8828 0.0523 ( 0.7729, 0.9927) ( 0.4306, 1.3349)

b.  Since the p­value = 0.038 is less than  = .05, the relationship is significant.

c r2 = .217.  The least squares line does not provide a very good fit

Trang 49

d The 95% confidence interval is .7729 to .9927.

68 a

b There appears to be a positive linear relationship between the two variables

c A portion of the Minitab output for this problem follows

The regression equation is

Rating (%) = 9.4 + 1.29 Top Five (%)

Predictor Coef SE Coef T P

ˆy = 9.37 + 1.2875 Top Five (%)

d Since the p-value = 000 < α = 05, the relationship is significant.

Trang 50

e r = 741; a good fit The least squares line explained 74.1% of the variability in the satisfaction 2rating.

f r xyr2  741 86

Ngày đăng: 09/10/2019, 23:11

TỪ KHÓA LIÊN QUAN

w