1. Trang chủ
  2. » Giáo Dục - Đào Tạo

Statistics for Business and Economics chapter 16 Regression Analysis: Model Building

40 72 2

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 40
Dung lượng 731 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

The Minitab output is shown below:The regression equation is 3.. The scatter diagram is shown below:050100150200250 A simple linear regression model does not appear to be appropriate.. T

Trang 1

Regression Analysis: Model Building

Learning Objectives

1 Learn how the general linear model can be used to model problems involving curvilinear

relationships

2 Understand the concept of interaction and how it can be accounted for in the general linear model

3 Understand how an F test can be used to determine when to add or delete one or more variables.

4 Develop an appreciation for the complexities involved in solving larger regression analysis problems

5 Understand how variable selection procedures can be used to choose a set of independent variables for an estimated regression equation

6 Learn how analysis of variance and experimental design problems can be analyzed using a regression model

7 Know how the Durbin-Watson test can be used to test for autocorrelation

Trang 2

1 a The Minitab output is shown below:

The regression equation is

Trang 3

10 15 20 25 30 35 40 45

x

The scatter diagram suggests that a curvilinear relationship may be appropriate

d The Minitab output is shown below:

The regression equation is

2 a The Minitab output is shown below:

The regression equation is

Y = 9.32 + 0.424 X

Predictor Coef SE Coef T p

Constant 9.315 4.196 2.22 0.113

Trang 4

b The Minitab output is shown below:

The regression equation is

3 a The scatter diagram shows some evidence of a possible linear relationship

b The Minitab output is shown below:

The regression equation is

Y = 2.32 + 0.637 X

Predictor Coef SE Coef T p

Constant 2.322 1.887 1.23 0.258

X 0.6366 0.3044 2.09 0.075

Trang 5

6 5

4 3

transformation and the corresponding standardized residual pot are shown below

The regression equation is

Trang 6

0.200 0.175

4 a The Minitab output is shown below:

The regression equation is

Total 5 37395

b p-value = 005 <  = 01; reject H0

5 The Minitab output is shown below:

The regression equation is

Trang 7

b Since the linear relationship was significant (Exercise 4), this relationship must be significant

Note also that since the p-value of 003 <  = 05, we can reject H0.

c The fitted value is 1302.01, with a standard deviation of 9.93 The 95% confidence interval is 1270.41 to 1333.61; the 95% prediction interval is 1242.55 to 1361.47

6 a The scatter diagram is shown below:

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

b No; the relationship appears to be curvilinear

c Several possible models can be fitted to these data, as shown below:

Trang 8

7 a The scatter diagram is shown below:

050100150200250

A simple linear regression model does not appear to be appropriate There appears to be a

curvilinear relationship between the two variables

b The Minitab output is shown below:

The regression equation is

R denotes an observation with a large standardized residual

Trang 9

The corresponding standardized residual plot is shown below:

120 100

80 60

40 20

0

4 3 2 1 0 -1 -2

There is an unusual trend in the points There is also some indication that the variance may not be constant

c The Minitab output is shown below:

The regression equation is

R denotes an observation with a large standardized residual

Trang 10

The corresponding standardized residual plot is shown below:

4.5 4.0

3.5 3.0

d The Minitab output is shown below:

The regression equation is

R denotes an observation with a large standardized residual

Trang 11

The corresponding standardized residual plot is shown below:

0.04 0.03

0.02 0.01

0.00 -0.01

8 a The scatter diagram is shown below:

050010001500200025003000350040004500

Rating

Trang 12

A simple linear regression model does not appear to be appropriate There appears to be a curvilinear relationship between the two variables.

b The Minitab output is shown below:

The regression equation is

Price = 33829 - 4571 Rating + 154 RatingSq

Predictor Coef SE Coef T P

c The Minitab output is shown below:

The regression equation is

Trang 13

A simple linear regression model appears to be appropriate.

b

Note the line drawn through the data This line indicates a possible curvilinar relationship between these two variables

c In the Minitab output that follows IndexSq denotes the square of the Cost-of-Living Index

The regression equation is

Creative Class (%) = 49.2 - 0.673 Cost-of-Living Index + 0.00282 IndexSq + 0.404 Income

Predictor Coef SE Coef T P

Constant 49.24 17.25 2.85 0.006

Cost-of-Living Index -0.6725 0.2888 -2.33 0.024

IndexSq 0.002821 0.001223 2.31 0.026

Income 0.40418 0.06772 5.97 0.000

Trang 14

10 a SSR = SST - SSE = 1030

Using Excel or Minitab, the p-value corresponding to F = 49.52 is 000.

Because p-value ≤ α, x1 is significant

100 / 23

Using Excel or Minitab, the p-value corresponding to F = 48.3 is 000.

Because p-value ≤ α, the addition of variables x2 and x3 is significant

11 a SSE = SST - SSR = 1805 - 1760 = 45

F = 440/1.8 = 244.44

Using Excel or Minitab, the p-value corresponding to F = 244.44 is 000.

Because p-value ≤ α, the overall relationship is significant.

b SSE(x1, x2, x3, x4) = 45

c SSE(x2, x3) = 1805 - 1705 = 100

Trang 15

d (100 45) / 2 15.28

1.8

Using Excel or Minitab, the p-value corresponding to F = 15.28 is 000.

Because p-value ≤ α, x1 and x4 contribute significantly to the model

12 a A portion of the Minitab output follows:

The regression equation is

Scoring Avg = 46.3 + 14.1 Putting Avg.

Predictor Coef SE Coef T P

b A portion of the Minitab output follows:

The regression equation is

Scoring Avg = 59.0 - 10.3 Greens in Reg + 11.4 Putting Avg - 1.81 Sand Saves

Predictor Coef SE Coef T P

Trang 16

SSE(reduced) - SSE(full) 7.2998 - 4.3240

The p-value associated with F = 8.95 (2 degrees of freedom numerator and 26 denominator)

is 001 With a p-value < α =.05, the addition of the two independent variables is statistically

significant

13 a A portion of the Minitab output follows:

The regression equation is

Earnings ($1000) = 14528 - 7640 Putting Avg.

Predictor Coef SE Coef T P

b A portion of the Minitab output follows:

The regression equation is

Earnings ($1000) = 5214 + 6873 Greens in Reg - 5623 Putting Avg + 2217 Sand Saves

Predictor Coef SE Coef T P

Trang 17

The p-value associated with F = 16.25 (2 degrees of freedom numerator and 26 denominator)

is 000 With a p-value < α =.05, the addition of the two independent variables is statistically

significant

d A portion of the Minitab output follows:

The regression equation is

Earnings ($1000) = 36697 - 501 Scoring Avg.

Predictor Coef SE Coef T P

14 a The Minitab output is shown below:

Risk = - 111 + 1.32 Age + 0.296 Pressure

Predictor Coef SE Coef T P

Total 19 4190.9

Source DF Seq SS

Age 1 1772.0

Pressure 1 1607.7

Trang 18

Unusual Observations

Obs Age Risk Fit SE Fit Residual St Resid

17 66.0 8.00 25.05 1.67 -17.05 -2.54R

R denotes an observation with a large standardized residual

b The Minitab output is shown below:

Risk = - 123 + 1.51 Age + 0.448 Pressure + 8.87 Smoker -

The p-value associated with F = 4.23 (2 numerator and 15 denominator DF) is 000

Because p-value ≤ α = 05, the addition of the two terms is significant.

15 a A portion of the Minitab output follows:

The regression equation is

ERA = - 0.253 + 0.453 H/9

Predictor Coef SE Coef T P

Constant -0.2535 0.7351 -0.34 0.732

Trang 19

b A portion of the Minitab output follows:

The regression equation is

The p-value associated with F = 41.26 (2 degrees of freedom numerator and 46 denominator)

is 000 With a p-value < α =.05, the addition of the two independent variables is statistically

significant

16 a The sample correlation coefficients are as follows:

Weeks Age Educ Married Head Tenure Manager Age 0.577

Trang 20

Cell Contents: Pearson correlation

Trang 21

The regression equation is

Weeks = - 0.07 + 1.73 Age - 28.7 Manager - 15.1 Head - 17.4 Sales

Predictor Coef SE Coef T P

d The results using Minitab’s Backward Elimination procedure are shown below:

Backward elimination Alpha-to-Remove: 0.05

Response is Weeks on 7 predictors, with N = 50

Trang 23

e The results using Mintab’s Best-Subset procedure are shown below:

The regression equation is

Weeks = 13.1 + 1.64 Age - 9.76 Married - 19.4 Head - 29.0 Manager - 19.0 Sales

Predictor Coef SE Coef T P

Trang 24

17 The output obtained using Minitab’s Best Subset Regression is shown below:

Response is Scoring Avg.

The regression equation is

Scoring Avg = - 88.1 + 0.591 Drive Average + 209 Greens in Reg.

+ 9.74 Putting Avg - 0.868 DriveGreens

Predictor Coef SE Coef T P

Trang 25

18 a Because the independent variable most highly correlated with RPG is OBP, it

will provide the best one-variable estimated regression equation The Minitab

output using OBP to predict RPG is shown below:

The regression equation is

Trang 26

The regression equation is

RPG = - 0.909 + 32.2 OBP + 0.109 HR - 21.5 AVG + 0.244 3B - 0.0223 BB

Trang 27

8 7 6 5 4 3 2 1

BB appears to be a good choice

19 See the solution to Exercise 14 in this chapter The Minitab output using the best subsets

regression procedure is shown below:

Trang 28

Risk = - 91.8 + 1.08 Age + 0.252 Pressure + 8.74 Smoker

Predictor Coef SE Coef T P

Trang 29

x3 = 0 if block 1 and 1 if block 2

b The Minitab output is shown below:

The regression equation is

d The p-value of 004 is less than  = 05; therefore, we can reject H0 and conclude that the

mean time to mix a batch of material is most the same for each manufacturer

24 a The dummy variables are defined as follows:

Trang 30

The Minitab output is shown below:

The regression equation is

b Note: Estimating the mean drying for paint 2 using the estimated regression equations developed

in part (a) may not be the best approach because at the 5% level of significance, we cannot reject

H0 But, if we want to use the output, we would proceed as follows

D1 = 1 D2 = 0 D3 = 0

TIME = 133 + 6(1) + 3(0) +11(0) = 139

25 X1 = 0 if computerized analyzer, 1 if electronic analyzer

X2 and X3 are defined as follows:

Trang 31

To test for any significant difference between the two analyzers we must test H0: 1 Since

the p-value corresponding to t = -4.54 is 045 <  = 05, we reject H0: 0 the time to do a tuneup is not the same for the two analyzers

26 Size = 0 if a small advertisement and 1 if a large advertisement

DesignB and DesignC are defined as follows:

DesignB DesignC AdvertisementDesign

LargeDesignB denotes the interaction between Large and DesignB

LargeDesignC denotes the interaction between Large and DesignC

Trang 32

The complete data set and the Minitab output are shown below:

The regression equation is

Number = 10.0 + 0.00 Size + 8.00 DesignB + 4.00 DesignC + 10.0 LargeDesignB

The Minitab output using only Design B follows:

The regression equation is

Trang 33

Residual Error 10 250.00 25.00

Total 11 544.00

Thus, DesignB is significant using α = 05 However, the model involving just the interaction

between Large and DesignB also provides some interesting results:

The regression equation is

conclusions about the relationships among the variables

27 a The Minitab output is shown below:

The regression equation is

b The Durbin-Watson statistic is 798118 At the 05 level of significance, dL = 1.20 and dU =1.41

Because d < dL, there is significant positive autocorrelation

Trang 34

28 From Minitab, d = 1.60 At the 05 level of significance, dL = 1.04 and dU = 1.77 Since dL  d

dU, the test is inconclusive

29 a The scatter diagram is shown below:

50 55 60 65 70 75 80 85

The curvature in the scatter diagram indicates that a simple linear regression model may not be appropriate

b The Minitab output is shown below:

The regression equation is

Rating = 49.9 + 14.9 Speed - 1.83 SpeedSq

Predictor Coef SE Coef T P

c The Minitab output for the transformed nonlinear model is shown below:

The regression equation is

Trang 35

The estimated regression equation developed in part (b) provides a much better fit.

30 a

There appears to be a curvilinear relationship between weight and price

b A portion of the Minitab output follows:

The regression equation is

Price = 11376 - 728 Weight + 12.0 WeightSq

Predictor Coef SE Coef T P

c A portion of the Minitab output follows:

The regression equation is

Price = 1284 - 572 Type_Fitness - 907 Type_Comfort

Predictor Coef SE Coef T P

Trang 36

The regression equation is

Price = 5924 - 215 Weight - 6343 Type_Fitness - 7232 Type_Comfort + 261 WxF

31 a The Minitab output is shown below:

The regression equation is

Delay = 80.4 + 11.9 Industry - 4.82 Public - 2.62 Quality - 4.07 Finished Predictor Coef SE Coef T P

Trang 37

Regression 4 2587.7 646.9 5.42 0.002

Residual Error 35 4176.3 119.3

Total 39 6764.0

b The low value of the adjusted coefficient of determination (31.2%) does not indicate a good fit

c The scatter diagram is shown below:

The scatter diagram suggests a curvilinear relationship between these two variables

d The output from Minitab’s best subsets procedure is shown below, where FinishedSq is the square

Trang 38

The estimated regression equation using Industry, Quality, Finished, and FinishedSq has an adjusted coefficient of determination of 54.4%

32 The computer output is shown below:

The regression equation is

Total 39 6764.0

Durbin-Watson statistic = 1.55

At the 05 level of significance, dL = 1.44 and dU = 1.54 Since d = 1.55 > dU, there is no significant positive autocorrelation

33 a The Minitab output is shown below:

The regression equation is

Delay = 70.6 + 12.7 Industry - 2.92 Quality

Predictor Coef SE Coef T p

Total 39 6764.0

Durbin-Watson statistic = 1.43

Trang 39

b The residual plot as a function of the order in which the data are presented is shown below:

Order In Which Data Are Presented

30 20 10 0 -10 -20 -30

There is no obvious pattern in the data indicative of positive autocorrelation

c At the 05 level of significance, dL = 1.39 and dU = 1.60 Since dL ≤ d ≤ dU, the test is inconclusive

34 The dummy variables are defined as follows:

The Minitab output is shown below:

The regression equation is

Trang 40

Since the p-value = 034 is less than  = 05, there are significant differences between comfort

levels for the three types of browsers

35 Let Mid-size = 1 if a mid-size car, 0 otherwise; Luxury = 1 if a luxury car, 0 otherwise; and Sports

= 1 if a sports car, 0 otherwise

The Minitab output is shown below

The regression equation is

Resale = 32.9 - 1.70 Mid-size + 4.30 Luxury + 7.30 Sports

Predictor Coef SE Coef T P

Ngày đăng: 09/10/2019, 23:11

TỪ KHÓA LIÊN QUAN

w