The Minitab output is shown below:The regression equation is 3.. The scatter diagram is shown below:050100150200250 A simple linear regression model does not appear to be appropriate.. T
Trang 1Regression Analysis: Model Building
Learning Objectives
1 Learn how the general linear model can be used to model problems involving curvilinear
relationships
2 Understand the concept of interaction and how it can be accounted for in the general linear model
3 Understand how an F test can be used to determine when to add or delete one or more variables.
4 Develop an appreciation for the complexities involved in solving larger regression analysis problems
5 Understand how variable selection procedures can be used to choose a set of independent variables for an estimated regression equation
6 Learn how analysis of variance and experimental design problems can be analyzed using a regression model
7 Know how the Durbin-Watson test can be used to test for autocorrelation
Trang 21 a The Minitab output is shown below:
The regression equation is
Trang 310 15 20 25 30 35 40 45
x
The scatter diagram suggests that a curvilinear relationship may be appropriate
d The Minitab output is shown below:
The regression equation is
2 a The Minitab output is shown below:
The regression equation is
Y = 9.32 + 0.424 X
Predictor Coef SE Coef T p
Constant 9.315 4.196 2.22 0.113
Trang 4b The Minitab output is shown below:
The regression equation is
3 a The scatter diagram shows some evidence of a possible linear relationship
b The Minitab output is shown below:
The regression equation is
Y = 2.32 + 0.637 X
Predictor Coef SE Coef T p
Constant 2.322 1.887 1.23 0.258
X 0.6366 0.3044 2.09 0.075
Trang 56 5
4 3
transformation and the corresponding standardized residual pot are shown below
The regression equation is
Trang 60.200 0.175
4 a The Minitab output is shown below:
The regression equation is
Total 5 37395
b p-value = 005 < = 01; reject H0
5 The Minitab output is shown below:
The regression equation is
Trang 7b Since the linear relationship was significant (Exercise 4), this relationship must be significant
Note also that since the p-value of 003 < = 05, we can reject H0.
c The fitted value is 1302.01, with a standard deviation of 9.93 The 95% confidence interval is 1270.41 to 1333.61; the 95% prediction interval is 1242.55 to 1361.47
6 a The scatter diagram is shown below:
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
b No; the relationship appears to be curvilinear
c Several possible models can be fitted to these data, as shown below:
Trang 87 a The scatter diagram is shown below:
050100150200250
A simple linear regression model does not appear to be appropriate There appears to be a
curvilinear relationship between the two variables
b The Minitab output is shown below:
The regression equation is
R denotes an observation with a large standardized residual
Trang 9The corresponding standardized residual plot is shown below:
120 100
80 60
40 20
0
4 3 2 1 0 -1 -2
There is an unusual trend in the points There is also some indication that the variance may not be constant
c The Minitab output is shown below:
The regression equation is
R denotes an observation with a large standardized residual
Trang 10The corresponding standardized residual plot is shown below:
4.5 4.0
3.5 3.0
d The Minitab output is shown below:
The regression equation is
R denotes an observation with a large standardized residual
Trang 11The corresponding standardized residual plot is shown below:
0.04 0.03
0.02 0.01
0.00 -0.01
8 a The scatter diagram is shown below:
050010001500200025003000350040004500
Rating
Trang 12A simple linear regression model does not appear to be appropriate There appears to be a curvilinear relationship between the two variables.
b The Minitab output is shown below:
The regression equation is
Price = 33829 - 4571 Rating + 154 RatingSq
Predictor Coef SE Coef T P
c The Minitab output is shown below:
The regression equation is
Trang 13A simple linear regression model appears to be appropriate.
b
Note the line drawn through the data This line indicates a possible curvilinar relationship between these two variables
c In the Minitab output that follows IndexSq denotes the square of the Cost-of-Living Index
The regression equation is
Creative Class (%) = 49.2 - 0.673 Cost-of-Living Index + 0.00282 IndexSq + 0.404 Income
Predictor Coef SE Coef T P
Constant 49.24 17.25 2.85 0.006
Cost-of-Living Index -0.6725 0.2888 -2.33 0.024
IndexSq 0.002821 0.001223 2.31 0.026
Income 0.40418 0.06772 5.97 0.000
Trang 1410 a SSR = SST - SSE = 1030
Using Excel or Minitab, the p-value corresponding to F = 49.52 is 000.
Because p-value ≤ α, x1 is significant
100 / 23
Using Excel or Minitab, the p-value corresponding to F = 48.3 is 000.
Because p-value ≤ α, the addition of variables x2 and x3 is significant
11 a SSE = SST - SSR = 1805 - 1760 = 45
F = 440/1.8 = 244.44
Using Excel or Minitab, the p-value corresponding to F = 244.44 is 000.
Because p-value ≤ α, the overall relationship is significant.
b SSE(x1, x2, x3, x4) = 45
c SSE(x2, x3) = 1805 - 1705 = 100
Trang 15d (100 45) / 2 15.28
1.8
Using Excel or Minitab, the p-value corresponding to F = 15.28 is 000.
Because p-value ≤ α, x1 and x4 contribute significantly to the model
12 a A portion of the Minitab output follows:
The regression equation is
Scoring Avg = 46.3 + 14.1 Putting Avg.
Predictor Coef SE Coef T P
b A portion of the Minitab output follows:
The regression equation is
Scoring Avg = 59.0 - 10.3 Greens in Reg + 11.4 Putting Avg - 1.81 Sand Saves
Predictor Coef SE Coef T P
Trang 16SSE(reduced) - SSE(full) 7.2998 - 4.3240
The p-value associated with F = 8.95 (2 degrees of freedom numerator and 26 denominator)
is 001 With a p-value < α =.05, the addition of the two independent variables is statistically
significant
13 a A portion of the Minitab output follows:
The regression equation is
Earnings ($1000) = 14528 - 7640 Putting Avg.
Predictor Coef SE Coef T P
b A portion of the Minitab output follows:
The regression equation is
Earnings ($1000) = 5214 + 6873 Greens in Reg - 5623 Putting Avg + 2217 Sand Saves
Predictor Coef SE Coef T P
Trang 17The p-value associated with F = 16.25 (2 degrees of freedom numerator and 26 denominator)
is 000 With a p-value < α =.05, the addition of the two independent variables is statistically
significant
d A portion of the Minitab output follows:
The regression equation is
Earnings ($1000) = 36697 - 501 Scoring Avg.
Predictor Coef SE Coef T P
14 a The Minitab output is shown below:
Risk = - 111 + 1.32 Age + 0.296 Pressure
Predictor Coef SE Coef T P
Total 19 4190.9
Source DF Seq SS
Age 1 1772.0
Pressure 1 1607.7
Trang 18Unusual Observations
Obs Age Risk Fit SE Fit Residual St Resid
17 66.0 8.00 25.05 1.67 -17.05 -2.54R
R denotes an observation with a large standardized residual
b The Minitab output is shown below:
Risk = - 123 + 1.51 Age + 0.448 Pressure + 8.87 Smoker -
The p-value associated with F = 4.23 (2 numerator and 15 denominator DF) is 000
Because p-value ≤ α = 05, the addition of the two terms is significant.
15 a A portion of the Minitab output follows:
The regression equation is
ERA = - 0.253 + 0.453 H/9
Predictor Coef SE Coef T P
Constant -0.2535 0.7351 -0.34 0.732
Trang 19b A portion of the Minitab output follows:
The regression equation is
The p-value associated with F = 41.26 (2 degrees of freedom numerator and 46 denominator)
is 000 With a p-value < α =.05, the addition of the two independent variables is statistically
significant
16 a The sample correlation coefficients are as follows:
Weeks Age Educ Married Head Tenure Manager Age 0.577
Trang 20Cell Contents: Pearson correlation
Trang 21The regression equation is
Weeks = - 0.07 + 1.73 Age - 28.7 Manager - 15.1 Head - 17.4 Sales
Predictor Coef SE Coef T P
d The results using Minitab’s Backward Elimination procedure are shown below:
Backward elimination Alpha-to-Remove: 0.05
Response is Weeks on 7 predictors, with N = 50
Trang 23e The results using Mintab’s Best-Subset procedure are shown below:
The regression equation is
Weeks = 13.1 + 1.64 Age - 9.76 Married - 19.4 Head - 29.0 Manager - 19.0 Sales
Predictor Coef SE Coef T P
Trang 2417 The output obtained using Minitab’s Best Subset Regression is shown below:
Response is Scoring Avg.
The regression equation is
Scoring Avg = - 88.1 + 0.591 Drive Average + 209 Greens in Reg.
+ 9.74 Putting Avg - 0.868 DriveGreens
Predictor Coef SE Coef T P
Trang 2518 a Because the independent variable most highly correlated with RPG is OBP, it
will provide the best one-variable estimated regression equation The Minitab
output using OBP to predict RPG is shown below:
The regression equation is
Trang 26The regression equation is
RPG = - 0.909 + 32.2 OBP + 0.109 HR - 21.5 AVG + 0.244 3B - 0.0223 BB
Trang 278 7 6 5 4 3 2 1
BB appears to be a good choice
19 See the solution to Exercise 14 in this chapter The Minitab output using the best subsets
regression procedure is shown below:
Trang 28Risk = - 91.8 + 1.08 Age + 0.252 Pressure + 8.74 Smoker
Predictor Coef SE Coef T P
Trang 29x3 = 0 if block 1 and 1 if block 2
b The Minitab output is shown below:
The regression equation is
d The p-value of 004 is less than = 05; therefore, we can reject H0 and conclude that the
mean time to mix a batch of material is most the same for each manufacturer
24 a The dummy variables are defined as follows:
Trang 30The Minitab output is shown below:
The regression equation is
b Note: Estimating the mean drying for paint 2 using the estimated regression equations developed
in part (a) may not be the best approach because at the 5% level of significance, we cannot reject
H0 But, if we want to use the output, we would proceed as follows
D1 = 1 D2 = 0 D3 = 0
TIME = 133 + 6(1) + 3(0) +11(0) = 139
25 X1 = 0 if computerized analyzer, 1 if electronic analyzer
X2 and X3 are defined as follows:
Trang 31To test for any significant difference between the two analyzers we must test H0: 1 Since
the p-value corresponding to t = -4.54 is 045 < = 05, we reject H0: 0 the time to do a tuneup is not the same for the two analyzers
26 Size = 0 if a small advertisement and 1 if a large advertisement
DesignB and DesignC are defined as follows:
DesignB DesignC AdvertisementDesign
LargeDesignB denotes the interaction between Large and DesignB
LargeDesignC denotes the interaction between Large and DesignC
Trang 32The complete data set and the Minitab output are shown below:
The regression equation is
Number = 10.0 + 0.00 Size + 8.00 DesignB + 4.00 DesignC + 10.0 LargeDesignB
The Minitab output using only Design B follows:
The regression equation is
Trang 33Residual Error 10 250.00 25.00
Total 11 544.00
Thus, DesignB is significant using α = 05 However, the model involving just the interaction
between Large and DesignB also provides some interesting results:
The regression equation is
conclusions about the relationships among the variables
27 a The Minitab output is shown below:
The regression equation is
b The Durbin-Watson statistic is 798118 At the 05 level of significance, dL = 1.20 and dU =1.41
Because d < dL, there is significant positive autocorrelation
Trang 3428 From Minitab, d = 1.60 At the 05 level of significance, dL = 1.04 and dU = 1.77 Since dL d
dU, the test is inconclusive
29 a The scatter diagram is shown below:
50 55 60 65 70 75 80 85
The curvature in the scatter diagram indicates that a simple linear regression model may not be appropriate
b The Minitab output is shown below:
The regression equation is
Rating = 49.9 + 14.9 Speed - 1.83 SpeedSq
Predictor Coef SE Coef T P
c The Minitab output for the transformed nonlinear model is shown below:
The regression equation is
Trang 35The estimated regression equation developed in part (b) provides a much better fit.
30 a
There appears to be a curvilinear relationship between weight and price
b A portion of the Minitab output follows:
The regression equation is
Price = 11376 - 728 Weight + 12.0 WeightSq
Predictor Coef SE Coef T P
c A portion of the Minitab output follows:
The regression equation is
Price = 1284 - 572 Type_Fitness - 907 Type_Comfort
Predictor Coef SE Coef T P
Trang 36The regression equation is
Price = 5924 - 215 Weight - 6343 Type_Fitness - 7232 Type_Comfort + 261 WxF
31 a The Minitab output is shown below:
The regression equation is
Delay = 80.4 + 11.9 Industry - 4.82 Public - 2.62 Quality - 4.07 Finished Predictor Coef SE Coef T P
Trang 37Regression 4 2587.7 646.9 5.42 0.002
Residual Error 35 4176.3 119.3
Total 39 6764.0
b The low value of the adjusted coefficient of determination (31.2%) does not indicate a good fit
c The scatter diagram is shown below:
The scatter diagram suggests a curvilinear relationship between these two variables
d The output from Minitab’s best subsets procedure is shown below, where FinishedSq is the square
Trang 38The estimated regression equation using Industry, Quality, Finished, and FinishedSq has an adjusted coefficient of determination of 54.4%
32 The computer output is shown below:
The regression equation is
Total 39 6764.0
Durbin-Watson statistic = 1.55
At the 05 level of significance, dL = 1.44 and dU = 1.54 Since d = 1.55 > dU, there is no significant positive autocorrelation
33 a The Minitab output is shown below:
The regression equation is
Delay = 70.6 + 12.7 Industry - 2.92 Quality
Predictor Coef SE Coef T p
Total 39 6764.0
Durbin-Watson statistic = 1.43
Trang 39b The residual plot as a function of the order in which the data are presented is shown below:
Order In Which Data Are Presented
30 20 10 0 -10 -20 -30
There is no obvious pattern in the data indicative of positive autocorrelation
c At the 05 level of significance, dL = 1.39 and dU = 1.60 Since dL ≤ d ≤ dU, the test is inconclusive
34 The dummy variables are defined as follows:
The Minitab output is shown below:
The regression equation is
Trang 40Since the p-value = 034 is less than = 05, there are significant differences between comfort
levels for the three types of browsers
35 Let Mid-size = 1 if a mid-size car, 0 otherwise; Luxury = 1 if a luxury car, 0 otherwise; and Sports
= 1 if a sports car, 0 otherwise
The Minitab output is shown below
The regression equation is
Resale = 32.9 - 1.70 Mid-size + 4.30 Luxury + 7.30 Sports
Predictor Coef SE Coef T P