Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Build
Trang 1Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Business Statistics:
A Decision-Making Approach
6 th Edition
Chapter 14
Multiple Regression Analysis
and Model Building
Trang 2Business Statistics: A Decision-Making Approach, 6e © 2005
multiple regression model
in a multiple regression model
Trang 3Business Statistics: A Decision-Making Approach, 6e © 2005
regression analysis and take the steps to correct the problems.
regression model by using dummy variables.
(continued)
Trang 4Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-The Multiple Regression
Model
Idea: Examine the linear relationship between
å x
â x
â x
â â
k k
2 2
1 1
Estimated slope coefficients
Estimated multiple regression model:
Estimated intercept
Trang 5Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Multiple Regression Model
Two variable model
y
2 2 1
1
0 b x b x b
Sl op
e f or
va ria ble x
1
Slope fo r variab
le x2
Trang 6Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Multiple Regression Model
Two variable model
y
2 2 1
1
0 b x b x b
x 1i The best fit equation, y ,
is found by minimizing the sum of squared errors, e 2
Sample observation
Trang 7Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Multiple Regression
Assumptions
e = (y – y)
Errors (residuals) from the regression model:
Trang 8Business Statistics: A Decision-Making Approach, 6e © 2005
Trang 9Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-The Correlation Matrix
selected independent variables can be found using Excel:
Tools / Data Analysis… / Correlation
with a t test
Trang 10Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Example
evaluate factors thought to influence demand
Dependent variable: Pie sales (units per week)
Independent variables: Price (in $)
Advertising ($100’s)
Trang 11Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Pie Sales Model
•Week
•Pie Sales
Trang 12Business Statistics: A Decision-Making Approach, 6e © 2005
Example: if b 1 = -20, then sales (y) is expected to decrease
by an estimated 20 pies per week for each $1 increase in selling price (x 1 ), net of the effects of changes due to
advertising (x 2 )
y-intercept (b 0 )
The estimated average value of y when all x i = 0 (assuming all x i = 0 is within the range of observed values)
Trang 13Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Pie Sales Correlation Matrix
Trang 14Business Statistics: A Decision-Making Approach, 6e © 2005
Trang 15Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Estimating a Multiple Linear
Regression Equation
generate the coefficients and measures of goodness of fit for multiple regression
Trang 16Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Multiple Regression Output
ce) 24.975(Pri -
306.526
Trang 17Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-The Multiple Regression
Equation
ertising) 74.131(Adv
ce) 24.975(Pri
306.526
b 1 = -24.975: sales will decrease, on average, by 24.975 pies per week for each $1 increase in selling price, net of the effects of
changes due to advertising
b 2 = 74.131: sales will increase, on average,
by 74.131 pies per week for each $100 increase in
advertising, net of the effects of changes due to price
where
Sales is in number of pies per week
Price is in $
Advertising is in $100’s.
Trang 18Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Using The Model to Make
Predictions
Predict sales for a week in which the selling
price is $5.50 and advertising is $350:
Predicted sales
is 428.62 pies
428.62
(3.5) 74.131
(5.50) 24.975
306.526
-ertising) 74.131(Adv
ce) 24.975(Pri
306.526 Sales
Trang 19Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Predictions in PHStat
PHStat | regression | multiple regression …
Check the
“confidence and prediction interval estimates” box
Trang 20Business Statistics: A Decision-Making Approach, 6e © 2005
Prediction interval for an individual y value, given these x’s
Trang 21Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Multiple Coefficient of
Determination
explained by all x variables taken together
squares of
sum Total
regression squares
of
Sum SST
SSR
R 2 = =
Trang 22Business Statistics: A Decision-Making Approach, 6e © 2005
29460.0 SST
SSR
R 2 = = =
52.1% of the variation in pie sales
is explained by the variation in price and advertising
Multiple Coefficient of
Determination
(continued)
Trang 23Business Statistics: A Decision-Making Approach, 6e © 2005
added to the model
models
What is the net effect of adding a new variable?
variable is added
explanatory power to offset the loss of one degree of freedom?
Trang 24Business Statistics: A Decision-Making Approach, 6e © 2005
all x variables adjusted for the number of x
variables used
(where n = sample size, k = number of independent variables)
Penalize excessive use of unimportant independent variables
Ë
Ê
-
-
-=
1 k
n
1
n )
R 1
( 1
Trang 25Business Statistics: A Decision-Making Approach, 6e © 2005
Multiple Coefficient of
Determination
(continued)
Trang 26Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Is the Model Significant?
F-Test for Overall Significance of the Model
of the x variables considered together and y
H 0 : β 1 = β 2 = … = β k = 0 (no linear relationship)
H A : at least one β i ≠ 0 (at least one independent
variable affects y)
Trang 27Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-F-Test for Overall
SSE k
SSR
-
-=
1
Trang 28Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-6.5386 2252.8
14730.0 MSE
With 2 and 12 degrees
Trang 29Business Statistics: A Decision-Making Approach, 6e © 2005
The regression model does explain
a significant portion of the variation in pie sales
(There is evidence that at least one independent variable affects y)
MSR
F = =
Critical Value:
Trang 30Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Are Individual Variables
Significant?
Use t-tests of individual variable slopes
variable x i and y
H 0 : β i = 0 (no linear relationship)
H A : β i ≠ 0 (linear relationship does exist
Trang 31Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Are Individual Variables
Significant?
H 0 : β i = 0 (no linear relationship)
H A : β i ≠ 0 (linear relationship does exist
Trang 32Business Statistics: A Decision-Making Approach, 6e © 2005
Trang 33Business Statistics: A Decision-Making Approach, 6e © 2005
The test statistic for each variable falls
in the rejection region (p-values < 05)
There is evidence that both Price and Advertising affect
From Excel output:
• •Coefficients •Standard Error •t Stat •P-value
Trang 34Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Confidence Interval Estimate
for the Slope
(the effect of changes in price on pie sales):
by between 1.37 to 48.58 pies for each increase of $1
in the selling price
i
b 2
Trang 35Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Standard Deviation of the
Regression Model
regression model is:
MSE k
n
SSE
-
-=
e
1
mean size of y for comparison
Trang 36Business Statistics: A Decision-Making Approach, 6e © 2005
Trang 37Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice- The standard deviation of the regression model is 47.46
week is
per week range, so this range is probably too large to be acceptable The analyst may want to look for additional variables that can explain more
of the variation in weekly sales
Trang 38Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Multicollinearity
Multicollinearity: High correlation exists
between two independent variables
redundant information to the multiple regression model
Trang 39Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Multicollinearity
variables can adversely affect the regression results
standard error and low t-values)
expectations
(continued)
Trang 40Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Some Indications of Severe
Multicollinearity
coefficient when a new variable is added to the model
insignificant when a new independent variable
is added
model increases when a variable is added to the model
Trang 41Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Detect Collinearity (Variance Inflationary
Factor)
the other explanatory variables
R 2
j is the coefficient of determination when the j th independent variable is regressed against the remaining k – 1 independent variables
Trang 42Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Detect Collinearity in PHStat
Output for the pie sales example:
Since there are only two explanatory variables, only one VIF
PHStat / regression / multiple regression …
Check the “variance inflationary factor (VIF)” box
Trang 43Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Qualitative (Dummy)
Variables
variable) with two or more levels:
yes or no, on or off, male or female
coded as 0 or 1
is significant
(number of levels - 1)
Trang 44Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Dummy-Variable Model Example (with 2 Levels)
Let:
y = pie sales
x 1 = price
x 2 = holiday (X 2 = 1 if a holiday occurred during the week)
2 1
b
yˆ = + 1 + 2
Trang 45Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Same slope
Dummy-Variable Model
Example (with 2 Levels)
1 0
1 2
0 1
0
x b
b (0)
b x
b b
yˆ
x b )
b (b
(1) b
x b b
yˆ
1 2
1
1 2
1
+
= +
+
=
+ +
= +
+
No Holiday
Different intercept
on pie sales
Trang 46Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Sales: number of pies sold per week
Price: pie price in $
weeks with a holiday than in weeks without a
holiday, given the same price
) 15(Holiday 30(Price)
300
Trang 47Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Dummy-Variable Models
(more than 2 Levels)
the number of levels
Three levels, so two dummy
variables are needed
Trang 48Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Dummy-Variable Models
(more than 2 Levels)
Ô Ó
Ô Ì
Ï
= Ô
Ó
Ô Ì
Ï
=
not if
0
level split
if
1 x
not
if 0
ranch if
1
3 2
1
b
yˆ = + 1 + 2 + 3
(continued)
Let the default category be “condo”
Trang 49Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Interpreting the Dummy Variable Coefficients (with 3
With the same square feet, a ranch will have an estimated average price of 23.53
thousand dollars more than a condo.
Suppose the estimated equation is
3 2
0.045x 20.43
18.84 0.045x
20.43
23.53 0.045x
20.43
1
0.045x 20.43
Trang 50Business Statistics: A Decision-Making Approach, 6e © 2005
variable and an independent variable may not
â x
â â
Trang 51Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Polynomial Regression
Model
where:
β0 = Population regression constant
βi = Population regression coefficient for variable xj : j = 1, 2, …k
p = Order of the polynomial
i = Model error
å x
â x
â â
å x
â x
â x
â â
y = 0 + 1 j + 2 2 j + K + p p j +
If p = 2 the model is a quadratic model:
General form:
Trang 52Business Statistics: A Decision-Making Approach, 6e © 2005
Linear fit does not give random residuals
Linear vs Nonlinear Fit
Nonlinear fit gives random residuals
Trang 53Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Quadratic Regression Model
Quadratic models may be considered when scatter
diagram takes on the following shapes:
y
β1 = the coefficient of the linear term
β2 = the coefficient of the squared term
x 1
å x
â x
â â
Trang 54Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Testing for Significance:
Quadratic Model
with the linear model
(No 2 nd order polynomial term)
(2 nd order polynomial term is needed)
å x
â x
â â
y = 0 + 1 j + 2 2 j +
å x
â â
H0: β2 = 0
HA: β2 0
MSE MSR
Trang 55Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Higher Order Models
y
x
å x
â x
â x
â â
If p = 3 the model is a cubic form:
Trang 56Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Interaction Effects
variables
levels of another x variable
2
2 1 5 2
1 4 3
3
2 1 2 1
1
0 â x â x â x â x x â x x â
Trang 57Business Statistics: A Decision-Making Approach, 6e © 2005
x â x
â x
â â
Trang 58Business Statistics: A Decision-Making Approach, 6e © 2005
Trang 59Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Interaction Regression Model
Trang 60Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-å x
x â x
â x
â â
Trang 61Business Statistics: A Decision-Making Approach, 6e © 2005
Stepwise regression procedure
are added
Best-subset approach
Trang 62Business Statistics: A Decision-Making Approach, 6e © 2005
selection , backward elimination , or through
standard stepwise regression
The coefficient of partial determination is the
measure of the marginal contribution of each independent variable, given that other
independent variables are in the model
Stepwise Regression
Trang 63Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Best Subsets Regression
variables
adjusted R 2 and lowest standard error s ε
Stepwise regression and best subsets regression can be performed using PHStat, Minitab, or other statistical software packages
Trang 64Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Aptness of the Model
verifying the assumptions of multiple regression:
Each x i is linearly related to y
) yˆ y
(
-Errors (or Residuals) are given by
Trang 65Business Statistics: A Decision-Making Approach, 6e © 2005
Trang 66Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-The Normality Assumption
computer
of the standardized residuals to check for normality
Trang 67Business Statistics: A Decision-Making Approach, 6e © 2005
Prentice-Chapter Summary
regression model
model
Trang 68Business Statistics: A Decision-Making Approach, 6e © 2005
assumptions
(continued)