When you have completed this chapter, you will be able to: Understand the importance of an appropriate model specification and multiple regression analysis, comprehend the nature and technique of multiple regression models and the concept of partial regression coefficients, use the estimation techniques for multiple regression models,...
Trang 2When you have completed this chapter, you will be able to:
Understand the importance of an appropriate model
specification and multiple regression analysis
Comprehend the nature and technique of
Trang 3though a joint test of hypothesis (F test) on the
coefficients of all variables
Draw inferences about the importance of the
independent variables through tests of hypothesis (ttests)
5 Explain the goodness of fit of an estimated model.
Identify the problems raised, and the remedies
thereof, by the presence of outliers/influential observations in the data sets
9.
Trang 4Comprehend the concept of partial correlations and
10 Identify the violation of model assumptions, including
linearity, homoscedasticity, autocorrelation, and
normality through simple diagnosic procedures.
Trang 514 Draw inferences about the importance of a subset of
the importance in multiple regression analysis
Trang 7The general multiple regression with k
y a bx b x1 1 2 2 . . . b xk k
Trang 8… is measured in the same units as the dependent variable
Multiple Standard Error of Estimate
… is a measure of the effectiveness
of the
regression equation
… it is difficult to determine what is a large value
and what is a small value of the standard error!
Trang 9Assumptions
Multiple Regression and Correlation
Assumptions … the independent variables and the dependent variables
have a linear relationship
… the dependent variable must be continuous and
uncorrelated
Trang 11A correlation matrix is used
to show all possible simple correlation coefficients among the
variables
… i t shows how strongly each independent variable
is correlated with the
dependent variable.
…t he matrix is useful for locating correlated independent variables.
Correlation Matrix Correlation Matrix
Trang 12The global test is
used to investigate whether any of the independent variables
have significant coefficients
0 equal s
all Not
:
1
H
0
Trang 13F distribution with k (number of
independent variables)
and
n(k+1) degrees of freedom,
where n is the sample size
Global Test … continued
Trang 15A market researcher for Super Dollar Super Markets is studying the yearly amount families of four or more spend on
food.
Three independent variables are thought to be
related to yearly food expenditures ( Food ).
Those variables are:
… total family income ( Income ) in $00,
… size of family ( Size ), and
… whether the family has children in college ( College )
Trang 16… the part is acceptable or unacceptable
… the voter will or will not vote for the incumbent
… continued Note: … the following regarding the regression equation
… the variable college is called
a dummy or indicator variable.
(It can take only one of two possible outcomes, i.e. a child is a
Trang 18Use a computer software package, such as MINITAB or Excel,
to develop a correlation matrix.
From the analysis provided by MINITAB, write out the regression equation:
… continued
What food expenditure would you estimate for a family of 4, with no college students, and an income
of $50,000 (which
is input as 500)?
What food expenditure would you estimate for a family of 4, with no college students , and an income
of $50,000 (which
is input as 500)?
Trang 19The regression equation is
Food = 954 + 1.09 Income + 748 Size + 565 Student
Predictor Coef SE Coef T P
Trang 20The regression equation is
Food = 954 + 1.09 Income + 748 Size + 565 Student
Predictor Coef SE Coef T P
is 80.4 percent.
From the regression
output we note: The coefficient of determination
is 80.4 percent.
… continued
This means that more than 80 percent of the variation
in the amount spent on food
is accounted for
by the variables income, family size, and student
This means that more than 80 percent of the variation
in the amount spent on food
is accounted for
by the variables income, family size, and student
Trang 21The regression equation is
Food = 954 + 1.09 Income + 748 Size + 565 Student
Predictor Coef SE Coef T P
An additional family
member will increase the amount spent per year on food by $748
… continued
A family with a college student will spend $565 more per year on food than those without
a college student
A family with a college student will spend $565 more per year on food than those without
a college student
Trang 22None of the correlations among the independent variables should cause problems.
Trang 23a family of 4 with a $500 (that is
Trang 24The regression equation is
Food = 954 + 1.09 Income + 748 Size + 565
if any of the regression coefficients are not zero
H0 is rejected if F>4.07
…from the MINITAB output,
the computed value
Trang 25The regression equation is
Food = 954 + 1.09 Income + 748 Size
(This is the hypothesis for the independent variable
family size)
Trang 26The regression equation is
Food versus Size
The regression equation is Food = 340 + 1031 Size
Predictor Coef SECoef Constant 339.7 940.7 Size 1031.0 179.4
Trang 27Residuals should be approximately
normally distributed
Analysis Residuals
Analysis Residuals
… histograms and stemandleaf charts are useful in
Trang 28Analysis Residuals
Analysis Residuals
Trang 29Analysis Residuals
Analysis Residuals
600 200 200 600 1000
876543210
Trang 30…and much more!
Trang 31This completes Chapter 14 This completes Chapter 14