1. Trang chủ
  2. » Kinh Doanh - Tiếp Thị

Statistics for business economics 7th by paul newbold chapter 11

64 241 2

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 64
Dung lượng 1,98 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Introduction to Regression Analysis Regression analysis is used to: the value of at least one independent variable variable on the dependent variable Dependent variable: the variable we

Trang 1

Statistics for Business and Economics

7 th Edition

Chapter 11

Simple Regression

Trang 2

Chapter Goals

After completing this chapter, you should be able to:

equation for a set of data

Trang 4

Overview of Linear Models

 An equation can be fit to show the best linear

relationship between two variables:

Y = β 0 + β 1 X

Where Y is the dependent variable and

X is the independent variable

β 0 is the Y-intercept

11.1

Trang 5

Least Squares Regression

 Estimates for coefficients β 0 and β 1 are found

using a Least Squares Regression technique

data, is

 Where b 1 is the slope of the line and b 0 is the

y-intercept:

x b b

2 1

s

y) Cov(x,

Trang 6

Introduction to Regression Analysis

 Regression analysis is used to:

the value of at least one independent variable

variable on the dependent variable

Dependent variable: the variable we wish to explain

(also called the endogenous variable ) Independent variable: the variable used to explain

(also called the exogenous variable )

Trang 7

Linear Regression Model

described by a linear function

changes in X

coefficients and  is a random error term.

i i

1 0

11.2

Trang 8

Simple Linear Regression

Model

i i

1 0

Coefficient

Random Error term Dependent

Variable

Independent Variable

Random Error component

Trang 9

Simple Linear Regression

Model

(continued)

Random Error for this X i value

1 0

Trang 10

Simple Linear Regression

Equation

i 1

0

The simple linear regression equation provides an

estimate of the population regression line

Estimate of the regression intercept

Estimate of the regression slope

Estimated (or predicted)

y value for observation i

Value of x for observation i

The individual random error terms e i have a mean of zero

) )

ˆ

e

Trang 11

Least Squares Estimators

 b 0 and b 1 are obtained by finding the values

of b 0 and b 1 that minimize the sum of the squared differences between y and :

2 i 1 0

i

2 i i

2 i

)]

x b (b

[y min

) y (y

min

e min

SSE

Trang 12

Least Squares Estimators

x

y xy 2

x n

1 i

2 i

n

1 i

i i

1

s

s r s

y) Cov(x, )

x (x

) y )(y

(continued)

Trang 13

Finding the Least Squares

Equation

 The coefficients b 0 and b 1 , and other

regression results in this chapter, will be found using a computer

 Hand calculations are tedious

 Statistical routines are built into Excel

 Other statistical analysis software can be used

Trang 14

Linear Regression Model

Assumptions

of X, plus random error)

(the constant variance property is called homoscedasticity )

another, so that

n) , 1, (i

for σ

] E[ε and

0 ]

E[ε i  i 2  2  

Trang 15

Interpretation of the Slope and the Intercept

 b 0 is the estimated average value of y when the value of x is zero (if x = 0 is

in the range of observed x values)

 b 1 is the estimated change in the average value of y as a result of a one-unit change in x

Trang 16

Simple Linear Regression

Example

 A real estate agent wishes to examine the

relationship between the selling price of a home and its size (measured in square feet)

 A random sample of 10 houses is selected

 Dependent variable (Y) = house price in $1000s

 Independent variable (X) = square feet

Trang 17

Sample Data for House Price Model

Trang 18

Graphical Presentation

 House price model: scatter plot

0 50 100 150 200 250 300 350 400 450

Trang 19

Regression Using Excel

 Excel will be used to generate the coefficients and measures of goodness of fit for regression

 Data / Data Analysis / Regression

Trang 20

Regression Using Excel

 Data / Data Analysis / Regression (continued)

Provide desired input:

Trang 21

Excel Output

Trang 22

0.10977 98.24833

price

(continued)

Trang 23

0 50 100 150 200 250 300 350 400 450

0.10977 98.24833

Trang 24

0.10977 98.24833

price

Trang 25

Interpretation of the Slope Coefficient, b 1

 b 1 measures the estimated change in the

average value of Y as a result of a

0.10977 98.24833

price

Trang 26

Measures of Variation

 Total variation is made up of two parts:

SSE

SSR

Trang 27

Measures of Variation

 SST = total sum of squares

 Measures the variation of the y i values around their mean, y

 SSR = regression sum of squares

 Explained variation attributable to the linear relationship between x and y

 SSE = error sum of squares

 Variation attributable to factors other than the linear relationship between x and y

(continued)

Trang 29

Coefficient of Determination, R 2

 The coefficient of determination is the portion

of the total variation in the dependent variable that is explained by variation in the

independent variable

 The coefficient of determination is also called

R-squared and is denoted as R 2

1 R

note:

squares of

sum total

squares of

sum

regression SST

SSR

Trang 30

Examples of Approximate

r 2 Values

Y

X Y

Trang 31

Examples of Approximate

r 2 Values

Y

X Y

Trang 32

Examples of Approximate

r 2 Values

r 2 = 0

No linear relationship between X and Y:

The value of Y does not depend on X (None of the variation in Y is explained

by variation in X)

Y

X

r 2 = 0

Trang 33

0.58082 32600.5000

18934.9348 SST

SSR

Trang 34

Correlation and R 2

 The coefficient of determination, R 2 , for a

simple regression is equal to the simple correlation squared

2 xy

R 

Trang 35

SSE 2

n

e s

σ

n

1 i

2 i 2

2 e

s 

Trang 37

Comparing Standard Errors

Y Y

e

s

values from the regression line

The magnitude of s e should always be judged relative to the size

of the y values in the sample data i.e., s = $41.33K is moderately small relative to house prices in

Trang 38

Inferences About the Regression Model

 The variance of the regression slope coefficient (b 1 ) is estimated by

2 x

2 e 2

i

2 e

2

1)s (n

s )

x (x

s s

Trang 40

Comparing Standard Errors of

is a measure of the variation in the slope of regression lines from different possible samples b 1

S

Trang 41

Inference about the Slope:

t Test

 t test for a population slope

 Null and alternative hypotheses

β1 = hypothesized slope

s = standard

Trang 42

Inference about the Slope:

98.25 price

house  

Estimated Regression Equation:

The slope of this model is 0.1098

Does square footage of the house affect its sales price?

(continued)

Trang 43

Inferences about the Slope:

t Test Example

H 0 : β 1 = 0

H 1 : β 1  0

From Excel output:

Coefficients Standard Error t Stat P-value

Intercept 98.24833 58.03348 1.69296 0.12892 Square Feet 0.10977 0.03297 3.32938 0.01039

1

b s

t

b 1

3.32938 0.03297

0

0.10977 s

β

b t

1 b

1 1

Trang 44

Inferences about the Slope:

Trang 45

Inferences about the Slope:

Trang 46

Confidence Interval Estimate

for the Slope

Confidence Interval Estimate of the Slope:

Excel Printout for House Prices:

At 95% level of confidence, the confidence interval for

the slope is (0.0337, 0.1858)

1

b α/2 2, n

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

d.f = n - 2

Trang 47

Confidence Interval Estimate

for the Slope

Since the units of the house price variable is

$1000s, we are 95% confident that the average impact on sales price is between $33.70 and

$185.80 per square foot of house size

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

This 95% confidence interval does not include 0

Conclusion: There is a significant relationship between house price and square feet at the 05 level of significance

(continued)

Trang 48

F-Test for Significance

SSE MSE

k

SSR MSR

Trang 49

18934.9348 MSE

MSR

With 1 and 8 degrees

of freedom P-value for the F-Test

Trang 50

F-Test for Significance

 = 05

11.08 MSE

MSR

Critical Value:

F  = 5.32

(continued)

F

Trang 51

 The regression equation can be used to

predict a value for y, given a particular x

 For a specified value, x n+1 , the predicted

value is

1 n 1 0

1

y ˆ    

11.6

Trang 52

Predictions Using Regression Analysis

317.85

0) 0.1098(200 98.25

(sq.ft.) 0.1098

98.25 price

Trang 53

Relevant Data Range

 When using a regression model for prediction, only predict within the relevant range of data

Trang 54

Estimating Mean Values and Predicting Individual Values

Goal: Form intervals around y to express

y 

Trang 55

Confidence Interval for the Average Y, Given X

Confidence interval estimate for the

expected value of y given a particular x i

Notice that the formula involves the term

so the size of interval varies according to the distance

2 1

n e

α/2 2, n 1

n

1 n 1

n

) x (x

) x

(x n

1 s

t y

: ) X

| E(Y

for interval

Confidence

ˆ

2 1

(x  

Trang 56

Prediction Interval for

2 1

n e

α/2 2, n 1

n

1 n

) x (x

) x

(x n

1 1

s t

y

: y

for interval

Confidence

ˆ

ˆ

Trang 57

Estimation of Mean Values:

x (x

) x

(x n

1 s

t

i

2 i

e α/2 2, - n 1

Trang 58

Estimation of Individual Values:

Example

Find the 95% confidence interval for an individual

house with 2,000 square feet

Confidence Interval Estimate for y n+1

102.28

317.85 )

X (X

) X

(X n

1 1

s t

i

2 i

e α/2 1, - n 1

Trang 59

Correlation Analysis

 Correlation analysis is used to measure

strength of the association (linear relationship) between two variables

relationship

11.7

Trang 60

s s

s

r 

1 n

) y )(y

x

(x

s xy   i  i  where

Trang 61

Hypothesis Test for Correlation

 To test the null hypothesis of no linear

association,

the test statistic follows the Student’s t

distribution with (n – 2 ) degrees of freedom:

0 ρ

:

) r (1

2) (n

r t

2

Trang 62

r 

Trang 63

Graphical Analysis

 The linear regression model is based on

minimizing the sum of squared errors

 If outliers exist, their potentially large squared

errors may have a strong influence on the fitted regression line

 Be sure to examine your data graphically for

outliers and extreme points

 Decide, based on your model and logic, whether the extreme points should remain or be removed 11.9

Trang 64

Chapter Summary

 Introduced the linear regression model

 Reviewed correlation and the assumptions of

linear regression

 Discussed estimating the simple linear

regression coefficients

 Described measures of variation

 Described inference about the slope

 Addressed estimation of mean values and

prediction of individual values

Ngày đăng: 10/01/2018, 16:03

TỪ KHÓA LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm