1. Trang chủ
  2. » Luận Văn - Báo Cáo

Chapter 4 (linear regression) student

50 79 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 50
Dung lượng 1,16 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

- Predict the value of a dependent variable based on the value of at least one independent variable - Explain the impact of changes in an independent variable on the dependent variab

Trang 1

Chapter 4 Linear Regression

and Correlation

analysis

Trang 2

1 Introduction to regression

analysis

Regression analysis

- Describe a relationship between two

variables in mathematical terms.

- Predict the value of a dependent variable

based on the value of at least one

independent variable

- Explain the impact of changes in an

independent variable on the dependent

variable

Trang 3

the variable used to

explain the dependent variable

Trang 4

Names for ys and xs in

Regressand Regressors

Effect variable Causal variablesExplained variable Explanatory

variables

Trang 5

Simple Linear Regression

Model

 Only one independent variable,

x

 Relationship between x and y

is described by a linear function

 Changes in y are assumed to

be caused by changes in x

Trang 6

Types of Regression

Models

Positive Linear Relationship

Negative Linear Relationship

Non-linear relationship

No Relationship

Trang 7

ε x

β β

Coefficient

Random Error term, or residual

Dependent

Variable

Independent Variable

Random Error component

Trang 8

 The probability distribution of the

errors has constant variance

 The underlying relationship between the x variable and the y variable is linear

Trang 9

Population Linear

Regression

Random Error for this x valuey

β β

Trang 10

x b

Estimate of the regression slope

Estimated

(or predicted)

y value

Independent variable

The individual random error terms ei have a mean of zero

Trang 11

Least Squares Criterion

 b 0 and b 1 are obtained by finding the values of b 0 and b 1 that minimize the sum of the squared residuals

2 1

0

2 2

x)) b

(b (y

) yˆ (y

Trang 12

The Least Squares

n

y

x xy

2

1

) (

x b y

o

r

Trang 13

 b0 is the estimated average value

of y when the value of x is zero

 b1 is the estimated change in the average value of y as a result of a one-unit change in x

Interpretation of the Slope and the Intercept

Trang 14

 A real estate agent wishes to examine the relationship between the selling

price of a home and its size

(measured in square feet)

 A random sample of 10 houses is

Trang 15

Sample Data for House

Price Model

House Price in

$1000s (y)

Trang 16

Least Squares Regression Properties

 The sum of the residuals from the least squares regression line is 0 (yyˆ )  0

2

) ˆ (y y

 The least squares coefficients are

unbiased estimates of β 0 and β 1

 The simple regression line always passes through the mean of the y variable and

the mean of the x variable

 The sum of the squared residuals is a

minimum (minimized)

0 1

y b  b x

Trang 17

 The coefficient of determination is

the portion of the total variation in

the dependent variable that is

explained by variation in the

independent variable

 The coefficient of determination is

also called R-squared and is denoted as

Trang 18

TSS total

2

R

Trang 19

R 2 = +1

Examples of Approximate

Values

y

x y

Trang 20

Examples of Approximate

Values

y

x y

variation in x

Trang 21

Examples of Approximate

Values

R 2 = 0

No linear relationship between x and y:

variation in y is explained by

variation in x)

Trang 23

Coefficient of determination

2 RSS

R

TSS

Trang 24

2 Correlation analysis

 Correlation is a technique used to

measure the strength of the

relationship between two variables

 The stronger the correlation, the

better the relationship or the better fit the regression line and vice versa

Trang 25

Scatter Plot Examples

Trang 26

Scatter Plot Examples

Trang 27

The correlation coefficient (r)

 The correlation coefficient is

used to measure the strength of the linear relationship between

two variables

 The product moment correlation coefficient is calculated using the formula:

Trang 28

The correlation coefficient (r)

( ][

) x x

( [

) y y

)(

x x

( r

2 2

) y (

n ][

) x (

) x (

n

[

y x

xy

n r

2 2

2 2

Trang 29

r : simple correlation coefficient

Trang 30

Features of r

 Unit free

 Range between -1 and 1

 The closer to -1, the stronger the

negative linear relationship

 The closer to 1, the stronger the

positive linear relationship

 The closer to 0, the weaker the linear relationship

Trang 31

r = +.3 r = +1

Examples of Approximate

Trang 32

Example calculation

2 2 2 2

( ) ( )

xy x y r

 

Trang 35

Estimate b0 and b1

Trang 36

Linear regression

equation

and b1?

Trang 37

Coefficient of determination and correlation coefficient

Trang 38

The Multiple Regression Model

Idea: Examine the linear relationship between

1 dependent (y) & 2 or more independent variables (xi)

ε x

β x

β x

β β

k k

2 2

1 1

Estimated multiple regression model:

Estimated intercept

Trang 39

Estimates b0, b1, b2,….,bk

0 1 1 2 2

2

1 0 1 1 1 2 1 2 1

2

2 0 2 1 1 2 2 2 2

k

x

2

0 k 1 1 k 2 2 k k k

Trang 41

0 b x b x b

yˆ   

Slop

e for

variable x

1

le x 2

Trang 42

0 b x b x b

x1i The best fit equation, y ,

is found by minimizing the sum of squared errors, e 2

Sample observation

Trang 43

Multiple Regression

Assumptions

 The errors are normally distributed

 The mean of the errors is zero

 Errors have a constant variance

 The model errors are independent

e = (y – y)

Errors (residuals) from the

regression model:

Trang 45

Week Pie Sales Price ($) Advertising ($100s)

Trang 46

Estimated (Predicted) regression

equation:

0 1 1 2 2

Trang 48

Multiple Coefficient of

Determination

 Reports the proportion of total

variation in y explained by all x

variables taken together

2 RSS Regression sum of squaresR

TSS Total sum of squares

Trang 49

Multiple correlation (R)

 Multiple correlation provides a

measure of the overall strength of

the relationship between dependent variable and independent variables

 It is defined as the positive square root of the coefficient of the

determination RR2

Trang 50

Correlation matrix

 Provides measures of the strength of the relationship between dependent variable and each independent variable

x 1 r x 1 y 1

x 2 r x 2 y r x 1 x 2 1

Ngày đăng: 27/05/2019, 16:37

TỪ KHÓA LIÊN QUAN