1. Trang chủ
  2. » Kinh Doanh - Tiếp Thị

Statistics for business decision making and analysis robert stine and foster chapter 21

50 161 1

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 50
Dung lượng 1,01 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

21.1 The Simple Regression ModelData Generating Process X denote its spending on advertising both in thousands of dollars... 21.1 The Simple Regression ModelData Generating Process The

Trang 2

The Simple Regression Model

Chapter 21

Trang 3

21.1 The Simple Regression Model

How can we test the CAPM (Capital Asset

Pricing Model) for Berkshire Hathaway

stock?

change in Berkshire Hathaway stock as y and the

percentage change in value of the whole stock

market as x

errors, confidence intervals and hypothesis tests

Trang 4

21.1 The Simple Regression Model

association in the population between an

explanatory variable x and response y.

 Consider the data to be a sample from a

population

Trang 5

21.1 The Simple Regression Model

Linear on Average

conditional mean of Y depends on X.

with intercept β0 and slope β1:

x x

X Y

E

x

y ( ) 0 1

Trang 6

21.1 The Simple Regression Model

Deviations from the Mean

 The deviations of responses around are

called errors

 Error, is denoted by , and E( ) = 0

x y

Trang 7

21.1 The Simple Regression Model

Deviations from the Mean

The SRM makes three assumptions about :

1. Independent Errors are independent of each

Trang 8

21.1 The Simple Regression Model

Data Generating Process

X denote its spending on advertising (both in

thousands of dollars)

Trang 9

21.1 The Simple Regression Model

Data Generating Process

The SRM assumes a normal distribution at each x

Trang 10

21.1 The Simple Regression Model

Data Generating Process

Eventually the data shown below are observed

Trang 11

21.1 The Simple Regression Model

Data Generating Process

 The true regression line is a characteristic of the population, not the observed data

 The SRM is a model and offers a simplified view

of reality

Trang 12

21.1 The Simple Regression Model

Simple Regression Model (SRM)

Observed values of the response Y are linearly related to the values

of the explanatory variable X by the equation:

, ~ N(0, ).

The observations are independent of one another, have equal variance around the regression line, and are normally distributed around the regression line

Trang 13

21.2 Conditions for the SRM

Conditions for the SRM – Checklist

Is the association between y and x linear?

 Have lurking variables been ruled out?

 Are the errors evidently independent?

 Are the variances of the residuals similar?

 Are the residuals nearly normal?

Trang 14

21.2 Conditions for the SRM

Conditions for the SRM – CAPM Example

Linearity condition is satisfied; no pattern in the

residuals Data are shifted to the right because of

Trang 15

21.2 Conditions for the SRM

Conditions for the SRM – CAPM Example

No obvious lurking variable (according to CAPM

theory)

Similar variances condition is satisfied Check the

plot of residuals versus x for any fan shaped

pattern (none visible)

Trang 16

21.2 Conditions for the SRM

Conditions for the SRM – CAPM Example

Evidently independent No dependence apparent

in the timeplot of the residuals

Trang 17

21.2 Conditions for the SRM

Conditions for the SRM – CAPM Example

The residuals are not normally distributed Check

sample size condition (satisfied) to use CLT.

Trang 18

21.2 Conditions for the SRM

Modeling Process

Before looking at plots, ask two questions:

1. Does a linear relationship make sense?

2. Is the relationship free of lurking variables?

Then begin working with data

Trang 19

21.2 Conditions for the SRM

Modeling Process

Plot y versus x and verify a linear association.

 Fit the least squares line and obtain residuals

Plot the residuals versus x.

 If time series data, construct a timeplot of

residuals

 Inspect the histogram and quantile plot of the

residuals

Trang 20

21.3 Inference in Regression

Parameters and Estimates for SRM

Trang 21

s n

s s

n

s b

Trang 22

21.3 Inference in Regression

Estimated Standard Error of b1

Influenced by:

 Standard deviation of the residuals As it

increases, the standard error increases

 Sample size As it increases, the standard error decreases

Standard deviation of x As it increases, the

standard error increases

Trang 23

21.3 Inference in Regression

Software Results for CAPM Example

Trang 24

21.3 Inference in Regression

Confidence Intervals

The 95% confidence interval for β1 is

The 95% confidence interval for β0 is

) ( 1 2

, 025 0

bn  

) ( 0 2

, 025 0

bn  

Trang 25

21.3 Inference in Regression

Confidence Intervals – CAPM Example

The 95% confidence interval for β1 is

The 95% confidence interval for β0 is

] 876

0 to 569

0 [ 077763

0 97 1 7223495

] 065

2 to 727

0 [ 339682

0 97 1 3962046

Trang 26

t 

Trang 27

21.3 Inference in Regression

Hypothesis Tests – CAPM Example

 The t-statistic of 9.29 with p-value of < 0.0001

indicates that the slope is significantly different

from zero

 The t-statistic of 4.11 with p-value of < 0.0001

indicates that the intercept is significantly different from zero

Trang 28

4M Example 21.1:

LOCATING A FRANCHISE OUTLET

Motivation

Does traffic volume affect gasoline sales?

How much more gasoline can be expected

to be sold at a franchise location with an

average of 40,000 drive-bys compared to

one with an average of 32,000 drive-bys?

Trang 29

confidence interval for 8,000 times the

estimated slope will indicate how much

more gas is expected to sell at the busier

location.

Trang 31

4M Example 21.1:

LOCATING A FRANCHISE OUTLET

Mechanics

Trang 34

Hence, a difference of 8,000 cars in daily

traffic volume implies a difference in

average daily sales of approximately 1,507

to 2,281 more gallons per day.

Trang 35

4M Example 21.1:

LOCATING A FRANCHISE OUTLET

Message

Based on a sample of 80 gas stations, we

expect that a station located at a site with

40,000 drive bys will sell on average from

1,507 to 2,281 more gallons of gas daily

than a location with 32,000 drive bys.

Trang 36

21.4 Prediction Intervals

Leveraging the SRM

fraction (usually 95%) of the values of the

response for a given value of x

interval because it makes a statement about the

location of a new observation rather than a

Trang 37

ˆnew t0.025, 2se ynew

new new b b x

y ˆ  0  1

2

2

)1(

)(

11

(

x

new e

new

s n

x

x n

s y

Trang 38

21.4 Prediction Intervals

Leveraging the SRM

 A simple approximation for a 95% prediction

interval is

 Prediction intervals are reliable within the range

of observed data They are also sensitive to the assumptions of constant variance and normality

e

s

y 2 ˆ 

Trang 39

4M Example 21.2:

MANAGING NATURAL RESOURCES

Motivation

In managing commercial fishing fleets, the

level of effort (number of boat-days) is

assumed to influence the size of the catch What is the predicted crab catch in a

season with 7,500 days of effort?

Trang 40

4M Example 21.2:

MANAGING NATURAL RESOURCES

Method

Use regression with Y equal to the catch

near Vancouver Island from 1980 – 2007

measured in thousands of pounds of

Dungeness crabs with X equal to the level

of effort (total number of days by boats

catching Dungeness crabs).

Trang 42

4M Example 21.2:

MANAGING NATURAL RESOURCES

Mechanics

Trang 43

4M Example 21.2:

MANAGING NATURAL RESOURCES

Mechanics

Evidently independent

Trang 46

4M Example 21.2:

MANAGING NATURAL RESOURCES

Mechanics

The t-statistic (and p-value) indicate that the slope

is significantly different from zero The predicted

catch in a year with x = 7500 days of effort is

1,173.24 thousand pounds The 95% prediction interval is from 908.44 to 1,438.11 thousand

pounds

Trang 47

average, each additional day of effort (per boat)

increases the harvest by about 160 pounds In a season with 7,500 days of effort, there is an

expected total harvest of 1,173,240 pounds

There is a 95% probability that the catch will be between 908,440 and 1,438,110 pounds

Trang 48

Best Practices

 Verify that your model makes sense, both visually and substantively

 Consider other possible explanatory variables

 Check the conditions, in the listed order

Trang 49

Best Practices (Continued)

 Use confidence intervals to express what you

know about the slope and intercept

before using prediction intervals

 Be careful when extrapolating

Trang 50

 Don’t overreact to residual plots

 Do not mistake varying amounts of data for

Ngày đăng: 10/01/2018, 16:01

TỪ KHÓA LIÊN QUAN