1. Trang chủ
  2. » Kinh Doanh - Tiếp Thị

Statistics for business decision making and analysis robert stine and foster chapter 25

47 115 1

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 47
Dung lượng 1,11 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

25.1 Two-Sample ComparisonConfounding Variables careful about lurking variables that would account for the significant difference between average salaries e.g., experience.. 25.1 Two-Sa

Trang 2

Categorical Explanatory Variables

Chapter 25

Trang 3

25.1 Two-Sample Comparisons

Does Wal-Mart discriminate against female

employees? Are they paid less than men?

explanatory variable representing gender to

analyze pay data.

between men and women to account for other

variables that may affect pay.

Trang 4

25.1 Two-Sample Comparison

Example: Mid-Level Managers’ Salaries

The average salary for women is $140,000 and the average salary for men is $144,700

Trang 5

25.1 Two-Sample Comparison

Example: Mid-Level Managers’ Salaries

The 95% confidence for the difference in mean

salaries is $740 to $8,591 (since 0 is not in this

interval, the difference is significant)

Assume conditions for inference are satisfied.

Trang 6

25.1 Two-Sample Comparison

Confounding Variables

careful about lurking variables that would account for the significant difference between average

salaries (e.g., experience).

correlated with salary and the two groups (men

and women) differ with regard to experience.

Trang 7

25.1 Two-Sample Comparison

Subsets and Confounding

Restrict analysis to a subset of cases with matching levels of the confounding variable (e.g., compare men and women with 5 years of experience)

Trang 8

25.1 Two-Sample Comparison

Subsets and Confounding

 The 95% confidence interval for the difference in average salaries between men and women within the subset of managers with 5 years experience

includes 0 (the difference is not significant)

 However, the standard error of the difference is

much larger; the cases in the subset do not

produce a precise estimate

Trang 9

25.2 Analysis of Covariance

Regression on Subsets

 What about the difference between average

salaries for managers with 2, 10 or 15 years

experience?

 Analysis of covariance: regression that combines categorical and numerical explanatory variables; adjusts the comparison of means for the effects of confounding variables

Trang 10

25.2 Analysis of Covariance

Regression on Subsets

Trang 12

25.2 Analysis of Covariance

Combining Regressions

women requires a dummy variable identifying

whether a manager is male or female (Group = 1

for men; Group = 0 for women).

An interaction term is the product of two

explanatory variables in a regression model

Trang 13

25.2 Analysis of Covariance

Combining Regressions

Trang 14

25.2 Analysis of Covariance

Combining Regressions

Trang 15

25.2 Analysis of Covariance

Interpreting Coefficients

dummy variable forms a baseline for comparison.

between estimated intercepts in the simple

regressions The slope of the interaction is the

difference between estimated slopes in the simple regressions.

Trang 16

25.3 Checking Conditions

 The scatterplot reveals a linear (weak)

association between Salary and Years.

 Some caution is necessary regarding lurking

variables (e.g., educational background or

business aptitude)

Trang 17

25.3 Checking Conditions

Checking for Similar Variances

 Plot the residuals on the fitted values

 Compare side-by-side boxplots of the residuals

for each group The similar variance condition is violated if the IQR in one boxplot is more than

twice the length of the other

Trang 18

25.3 Checking Conditions

Checking for Similar Variances

Trang 19

25.3 Checking Conditions

Checking for Similar Variances

Trang 20

25.3 Checking Conditions

 The similar variance condition is satisfied

 Examining the normal quantile plot confirms that the residuals are nearly normal

Trang 21

25.4 Interactions and Inference

 Principle of marginality: if the interaction is

statistically significant, retain it as well as both of its components regardless of their level of

significance

 If the interaction is not statistically significant,

remove it from the regression and re-estimate the equation A model without an interaction term is simpler to interpret since the lines fit to the groups are parallel

Trang 22

25.4 Interactions and Inference

Interactions and Collinearity

An interaction in a multiple regression introduces

collinearity (see large VIF for Group Years).

Trang 23

25.4 Interactions and Inference

Interactions and Collinearity

Since the interaction in this example is not

significant, remove it and re-estimate the MRM

Trang 24

25.4 Interactions and Inference

Parallel Fits

between the intercepts for male and female

managers.

means that the line for men is shifted up from the

line for women by $1,024 for all levels of

experience

Trang 25

25.4 Interactions and Inference

Parallel Fits

Trang 26

25.4 Interactions and Inference

Parallel Fits

the slope of Group indicates that it is not

statistically significant.

difference between the average salaries of male

and female managers when comparing managers with equal years of experience.

Trang 27

4M Example 25.1:

PRIMING IN ADVERTISING

Motivation

FedEx introduced the Courier Pak using two waves

of promotion: an ad to raise awareness (i.e.,

priming) and a visit to existing clients by a sales

rep Management has two questions: (1) How

many shipments were generated by a typical one hour contact by the sales rep? and (2) Was the

promotion more effective for clients who were

already aware of the Courier Pak?

Trang 28

4M Example 25.1:

PRIMING IN ADVERTISING

Method

Based on data from 125 customers, fit a multiple

regression with a categorical variable The

response is number of shipments using Courier

Pak The explanatory variables are the amount of time spent with the client by a sales rep and a

dummy variable indicating whether or not the

client was aware of the Courier Pak The

interaction between the explanatory variables is

included

Trang 29

4M Example 25.1:

PRIMING IN ADVERTISING

Method

Scatterplot with lines fit separately for each group

(clients aware of Courier Pak shown in green)

Trang 30

indicates whether prior awareness of Courier

Paks affects how the sales rep visit influenced the client

Trang 31

4M Example 25.1:

PRIMING IN ADVERTISING

Mechanics – Estimate Model

Trang 32

4M Example 25.1:

PRIMING IN ADVERTISING

Mechanics – Check Conditions

Nothing in the plots suggest dependence Similar

variance condition is satisfied.

Trang 33

4M Example 25.1:

PRIMING IN ADVERTISING

Mechanics – Check Conditions

Similar variances confirmed

Trang 34

4M Example 25.1:

PRIMING IN ADVERTISING

Mechanics – Check Conditions

Nearly normal condition is satisfied

Trang 35

4M Example 25.1:

PRIMING IN ADVERTISING

Mechanics

Based on the F-statistic we can conclude that the

model explains statistically significant variation

The interaction between awareness and hours of contact is statistically significant Following the

principle of marginality, we retain Aware in the

model

The interaction implies that the gap between the

lines gets wider as the number of contact hours

increases

Trang 36

4M Example 25.1:

PRIMING IN ADVERTISING

Message

Priming produces a statistically significant increase

in the subsequent use of Courier Paks when

followed by a visit from a sales rep Each

additional hour of contact with a sales rep

produces about 4.3 more uses of the Courier

Paks with priming than without priming

Trang 37

25.5 Regression with Several Groups

Example: Estimating Store Sales

 Explanatory variables are median household

income in surrounding community, size of the

local population, and market (urban, suburban,

rural)

 The response is sales in dollars per square foot

Trang 38

25.5 Regression with Several Groups

Scatterplot Matrix

Rural – red

Suburban – green

Urban – blue

Association within each

group appears linear.

Trang 39

25.5 Regression with Several Groups

Example: Estimating Store Sales

In general, to distinguish J groups requires J-1

dummy variables

 For this example use two dummy variables:

Suburban Dummy = 1 suburban, 0 otherwise

Urban Dummy = 1 urban, 0 otherwise

Note that rural locations would be coded 0,0

Trang 40

25.5 Regression with Several Groups

Example: Estimating Store Sales

Trang 41

25.5 Regression with Several Groups

Example: Estimating Store Sales

 The interpretation of the estimates is similar to

the interpretation of models with two groups

 Coefficients associated with dummy variables

reflect differences of stores in other locations

compared to rural stores

Trang 42

25.5 Regression with Several Groups

Estimating Sales for Rural Stores

The estimated equation for baseline comparison

(stores located in a rural location) is

Estimated Sales ($/SqFt) =

-388.6992 + 0.0097 Income + 0.2401 Population (000)

Trang 43

25.5 Regression with Several Groups

Estimating Sales for Urban Stores

Consider stores in an urban location The estimated

sales is given by

Estimated Sales ($/SqFt) =

(-388.6992 + 468.8654) + (0.0097 - 0.0053) Income + 0.2401 Population (000)

Estimated Sales ($/SqFt) =

80.1662 + 0.0044 Income + 0.2401 Population (000)

Trang 44

25.5 Regression with Several Groups

Interpretation of Results

compared to rural stores, but do not grow as fast

with increases in income.

because the model does not include an interaction term between Population and dummy variables for location.

Trang 45

Best Practices

 Be thorough in your search for confounding

variables

 Consider interactions

 Choose an appropriate baseline group

 Write out the fits for separate groups

Trang 46

Best Practices (Continued)

 Be careful interpreting the coefficient of the

dummy variable

 Check for comparable variances in the groups

 Use color-coding or different plot symbols to

identify subsets of observations in plots

Trang 47

 Don’t think that you have adjusted for all of the

confounding factors

 Don’t confuse the different types of slopes

 Don’t forget to check the conditions of the MRM

Ngày đăng: 10/01/2018, 16:01

TỪ KHÓA LIÊN QUAN