1. Trang chủ
  2. » Kinh Doanh - Tiếp Thị

Statistics for business decision making and analysis robert stine and foster chapter 05

39 166 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 39
Dung lượng 1,13 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

two categorical variables: Host and Purchase  Host identifies the originating site: MSN, RecipeSource, or Yahoo; Purchase indicates... 5.1 Contingency TablesConsider Two Categorical Va

Trang 2

Association between Categorical Variables

Chapter 5

Trang 3

5.1 Contingency Tables

Which hosts send more buyers to

Amazon.com?

two categorical variables: Host and Purchase

Host identifies the originating site: MSN,

RecipeSource, or Yahoo; Purchase indicates

Trang 4

5.1 Contingency Tables

Consider Two Categorical Variables

Simultaneously

categorical variable contingent on the value of

another (for every combination of both variables)

Trang 5

5.1 Contingency Tables

Contingency Table for Web Shopping

Trang 6

5.1 Contingency Tables

Marginal and Conditional Distributions

• Marginal distributions appear in the “margins” of a contingency table and represent the totals

(frequencies) for each categorical variable

separately

• Conditional distributions refer to counts within a

Trang 7

5.1 Contingency Tables

Conditional Distribution of Purchase for each

Host (Column Counts and Percentages)

Trang 8

5.1 Contingency Tables

Conditional Distribution

• Reveals the percentage of purchases

among visitors from RecipeSource to be

much less than for MSN and Yahoo

Host and Purchase are associated

Trang 9

5.1 Contingency Tables

Segmented Bar Charts

• Used to display conditional distributions

• Divides the bars in a bar chart into

segments that are proportional to the

percentage in each category of a second

variable

Trang 10

5.1 Contingency Tables

Contingency Table of Purchase by Region

Trang 11

5.1 Contingency Tables

Segmented Bar Chart Shows Association

Trang 12

5.1 Contingency Tables

Mosaic Plots

 Alternative to segmented bar chart

 A plot in which the size of each “tile” is

proportional to the count in a cell of a

contingency table

Trang 13

5.1 Contingency Tables

Contingency Table of Shirt Size by Style

Trang 14

5.1 Contingency Tables

Mosaic Plot Shows Association

Trang 15

4M Example 5.1: CAR THEFT

Motivation

Should insurance companies vary the

premiums for different car models (are

some cars more likely to be stolen than

others)?

Trang 16

4M Example 5.1: CAR THEFT

Method

Data obtained from the National Highway Traffic

Safety Administration (NHTSA) on car theft for

seven popular models (two categorical variables: type of car and whether the car was stolen).

Trang 17

4M Example 5.1: CAR THEFT

Mechanics

Trang 18

4M Example 5.1: CAR THEFT

Mechanics

Trang 19

4M Example 5.1: CAR THEFT

Message

The Dodge Intrepid is more likely to be stolen than other popular models The data suggest that

higher premiums for theft insurance should be

charged for models that are more likely to be

stolen.

Trang 20

5.2 Lurking Variables

and Simpson’s Paradox

Association Not Necessarily Causation

affects the apparent relationship between two

other variables

between two variables when data are separated

Trang 21

4M Example 5.2: AIRLINE ARRIVALS

Motivation

Does it matter which of two airlines a

corporate CEO chooses when flying to

meetings if he wants to avoid delays?

Trang 22

4M Example 5.2: AIRLINE ARRIVALS

Method

Data obtained from US Bureau of

Transportation Statistics on flight delays for two airlines (two categorical variables:

airline and whether the flight arrived on

time)

Trang 23

4M Example 5.2: AIRLINE ARRIVALS

Mechanics

Trang 24

4M Example 5.2: AIRLINE ARRIVALS

Mechanics –

Is destination a lurking variable?

Trang 25

4M Example 5.2: AIRLINE ARRIVALS

Mechanics –

This is Simpson’s Paradox

Trang 26

4M Example 5.2: AIRLINE ARRIVALS

Message

The CEO should book on US Airways as it is more likely to arrive on time regardless of

destination

Trang 27

5.3 Strength of Association

Chi-Squared Statistic

 A measure of association in a contingency

table

 Calculated based on a comparison of the

observed contingency table to an artificial

table with the same marginal totals but no

Trang 28

5.3 Strength of Association

Contingency Table

Trang 29

5.3 Strength of Association

Calculating the Chi-Squared Statistic

Trang 31

5.3 Strength of Association

Cramer’s V

 Derived from the Chi-Squared Statistic

 Ranges in value from 0 (variables are not

associated) to 1(variables are perfectly

associated)

Trang 32

5.3 Strength of Association

Calculating Cramer’s V

V = 0.20 for our example

There is a weak association between group

2

x V

Trang 33

5.3 Strength of Association

Checklist: Chi-Squared and Cramer’s V

 Verify that variables are categorical

 Verify that there are no obvious lurking

variables

Trang 34

4M Example 5.3: REAL ESTATE

Motivation

Do people who heat their homes with gas

prefer to cook with gas as well? What

heating systems and appliances should a

developer select for newly built homes?

Trang 35

4M Example 5.3: REAL ESTATE

Method

The developer contacts homeowners to

obtain the data Two categorical variables: type of fuel used for home heating (gas or electric) and type of fuel used for cooking

(gas or electric)

Trang 36

4M Example 5.3: REAL ESTATE

Mechanics

Trang 37

4M Example 5.3: REAL ESTATE

Message

Homeowners prefer gas to electric heat by

about 2 to 1 The developer should build

about two-thirds of new homes with gas

heat Put electric appliances in all homes

with electric heat and in half of the homes

with gas heat (assuming that buyers for

new homes have the same preferences).

Trang 38

Best Practices

association between two categorical variables.

Trang 39

Pitfalls

Ngày đăng: 10/01/2018, 16:00

TỪ KHÓA LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm