1. Trang chủ
  2. » Kinh Doanh - Tiếp Thị

Statistics for business decision making and analysis robert stine and foster chapter 04

42 171 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 42
Dung lượng 845 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

 To answer this question we must understand the typical length of a song and the variation of song sizes around the typical length  We can do this using summary statistics Copyright ©

Trang 2

Describing Numerical

Data

Chapter 4

Trang 3

4.1 Summaries of Numerical Variables

Can 500 different songs fit on the iPod

Shuffle?

 To answer this question we must understand the typical length of a song and the variation of song sizes around the typical length

 We can do this using summary statistics

Copyright © 2011 Pearson Education, Inc.

3 of 42

Trang 4

4.1 Summaries of Numerical Variables

A Subset of the Data

Trang 5

4.1 Summaries of Numerical Variables

The Median

 Value in the middle of a sorted list of numerical

values (a typical value)

 Half of the values fall below the median; half fall above

 It is the 50th Percentile

Copyright © 2011 Pearson Education, Inc.

5 of 42

Trang 6

4.1 Summaries of Numerical Variables

Common Percentiles

 Lower Quartile = 25th Percentile

 Upper Quartile = 75th Percentile

 One quarter of the values fall below the lower

quartile and one quarter fall above the upper

quartile

Trang 7

4.1 Summaries of Numerical Variables

The Interquartile Range (IQR)

IQR = 75th Percentile – 25th Percentile

 A measure of variation based on quartiles

 Used to accompany the median

Copyright © 2011 Pearson Education, Inc.

7 of 42

Trang 8

4.1 Summaries of Numerical Variables

The Range

Range = Maximum - Minimum

 Maximum Value = 100th Percentile

 Minimum Value = 0th Percentile

 Another measure of variation; not preferred

because based on extreme values

Trang 9

4.1 Summaries of Numerical Variables

The Five Number Summary

Trang 10

4.1 Summaries of Numerical Variables

The Five Number Summary for Song Sizes

Trang 11

4.1 Summaries of Numerical Variables

Summary Statistics for Song Sizes

Trang 12

4.1 Summaries of Numerical Variables

The Mean (Average)

 Arithmetic average; divide the sum of the values

by the number of values (another typical value)

The symbol y represents the variable of interest

 The symbol read “y bar” represents the meany

Trang 13

4.1 Summaries of Numerical Variables

The Mean (Average)

Trang 14

4.1 Summaries of Numerical Variables

The Variance (s 2)

 Is a measure of variation based on the mean

 How far a value is from the mean is known as its

deviation; the variance is the average of the squared

deviations

Trang 15

4.1 Summaries of Numerical Variables

Trang 16

4.1 Summaries of Numerical Variables

The Standard Deviation (SD)

 Is the square root of the variance

 Is a measure of variability in the original units of the data (the variance results in squared units)

2

Trang 17

4.1 Summaries of Numerical Variables

Summary Statistics for Song Sizes

Trang 18

4M Example 4.1: MAKING M&M’s

Motivation

How many M&M’s are needed to fill a bag labeled to weigh 1.6 ounces?

Trang 19

4M Example 4.1: MAKING M&M’s

Method

Data are weights of 72 plain chocolate M&M’s taken from several packages To get a measure of the amount of variation relative to the typical size, we use the ratio of the standard deviation to the

mean (known as the coefficient of variation)

Copyright © 2011 Pearson Education, Inc.

19 of 42

v

s c

y

Trang 20

4M Example 4.1: MAKING M&M’s

Mechanics

Mean Weight = 0.86 gm

SD = 0.04 gm

Trang 21

4M Example 4.1: MAKING M&M’s

Message

Since the SD is quite small compared to the mean

(with a c v of about 5%) the results suggest that 53

pieces are usually enough to fill a bag

A bag labeled 1.6 ounces weighs about 45.36 grams

Since there is little variability around the typical weight of

an M&M, we can calculate the number of pieces to fill a

1.6 ounce bag as 45.36/0.86.

Copyright © 2011 Pearson Education, Inc.

21 of 42

Trang 22

4.2 Histograms and the

Distribution of Numerical Data

Histograms

 Plot the distribution of a numerical variable by

showing counts of values occurring within

adjacent intervals

 Similar to bar charts but designed for continuous quantitative data (bar charts are only appropriate for discrete categories)

Trang 23

4.2 Histograms and the

Distribution of Numerical Data

Histogram of Song Sizes

Copyright © 2011 Pearson Education, Inc.

23 of 42

Trang 24

4.2 Histograms and the

Distribution of Numerical Data

Histogram of Song Sizes

 Indicates a few very long songs (outliers)

 The graph devotes more than half of its area to

show less than 1% of the songs (white space

rule: graphs with mostly white space can be

improved by changing the interval of the plot to

focus on the data rather than the white space)

Trang 25

4.3 Boxplot

Graph of the Five Number Summary

Copyright © 2011 Pearson Education, Inc.

25 of 42

Trang 26

4.3 Boxplot

Combining Boxplots with Histograms

 Boxplots locate the median and quartiles

and highlight outliers

 The median splits the area of the histogram

in half (unlike the mean, it is resistant or

robust to the effects of outliers)

Trang 27

4.3 Boxplot

Boxplot with Histogram of Song Sizes

Copyright © 2011 Pearson Education, Inc.

27 of 42

Trang 28

4.4 Shape of a Distribution

Modes

bimodal; three or more is multimodal

height is uniform

Trang 29

4.4 Shape of a Distribution

Symmetry and Skewness

 A distribution is symmetric if the two sides

of its histogram are mirror images

 A distribution is skewed if one tail of the

histogram stretches out farther than the

other

Copyright © 2011 Pearson Education, Inc.

29 of 42

Trang 30

4.4 Shape of a Distribution

Distribution of Song Sizes

 The mode lies between 3 and 4 MB

 The distribution is right skewed (the right

tail stretches out farther than the left tail)

Trang 34

4M Example 4.2:

EXECUTIVE COMPENSATION

Message

The distribution of annual salaries of CEO’s

in 2003 is unimodal, nearly symmetric

around the median of $650,000, and right

skewed The average is $697,000 The

largest salary is $4,000,000.

Trang 35

4.4 Shape of a Distribution

Bell-Shaped Distributions and Empirical Rule

 A bell-shaped distribution is symmetric and unimodal

 The empirical rule uses the standard

deviation to describe how data with a

bell-shaped distribution cluster around the mean

Copyright © 2011 Pearson Education, Inc.

35 of 42

Trang 36

4.4 Shape of a Distribution

The Empirical Rule

Trang 37

4.4 Shape of a Distribution

Standardizing

Converting data to z-scores

 Z- scores measure the distance from the

mean in standard deviations

Copyright © 2011 Pearson Education, Inc.

37 of 42

y y z

s

Trang 38

4.5 Epilog

Can 500 different songs fit on the iPod

Shuffle?

Because of variation, not every collection of 500

songs will fit The longest 500 songs won’t fit

However, based on the typical song size, the

amount of variation in song sizes and the shape

of its distribution, we can say that most

collections of 500 songs will fit!

Trang 39

Best Practices

histograms and summaries such as the mean and standard deviation

with a graph

when preparing a histogram

Copyright © 2011 Pearson Education, Inc.

39 of 42

Trang 40

Best Practices (Continued)

 Scale your plots to show data, not empty space

 Anticipate what you will see in a histogram

 Label clearly

 Check for gaps

Trang 41

 Do not ignore the presence of outliers.

Copyright © 2011 Pearson Education, Inc.

41 of 42

Ngày đăng: 10/01/2018, 16:00

TỪ KHÓA LIÊN QUAN