Why Use Frequency Distributions? A frequency distribution is a way to summarize data The distribution condenses the raw data into a more useful form.... Simple frequency distributio
Trang 1Chapter 2 Summarizing data
Summarizing Qualitative Data
Summarizing Quantitative Data
Part A
Trang 2Why summarizing data?
Trang 3I Summarizing Qualitative Data
Frequency Distribution
Relative Frequency Distribution
Percent Frequency Distribution
Bar Graph
Pie Chart
Trang 5Example: Marada Inn
Guests staying at Marada Inn were
asked to rate the quality of their
accommodations as being excellent,
above average, average, below average, or
poor The ratings provided by a sample of 20 guests are:
Trang 6Example: Marada Inn
Excellent Above Average Average
Above Average Above Average Below Average Poor
Above Average Average
Trang 7Why Use Frequency Distributions?
A frequency distribution is a way to
summarize data
The distribution condenses the raw data
into a more useful form
and allows for a quick visual interpretation
of the data
Trang 8The relative frequency of a class is
The relative frequency of a class is
A relative frequency distribution is
A relative frequency distribution is
Relative Frequency Distribution
Trang 9Relative Frequency Distribution
Ratings Frequency Relative frequency
Trang 10Bar Graph
A graphical device for depicting qualitative data
On one axis (usually the horizontal axis), we specify
the label or the name of each of the class
On the other axis (usually the vertical axis), we specify the frequency, relative frequency of each class
A bar of fixed width is drawn above each class
label, we extend the height appropriately
The bars are separated to emphasize the fact that each class is a separate category
Trang 11 Bar Graph - Example
Trang 12First draw a circle; then use the relative
frequencies to subdivide the circle
into sectors that correspond to the
relative frequency for each class
Trang 13 Pie Chart - Example
Trang 14Insights Gained from the Preceding Pie Chart
Example: Marada Inn
Trang 15II Summarizing Quantitative Data
Trang 161 Simple frequency distribution
Simple frequency
distribution consists of a
list of data values, each
showing the number of
items having that value
Trang 171 Simple frequency distribution
Why?
• Normally not suitable
for continuous data
• Normally applicable
to discrete raw data
Why?
Trang 18 The following data record the number of
children in the families of the 47 workers in a company:
1 1 3 2 0 2 0 1 2 2 1 3
5 2 4 0 0 2 4 1 1 2 2 0
3 0 0 2 1 3 6 0 2 1 0 3
2 2 2 1 0 0 1 1 3 1 4
Trang 19Constructing a simple frequency distribution using a tally chart
Data value Tally marks Total
0 1 2 3 4 5 6
Trang 20Frequency distribution table
Number of children
in family Number of workers
0 1 2 3 4 5 6
Trang 21Disadvantage of simple frequency
distribution
Trang 222 Grouped frequency distributions
Used when the data set contains a large number
of data values
A grouped frequency distribution summaries
data into groups of values, each showing the
number of items having values in the group.
Each group of data value called class
Used for both continuous data and discrete data
Trang 23Definitions associated with frequency
distribution classes
Class limits: are the lower and upper
values of the classes as physically
described in the distribution
Trang 24Definitions associated with frequency
distribution classes
Class widths (class lengths):
- continuous data: are the numerical differences between lower and upper class limits.
- discrete data: are the numerical differences
between the lower limit of one class and the lower limit of the immediately following class
Class mid-points: are situated in the centre of the classes.
Trang 25Definitions associated with frequency
distribution classes
Open-ended class:
- A class without a/an
lower/upper limit.
- Usually used for the first
class which has no defined
lower limit and/or the last
class which has no defined
upper limit
Classes
< 10 10-15 15-20
>=20
Trang 26Guidelines for grouping values into
classes
• Use between 5 and 20 classes
Number of Data Points Number of Classes
Trang 27Example: Hudson Auto Repair
The manager of Hudson Auto
would like to have a better
understanding of the cost
of parts used in the engine
tune-ups performed in the
shop She examines 50
customer invoices for tune-ups The costs of parts,rounded to the nearest dollar, are listed on the nextslide
Trang 28Example: Hudson Auto Repair
Sample of Parts Cost for 50 Tune-ups
Trang 29Frequency Distribution
Guidelines for Selecting Width of Classes
Largest Data Value Smallest Data Value
Number of Classes
−
•Use classes of equal width
•Approximate Class Width =
Trang 30 Example
Trang 31Relative Frequency Distribution
Part costs ($) Frequency Relative
frequency frequency (%)Relative
Total
Trang 32Insights Gained from the Percent Frequency Distribution
Relative Frequency Distributions
Trang 33Dot Plot
One of the simplest graphical
summaries of data is a dot plot.
A horizontal axis shows the range of
data values.
Then each data value is represented by
a dot placed above the axis.
Trang 34Dot Plot
Tune-up Parts Cost
Trang 35 Another common graphical presentation of
quantitative data is a histogram
The variable of interest is placed on the horizontal axis
A rectangle is drawn above each class interval with its height corresponding to the interval’s frequency, relative frequency
Unlike a bar graph, a histogram has no gap between rectangle
Trang 36Tune-up Parts Cost
Trang 37Cumulative frequency distribution
Cumulative frequency distribution
Cumulative relative frequency distribution –
Cumulative relative frequency distribution –
Cumulative Distributions
Trang 38Cumulative Frequency Distribution
Part costs ($) Frequency Cumulative
frequency cumulative Relative
frequency
Hudson Auto Repair
Trang 39An ogive is a graph of a cumulative distribution.The data values are shown on the horizontal axis.Shown on the vertical axis are the:
• cumulative frequencies, or
• cumulative relative frequencies, or
The frequency (one of the above) of each class is plotted as a point
The plotted points are connected by straight lines
Trang 40Hudson Auto Repair
Trang 41Ogive with Cumulative Percent Frequencies
Trang 42Chapter 2 Summarizing data
Exploratory Data Analysis
Crosstabulation and Scatter Diagrams
Part B
x y
Trang 43Exploratory Data Analysis
The techniques of exploratory data analysis consist of simple arithmetic and easy-to-draw pictures that can
be used to summarize data quickly
One such technique is the stem-and-leaf display
Trang 44Stem-and-Leaf Display
Trang 45Example: Hudson Auto Repair
The manager of Hudson Auto
would like to have a better
understanding of the cost
of parts used in the engine
tune-ups performed in the
shop She examines 50
customer invoices for tune-ups The costs of parts,rounded to the nearest dollar, are listed on the
next
slide
Trang 46Example: Hudson Auto Repair
Sample of Parts Cost for 50 Tune-ups
Trang 47Stretched Stem-and-Leaf Display
Whenever a stem value is stated twice, the first value corresponds to leaf values of 0 - 4, and the second value corresponds to leaf values of 5 - 9
If we believe the original stem-and-leaf display has condensed the data too much, we can stretch the display by using two stems for each leading digit(s)
Trang 48Stretched Stem-and-Leaf Display
Trang 49Stem-and-Leaf Display
Leaf Units
• Where the leaf unit is not shown, it is assumed
to equal 1
• Leaf units may be 100, 10, 1, 0.1, and so on
• In the preceding example, the leaf unit was 1
• A single digit is used to define each leaf
Trang 50Example: Leaf Unit = 0.1
If we have data with values such as
8.6 11.7 9.4 9.1 10.2 11.0 8.8
a stem-and-leaf display of these data will be
Trang 51Example: Leaf Unit = 10
If we have data with values such as
1806 1717 1974 1791 1682 1910 1838
a stem-and-leaf display of these data will be
Trang 52Crosstabulations and Scatter
Diagrams
Crosstabulation and a scatter diagram are two
methods for summarizing the data for two (or more) variables simultaneously
Often a manager is interested in tabular and
graphical methods that will help understand the
relationship between two variables
So far we have focused on methods that are used
to summarize the data for one variable at a time
Trang 53Crosstabulation can be used when:
A crosstabulation is
Trang 54Cross-tabulation Example: Finger Lakes Homes
The number of Finger Lakes homes sold for each style and price for the past two years is shown below
Trang 55 Insights Gained from Preceding Crosstabulation
Trang 56Frequency distribution
for the home style variable
Trang 57Cross-tabulation: Row or Column
Percentages
Converting the entries in the table into row percentages or column percentages can
provide additional insight about the
relationship between the two variables.
Trang 58Crosstabulation: Row Percentages
Trang 59Crosstabulation: Column Percentages
Trang 60 The general pattern of the plotted points suggests the overall relationship between the variables.
One variable is shown on the horizontal axis and the other variable is shown on the vertical axis
A scatter diagram is a graphical presentation of the relationship between two quantitative variables
Scatter Diagram and Trendline
A trendline is an approximation of the relationship
Trang 61Scatter Diagram
A Positive Relationship
x y
Trang 62Scatter Diagram
A Negative Relationship
x y
Trang 63Scatter Diagram
No Apparent Relationship
x y
Trang 64Example: Panthers Football Team
Scatter Diagram
The Panthers football team is interested
in investigating the relationship, if any,
between interceptions made and points scored
13213
1424181730
x = Number ofInterceptions
y = Number of Points Scored
Trang 65Scatter Diagram
Trang 66Insights Gained from the Preceding Scatter Diagram
Example: Panthers Football Team
Trang 67Tabular and Graphical Procedures
Qualitative Data Quantitative Data