17.1 A Confidence Interval for the Median Nonparametric Statistics population... 17.1 A Confidence Interval for the Median Nonparametric Confidence Interval First step in finding a co
Trang 2Copyright © 2011 Pearson Education, Inc.
Alternative Approaches to
Inference
Chapter 17
Trang 317.1 A Confidence Interval for the
Median
An auto insurance company is thinking about compensating agents by comparing the
number of claims they produce to a
standard Annual claims average near
$3,200 with a median claim of $2,000
sampling distribution
Trang 417.1 A Confidence Interval for the
Median
Distribution of Sample of Claims (n = 42)
For this sample, the average claim is $3,632 with
s = $4,254 The median claim is $2,456.
Copyright © 2011 Pearson Education, Inc.
4 of 35
Trang 517.1 A Confidence Interval for the
Median
Is Sample Mean Compatible with µ=$3,200?
interval for µ
This interval is
$3,632 ± 2.02 x $4,254 /
Trang 617.1 A Confidence Interval for the
Median
Is Sample Mean Compatible with µ=$3,200?
confidence t-interval for the mean.
condition necessary to use the t-interval.
the conditions are not met
Copyright © 2011 Pearson Education, Inc.
6 of 35
Trang 717.1 A Confidence Interval for the
Median
Nonparametric Statistics
population
(theta)
Trang 817.1 A Confidence Interval for the
Trang 917.1 A Confidence Interval for the
Median
Nonparametric Confidence Interval
First step in finding a confidence interval for θ is to sort
the observed data in ascending order (known as order
statistics)
X(1) < X(2) < … < X(n)
Trang 1017.1 A Confidence Interval for the
Median
Nonparametric Confidence Interval
we know
less than or equal to θ is ½,
Copyright © 2011 Pearson Education, Inc.
10 of 35
Trang 1117.1 A Confidence Interval for the
Median
Nonparametric Confidence Interval
between ordered observations using the binomial
distribution
segments to achieve desired coverage
Trang 1217.1 A Confidence Interval for the
Median
Nonparametric Confidence Interval
whose coverage is exactly 0.95
[$1,217 to $3,168]
Copyright © 2011 Pearson Education, Inc.
12 of 35
Trang 1317.1 A Confidence Interval for the
Median
Parametric versus Nonparametric
of binomial probabilities (difficult to obtain exactly 95%
coverage)
distribution is skewed This prohibits obtaining estimates
for the total (total = nµ).
Trang 1417.2 Transformations
Transform Data into Symmetric Distributions
Taking base 10 logs of the claims data results in a more
symmetric distribution
Copyright © 2011 Pearson Education, Inc.
14 of 35
Trang 1617.2 Transformations
Transform Data into Symmetric Distributions
If y = log10 x, then = 3.312 with s y = 0.493
The 95% confidence t-interval for µy is
[3.16 to 3.47]
If we convert back to the original scale of dollars, this
interval resembles that for the median rather than that for the mean
Copyright © 2011 Pearson Education, Inc.
16 of 35
y
Trang 1717.3 Prediction Intervals
from the population with chosen probability
anticipates the size of the next claim, allowing for the
random variation associated with an individual
Trang 1817.3 Prediction Intervals
For a Normal Population
The 100 (1 – α)% prediction interval for an independent draw
from a normal population is
where and s estimate µ and σ
Copyright © 2011 Pearson Education, Inc.
18 of 35
n
s t
x ± α / 2,n−1 1 + 1
x
Trang 1917.3 Prediction Intervals
Nonparametric Prediction Interval
P(X(i) ≤ X ≤ X(i+1)) = 1/(n + 1)
P(X ≤ X(1)) = 1/(n + 1)
P(X(n) ≤ X) = 1/(n + 1)
Trang 2017.3 Prediction Intervals
Nonparametric Prediction Interval
Combine segments to get desired coverage
Trang 214M Example 17.1:
EXECUTIVE SALARIES Motivation
Fees earned by an executive placement
service are 5% of the starting annual total
compensation package How much can
the firm expect to earn by placing a current client as a CEO in the telecom industry?
Trang 224M Example 17.1:
EXECUTIVE SALARIES Method
Obtain data (n = 23 CEOs from telecom industry)
Copyright © 2011 Pearson Education, Inc.
22 of 35
Trang 234M Example 17.1:
EXECUTIVE SALARIES Method
The distribution of total compensation for
CEOs in the telecom industry is not normal Construct a nonparametric prediction
interval for the client’s anticipated total
compensation package.
Trang 244M Example 17.1:
EXECUTIVE SALARIES Mechanics
Sort the data:
Copyright © 2011 Pearson Education, Inc.
24 of 35
Trang 264M Example 17.1:
EXECUTIVE SALARIES Message
The compensation package of three out of four placements in this industry is predicted to be in the range from about $750,000 to
$30,000,000 The implied fee ranges from
$37,500 to $1,500,000.
Copyright © 2011 Pearson Education, Inc.
26 of 35
Trang 2717.4 Proportions Based on Small
Samples
Wilson’s Interval for a Proportion
closer to ½ and away from the troublesome boundaries at
0 and 1
create an adjusted proportion
pˆ
p
~
Trang 2817.4 Proportions Based on Small
Samples
Wilson’s Interval for a Proportion
Add 2 successes and 2 failures to the data and define = (# of
p ~ ( 1 ~ ~ )
α
Trang 294M Example 17.2: DRUG TESTING
Motivation
A company is developing a drug to prolong
time before a relapse of cancer The drug must cut the rate of relapse in half To test this drug, the company first needs to know the current time to relapse.
Trang 304M Example 17.2: DRUG TESTING
Method
Data are collected for 19 patients who were
observed for 24 months Doctors found a
relapse in 9 of the 19 patients While the SRS condition is satisfied, the sample size condition
is not Use Wilson’s interval for a proportion.
Copyright © 2011 Pearson Education, Inc.
30 of 35
Trang 314M Example 17.2: DRUG TESTING
0 )
4 19
/(
) 2 9
(
~ p = + + ≈
)419
/(
)478
01(478
Trang 324M Example 17.2: DRUG TESTING
Message
We are 95% confident that the proportion of
patients with this cancer that relapse within 24
months is between 27% and 68% In order to cut this proportion in half, the drug will have to reduce this rate to somewhere between 13% and 34%.
Copyright © 2011 Pearson Education, Inc.
32 of 35
Trang 34in order to use a t – interval for the mean.
because they are narrower than a nonparametric interval
quantile plot
Copyright © 2011 Pearson Education, Inc.
34 of 35
Trang 35Pitfalls (Continued)
prediction interval