Example 1: The amount of time it takes to assemble a computer is normally distributed, with a mean of 50 minutes and a standard deviation of 10 minutes.. Much like the standard normal
Trang 1Continuous Probability
Distributions
Chapter 7
Trang 2A continuous random variable has an
uncountable infinite number of values in the
Trang 3P(a ≤ x ≤ b )
The probability that a continuous variable X will assume
any particular value is zero
The probability that x falls between ‘a’ and ‘b’ is the area under the graph of f(x) between ‘a’ and ‘b’
Trang 4A random variable X is said to be
A random variable X is said to be uniformly uniformly
distributed if its density function is
with
12
) a b
( )
X (
V 2
b
a E(X)
b x
a a
b
1 )
x ( f
Trang 5Example 7.1: The daily sale of gasoline is uniformly
distributed between 2,000 and 5,000 gallons Find the
probability that sales are: Between 2,500 and 3,000 gallons
Trang 6A random variable X with mean µ and variance σ 2is normally distributed if its probability density function is
We denote a normal distribution by N(µ, σ 2 )
Normal distributions range from minus infinity to plus infinity
71828
2 e
and
14159
3 where
x
e 2
1 )
x ( f
2
x ) 2 / 1 (
=
= π
2 e
and
14159
3 where
x
e 2
1 )
x ( f
2
x ) 2 / 1 (
=
= π
µ
−
−
7.2 Normal Distribution
Trang 7The Shape of the Normal Distribution
bell shaped, and symmetrical around µ
µ
Trang 8Increasing the mean shifts the curve to
the right…
Trang 9Increasing the standard deviation “ flattens” the curve…
Trang 10Two facts help calculate normal probabilities:
- The normal distribution is symmetrical.
- Any random variable (rv) X having N(µ , σ2 ) ) can be transformed into a rv Z having
Trang 12This shifts the mean of X to zero…
0
Trang 13This changes the shape of the curve…
0
Trang 14Example 1: The amount of time it takes to
assemble a computer is normally distributed, with
a mean of 50 minutes and a standard deviation of
10 minutes What is the probability that a
computer is assembled in between 45 and 60
minutes?
•Solution
Let X denote the assembly time of a computer.
We seek P(45<X<60).
Trang 19The curve for σ =5%
The curve for σ = 10%
Example 2: The standard deviation of the rate of return (X) is now 10% What is the probability of losing money?
Trang 20Using Excel to Find Normal Probabilities
For P(X<k) enter in any empty cell:
Trang 22What percentage of the standard normal population
is located to the left of z .30 ?
Trang 23Example 4: Determine z not exceeded by 5% of the population; that is, z is exceeded by 95% of the population.
Solution: Because of the symmetry of the normal distribution it is the negative value of z .05
0
0.95 0.05
-1.645
Trang 25The Student t density function
ν (nu) is called the degrees of freedom
E(t) = 0 V(t) = n/(n – 2)
2 / ) 1 ( 2
t 1 )]!
2 [(
)]!
1
[(
)t ( f
+ ν
−
ν
=
2 / ) 1 ( 2
t 1 )]!
2 [(
)]!
1
[(
)t ( f
+ ν
−
ν
=
(for n > 2)
Trang 26Much like the standard normal distribution, the Student
about its mean of zero:
Trang 27As the number of degrees of freedom increases, the t distribution approaches the standard normal distribution.
Trang 28The student t distribution is used extensively in statistical inference Table 4 in Appendix B lists values of ; that is, values of a Student t
random variable with degrees of freedom such that:
The values for A are pre-determined “critical”
values, typically in the 10%, 5%, 2.5%, 1% and
1/2% range.
Trang 29For example, if we want the value of t with 10
degrees of freedom such that the area under the Student t curve is 05:
Area under the curve value (t A ) : COLUMN
Degrees of Freedom : ROW
t.05,10
t .05,10 =1.812
Trang 31Chi squared values can be found from the chi
squared table or from Excel.
The χ2 -table entries are the χ2 values of the right hand tail probability (A), for which P(χ2 n > c 2 A ) = A.
0 5 10 15 20 25 30 35
A
χ2 A
Trang 32Degrees of freedom
1 0.0000393 0.0001571 6.6349 7.87944
.
10 2.15585 2.55821 18.307 23.2093 25.1882
Degrees of freedom
1 0.0000393 0.0001571 6.6349 7.87944
.
.99 0 χ 2
.05 χ 2
.0 1 0
χ 2 005
A=.05
χ 2 Α
To find χ 2 for which
Trang 33To find the point in a chi-squared distribution with 8 degrees of freedom, such that the area to the right is 05,
Look up the intersection of the 8 d.f row with the
column, yielding a value of 15.5
Trang 34To find the point in a chi-squared distribution with
8 degrees of freedom, such that the area to the
left is 05,
Trang 35=2.73 =15.5
Trang 36F Distribution…
F > 0
is the “numerator” degrees of freedom and
is the “denominator” degrees of freedom.
0
F F
1
F 2
2 2
2 2
2 )
F
(
f
2 2
1
2
2 2
2
1 2
1
2 1
2 1
1 1
ν
0
F F
1
F 2
2 2
2 2
2 )
F
(
f
2 2
1
2
2 2
2
1 2
1
2 1
2 1
1 1
ν
!
!
!
Trang 37This density function generates a rich family of
distributions, depending on the values of n 1 and n 2
The F Distribution
ν1 = 5, ν2 = 10
ν1 = 50, ν2 = 10
0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008
ν1 = 5, ν2 = 10
ν1 = 5, ν2 = 1
Trang 38For example, what is the value of F for 5% of the
area under the right hand “tail” of the curve, with
a numerator degree of freedom of 3 and a
denominator degree of freedom of 7?
Numerator Degrees of Freedom : COLUMN
Denominator Degrees of Freedom : ROW
F.05 , 3 , 7
There are different tables
for different values of A.
Make sure you start with
the correct table !!
F .05,3,7 =4.35
Trang 40Sampling Distributions
Chapter 8
40
Trang 42Samples are random, so the sample
statistic is a random variable.
As such it has a
As such it has a sampling distribution sampling distribution .
42
Trang 438.1 Sampling Distribution of the Mean
Example 1: A die is thrown infinitely many
times Let X represent the number of
spots showing on any throw The
Trang 44Suppose we want to estimate µ from the mean
of a sample of size n = 2.
What is the distribution of ?
44
x
Trang 45Throwing a die twice – sample mean
Trang 46The distribution of when n = 2
Calculating the relative frequency of each value
of we have the following results
Trang 47) 25
( 1167
5 3
25 n
2 x
2 x
x
σ
=
= σ
= µ
=
) 10
( 2917
5 3
10 n
2 x
2 x
x
σ
=
= σ
= µ
=
) 5
( 5833
5 3
5 n
2 x
= µ
=
As the sample size changes, the mean of the sample mean does not change!
Trang 48) 25
( 1167
5 3
25 n
2 x
2 x
x
σ
=
= σ
= µ
=
) 10
( 2917
5 3
10 n
2 x
2 x
x
σ
=
= σ
= µ
=
) 5
( 5833
5 3
5 n
2 x
= µ
=
As the sample size increases, the
variance of the sample mean decreases!
Trang 49Compare the range of the population
to the range of the sample mean. Let us take samples of two observations
2
Trang 50The Central Limit Theorem
If a random sample is drawn from any population, the
sampling distribution of the sample mean is:
– Normal if the parent population is normal,
normal, provided the sample size is sufficiently large The larger the sample size, the more closely the
sampling distribution of will resemble a normal
distribution.
50
x
Trang 512 x
Trang 55Example 2: The amount of soda pop in each bottle
is normally distributed with a mean of 32.2
ounces and a standard deviation of 3 ounces
Find the probability that a bottle bought by a
customer will contain more than 32 ounces.
0.7486 67)
P(z
) 3
32.2
32 σ
μ
x P(
Trang 56Find the probability that a carton of four bottles will
have a mean of more than 32 ounces of soda per bottle.
9082
0 )
33
1 z
( P
) 4
3
2 32 32
x ( P )
32 x
P( >
Trang 57Example 3: The average weekly income of B.B.A graduates one year after graduation is $600
Suppose the distribution of weekly income has a standard deviation of $100
What is the probability that 35 randomly selected graduates have an average weekly income of less than $550?
57
0.0015 2.97)
P(z
) 35 100
600
550 σ
μ
x P(
550) x
Trang 5858
Trang 59Chapter 9
59
Trang 61Statistical inference is the process by which we acquire information and draw conclusions about populations from samples.
There are two procedures for making inferences:
Trang 62The objective of estimation is to determine the
approximate value of a population parameter on the basis
Trang 63We saw earlier that point probabilities in
continuous distributions were virtually zero The probability of the point estimator being correct is zero
Trang 64An
An interval estimator interval estimator draws inferences about a
population by estimating the value of an unknown parameter using
parameter using an interval an interval .
That is we say (with some _% certainty) that the population parameter of interest is between some lower and upper bounds.
Trang 65For example, suppose we want to estimate the mean summer income of a class of business students For
n = 25 students, is calculated to be 400 $/week.
point estimate interval estimate
An alternative statement is:
The mean income is between 380 and 420 $/week.
Trang 66Estimating when is known…
From Chapter 9, the sampling distribution of is approximately normal with mean µ and standard deviation
X
n /
σ
n /
X Z
σ
µ
−
=
Trang 67Thus, substituting Z produces
With a little bit of algebra,
With a little bit of different algebra we have
x z
z x
The confidence interval
Trang 68Lower confidence limit (LCL) =
Upper confidence limit (UCL) =
The probability 1 – α is the
The probability 1 – α is the confidence level confidence level, which is a
measure of how frequently the interval will actually include µ.
Trang 69Four commonly used confidence levels
Trang 70Example 2: Doll Computer Comp found that the
demand over the lead time is normally
distributed with a standard deviation of 75
Estimate the expected demand over the lead time
at 95% confidence level Assume N=25 and x = 370 16
[ 340 76 , 399 56 ] 40
29 16
.
370 25
75 96
1 16
370
25
75 z
16
370 n
± α
Trang 71Comparing two confidence intervals with the same level of confidence, the narrower interval provides more information than the wider interval
The width of the confidence interval is calculated by
and therefore is affected by
• the population standard deviation (s)
• the confidence level (1-a)
• the sample size (n).
n
Z
2 n
z
x n
Trang 721 z
If the standard deviation grows larger, a longer
confidence interval is needed to maintain the
confidence level
Note what happens when σ increases to 1.5 σ
Trang 73Example 1: Estimate the mean value of the distribution resulting from the 100 repeated throws of the die It is known that σ = 1.71
Use 90% confidence level:
Use 95% confidence level:
1 645
1 96
1
Trang 742 n
2 n
Trang 75By increasing the sample size we can decrease
the width of the confidence interval while the
confidence level can remain unchanged.
n
z 2 width
Interval = /2 σ
α
There is an inverse relationship between the width of the interval and the sample size
Trang 76The phrase “estimate the mean to within W
units”, translates to an interval estimate of the form
Trang 77The required sample size to estimate the mean is
77
2 2
Trang 78Example 4: To estimate the amount of lumber that can be harvested in a tract of land, the mean diameter of trees in the tract must be estimated to within one inch with 99%
confidence What sample size should be taken for the margin of error +/-1 inch? (assume diameters are normally distributed with σ = 6 inches).
78
The confidence level 99% leads to α = 01, thus zα/2 = z .005 = 2.575.
239 1
2.575(6) W
σ
z n
2
2 2
Trang 79Introduction to
Hypothesis Testing
Chapter 10
Trang 81• Is there statistical evidence that support the hypothesis that more than p% of all
potential customers will purchase a new products?
• Is the hypothesis that a certain drug is effective?
Trang 82Step 1: Two hypotheses are defined.
H 0 : The null hypothesis specifies our current belief
about the parameter we test (µ = 170, p = 4, etc.) Must be a specific value.
H 1 : The alternative hypothesis specifies a range of
values for the parameter tested (µ > 170; p ≠ .4; etc.) effected by the belief.
11.1 Concepts of Hypothesis Testing
The Reject-Region method
Trang 83H 0 Needed in the test Do not reject H 0
Trang 85Example 1: The manager of a department determines
that new billing system will be cost-effective only if the mean monthly account is more than $170 A random
sample of 400 monthly accounts is drawn, for which the sample mean is $178 Assume a standard deviation of
$65 Can the manager conclude from this that the new system will be cost-effective?
Step 1: H 0 : µ = 170
H 1 : µ > 170
10.2 Testing the Population Mean when the Population Standard Deviation is Known
Trang 86β= Probability of committing the type II error
Send an innocent person
Send an innocent person
Trang 87Step 3: Determine the sample size n and hence the
sampling distribution.
Example 1:n=400; N(0,1)
Trang 88Step 4: Depending on the sampling distribution , the H A , and the value of α, find the suitable critical value and the reject region
H A : μ > 170 (1-sided test)
1.645 Critical value
N(0,1)
α = 0 5
Reject region: Z > 1.645
Trang 89Right-Tail Testing
Trang 90Left-Tail Testing
Trang 91Two–Tail Testing
Trang 92Step 5: Collect data x = 178
Calculate the standard test statistic
46
2400
/65
170
178/− = − =
=
n
x z
σ
µ
H A : μ = 170 Example 1:
Trang 93Reject H 0 : There is enough statistical evidence to
conclude that the alternative hypothesis is true.
Do not reject H 0: There is not enough statistical
evidence to conclude that the alternative hypothesis is true.
Step 6: If the test statistic is in the reject region, then reject H 0 , otherwise do not reject H 0 .
Example 1: Reject H 0 at α = 05
Trang 94A Left Hand Tail Test
Reject H 0 if falls herex Critical
value
Trang 95The SSA envelop plan example: The chief financial officer in FedEx believes that including a stamped self-addressed (SSA) envelop in the monthly
invoice sent to customers will decrease the amount
of time it take for customers to pay their monthly
bills Currently, customers return their payments in
22 days on the average, with a standard deviation of
6 days.
A random sample of 220 customers was selected
and SSA envelops were included with their invoice packs The mean time it took customers to pay their bill was 21.63
Can the CFO conclude that the plan will be
successful at 10% significance level?
Trang 96/ 6
22 63
σ
µ
Trang 97Critical value Reject H 0 if falls herex
Trang 98Example 2: A statistician believes the monthly
mean of the long-distance bills for all AT&T
difference between AT&T’s bills and the
competitor’s bills (on the average)?
55
17
=
x
Trang 9919
1 100
/ 87
3
09
17 55
σ
µ
−1.96 Critical value
1.96 Critical value
α /2 = 025 α /2 = 025
Trang 100Chapter 11
Inference About a
Population
Trang 101Recall: By the central limit theorem, when σ2 is known
is normally distributed if:
• the sample is drawn from a normal population, or
• the population is not normal but the sample is
sufficiently large.
When σ2 is unknown, we use s 2 instead, and has the
t-distribution
x
11.1 Inference About a Population Mean
When the Population Standard Deviation is Unknown
Trang 104Example 1: The productivity of newly hired
trainees is studied It is believed that trainees can process and distribute more than 450 packages per hour within one week of hiring.
Can we conclude that this belief is correct, if the mean productivity observation of 50 trainees is 460.38 and the standard deviation is 38.83.
Trang 105Step 4: Reject Region t ≥ tα,n-1 ≅ t. 05,50 = 1.676
Cf 1.645 for the Z-distribution
Step 5:
Step 6: Reject the null hypothesis
89
150
83
38
45038
x
1.676 Critical value
Trang 106Confidence interval estimator of µ when σ 2 is unknown
1 n
f.
d n
s t
n
s t
Example 2: An investor is trying to estimate the return
on investment in companies that won quality awards last year A random sample of 83 such companies is
selected, yields
Construct a 95% confidence interval for the mean
return.
8.31 68.98
s
68.98 s