INTRODUCTION TO COMMON PROBABILITY DISTRIBUTIONS Probability distribution: A probability distribution describes the values of a random variable and the probability associated with these
Trang 1Reading 10 Common Probability Distributions
–––––––––––––––––––––––––––––––––––––– Copyright © FinQuiz.com All rights reserved ––––––––––––––––––––––––––––––––––––––
1 INTRODUCTION TO COMMON PROBABILITY DISTRIBUTIONS
Probability distribution: A probability distribution
describes the values of a random variable and the
probability associated with these values
Types of distribution:
1 Uniform
2 Binomial
3 Normal
4 Lognormal
Random variable: A variable that has uncertain future
outcomes is called random variable The two basic types
of random variables are:
1)Discrete random variables: Discrete random variables
have a countable number of outcomes i.e all
possible outcomes can be listed without missing any
of them For example, counts, dice, number of
students, quoted price of a stock etc A discrete
random variable can take
• On a limited (finite) number of outcomes i.e x1, x2,
…,xn
• On an unlimited (infinite) number of outcomes i.e y1,
y2, …
2)Continuous random variables: Continuous random
variables have an infinite and uncountable range of
possible outcomes; thus, we cannot list all possible
outcomes For example, time, weight, distance, rate
of return etc The range of possible outcomes of a
continuous random variable is the real line i.e
between -∞ and +∞ or some subset of the real line
Probability function: The probability function describes
the probability of a specific value that the random
variable can take
For a discrete random variable, it is denoted as:
P(X = x) read as the “probability that a random
variable X takes on the value x
where,
X represents the name of the random variable
x represents the value of the random variable
Example:
Suppose, X = number of heads in 15 flips of a coin
P(X = 5) = P (5) probability of 5 heads (x) in 15 flips of a
coin
• For a continuous random variable, the probability
function is called the probability density function
(pdf) and is denoted as f(x)
Properties of a probability function:
1)0 ≤ P(x) ≤ 1, for all x
2)The sum of the probabilities p(x) over all values of X =
1 i.e ∑ = 1
Cumulative distribution function or distribution function:
The cumulative distribution function describes the
probability that a random variable X ≤ particular value x
i.e P(X ≤ x) For both discrete and continuous random variables, it is denoted as F(x) = P(X ≤ x)
F(x) = Sum of all the values of the probability function for all outcomes ≤ x
Properties of Cumulative distribution function (cdf):
1)The cdf lies between 0 and 1 for any x i.e 0 ≤ F(x) ≤ 1
2)With an increase in x the cdf either increases or remains constant
For detailed understanding, please refer to Example given after Table 1, Reading 10, Volume 1
2.1 The Discrete Uniform Distribution
It the simplest form of probability distribution
• The discrete uniform distribution has a finite number
of specified outcomes
• The probability of each outcome in a discrete uniform distribution is equally likely
2.2 The Binomial Distribution
A distribution that involves binary outcomes is referred to
as binomial distribution It has following properties:
1 A binomial distribution has fixed number of trials i.e
Practice: Example 1,
Volume 1, Reading 10
Trang 2n
2 Each trial in a binomial distribution has two possible
outcomes i.e a “success” and a “failure”
3 Probability of success is denoted as P (success) = p
and Probability of failure is denoted as P (failure)
=1– p → for all trials
4 The trials are independent, which means that the
outcome of one trial does not affect the outcomes
of any other trials
Assumptions of the binomial distribution:
a)The probability of success (i.e p) is constant for all
trials
b)The trials are independent
Bernoulli trial: A trial that generates one of two
outcomes is called a Bernoulli trial
• In a Bernoulli trial with n number of trials, we can
have 0 to n successes
• If the outcome of an individual trial is random, then
the total number of successes in n trials is also
random
Binomial random variable X: It represents the number of
successes in n Bernoulli trials i.e
X = sum of Bernoulli random variables
X = Y1 + Y2 + …+ Yn
where,
Yi = Outcome on the i th trial
• A binomial random variable is completely described
by two parameters i.e n and p It is stated as X~ B (n,
p) read as “X has a binomial distribution with
parameters n and p”
• Thus, a Bernoulli random variable is a binomial
random variable with n = 1 i.e Y~B (1, p)
Probability function of the Bernoulli random variable Y:
• When the outcome is success Y = 1
• When the outcome is failure Y = 0
p (l) = P(Y= 1) = p = probability of success
p (0) = P( Y = 0) = 1 – p = probability of failure
For example, a stock price is a Bernoulli random variable
with probability of success (an up move) = p and
probability of failure (a down move) = 1 – p
Suppose, Stock price today = S
• When the stock price increases, ending price = uS =
(1 + rate of return if the stock moves up) × S
• When the stock price decreases, ending price = dS
One-Period Stock Price as a Bernoulli Random Variable
Source: Example 2, Volume 1, Reading 10
Number of sequences in n trials that result in x up moves (or successes) and n – x down moves (or failures) is calculated as follows:
!
! !
where, n! = n factorial = n(n - 1) (n - 2) 1 (and 0! = 1 by convention)
Probability function for a binomial random variable:
1
! ! !1
for x = 0, 1, 2, …, n
where,
x = # successes out of n trials
n – x = # failures out of n trials
p = probability of success
1 – p = probability of failure
n = number of trials
Probability of success:
P(X=1)=
1 1
p1(1−p)1 − 1=p
Probability of failure:
p p
p X
=
1 ) 1 ( )
0
1
0
NOTE:
When the probability of success on a trial is 0.50, the
binomial distribution is symmetric; otherwise, it is
asymmetric or skewed
Trang 3Example:
If a coin is tossed 20 times, what is the probability of
getting exactly 10 heads?
p = 0.50
1 – p = 0.5
n = 20
x = 10
10
20
(0.5)10
(0.5)10 = 0.176
Stock price movement on three consecutive days:
• Each day is an independent trial
• When the stock moves up u = 1 + rate of return for
an up move
• When the stock moves down d = 1 + rate of return
for a down move
A binomial tree is shown below Each boxed value that
represents successive moves (branch in the tree) is
called a node
• In the fig below, a node reflects the potential value
for the stock price at a specified time
• At each node, the transition probability for an up
move is p and for a down move is (1 – P)
• Each of the sequences uud, udu, and duu, has
probability = p2 (l – p)
• Stock price after three moves = P (S3 = uudS) = 3p2 (l -
p)
e.g Number of ways to get 2 up moves in three periods
= 3! / (3 – 2)! 2! = 3
3.1 Continuous Uniform Distribution
The continuous uniform distribution is the simplest continuous probability distribution The uniform distribution has two main uses
• It plays an important role in Monte Carlo simulation
• It is an appropriate probability model to represent an uncertainty in beliefs with equally likely outcomes
Probability density function (pdf): It is used to assign the probabilities to a continuous random variable and is
denoted as f (x) According to pdf,
• The probability that value of x lies between a and b
is the area under the graph of f(x) that lies between
a and b or the integral of f(x) over the range a to b
≤
≤
−
=
elsewhere 0
b a for 1 ) (
x a
b x f
• Over the range of values from a to b, density of the
distribution of a random variable x =
• Elsewhere, density of the distribution of a random
variable x = 0
Finding probability: The probabilities can be estimated
as follows:
!
• F (x) = area under the curve graphing the pdf
• Under a Continuous uniform distribution, probabilities
for values of a continuous random variable x are assigned across an interval of values of x; thus, the probability that x takes on a specific value = 0
• Since the probabilities at the endpoints a and b = 0 for any continuous random variable X, P (a ≤ X ≤ b)
= P (a < X ≤ b) = P (a ≤ X< b) = P (a< X < b)
For a continuous uniform random variable:
Mean = µ = (a + b) / 2 Variance = σ2 = (b – a) 2 / 12 S.D =
• Note that S.D is not a useful risk measure for a uniform distribution; rather, the S.D is a good risk measure for Normal Distribution
Practice: Example 4, 5 & 6,
Volume 1, Reading 10
Trang 4Example:
Suppose,
At the lower bound = a =100,000 km total cost
= $40,000
At the upper bound = b =150,000 km total cost
= $60,000
Outside the lower and upper bound total cost = $0
x = total anticipated annual travel costs in thousands of
dollars
the distribution has density f(x) = 1/ (60 - 40) = 1/20
• Elsewhere, the distribution has density f(x) = 0
The probability that travel costs are between 40 and 60 =
Total area under the density function f(x) between 40
and 60 = height × length (or base) = (1/20) × (60–40) = 1
The probability that travel costs are between 40 and 50 =
Area under the curve between 40 & 50 = (1/20) × (50–40)
= 0.50
3.2 The Normal Distribution
• A normal distribution is a distribution that is symmetric
about the centre (mean) and is bell-shaped Thus,
o Skewness = 0
o Kurtosis = 3 and Excess kurtosis = 0
distribution is the entire real line i.e all real numbers
lying between -∞ and +∞
• The tails of the normal distribution never touches the
horizontal axis and extend without limit to the left
and to the right; however, as we move away from
the center, the tails get closer and closer to the
horizontal axis This characteristic is referred to as the
distribution is asymptotic to the horizontal axis
• The normal distribution is described by two
parameters i.e its mean (µ) and its variance (σ2) or
standard deviation (σ) It is stated as:
X ~ N (µ, σ2) read “X follows a normal distribution
with mean µ and variance σ2”
shifts to the right (left)
• The smaller the S.D., the more the observations are concentrated around the mean
• Since the normal distribution is symmetrical, it tends
to underestimate the probability of extreme returns
Thus, it is not appropriate to use for Options
• The normal distribution can be used to model
returns; however, is not appropriate to use to model asset prices
• According to the central limit theorem, sum and mean of a large number of independent random variables is approximately normally distributed
• It is important to note that a linear combination of two or more normal random variables is also normally distributed
A univariate normal distribution describes the probability
of a single random variable
A multivariate normal distribution describes the
probabilities for a group of related random variables It is completely defined by three parameters:
1 The list of the mean returns on the individual securities i.e total means = n
2 The list of the securities’ variances of return i.e total variances = n
3 The list of all the distinct pair-wise return correlations i.e total distinct correlations = n (n - 1) / 2
For example, a bivariate normal distribution (i.e a distribution with 2 stocks) has:
• Means = 2
• Variances = 2
• Correlation = 2 (2 –1) / 2 = 1
For a normal random variable standard deviation of:
• Sample skewness = 6/ n
• Sample kurtosis = 24/ n Normal density function: It is expressed as follows:
= 1
%√2&
−( − ()
2% ) for − ∞ < < + ∞
• The probability that a normally distributed variable x takes on values in the range from a to b = Area
Practice: Example 7,
Volume 1, Reading 10
Trang 5• The total area under the curve = 1
• The area under the curve to the left of centre = 0.5
and the area right of centre = 0.5
o Approximately 50% of all observations fall in the
interval µ ± (2/ 3) σ
o Approximately 68% of all observations fall in the
interval µ ± σ
o Approximately 95% of all observations fall in the
interval µ ± 2σ
o Approximately 99% of all observations fall in the
interval µ ± 3σ
• More-precise intervals are µ ± 1.96σ for 95% of the
observations and µ ± 2.58σ for 99% of the
observations
Standard normal distribution or unit normal distribution: It
is a normal distribution with:
• The mean (µ ) = 0
• Standard deviation (σ) =1
When X is normally distributed, it can be standardized
using the following formula:
Z =
away from the mean the point x lies
Example:
Suppose, a normal random variable, X = 9.5 with µ = 5
and σ = 1.5
Z = (9.5 - 5) / 1.5 = 3 Example:
Finding the Probability i.e P (Z < 2.67) It is found by first
finding 2.6 in the left hand column, and then moving
across the row to the column under 0.07 (Refer to table
on the next page) Thus,
The area to the left of z = 2.67 = 0.9962
• In order to find the area to the right of z, we use the
Standard Normal Table given below to find the area
that corresponds to z-value and then subtract the
area from 1
• Probability to the right of x = 1.0 - N(x)
• Since the normal distribution is symmetric around its
mean, the area and the probability to the right of x =
area and the probability to the left of -x, N (-x)
•The probability to the right of –x i.e P (Z ≥ -x) = N(x)
Example:
• Finding P (Z > 1.23):
• Finding P (-0.75 < Z < 1.23):
• Finding P (Z< -2.33):
Example:
The average (µ) on a corporate finance test was 78 with
a standard deviation of 8 (σ) If the test scores are normally distributed, find the probability that a student receives a test score greater than 85
Z =
= 0.875 ≈ 0.88
P(x> 85) = P (z> 0.88) = 1 −P(z< 0.88) = 1 − 0.8106
= 0.1894
Trang 6NOTE:
• P (Z ≤ 1.282) = 0.90 = 90% → It implies that 90th percentile point = 1.282 and % of values in the right tail = 10%
• P (Z ≤ 1.65) = 0.95 = 95% → It implies that the 95th percentile point = 1.65 and % of values in the right tail = 5%
• P (Z ≤ 2.327) = 0.99 = 99% → It implies that the 99th percentile point = 2.327 and % of values in the right tail = 1%
3.3 Applications of the Normal Distribution
• The mean-variance analysis is based on the assumption that returns are normally distributed
• Safety-first rule: Safety-first rule focuses on shortfall risk i.e the risk that portfolio value will fall below
some minimum acceptable level over some specified time horizon For example, the risk that the assets in a defined benefit plan will fall below plan liabilities
According to Roy's safety-first criterion, the optimal portfolio is the one that minimizes the probability that
portfolio return (Rp) falls below the threshold level (RL) When returns are normally distributed, the safety-first
optimal portfolio is the portfolio that maximizes the
safety-first ratio (SFRatio):
!* = +,* − *-/%
• Investors prefer the portfolio with the highest SFRatio
• Probability that the portfolio return < threshold level =
P (Rp< RL) = N (-SFRatio)
• The optimal portfolio has the lowest P (Rp< RL) Example:
• Portfolio 1 expected return = 12% and S.D = 15%
• Portfolio 2 expected return = 14% and S.D = 16%
• Threshold level = 2%
• Assumes that returns are normally distributed
SFRatio of portfolio 1 = (12 – 2) / 15 = 0.667 SFRatio of portfolio 2 = (14 – 2) / 16 = 0.75
• Since SFRatio of portfolio 2 > SFRatio 1, the superior Portfolio is Portfolio 2
Practice: Example 8, Volume 1, Reading 10
Trang 7Probability that return < 2% = N (–0.75) = 1 – N (0.75)
= 1 – 0.7734*
≈ 23%
*Refer to table on previous page
Sharpe Ratio:
Sharpe ratio = [E (Rp) – Rf] / σp
• The portfolio with the highest Sharpe ratio is the one
that minimizes the probability that portfolio return will
be less than the risk-free rate (assuming returns are
normally distributed)
Managing Financial risk: Two important measures used
to manage financial risk include:
losses (in money terms) expected over a specified
time period (e.g a day, quarter, year etc.) at a
specified level of probability (e.g 5%, 1%) VAR
estimated using variance-covariance or analytical
method assumes that returns are normally
distributed
Example:
A one week VAR of $10 million for a portfolio with 5%
probability implies that portfolio is expected to loss
$10 million or more in a single week
• Stress testing/scenario analysis: It involves a use of
set of techniques to estimate losses in extremely
worst combinations of events or scenarios
3.4 The Lognormal Distribution
A random variable (i.e Y) whose natural logarithm (i.e ln
Y) has a normal distribution, is said to have a Lognormal
distribution
Reason :
Since, negative values do not have logarithms, Y is
always > 0 and thus the distribution is positively skewed
(unlike normal distribution that is bell-shaped)
• Like normal distribution, it is completely described by
two parameters i.e the mean and variance of In Y,
given that Y is lognormal
Mean (µL) of a lognormal random variable = exp (µ + 0.50σ2)
Variance (σL2) of a lognormal random variable
= exp (2µ+ σ2) × [exp (σ2) – 1]
Strengths of lognormal distribution:
(relative to normal distribution) to use to model asset prices because asset prices cannot be negative
• It is used in Black-Scholes-Merton model, which assumes that the asset’s price underlying the option
is lognormally distributed
It is important to note that when a stock's continuously
compounded return is normally distributed, then future stock price is necessarily lognormally distributed
ST = S0exp (r0,T) Where,
exp = e
r0,t = Continuously compounded return from 0 to T
• Since ST is proportional to the log of a normal random variable → ST is lognormal
Price relative = Ending price / Beginning price =
St+1/ St=1 + Rt, t+1
where,
Rt, t+1 = holding period return on the stock from t to t + 1
Continuously compounded return associated with a holding period from t to t + 1:
rt, t+1= ln(1 + holding period return)
Or
rt, t+1 = ln(price relative) = ln (St+1 / St) = ln (1 + Rt,t+1) NOTE:
The continuously compounded return < associated holding period return
Continuously compounded return associated with a holding period from 0 to T:
R0,T= ln (ST / S0)
Or
,= ,+ ,+ ⋯ + ,
Where,
rT-I, T = One-period continuously compounded returns
Practice: Example 9,
Volume 1, Reading 10
Trang 8Example:
Suppose, one-week holding period return = 0.04
Equivalent continuously compounded return =
one-week continuously compounded return = ln (1.04)
= 0.039221
the observations of a normally distributed random
variable are expected to lie are symmetric around
the mean
the observations of a lognormally distributed
random variable are expected to lie are not
symmetric around the mean
In many investment applications, it is assumed that
returns are independently and identically distributed
(IID)
• Returns are independently distributed implies that
investors cannot forecast future returns using past
returns (i.e., weak-form market efficiency)
• Returns are identically distributed implies that the
mean and variance of return do not change from
period to period (i.e stationarity)
When one-period continuously compounded returns (i.e
r0,1) are IID random variables with mean µ and variance
σ2, then
And
,/ = %0
S.D = σ (r0,T) = σ√0
compounded returns are normally distributed, then
the T holding period continuously compounded
return (i.e r0,T) is also normally distributed with mean
µT and variance σ2T
one-period continuously compounded returns is
approximately normal even if they are not normally
distributed
Volatility:
Volatility reflects the deviation of the continuously compounded returns on the underlying asset around its mean It is estimated using a historical series of
continuously compounded daily returns
Annualized volatility = sample S.D of one period
continuously compounded returns
× √0 where,
T = Number of trading days in a year = 250
Example:
Michelin Daily Closing Prices Date (2003) Closing Price (€)
Since, rt, t+1 = ln (St+1 / St) = ln (1 + Rt,t+1)
• ln (25.21 / 25.20) = 0.000397
• ln (25.52 / 25.21) = 0.012222
• ln (26.10 / 25.52) = 0.022473
• ln (26.14 / 26.10) = 0.001531 Sum = 0.036623
Mean = 0.009156 Variance = 0.000107 S.D = 0.010354 Annualized volatility = 0.010354 × √250 = 0.163711 Expected continuously compounded annual return
= Sample mean × T
= 0.009156 (250)
= 2.289
Source: Example 10, Volume 1, Reading 10.
probability distribution It can be used in conjunction
Uses:
•It can be used in valuing complex securities e.g
• It can be used to estimate VAR e.g using Monte Carlo simulation, portfolio's profit and loss performance for a specified time horizon are simulated to generate a frequency distribution for changes in portfolio value; the point that reflects the end point of the least favorable 5% of simulated changes is 95% VAR
• It can be used to examine a model's sensitivity to changes in the assumptions
Trang 9Advantages: Monte Carlo simulation can be used to
value complex securities i.e European-style
options
Drawbacks: Unlike analytical methods (e.g
Black-Scholes-Merton option pricing model),
Monte Carlo simulation provides only
statistical estimates, not exact results In
addition, unlike black-scholes model,
Monte Carlo simulation model cannot be
used to quickly measure the sensitivity of
call option value to changes in current
stock price and other variables
Steps of Monte Carlo simulation technique to examine a
model's sensitivity to changes in assumptions:
1)Specify the underlying variable or variables e.g stock
price for an equity call option
2)Specify the beginning values of the underlying
variables e.g stock price
• C iT = Value of the option at maturity T The subscript I
reflects a value resulting from the ith simulation trial
3)Specify a time period
Time increment = ∆t
= Calendar time / Number of sub-periods (K)
4)Specify the regression model for changes in stock
price
where,
Zk= Risk factor in the simulation It is a standard normal
random variable
5)K random variables are drawn for each risk factor
using a computer program or spreadsheet function
6)Now the underlying variables are estimated by
substituting values of random observations in the
model specified in Step 4
7)The value of a call option at maturity i.e CiT is
calculated and then this value is discounted back at
time period 0 to get Ci0
8)This process is repeated until a specified number of trials, i, is completed (e.g tens of thousands of trials) NOTE:
For obtaining each extra digit of accuracy in results, the appropriate increase in the number of trials depends on the problem For example, in option value, tens of thousands of trials may be appropriate Generally, the number of trials should be increased by a factor of 100 9)Finally, mean value and S.D for the simulation are calculated
Mean value = Average value of the option over all trials
in the simulation
• The mean value will be the Monte Carlo estimate of the value of the call option
Random number generator: An algorithm that generates uniformly distributed random numbers between 0 and 1
is referred to as random number generator It is important to note that random observations from any distribution can be generated using a uniform random variable
Steps to generate random observations on variable X:
0 and 1 using the random number generator
observation on variable X
Historical simulation or Back simulation: Under a historical simulation, samples are generated using a historical record of underlying variables to simulate a process It is based on the assumption that historical data can be used to predict future
Drawback of Historical simulation: Unlike Monte Carlo
simulation, historical simulation cannot be used to perform “what if” analyses
Practice: Example 11 & 12, Volume 1, Reading 10 & End of Chapter Practice Problems for Reading 10
...• ln (26 .10 / 25.52) = 0.022473
• ln (26.14 / 26 .10) = 0.001531 Sum = 0 .036 623
Mean = 0.009156 Variance = 0.00 0107 S.D = 0.0 103 5 4 Annualized volatility = 0.0 103 5 4 × √250 =... Portfolio
Practice: Example 8, Volume 1, Reading 10
Trang 7Probability that return < 2% = N (–0.75)... assumes that returns are normally
distributed
Example:
A one week VAR of $10 million for a portfolio with 5%
probability implies that portfolio is expected to loss