Statistics in geophysics probability theory II

Example: In an opinion poll, we might to decide to ask 50people whether they agree or disagree with a certain issue.The sample space for this experiment has 250 elements.Define a variabl

Trang 1

Statistics in Geophysics: Probability Theory II

Steffen Unkel

Department of Statistics Ludwig-Maximilians-University Munich, Germany

Trang 2

Random variables

variable than with the original probability structure

Example: In an opinion poll, we might to decide to ask 50people whether they agree or disagree with a certain issue.The sample space for this experiment has 250 elements.Define a variable X = number of 1s recorded out of 50.The sample space for X is the set of integers {0, 1, 2, , 50}

Trang 4

Random variables

We define an inducedprobability function PX on X as follows:

PX(X = xi) = P({ωj ∈ Ω : X (ωj) = xi}) ,for i = 1, , m and j = 1, , n

We will simply write P(X = xi) rather than PX(X = xi)

Trang 5

Example: Tossing a fair coin three times

X : number of heads obtained in the three tosses

Trang 7

Properties of a cdf

The function FX(x ) is a cdf if and only if the following threeconditions hold:

2 FX(x ) is a monotone, non-decreasing function of x

3 FX(x ) is continuous from the right; that is,

Trang 8

Density and mass functions

Definition:

A random variable X iscontinuous if FX(x ) is a continuousfunction of x A random variable isdiscrete if FX(x ) is a stepfunction of x

Associated with a random variable X and its cdf FX(x ) is anotherfunction, called either theprobability density function(pdf) or

probability mass function(pmf)

Trang 9

Probability mass function

Definition:

The probability mass function (pmf) of a discrete random variable

X is given by

fX(x ) = P(X = x ) for all x Hence, for positive integers a and b with a ≤ b, we have

Trang 10

Probability density function

A pmf gives us “point probabilities” and we can sum over thevalues of the pmf to get the cdf

The analogous procedure in the continuous case is to

substitute integrals for sums:

Trang 11

Probability density function

P(a < X < b) = P(a < X ≤ b) = P(a ≤ X < b) = P(a ≤ X ≤ b)

Trang 12

Figure: Hypothetical pdf for a non-negative random variable X

Trang 14

Mean of a random variable

Definition:

denoted by E(X ) (or µX), is (provided that the sum or integralexists):

Trang 15

Expected value of a function of a random variable

Let g (x ) be a function of a random variable X Then (providedthat the sum or integral exists),

Trang 16

Properties of expected values

If X is any random variable, then (as long as the expectationsexist):

2 E(c g (X )) = cE(g (X ))

3 E(c1g1(X ) + c2g2(X )) = c1E(g1(X )) + c2E(g2(X ))

4 E(g1(X )) ≤ E(g2(X )) if g1(x ) ≤ g2(x ) for all x

Trang 17

Variance of a random variable

Trang 18

Linear transformations of random variables

Assume X is a random variable with mean µX and variance σX2 If

Y = aX + b, where a and b are any constants, then

µY = aµX + b, σY2 = a2σX2, σy = |a|σX

Trang 19

Statisticaldistributionsare used to model populations

We usually deal with a familyof distributions, which is

Here, we catalog someof the frequentlyoccurring probability

to their usage

This presentation is by no means comprehensive in its

coverage of statistical distributions!

Trang 21

Binomial distribution

Let X count the number of successes observed in a sequence

of n identical and independent Bernoulli trials, that is,

Trang 24

If X ∼ P(λ), then E(X ) = λ and Var(X ) = λ.

Trang 26

Example: Annual Hurricane Landfalls on the U.S coastline

Figure: Histogram of annual numbers of U.S landfalling hurricanes for

1899-1998 (dashed), and fitted Poisson distribution with λ = 1.7 (solid).

Trang 27

Approximation of the binomial distribution

Trang 28

Approximation of the binomial distribution

Trang 29

Exponential distribution

The exponential distribution can be used to model lifetimes

If X is a continuous random variable with non-negative range,which has pdf

fX(x ) = λ exp(−λx ) , for x ≥ 0 ,where λ > 0, then X is defined to have an exponential

distribution, denoted by X ∼ E (λ)

If X ∼ E (λ), then E(X ) = λ1 and Var(X ) = λ12

If the number of events in the unit time interval follows aPoisson distribution with mean λ, then the time to the next

Trang 31

If X is a normal random variable, then E(X ) = µ and

Var(X ) = σ2

Trang 33

Standard normal distribution

Normal distribution having µ = 0 and σ = 1:

Φ(z) =R−∞z φ(u)du = P(Z ≤ z) is the conventional notation forits cdf

Any Gaussian random variable can be standardized by subtractingits mean and dividing by its standard deviation:

Trang 34

Distributions of functions of a random variable

Let X be a continuous random variable with density fX(x ) andlet Y = g (X ) be astrongly monotone anddifferentiable

function

The density fY(y ) of Y is given by

fY(y ) = fX(g−1(y )) ·

dg−1(y )dy

Trang 35

for y > 0 and zero elsewhere.

If Y is a log-normal random variable, then

Trang 36

Vector of random variables

We need to know how to describe and use probability modelsthat deal with more than one random variable at a time(called multivariatemodels)

variables

A bivariate random vector (X , Y ) associates an ordered pair

of real numbers, that is, a point (x , y ), with each

Trang 37

Joint and marginal distributions

The two cases we will discuss are those in which (X,Y) isdiscrete or in which (X , Y ) is continuous

When (X , Y ) is discrete, thejoint pmfis

fX ,Y(x , y ) = P(X = x , Y = y ) ,where fX ,Y(x , y ) ≥ 0 for all (x , y ) and must sum to 1, if weadd over all possible observed vectors

The marginalpmfs of X and Y , fX(x ) = P(X = x ) and

fY(y ) = P(Y = y ), are given by

fX(x ) =XfX ,Y(x , y ) and fY(y ) =XfX ,Y(x , y )

Trang 38

The joint cdf of two random variables, FX ,Y(x , y ) is

Trang 39

Example of a joint density

2

f(x, y)

0.05 0.10

0.15

Density of the bivariate standard normal distribution

Trang 40

fX |Y = fX ,Y(x , y )

fY(y ) .Conditional pmf and pdf are defined for any y such that fY(y ) > 0

Trang 41

Definition:

Let (X , Y ) be a bivariate random vector with joint pdf or pmf

fX ,Y(x , y ) and marginal pdfs or pmfs fX(x ) and fY(y ) Then Xand Y are calledindependentif, for every x , y ∈ R,

fX ,Y = fX(x )fY(y )

If X and Y are independent, then

fX |Y(x |y ) = fX(x ) and fY |X(y |x ) = fY(y )

Trang 43

Covariance and correlation

Thecovarianceof X and Y is the number defined by

Cov(X , Y ) = E[(X − µX)(Y − µY)] = E(XY ) − µXµY

Thecorrelationof X and Y is the number defined by

ρXY = Cov(X , Y )

The value ρXY is also called thecorrelation coefficient X and Yare calleduncorrelatedif ρXY = 0; they arepositively (negatively)correlated if ρXY > 0 (ρXY < 0)

Trang 44

Properties of covariance and correlation

The following statements hold:

Cov(X , Y ) = 0 and ρXY = 0

two constants, then

Var(aX + bY ) = a2Var(X ) + b2Var(Y ) + 2ab Cov(X , Y )

Var(aX + bY ) = a2Var(X ) + b2Var(Y )

Trang 45

Example: The bivariate standard normal distribution

The bivariate standard normal distribution with parameter ρ(|ρ| < 1) has the joint density

The correlation of X and Y is ρ

In this case: Uncorrelatedness implies independence

Trang 46

Example: The bivariate standard normal distribution

x

0.02 0.06

0.1 0.12

0.08 0.1

0.12 0.14

0.16 0.18

0.2 0.22

0.1 0.12

0.14 0.16

Trang 47

Sums of random variables

If X and Y are independent random variables with pmfs orpdfs fX(x ) and fY(y ), then the pmf or pdf of Z = X + Y is

if X and Y are continuous

The function fZ(z) is called the convolution of fX(x ) and

fY(y )

Trang 48

Law of large numbers

Considerindependently and identically distributed (i.i.d) randomvariables X1, X2, , Xn with E(Xi) = µ and Var(Xi) = σ2 < ∞(i = 1, , n)

it can be shown that E( ¯Xn) = µ and Var( ¯Xn) = 1nσ2

The law of large numbers states that

P

lim

n→∞

... real numbers, that is, a point (x , y ), with each

Trang 37

The... distributions

The two cases we will discuss are those in which (X,Y) isdiscrete or in which (X , Y ) is continuous

When (X , Y ) is discrete, thejoint pmfis

fX ,Y(x , y )... class="text_page_counter">Trang 38

The joint cdf of two random variables, FX ,Y(x ,

Định dạng
Số trang	50
Dung lượng	1,22 MB