1. Trang chủ
  2. » Giáo án - Bài giảng

Statistics in geophysics inferential statistics

26 205 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 26
Dung lượng 226,02 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Parameter estimation: Let X be a random variable, whosedensity is fXx ; θ, where the form of the density is assumedknown except that it contains an unknown parameter θ.. This procedure s

Trang 1

Statistics in Geophysics: Inferential Statistics

Steffen Unkel

Department of Statistics Ludwig-Maximilians-University Munich, Germany

Trang 2

Parameter estimation

We will be studying problems of statistical inference

Many problems of inference have been dichotomized into twoareas: estimation of parameters andtests of hypotheses

Parameter estimation: Let X be a random variable, whosedensity is fX(x ; θ), where the form of the density is assumedknown except that it contains an unknown parameter θ

The problem is then to use the observed values x1, , xn of arandom sample X1, , Xn toestimate the value of θ or thevalue of some function of θ, say τ (θ)

Trang 3

Estimator and estimate

Any statistic T = g (X1, , Xn) whose values are used toestimate θ is defined to be anestimator of θ

That is, T is a known function of observable random variablesthat is itself a random variable

An estimate is the realized value t = g (x1, , xn) of anestimator, which is a function of the realized values x1, , xn

Example: ¯Xn= 1nPn

i =1Xi is an estimator of a mean µ and ¯xn

is an estimate of µ Here, T is ¯Xn, t is ¯xn and g (·) is thefunction defined by summing the arguments and then dividing

by n

Trang 4

In 1921,R A Fisher pointed out an attractive rationale,calledmaximum likelihood(ML), for estimating parameters

This procedure says one should examine the likelihood

functionof the sample values and take as the estimates of the

likelihood function

ML is unifying concept to cover a broad range of problems

It is generally accepted as the best rationale to apply inestimating parameters, when one is willing to assume the form

of the population probability law is known

Trang 5

Likelihood function

If X1, , Xn are an i.i.d sample from a population with pdf

or pmf f (x |θ), the likelihood functionis defined by

L(ˆθ) = max

The value ˆθ that maximizes the likelihood is called themaximum likelihood estimate (MLE) for θ

Trang 6

Log-likelihood and score function

It is often more convenient to work with the logarithm of thelikelihood function, called the log-likelihood:

Trang 11

θ(k+1)= θ(k)− 1

s0 θ(k) · s



θ(k) This iterative scheme continues until a prespecified

convergence criterion is met

Trang 12

Other estimation methods

The method of momentsuses sample moments to estimatethe parameters of an assumed probability law

Least squares estimation minimizes the sum of the squares ofthe deviations of the observed values and the fitted values

Bayesian estimation is based on combining the evidencecontained in the data with prior knowledge, based on

subjective probabilities, of the values of unknown parameters

Trang 13

criteria for comparing them.

We will now define certainproperties, which an estimator may

or may not possess, that will help us in deciding whether oneestimator is better than another

Trang 14

Definition:

An estimator T = g (X1, , Xn) is defined to be anunbiased

estimator of an unknown parameter θ if and only if

E(T ) = θ for all values of θ

The difference E(T ) − θ is called the bias of T and can beeither positive, negative, or zero

An estimator T of θ is said to be asymptotically unbiasedif

lim

Trang 15

Precision of estimation

For observations x1, , xn an estimator T yields an estimate

t = g (x1, , xn)

In general, the estimate will not be equal to θ

For unbiased estimators the precisionof the estimation

method is captured by the variance of the estimator, Var(T )

The square root of Var(T ) (the standard deviation of T ) iscalled thestandard error, which in general has to be estimateditself

Trang 16

Lower bound for variance

Let X be a random variable with density f (x , θ) Undercertain regularity conditions:

nEh

∂θln f (x , θ)2

where T is an unbiased estimator of θ

The equation above is called theCram´er-Rao inequality, andthe right-hand side is called theCram´er-Rao lower bound forthe variance of unbiased estimators of θ

Trang 18

Definition:

Let T = g (X1, , Xn) be an estimator for θ Then, T is a

consistent estimator for θ if

Trang 20

Interval estimation

So far, we have dealt with the point estimationof a

parameter

It seems desirable that a point estimate should be

accompanied by some measure of the possible error of theestimate

We might make the inference of estimating that the true value

of the parameter is contained in some interval

Interval estimation: Define two statistics T1 = g1(X1, , Xn)and T2= g2(X1, , Xn), where T1≤ T2, so that [T1, T2]constitutes an interval for which the probability can be

determined that it contains the unknown θ

Trang 21

Confidence interval

Definition:

Given a random sample X1, , Xn let T1 = g1(X1, , Xn)and T2= g2(X1, , Xn) be two statistics satisfying T1≤ T2for which

P(T1 ≤ θ ≤ T2) = 1 − α Then the random interval [T1, T2] is called a

(1 − α)-confidence interval for θ

1 − α is called the confidence coefficient and T1 and T2 arecalled thelower and upper confidence limits, respectively

A value [t1, t2], where tj = gj(x1, , xn) (j = 1, 2) is an

observed (1 − α)-confidence intervalfor θ

Trang 22

One-sided confidence interval

Definition:

Let T1= −∞ and T2= g2(X1, , Xn) be a statistic forwhich

P(θ ≤ T2) = 1 − α Then T2 is called aone-sided upper confidence limitfor θ

Similarly, let T2= ∞ and T1= g1(X1, , Xn) be a statisticfor which

Then T1 is called aone-sided lower confidence limitfor θ

Trang 23

Confidence intervals for the mean (with known variance)

100(1 − α) %-confidence interval for µ (scenario σ2 known)

For a normally distributed random variable X :

.For an arbitrarily distributed random variable X and n > 30,



is an approximateconfidence interval for µ

For 0 < p < 1, zp is the p-quantile of the standard normaldistribution, that is, it is the value for which

F (zp) = Φ(zp) = p Hence, zp= Φ−1(p)

Trang 24

Confidence intervals for the mean (with unknown variance)

100(1 − α) %-confidence interval for µ (scenario σ2 unknown)

For a normally distributed random variable X :

,

where S =

q

1 n−1

i =1(Xi − ¯X )2 and t1−α/2(n − 1) being the(1 − α/2)-quantile of the t-distributionwith n − 1 degrees offreedom

For an arbitrarily distributed random variable X and n > 30,



is an approximateconfidence interval for µ

Trang 25

Confidence intervals for the variance

100(1 − α) %-confidence interval for σ2

For a normally distributed random variable X :

where χ21−α/2(n − 1) and χ2α/2(n − 1) denote the

(1 − α/2)-quantile and (α/2)-quantile, respectively, of the

chi-square distribution with n − 1 degrees of freedom

Trang 26

Confidence interval for a proportion

100(1 − α) %-confidence interval for π

In dichotomouspopulations and for n > 30, an approximate

confidence interval for π = P(X = 1) is given by

"

ˆ

π − z1−α/2

rˆπ(1 − ˆπ)

n , ˆπ + z1−α/2

rˆπ(1 − ˆπ)n

#,

where ˆπ = ¯X denotes the relative frequency

Ngày đăng: 04/12/2015, 17:09

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

  • Đang cập nhật ...

TÀI LIỆU LIÊN QUAN