1. Trang chủ
  2. » Tài Chính - Ngân Hàng

Class Notes in Statistics and Econometrics Part 3 pdf

49 340 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 49
Dung lượng 418,65 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Binomial We will begin with mean and variance of the binomial variable, i.e., the number of successes in n independent repetitions of a Bernoulli trial 3.7.1.. For the binomial variable

Trang 1

CHAPTER 5

Specific Random Variables

5.1 Binomial

We will begin with mean and variance of the binomial variable, i.e., the number

of successes in n independent repetitions of a Bernoulli trial (3.7.1) The binomialvariable has the two parameters n and p Let us look first at the case n = 1, in whichthe binomial variable is also called indicator variable: If the eventAhas probability

p, then its complement A0 has the probability q = 1 − p The indicator variable of

A, which assumes the value 1 ifA occurs, and 0 if it doesn’t, has expected value pand variance pq For the binomial variable with n observations, which is the sum of

n independent indicator variables, the expected value (mean) is np and the variance

is npq

Trang 2

Problem79 The random variablexassumes the value a with probability p andthe value b with probability q = 1 − p Show that var[x] = pq(a − b)2.

Answer E[ x ] = pa + qb; var[ x ] = E[ x 2 ] − (E[ x ]) 2 = pa 2 + qb 2 − (pa + qb) 2 = (p − p 2 )a 2 − 2pqab + (q − q 2 )b 2 = pq(a − b) 2 For this last equality we need p − p 2 = p(1 − p) = pq 

The Negative Binomial Variable is, like the binomial variable, derived from theBernoulli experiment; but one reverses the question Instead of asking how manysuccesses one gets in a given number of trials, one asks, how many trials one mustmake to get a given number of successes, say, r successes

First look at r = 1 Lettdenote the number of the trial at which the first successoccurs Then

(5.1.1) Pr[t=n] = pqn−1 (n = 1, 2, )

This is called the geometric probability

Is the probability derived in this way σ-additive? The sum of a geometricallydeclining sequence is easily computed:

Trang 3

Equation (5.1.4) means 1 = p + pq + pq2+ · · · , i.e., the sum of all probabilities isindeed 1.

Now what is the expected value of a geometric variable? Use definition of pected value of a discrete variable: E[t] = pP∞

ex-k=1kqk−1 To evaluate the infinitesum, solve (5.1.4) for s:

and differentiate both sides with respect to q:

The expected value of the geometric variable is therefore E[t] = pp2 = 1p

Problem 80 Assume t is a geometric random variable with parameter p, i.e.,

it has the values k = 1, 2, with probabilities

(5.1.7) pt(k) = pqk−1, where q = 1 − p

The geometric variable denotes the number of times one has to perform a Bernoulliexperiment with success probability p to get the first success

Trang 4

• a 1 point Given a positive integer n What is Pr[t>n]? (Easy with a simpletrick!)

Answer t> n means, the first n trials must result in failures, i.e., Pr[ t> n] = q n Since { t > n} = { t = n + 1} ∪ { t = n + 2} ∪ · · · , one can also get the same result in a more tedious way:

It is pq n + pq n+1 + pq n+2 + · · · = s, say Therefore qs = pq n+1 + pq n+2 + · · · , and (1 − q)s = pq n ;



Problem 81 t is a geometric random variable as in the preceding problem Inorder to compute var[t] it is most convenient to make a detour via E[t(t− 1)] Hereare the steps:

Trang 5

• a Express E[t(t− 1)] as an infinite sum.

Answer Just write it down according to the definition of expected values: P∞k=0k(k −

Trang 6

• d Usec and the fact that E[t] = 1/p to derive

r − 1



prqn−r.This is the negative binomial, also called the Pascal probability distribution withparameters r and p

Trang 7

One easily gets the mean and variance, because due to the memory-less property

it is the sum of r independent geometric variables:

Problem 82 3 points A fair coin is flipped until heads appear 10 times, andx

is the number of times tails appear before the 10th appearance of heads Show thatthe expected value E[x] = 10

Answer Let t be the number of the throw which gives the 10th head t is a negative binomial with r = 10 and p = 1/2, therefore E[ t ] = 20 Since x = t − 10, it follows E[ x ] = 10 

Problem 83 (Banach’s match-box problem) (Not eligible for in-class exams)There are two restaurants in town serving hamburgers In the morning each of themobtains a shipment of n raw hamburgers Every time someone in that town wants

to eat a hamburger, he or she selects one of the two restaurants at random What isthe probability that the (n + k)th customer will have to be turned away because therestaurant selected has run out of hamburgers?

Answer For each restaurant it is the negative binomial probability distribution in disguise:

if a restaurant runs out of hamburgers this is like having n successes in n + k tries.

Trang 8

But one can also reason it out: Assume one of the restaurantes must turn customers away after the n + kth customer Write down all the n + k decisions made: write a 1 if the customer goes to the first restaurant, and a 2 if he goes to the second I.e., write down n + k ones and twos Under what conditions will such a sequence result in the n + kth move eating the last hamburgerthe first restaurant? Exactly if it has n ones and k twos, a n + kth move is a one As in the reasoning for the negative binomial probability distribution, there are n+k−1n−1 possibilities, each of which has probability 2−n−k Emptying the second restaurant has the same probability Together the

5.2 The Hypergeometric Probability Distribution

Until now we had independent events, such as, repeated throwing of coins ordice, sampling with replacement from finite populations, ar sampling from infinitepopulations If we sample without replacement from a finite population, the prob-ability of the second element of the sample depends on what the first element was.Here the hypergeometric probability distribution applies

Assume we have an urn with w white and n − w black balls in it, and we take asample of m balls What is the probability that y of them are white?

We are not interested in the order in which these balls are taken out; we maytherefore assume that they are taken out simultaneously, therefore the set U ofoutcomes is the set of subsets containing m of the n balls The total number of suchsubsets is n How many of them have y white balls in them? Imagine you first

Trang 9

pick y white balls from the set of all white balls (there are wy possibilities to dothat), and then you pick m − y black balls from the set of all black balls, which can

be done in n−wm−y different ways Every union of such a set of white balls with a set

of black balls gives a set of m elements with exactly y white balls, as desired Thereare therefore wy n−wm−y different such sets, and the probability of picking such a setis

(5.2.1) Pr[Sample of m elements has exactly y white balls] =

w y

 n−w m−y



n m



Problem84 You have an urn with w white and n − w black balls in it, and youtake a sample of m balls with replacement, i.e., after pulling each ball out you put itback in before you pull out the next ball What is the probability that y of these ballsare white? I.e., we are asking here for the counterpart of formula (5.2.1) if sampling

is done with replacement

Answer.

(5.2.2)

w n

yn − w n

m−y m y





Trang 10

Without proof we will state here that the expected value of y, the number ofwhite balls in the sample, is E[y] = mwn, which is the same as if one would select theballs with replacement.

Also without proof, the variance ofy is

n

(n − w)n

(n − m)(n − 1).This is smaller than the variance if one would choose with replacement, which isrepresented by the above formula without the last term n−mn−1 This last term iscalled the finite population correction More about all this is in [Lar82, p 176–183]

5.3 The Poisson DistributionThe Poisson distribution counts the number of events in a given time interval.This number has the Poisson distribution if each event is the cumulative result of alarge number of independent possibilities, each of which has only a small chance ofoccurring (law of rare events) The expected number of occurrences is proportional

to time with a proportionality factor λ, and in a short time span only zero or oneevent can occur, i.e., for infinitesimal time intervals it becomes a Bernoulli trial

Approximate it by dividing the time from 0 to t into n intervals of length t

n; thenthe occurrences are approximately n independent Bernoulli trials with probability of

Trang 11

success λtn (This is an approximation since some of these intervals may have morethan one occurrence; but if the intervals become very short the probability of havingtwo occurrences in the same interval becomes negligible.)

In this discrete approximation, the probability to have k successes in time t is

Pr[x=k] =n

k

λtn

k

1 − λtn

(n−k)

(5.3.1)

= 1k!

(5.3.3) is the limit because the second and the last term in (5.3.2) → 1 The sum

of all probabilities is 1 since P∞

k=0 (λt)kk! = eλt The expected value is (note that wecan have the sum start at k = 1):

Trang 12

Problem 85 xfollows a Poisson distribution, i.e.,

• b 4 points Compute E[x(x− 1)] and show that var[x] = λt

Answer For E[ x ( x − 1)] we can have the sum start at k = 2:

From this follows

(5.3.7) var[ x ] = E[ x2] − (E[ x ])2= E[ x ( x − 1)] + E[ x ] − (E[ x ])2= (λt)2+ λt − (λt)2= λt.

Trang 13

Answer That which gives the right expected value, i.e., λ = np 

Problem 87 Two researchers counted cars coming down a road, which obey aPoisson distribution with unknown parameter λ In other words, in an interval oflength t one will have k cars with probability

k

k! e

−λt.Their assignment was to count how many cars came in the first half hour, and howmany cars came in the second half hour However they forgot to keep track of thetime when the first half hour was over, and therefore wound up only with one count,namely, they knew that 213 cars had come down the road during this hour Theywere afraid they would get fired if they came back with one number only, so theyapplied the following remedy: they threw a coin 213 times and counted the number ofheads This number, they pretended, was the number of cars in the first half hour

• a 6 points Did the probability distribution of the number gained in this waydiffer from the distribution of actually counting the number of cars in the first halfhour?

Answer First a few definitions: x is the total number of occurrences in the interval [0, 1] y

is the number of occurrences in the interval [0, t] (for a fixed t; in the problem it was t = 1 , but we

Trang 14

will do it for general t, which will make the notation clearer and more compact Then we want to compute Pr[ y= m| x= n] By definition of conditional probability:

(5.3.9) Pr[ y= m| x= n] =Pr[y=m andx=n]

Pr[ x= n] .How can we compute the probability of the intersection Pr[ y= m and x= n]? Use a trick: express this intersection as the intersection of independent events For this define z as the number of events in the interval (t, 1] Then { y= m and x= n} = { y= m and z= n − m}; therefore Pr[ y= m and

x= n] = Pr[ y= m] Pr[ z= n − m]; use this to get

(5.3.10)

Pr[ y= m| x= n] =Pr[y=m] Pr[z=n − m]

Pr[ x= n] =

λmtmm! e−λt λn−m(n−m)!(1−t)n−me−λ(1−t)



•b 2 points Explain what it means that the probability distribution of the numberfor the first half hour gained by throwing the coins does not differ from the one gained

Trang 15

by actually counting the cars Which condition is absolutely necessary for this tohold?

Answer The supervisor would never be able to find out through statistical analysis of the data they delivered, even if they did it repeatedly All estimation results based on the faked statistic would be as accurate regarding λ as the true statistics All this is only true under the assumption that the cars really obey a Poisson distribution and that the coin is fair.

The fact that the Poisson as well as the binomial distributions are memoryless has nothing to

do with them having a sufficient statistic.



Problem88 8 pointsxis the number of customers arriving at a service counter

in one hour xfollows a Poisson distribution with parameter λ = 2, i.e.,

• b Despite the small number of customers, two employees are assigned to theservice counter They are hiding in the back, and whenever a customer steps up tothe counter and rings the bell, they toss a coin If the coin shows head, Herbert serves

Trang 16

the customer, and if it shows tails, Karl does Compute the probability that Herberthas to serve exactly one customer during the hour Hint:

2!+

13!+

14!+ · · ·

• c For any integer k ≥ 0, compute the probability that Herbert has to serveexactly k customers during the hour

Problem 89 3 points Compute the moment generating function of a Poissonvariable observed over a unit time interval, i.e., x satisfies Pr[x=k] = λk!ke−λ andyou want E[et x] for all t

Answer E[etx] =P∞k=0etk λk!ke−λ=P∞k=0(λek!t)ke−λ= eλete−λ= eλ(et−1) 

5.4 The Exponential DistributionNow we will discuss random variables which are related to the Poisson distri-bution At time t = 0 you start observing a Poisson process, and the randomvariable t denotes the time you have to wait until the first occurrence t can haveany nonnegative real number as value One can derive its cumulative distribution

as follows t>t if and only if there are no occurrences in the interval [0, t] fore Pr[t>t] = (λt)0e−λt = e−λt, and hence the cumulative distribution function

Trang 17

There-Ft(t) = Pr[t≤t] = 1 − e−λtwhen t ≥ 0, and Ft(t) = 0 for t < 0 The density function

is therefore ft(t) = λe−λt for t ≥ 0, and 0 otherwise This is called the exponentialdensity function (its discrete analog is the geometric random variable) It can also

be called a Gamma variable with parameters r = 1 and λ

Problem 90 2 points An exponential random variable t with parameter λ > 0has the density ft(t) = λe−λt for t ≥ 0, and 0 for t < 0 Use this density to computethe expected value of t

0 u dv = uv ∞

0 −R∞

0 v du, where u=t dv0=λe−λtdt

du0=dt v=−e−λt Either way one obtains E[ t ] = −te −λt ∞

Answer One can use that Γ(r) =R∞

0 λrtr−1e−λtdt for r = 3 to get: E[ t2] = (1/λ2)Γ(3) = 2/λ 2 Or all from scratch: E[ t 2 ] = R∞

Trang 18

the second do it again:R∞

0 + 2R∞

0 (1/λ)e −λt dt = 2/λ 2 

Problem 92 2 points Does the exponential random variable with parameter

λ > 0, whose cumulative distribution function is Ft(t) = 1 − e−λt for t ≥ 0, and

0 otherwise, have a memory-less property? Compare Problem 80 Formulate thismemory-less property and then verify whether it holds or not

Answer Here is the formulation: for s<t follows Pr[ t> t| t> s] = Pr[ t> t − s] This does indeed hold Proof: lhs =Pr[t>Pr[t andt>s]t>s] =Pr[Pr[t>t>t]s] =ee−λs−λt = e−λ(t−s) 

Problem 93 The random variabletdenotes the duration of an unemploymentspell It has the exponential distribution, which can be defined by: Pr[t>t] = e−λtfor

t ≥ 0 (tcannot assume negative values)

• a 1 point Use this formula to compute the cumulative distribution function

Ft(t) and the density function ft(t)

Answer F t (t) = Pr[ t≤ t] = 1 − Pr[ t> t] = 1 − e −λt for t ≥ 0, zero otherwise Taking the derivative gives f t (t) = λe−λtfor t ≥ 0, zero otherwise 

• b 2 points What is the probability that an unemployment spell ends after time

t + h, given that it has not yet ended at time t? Show that this is the same as the

Trang 19

unconditional probability that an unemployment spell ends after time h (memory-lessproperty).



• c 3 points Let h be a small number What is the probability that an ment spell ends at or before t + h, given that it has not yet ended at time t? Hint:for small h, one can write approximately

Trang 20

5.5 The Gamma DistributionThe time until the second occurrence of a Poisson event is a random variablewhich we will callt(2) Its cumulative distribution function is Ft(2)(t) = Pr[t(2)≤t] =

1 − Pr[t(2)>t] Butt(2)>t means: there are either zero or one occurrences in the timebetween 0 and t; therefore Pr[t(2)>t] = Pr[x=0]+Pr[x=1] = e−λt+λte−λt Putting itall together gives Ft(2)(t) = 1−e−λt−λte−λt In order to differentiate the cumulativedistribution function we need the product rule of differentiation: (uv)0 = u0v + uv0.This gives

Trang 21

The following definite integral, which is defined for all r > 0 and all λ > 0 iscalled the Gamma function:

Z ∞ 0

λrtr−1e−λtdt

Although this integral cannot be expressed in a closed form, it is an importantfunction in mathematics It is a well behaved function interpolating the factorials inthe sense that Γ(r) = (r − 1)!

Problem 95 Show that Γ(r) as defined in (5.5.6) is independent of λ, i.e.,instead of (5.5.6) one can also use the simpler equation

Z ∞

tr−1e−tdt

Trang 22

Problem 96 3 points Show by partial integration that the Gamma functionsatisfies Γ(r + 1) = rΓ(r).

Answer Start with

Z ∞ 0

λr+1tre−λtdt and integrate by parts:Ru 0 vdt = uv −Ruv 0 dt with u 0 = λe −λt and v = λ r t r , therefore u = −e −λt

and v0= rλ r t r−1 :

(5.5.9) Γ(r + 1) = −λrtre−λt

∞ 0

+

Z ∞ 0

rλrtr−1e−λtdt = 0 + rΓ(r).



Problem 97 Show that Γ(r) = (r − 1)! for all natural numbers r = 1, 2,

Answer Proof by induction First verify that it holds for r = 1, i.e., that Γ(1) = 1:

Z ∞ 0

λe−λtdt = −e−λt ∞

0 = 1 and then, assuming that Γ(r) = (r − 1)! Problem 96 says that Γ(r + 1) = rΓ(r) = r(r − 1)! = r! 

Without proof: Γ(1) =√

π This will be shown in Problem 161

Trang 23

Therefore the following defines a density function, called the Gamma densitywith parameter r and λ, for all r > 0 and λ > 0:

Problem98 4 points Compute the moment generating function of the Gammadistribution

Answer.

m x (t) = E[etx] =

Z ∞ 0

etx λ

r

Γ(r)x

r−1 e−λxdx (5.5.12)

r

(λ − t) r

Z ∞ 0

(λ − t) r x r−1

Γ(r) e

−(λ−t)x

dx (5.5.13)

Trang 24

Problem99 2 points The density and moment generating functions of a Gammavariablexwith parameters r > 0 and λ > 0 are

x/λ has a Gamma distribution with parameters r and λ You can prove this eitherusing the transformation theorem for densities, or the moment-generating function

Answer Solution using density function: The random variable whose density we know is x ; its density is Γ(r)1 x r−1 e−x If x = λ v , then dxdv = λ, and the absolute value is also λ Therefore the density of v isΓ(r)λr v r−1 e−λv Solution using the mgf:

...

Trang 23< /span>

Therefore the following defines a density function, called the Gamma densitywith parameter r and λ,... class= "text_page_counter">Trang 21

The following definite integral, which is defined for all r > and all λ > iscalled the Gamma function:

Z...

dx (5.5. 13)

Trang 24

Problem99 points The density and moment generating functions of

Ngày đăng: 04/07/2014, 15:20

TỪ KHÓA LIÊN QUAN