Class Notes in Statistics and Econometrics Part 5 pps

Given a random vectorε of independent variables εi with zero expected valueE[εi] = 0 and identical second and fourth moments.. More About the Univariate Case By definition,z is a standar

Trang 1

CHAPTER 9

Random Matrices

The step from random vectors to random matrices (and higher order randomarrays) is not as big as the step from individual random variables to random vectors

We will first give a few quite trivial verifications that the expected value operator

is indeed a linear operator, and them make some not quite as trivial observationsabout the expected values and higher moments of quadratic forms

9.1 Linearity of Expected ValuesDefinition 9.1.1 LetZ be a random matrix with elementszij Then E[Z] isthe matrix with elements E[zij]

Trang 2

Theorem 9.1.2 If A, B, and C are constant matrices, then E [AZB + C] =

A E[Z]B + C

Proof by multiplying out

Theorem 9.1.3 E [Z>] = (E[Z])>; E[trZ] = tr E[Z]

Theorem 9.1.4 For partitioned matrices E [

XY

] =E[X]E[Y]

.Special cases: If C is a constant, then E[C] = C, E[AX+ BY] = A E[X] +

B E[Y], and E[a ·X+ b ·Y] = a · E[X] + b · E[Y]

If X andY are random matrices, then the covariance of these two matrices is

a four-way array containing the covariances of all elements of X with all elements

of Y Certain conventions are necessary to arrange this four-way array in a dimensional scheme that can be written on a sheet of paper Before we developthose, we will first define the covariance matrix for two random vectors

two-Definition 9.1.5 The covariance matrix of two random vectors is defined as(9.1.1) C[x,y] = E[(x− E[x])(y− E[y])>]

Theorem 9.1.6 C[x,y] = E[xy>] − (E[x])(E[y])>

Theorem 9.1.7 C[Ax+ b, Cy+ d] = A C[x,y]C>

Trang 3

Problem 152 Prove theorem9.1.7.

Theorem 9.1.8 C[

xy

,

uv

] =C[x,u] C[x,v]C[y,u] C[y,v]

.Special case: C[Ax+By, Cu+Dv] = A C[x,u]C>+A C[x,v]D>+B C[y,u]C>+

B C[y,v]D> To show this, express each of the arguments as a partitioned matrix,then use theorem9.1.7

Definition 9.1.9 V[x] = C[x,x] is called the dispersion matrix

It follows from theorem9.1.8that

com-ponent y Then the whole dispersion matrix V[y] exists

Trang 4

Theorem 9.1.11 V[x] is singular if and only if a vector a exists so that a>x

is almost surely a constant

Proof: Call V[x] = ΣΣΣ Then ΣΣΣ singular iff an a exists with ΣΣΣa = o iff an a existswith a>ΣΣa = var[a>x] = 0 iff an a exists so that a>xis almost surely a constant

This means, singular random variables have a restricted range, their values arecontained in a linear subspace This has relevance for estimators involving singularrandom variables: two such estimators (i.e., functions of a singular random variable)should still be considered the same if their values coincide in that subspace in whichthe values of the random variable is concentrated—even if elsewhere their valuesdiffer

Problem154 [Seb77, exercise 1a–3 on p 13] Letx= [x1, ,xn]>be a vector

of random variables, and let y1 =x1 andyi =xi−xi−1 for i = 2, 3, , n Whatmust the dispersion matrix V[x] be so that the yi are uncorrelated with each otherand each have unit variance?

Trang 5

Answer cov[xi, xj] = min(i, j).

Trang 6

Proof Writeyas the sum of η andε=y− η; then

y>Ay= (ε+ η)>A(ε+ η)(9.2.2)

=ε>Aε+ε>Aη + η>Aε+ η>Aη(9.2.3)

η>Aη is nonstochastic, and since E[ε] = o it follows

Trang 7

are writing now σ2Ψ = ΣΣΣ, follows E[yy>] = ηη>+ ΣΣΣ, therefore

" yy

#

=ηη+ ΣΣ ; therefore

Trang 8

Answer Write y =y1 y2 yn> and Σ Σ Σ = diag(σ 2 σ 2 σ 2

n

) Then the vector y1− ¯ y y2− ¯ y yn− ¯ y> can be written as (I − n1ιι>)y. 1nιι> is idempotent, therefore D = I −n1ιι > is idempotent too Our estimator is n(n−1)1 y > Dy, and since the mean vector η = ιη satisfies Dη = o, theorem 9.2.1 gives

E[y>Dy] = tr[DΣ Σ Σ] = tr[Σ Σ Σ] − 1

ntr[ιι

>

Σ Σ]

Divide this by n(n − 1) to get (σ 2 + · · · + σ 2

n )/n 2 , which is var[¯ y], as claimed

For the variances of quadratic forms we need the third and fourth moments ofthe underlying random variables

Problem 156 Let µi= E[(y− E[y])i] be the ith centered moment ofy, and let

σ =√

µ2be its standard deviation Then the skewness is defined as γ1= µ3/σ3, andkurtosis is γ2= (µ4/σ4) − 3 Show that skewness and kurtosis of ay+ b are equal tothose of y if a > 0; for a < 0 the skewness changes its sign Show that skewness γ1and kurtosis γ2 always satisfy

Trang 9

Answer Define ε = y − µ, and apply Cauchy-Schwartz for the variables ε and ε 2 :

(9.2.14) (σ3γ1)2= (E[ε3])2= cov[ε, ε2]2≤ var[ε] var[ε 2 ] = σ6(γ2 + 2)

the skewness and kurtosis of a random variable

Answer To show that all combinations satisfying this inequality are possible, define

Theorem 9.2.2 Given a random vectorε of independent variablesεi with zeroexpected value E[εi] = 0, and whose second and third moments are identical Callvar[ε] = σ2, and E[ε3] = σ3γ (where σ is the positive square root of σ2) Here γ is

Trang 10

called the skewness of these variables Then the following holds for the third mixedmoments:

(9.2.16) E[εiεjεk] =

(

σ3γ1 if i = j = k

0 otherwiseand from (9.2.16) follows that for any n × 1 vector a and symmetric n × n matrices

C whose vector of diagonal elements is c,

One would like to have a matrix notation for (9.2.16) from which (9.2.17) follows by

a trivial operation This is not easily possible in the usual notation, but it is possible

Trang 12

Since n ∆ C is the vector of diagonal elements of C, called c, the last term

in equation (9.2.21) is the scalar product a>c

Given a random vectorε of independent variables εi with zero expected valueE[εi] = 0 and identical second and fourth moments Call var[εi] = σ2 and E[ε4i] =

σ4(γ2+3), where γ2is the kurtosis Then the following holds for the fourth moments:

Trang 13

is easy in tile notation:

matrices A and B, whose vectors of diagonal elements are a and b,

(9.2.24) E[(ε>Aε)(ε>Bε)] = σ4tr A tr B + 2 tr(AB) + γ a>b

Trang 14

Answer ( 9.2.24 ) is an immediate consequence of ( 9.2.23 ); this step is now trivial due to linearity of the expected value:

A

B + σ4

A

B + γ2 σ4

A

∆

B

The first term is tr AB The second is tr AB>, but since A and B are symmetric, this is equal

to tr AB The third term is tr A tr B What is the fourth term? Diagonal arrays exist with any number of arms, and any connected concatenation of diagonal arrays is again a diagonal array, see ( B.2.1 ) For instance,

∆

.

Trang 15

From this together with ( B.1.4 ) one can see that the fourth term is the scalar product of the diagonal

and denote the vector of diagonal elements by a Let x= θ +εwhereεsatisfies theconditions of theorem 9.2.2and equation (9.2.23) Then

(9.2.27) var[x>Ax] = 4σ2θ>A2θ + 4σ3γ1θ>Aa + σ4γ2a>a + 2 tr(A2)

Answer Proof: var[x>Ax] = E[(x>Ax) 2 ] − (E[x>Ax 2 Since by assumption V[x] = σ 2 I, the second term is, by theorem 9.2.1 , (σ 2 tr A + θ>Aθ) 2 Now look at first term Again using the notation ε = x − θ it follows from ( 9.2.3 ) that

(x>Ax) 2 = (ε>Aε) 2 + 4(θ>Aε) 2 + (θ>Aθ) 2

(9.2.28)

+ 4ε>Aε θ>Aε + 2ε>Aε θ>Aθ + 4 θ>Aε θ>Aθ.

(9.2.29)

Trang 16

We will take expectations of these terms one by one Use ( 9.2.24 ) for first term:

To deal with the second term in ( 9.2.29 ) define b = Aθ; then

(θ>Aε)2= (b>ε)2= b>ε εε ε>b = tr(b>ε εε ε>b) = tr(ε εε ε>bb>) (9.2.31)

E[(θ>Aε)2] = σ2tr(bb>) = σ2b>b = σ2θ>A2θ (9.2.32)

The third term is a constant which remains as it is; for the fourth term use ( 9.2.17 )

ε>Aε θ>Aε = ε>Aε >ε (9.2.33)

E[ε>Aε θ>Aε] = σ3γ1a>b = σ3γ1 a>Aθ

(9.2.34)

If one takes expected values, the fifth term becomes 2σ 2 tr(A) θ>Aθ, and the last term falls away Putting the pieces together the statement follows

Trang 17

CHAPTER 10

The Multivariate Normal Probability Distribution

10.1 More About the Univariate Case

By definition,z is a standard normal variable, in symbols, z∼N(0, 1), if it hasthe density function

2πe−x2 +y22 In order to

Trang 18

see that this joint density integrates to 1, go over to polar coordinates x = r cos φ,

y = r sin φ, i.e., compute the joint distribution ofrand φfrom that ofxandy: theabsolute value of the Jacobian determinant is r, i.e., dx dy = r dr dφ, therefore

−x2 +y22 dx dy =

Z 2π φ=0

Z ∞ r=0

12πe

√

2 =

√ π.

Trang 19

A univariate normal variable with mean µ and variance σ2is a variablexwhosestandardized version z = x−µσ ∼ N(0, 1) In this transformation from xto z, theJacobian determinant is dz

dis-tributed variable y ∼ N(µ, 1) Show that the sample mean ¯y is a sufficient tic for µ Here is a formulation of the factorization theorem for sufficient statis-tics, which you will need for this question: Given a family of probability densities

statis-fy(y1, , yn; θ) defined on Rn, which depend on a parameter θ ∈ Θ The statistic

T : Rn → R, y1, , yn 7→ T (y1, , yn) is sufficient for parameter θ if and only ifthere exists a function of two variables g : R × Θ → R, t, θ 7→ g(t; θ), and a function

of n variables h : Rn→ R, y1, , yn7→ h(y1, , yn) so that

Trang 20

10.2 Definition of Multivariate NormalThe multivariate normal distribution is an important family of distributions withvery nice properties But one must be a little careful how to define it One mightnaively think a multivariate Normal is a vector random variable each component

of which is univariate Normal But this is not the right definition Normality ofthe components is a necessary but not sufficient condition for a multivariate normalvector If u =

xy

with both x and y multivariate normal, u is not necessarilymultivariate normal

Here is a recursive definition from which one gets all multivariate normal butions:

distri-(1) The univariate standard normalz, considered as a vector with one nent, is multivariate normal

compo-(2) Ifxandy are multivariate normal and they are independent, thenu=

xy

is multivariate normal

(3) If y is multivariate normal, and A a matrix of constants (which need not

be square and is allowed to be singular), and b a vector of constants, then Ay+ b

Trang 21

is multivariate normal In words: A vector consisting of linear combinations of thesame set of multivariate normal variables is again multivariate normal.

For simplicity we will go over now to the bivariate Normal distribution

10.3 Special Case: Bivariate NormalThe following two simple rules allow to obtain all bivariate Normal randomvariables:

(1) If x and y are independent and each of them has a (univariate) normaldistribution with mean 0 and the same variance σ2, then they are bivariate normal.(They would be bivariate normal even if their variances were different and theirmeans not zero, but for the calculations below we will use only this special case, whichtogether with principle (2) is sufficient to get all bivariate normal distributions.)

(2) If x=

xy

is bivariate normal and P is a 2 × 2 nonrandom matrix and µ

a nonrandom column vector with two elements, then Px+ µ is bivariate normal aswell

All other properties of bivariate Normal variables can be derived from this.First let us derive the density function of a bivariate Normal distribution Write

Trang 22

vector xis bivariate normal Take any nonsingular 2 × 2 matrix P and a 2 vector

(10.3.1) fx, y(x, y) = 1

2πσ2exp− 1

2σ2(x2+ y2)

For the next step, remember that we have to express the old variable in terms

of the new one: x = P−1(u− µ) The Jacobian determinant is therefore J =det(P−1) Also notice that, after the substitution x

nent in the joint density function of xand y is − 1

2σ 2(x2+ y2) = − 1

xy

>

xy

Trang 23

functions gives

(10.3.2) fu, v(u, v) = 1

2πσ2

det(P−1) exp− 1

uv

] = σ2P P> = σ2Ψ, say Since P−1>P−1P P>= I,

it follows P−1>P−1= Ψ−1 and det(P−1) = 1/pdet(Ψ), therefore

(10.3.3) fu, v(u, v) = 1

2πσ2

1pdet(Ψ)exp

− 12σ2

(10.3.4) fx(x) = (2πσ2)−n/2(det Ψ)−1/2exp− 1

2σ2(x − µ)>Ψ−1(x − µ)

Problem 163 1 point Show that the matrix product of (P−1)>P−1 and P P>

is the identity matrix

Trang 24

Problem 164 3 points All vectors in this question are n × 1 column vectors.Lety= α+ε, where α is a vector of constants andεis jointly normal with E[ε] = o.Often, the covariance matrix V[ε] is not given directly, but a n×n nonsingular matrix

T is known which has the property that the covariance matrix of Tε is σ2 times the

n × n unit matrix, i.e.,

Show that in this case the density function ofy is

(10.3.6) fy(y) = (2πσ2)−n/2|det(T )| exp− 1

2σ2 T (y − α)>

T (y − α).Hint: definez= Tε, write down the density function of z, and make a transforma-tion between z andy

Answer Since E[z] = o and V[z] = σ 2 In, its density function is (2πσ 2 ) −n/2 exp(−z > z/2σ 2 ) Now express z, whose density we know, as a function of y, whose density function we want to know.

z = T (y − α) or

z1 = t11(y1 − α1) + t12(y2 − α2) + · · · + t1n(yn − αn) (10.3.7)

(10.3.8)

zn = tn1(y1 − α1 ) + tn2(y1 − α2) + · · · + tnn(yn − αn) (10.3.9)

therefore the Jacobian determinant is det(T ) This gives the result.

Trang 25

10.3.1 Most Natural Form of Bivariate Normal Density

most natural form For this we set the multiplicative “nuisance parameter” σ2= 1,i.e., write the covariance matrix as Ψ instead of σ2Ψ

• a 1 point Write the covariance matrix Ψ = V[

uv

] in terms of the standarddeviations σu and σv and the correlation coefficient ρ

• b 1 point Show that the inverse of a 2 × 2 matrix has the following form:

• c 2 points Show that

q2=

u− µ v− ν Ψ−1u− µ

v− ν

(10.3.11)

Trang 26

• d 2 points Show the following quadratic decomposition:

• f 1 point Show that d =√det Ψ can be split up, not additively but tively, as follows: d = σu· σvp

The second factor in (10.3.15) is the density of a N(ρσv

σuu, (1 − ρ2)σ2) evaluated

at v, and the first factor does not depend on v Therefore if I integrate v out toget the marginal density of u, this simply gives me the first factor The conditionaldensity of v given u= u is the joint divided by the marginal, i.e., it is the second

Trang 27

factor In other words, by completing the square we wrote the joint density function

in its natural form as the product of a marginal and a conditional density function:

fu,v(u, v) = fu(u) · fv|u(v; u)

From this decomposition one can draw the following conclusions:

• u∼N(0, σ2

u) is normal and, by symmetry,v is normal as well Note thatu

(orv) can be chosen to be any nonzero linear combination ofxandy Anynonzero linear transformation of independent standard normal variables istherefore univariate normal

• If ρ = 0 then the joint density function is the product of two independentunivariate normal density functions In other words, if the variables arenormal, then they are independent whenever they are uncorrelated Forgeneral distributions only the reverse is true

• The conditional density of v conditionally on u= u is the second term onthe rhs of (10.3.15), i.e., it is normal too

• The conditional mean is

(10.3.16) E[v|u= u] = ρσv

σuu,

Trang 28

i.e., it is a linear function of u If the (unconditional) means are not zero,then the conditional mean is

which can also be written as

(10.3.20) var[v|u= u] = var[v] −(cov[u,v])

2

var[u] .

We did this in such detail because any bivariate normal with zero mean has thisform A multivariate normal distribution is determined by its means and variancesand covariances (or correlations coefficients) If the means are not zero, then thedensities merely differ from the above by an additive constant in the arguments, i.e.,

Trang 29

if one needs formulas for nonzero mean, one has to replace u and v in the aboveequations by u − µu and v − µv du and dv remain the same, because the Jacobian

of the translation u 7→ u − µu, v 7→ v − µv is 1 While the univariate normal wasdetermined by mean and standard deviation, the bivariate normal is determined bythe two means µuand µv, the two standard deviations σuand σv, and the correlationcoefficient ρ

10.3.2 Level Lines of the Normal Density

of δ, the covariance matrix (??) has the form

σu2 σuσvcos δσuσvcos δ σ2

satisfies x>Ψ−1x = r2 The opposite holds too, all vectors x satisfying x>Ψ−1x =

r2 can be written in the form (10.3.22) for some φ, but I am not asking to provethis This formula can be used to draw level lines of the bivariate Normal densityand confidence ellipses, more details in (??)

Trang 30

Problem 167 The ellipse in Figure1contains all the points x, y for which(10.3.23) x − 1 y − 1

0.5 −0.25

Trang 27

factor In other words, by completing the square we wrote the joint density function

in its...

T (y − α).Hint: definez= Tε, write down the density function of z, and make a transforma-tion between z andy

Answer Since E[z] = o and V[z] = σ In, its density function

Định dạng
Số trang	63
Dung lượng	536,09 KB