1. Trang chủ
  2. » Ngoại Ngữ

A Course in Mathematical Statistics phần 9 pps

67 376 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Analysis of Variance
Trường học University of Education
Chuyên ngành Mathematical Statistics
Thể loại Textbook
Định dạng
Số trang 67
Dung lượng 391,07 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

cata-Comparison of the strengths of certain objects made of different batches 17.1 One-way Layout or One-way Classification with the Same Number of Observations Per Cell The models to be

Trang 1

Chapter 17 Analysis of Variance

The Analysis of Variance techniques discussed in this chapter can be used tostudy a great variety of problems of practical interest Below we mention a fewsuch problems

Crop yields corresponding to different soil treatment

Crop yields corresponding to different soils and fertilizers

Comparison of a certain brand of gasoline with and without an additive byusing it in several cars

Comparison of different brands of gasoline by using them in several cars.Comparison of the wearing of different materials

Comparison of the effect of different types of oil on the wear of severalpiston rings, etc

Comparison of the yields of a chemical substance by using different lytic methods

cata-Comparison of the strengths of certain objects made of different batches

17.1 One-way Layout (or One-way Classification) with the Same Number of Observations Per Cell

The models to be discussed in the present chapter are special cases of thegeneral model which was studied in the previous chapter In this section, weconsider what is known as a one-way layout, or one-way classification, which

we introduce by means of a couple of examples

Trang 2

EXAMPLE 1 Consider I machines, each one of which is manufactured by I different

compa-nies but all intended for the same purpose A purchaser who is interested inacquiring a number of these machines is then faced with the question as towhich brand he should choose Of course his decision is to be based on the

productivity of each one of the I different machines To this end, let a worker run each one of the I machines for J days each and always under the same conditions, and denote by Y ij his output the jth day he is running the ith

machine Let μi be the average output of the worker when running the ith machine and let e ij be his “error” (variation) the jth day when he is running the ith machine Then it is reasonable to assume that the r.v.’s e ij are normallydistributed with mean 0 and variance σ2

It is further assumed that they are

independent Therefore the Y ij’s are r.v.’s themselves and one has the ing model

EXAMPLE 2 For an agricultural example, consider I · J identical plots arranged in an I × J

orthogonal array Suppose that the same agricultural commodity (some sort of

a grain, tomatoes, etc.) is planted in all I · J plots and that the plants in the ith row are treated by the ith kind of I available fertilizers All other conditions assumed to be the same, the problem is that of comparing the I different kinds

of fertilizers with a view to using the most appropriate one on a large scale.Once again, we denote by μi the average yield of each one of the J plots in the ith row, and let e ij stand for the variation of the yield from plot to plot in the

ith row, i = 1, , I Then it is again reasonable to assume that the r.v.’s e ij , i=

repre-(straight) lines In such a case there are formed IJ rectangles in the resulting rectangular array which are also referred to as cells (see also Fig 17.1) The

same interpretation and terminology is used in similar situations throughoutthis chapter

In connection with model (1), there are three basic problems we areinterested in: Estimation of μi , i = 1, , I; testing the hypothesis: H:μ1= ··· =

μI (=μ, unspecified) (that is, there is no difference between the I machines, or

the I kind of fertilizers) and estimation of σ2

Set

Y e

Trang 3

Then it is clear that Y = X′βββββ + e Thus we have the model described in (6)

of Chapter 16 with n = IJ and p = I Next, the I vectors (1, 0, · · · , 0)′, (0, 1,

0, , 0)′, , (0, 0, , 0, 1)′ are, clearly, independent and any other row

vector in X′ is a linear combination of them Thus rank X′ = I (= p), that is, X

J

Figure 17.1

Trang 4

is of full rank Then by Theorem 2, Chapter 16, μi = 1, , I have uniquely

determined LSE’s which have all the properties mentioned in Theorem 5 ofthe same chapter In order to determine the explicit expression of them, weobserve that

J j

J

1 1

1

J j

J j

so that, under the hypothesis H :μ1=···=μI (=μ, unspecified), ηηηηη ∈V1 That is,

r − q = 1 and hence q = r − 1 = p − 1 = I − 1 Therefore, according to (31) in

Chapter 16, the F statistic for testing H is given by

F

S SS

S SS

= − − = ( )−

n r q

I J I

c C C

c C C

I

Y

1 1

μ One has then the (unique) solution

I

1 1

(4)Therefore relations (28) and (29) in Chapter 16 give

17.1 One-way Layout (or One-way Classification) 443

Trang 5

SC C j ij ij C

J

ij i j

J i I i

ηand

Sc c j ij ij c

J

ij j J i I i

J

i i I i

I i

2 1 1

1

(5)Likewise,

J

ij j J i I i

2 1

I

i i

I

i i

SS SS

MS MS

H e H e

Trang 6

Table 1 Analysis of Variance for One-Way Layout

=

− 1

within groups SS e Y ij Y i

j J

I J

e e

17.1 One-way Layout (or One-way Classification) 445

REMARK 1 From (5), (6) and (7) it follows that SS T = SS H + SS e Also from

(6) it follows that SS T stands for the sum of squares of the deviations of the Y ij’s

from the grand (sample) mean Y Next, from (5) we have that, for each i,

J

j=1(Y ij − Y i.)2 is the sum of squares of the deviations of Y ij , j = 1, , J within the ith group For this reason, SS e is called the sum of squares within groups.

On the other hand, from (7) we have that SS H represents the sum of squares of

the deviations of the group means Y i. from the grand mean Y . (up to the factor

J) For this reason, SS H is called the sum of squares between groups Finally,

SS T is called the total sum of squares for obvious reasons, and as mentioned above, it splits into SS H and SS e Actually, the analysis of variance itself derives

its name because of such a split of SS T

Now, as follows from the discussion in Section 5 of Chapter 16, the

quantities SS H and SS e are independently distributed, under H, as σ2

IJ−1 distributed, under H We may

summarize all relevant information in a table (Table 1) which is known as an

Analysis of Variance Table.

EXAMPLE 3 For a numerical example, take I = 3, J = 5 and let

and MS H = 315.5392, MS e= 7.4, so that F = 42.6404 Thus for α = 0.05, F2,12;0.05,

= 3.8853 and the hypothesis H:μ1=μ2=μ3 is rejected Of course, ˜σ2

= MS = 7.4

Trang 7

17.2 Two-way Layout (Classification) with One Observation Per Cell

The model to be employed in this paragraph will be introduced by an priate modification of Examples 1 and 2

appro-EXAMPLE 4 Referring to Example 1, consider the I machines mentioned there and also J

workers from a pool of available workers Each one of the J workers is

assigned to each one of the I machines which he runs for one day Let μij be the

daily output of the jth worker when running the ith machine and let e ij be his

“error.” His actual daily output is then an r.v Y ij such that Y ijij + e ij At thispoint it is assumed that each μij is equal to a certain quantity μ, the grand mean,

plus a contribution αi due to the ith row (ith machine), and called the ith row effect, plus a contribution βj due to the jth worker, and called the jth column effect It is further assumed that the I row effects and also the J column effects

cancel out each other in the sense that

αi βj j J i

EXAMPLE 5 Consider the identical I · J plots described in Example 2, and suppose that J

different varieties of a certain agricultural commodity are planted in each one

of the I rows, one variety in each plot Then all J plots in the ith row are treated

by the ith of I different kinds of fertilizers Then the yield of the jth variety of the commodity in question treated by the ith fertilizer is an r.v Y ij which is

assumed again to have the structure described in (10) Here the ith row effect

Trang 8

is the contribution of the ith fertilized and the jth column effect is the bution of the jth variety of the commodity in question.

contri-From the preceding two examples it follows that the outcome Y ij is fected by two factors, machines and workers in Example 4 and fertilizers and

af-varieties of agricultural commodity in Example 5 The I objects (machines or fertilizers) and the J objects (workers or varieties of an agricultural commod- ity) associated with these factors are also referred to as levels of the factors.

The same interpretation and terminology is used in similar situations out this chapter

through-In connection with model (10), there are the following three problems to

be solved: Estimation of μ; αi , i = 1, , I;βj , j = 1, , J; testing the hypothesis

H A:α1=···=αI = 0 (that is, there is no row effect), H B:β1=···=βJ= 0 (that is,there is no column effect) and estimation of σ2

Trang 9

and then we have

Y= ′ +Xββ e with n=IJ and p= + +I J 1

It can be shown (see also Exercise 17.2.1) that X′ is not of full rank but rank

X′ = r = I + J − 1 However, because of the two independent restrictions

αi β

i

I

j j J

0and

impliesμˆ = Y , where Y . is again given by (4);

a iS Y,ββ( )= 0impliesαˆ i = Y i − Y ., where Y i. is given by (2) and (∂/∂βj)S(Y, βββββ) = 0 implies βββββˆj = Y.j − Y ., where

EY= = ′ηη X(μ α: 1, ,α βI; 1, ,βJ)′∈V r, where r= + −I J 1.Consider the hypothesis

H A:α1= · · · =αI= 0

Then, under H A,ηηηηη ∈ V r −q , where r − q A = J, so that q A = I − 1.

Next, under H A again, S(Y, βββββ) becomes

j J i

from where by differentiation, we determine the LSE’s of μ and βj, to be

denoted by ˆμ and ˆβ , respectively That is, one has

Trang 10

ˆ ˆ , ˆ . ˆ , , ,

μA =Y .=μ βj A, =Y jY .=βj j=1 J (13)Therefore relations (28) and (29) in Chapter 16 give by means of (11) and (12)

SC C j ij ij C

J

ij i j j

J i I i

J i I i

ij j j

J i

I

i i I

1

2 1

1

2 1

(14)because

ij j i j

J i

I

i i

I

ij j j

J

i i I

i i I

2 1

I

SS SS

MS MS

and SS A , SS e are given by (15) and (14), respectively (However, for an

expres-sion of SS to be used in actual calculations, see (20) below.)

17.2 Two-way Layout (Classification) with One Observation Per Cell 449

Trang 11

Next, for testing the hypothesis

J

SS SS

MS MS

J

j j

2

1

2 1

J i

ij j

J

i i I i

I

j j

J

ij j

2

1

2

1 1

2

1

2

1 1

1 1

1 1

22

J i

I

i

ij j J i

I

j

i j J i

J i

I

i i

I

ij j J

i I

Trang 12

Table 2 Analysis of Variance for Two-way Layout with One Observation Per Cell

J

.

β 2 1

=

− 1

residual SS e Y ij Y i Y j Y

j J

J i

I

j j

J

ij i I

j j

and

j J i

Trang 13

3 7 5 4

17.3 Two-way Layout (Classification) with K ( ≥≥≥≥≥ 2) Observations Per Cell

In order to introduce the model of this section, consider Examples 4 and 5 and

suppose that K ( ≥ 2) observations are taken in each one of the IJ cells This amounts to saying that we observe the yields Y ijk , k = 1, , K of K identical plots with the (i, j)th plot, that is, the plot where the jth agricultural commodity was planted and it was treated by the ith fertilizer (in connection with Example 5); or we allow the jth worker to run the ith machine for K days instead of one

day (Example 4) In the present case, the relevant model will have the form

Y ijkij + e ijk However, the means μij , i = 1, , I; j = 1, , J need not be

additive any longer In other words, except for the grand mean μ and the rowand column effects αi and βj, respectively, which in the previous section added

up to make μij , we may now allow interactionsγij among the various factorsinvolved, such as fertilizers and varieties of agricultural commodities, or work-ers and machines It is not unreasonable to assume that, on the average, theseinteractions cancel out each other and we shall do so Thus our present model

is as follows:

Y ijk=μ + αijij + e ijk, (22)where

i

I

j ij j

J

ij i I j

Once again the problems of main interest are estimation of μ, αij and γij,

i = 1, , I; j = 1, , J; testing the hypotheses: H A:α1=···=αI = 0, H B:β1=···=

βJ = 0 and H ABij = 0, i = 1, , I; j = 1, , J (that is, there are no interactions

present); and estimation of σ2

By setting

Y e

ββ (μ , μ μ1 .μIJ)′and

Trang 14

it is readily seen that

Y= X′βββββ + e with n = IJK and p = IJ, (22′)

so that model (22′) is a special case of model (6) in Chapter 16 From the form

of X′ it is also clear that rank X′ = r = p = IJ; that is, X′ is of full rank (see also

Exercise 17.3.1) Therefore the unique LSE’s of the parameters involved areobtained by differentiating with respect to μij the expression

S Y,

.ββ

I

μ 2

1 1 1

We have then

Trang 15

ˆ ., , ; , ,

μij=Y ij i=1 I j=1 J (23)Next, from the fact that μij=μ + αijij and on the basis of the assumptionsmade in (22), we have

μ = μ ., αii.−μ ., βj.j−μ ., γijij−μi.−μ.j+μ ., (24)

by employing the “dot” notation already used in the previous two sections.From (24) we have that μ, αij and γij are linear combinations of the param-eters μij Therefore, by the corollary to Theorem 3 in Chapter 16, they are

estimable, and their LSE’s ˆ μ, ˆαi , ˆβj , ˆγij, are given by the above-mentioned linearcombinations, upon replacing μij by their LSE’s It is then readily seen that

.

K j J i I j

J i

K

C j

J i I

i i i

I

j j j

J

ij ij j

J i I

(26)because, as is easily seen, all other terms are equal to zero (See also Exercise17.3.2.)

From identity (26) it follows that, under the hypothesis

H A1= · · · = αI= 0,the LSE’s of the remaining parameters remain the same as those given in (25)

It follows then that

ijk k

K

ij j J i I j

J i I

2

2 1

2 1 1 1

1

(27)and

Trang 16

i i I

2 1

2 1

Therefore the F statistic in the present case is

FA

A e

A e

IJ K I

SS SS

MS MS

and SS A , SS e are given by (28) and (27), respectively

For testing the hypothesis

IJ K J

SS SS

MS MS

j j J

IJ K

SS SS

MS MS

Trang 17

ij i j j

J i I

1 1

ijk k K j J i I

2

1

2 1

2 1

Once again the main results of this section are summarized in a table, Table 3

The number of degrees of freedom of SS T is calculated by those of SS A,

SS B , SS AB and SS e, which can be shown to be independently distributed as σ2χ2r.v.’s with certain degrees of freedom

EXAMPLE 6 For a numerical application, consider two drugs (I= 2) administered in three

dosages (J = 3) to three groups each of which consists of four (K = 4) subjects.

Certain measurements are taken on the subjects and suppose they are asfollows:

Trang 18

FA= 0.8471, FB= 12.1038, FAB= 0.1641

Thus for α = 0.05, we have F1,18;0.05 = 4.4139 and F2,18;0.05 = 3.5546; we accept H A,

reject H B and accept H AB Finally, we have σ˜2

= 183.0230

The models analyzed in the previous three sections describe three mental designs often used in practice There are many others as well Some ofthem are taken from the ones just described by allowing different numbers ofobservations per cell, by increasing the number of factors, by allowing the roweffects, column effects and interactions to be r.v.’s themselves, by randomizingthe levels of some of the factors, etc However, even a brief study of thesedesigns would be well beyond the scope of this book

J

.

β 2 1

=

− 1

AB interactions SS AB K ij K Y Y Y Y

j J

i

I

ij i j j

2

1 1

j J

j J

Exercises 457

Trang 19

17.3.3 Show that SS T = SS e + SS A + SS B + SS AB , where SS e , SS A , SS B , SS AB and

SS T are given by (27), (28), (31), (33) and (34), respectively

17.3.4 Apply the two-way layout with two observations per cell analysis ofvariance to the data given in the table below (take α = 0.05)

Consider again the one-way layout with J (≥ 2) observations per cell described

in Section 17.1 and suppose that in testing the hypothesis H :μ1=···=μI (= μ,unspecified) we decided to reject it on the basis of the available data In

rejecting H, we simply conclude that the μ’s are not all equal No conclusionsare reached as to which specific μ’s may be unequal

The multicomparison method described in this section sheds some light onthis problem

For the sake of simplicity, let us suppose that I = 6 After rejecting H, the

natural quantities to look into are of the following sort:

131

3

13

1

6

0

This observation gives rise to the following definition

DEFINITION 1 Any linear combination ψ = ∑I

i=1 c iμi of the μ’s, where c i , i = 1, , I are known

constants such that ∑I

i=1c i = 0, is said to be a contrast among the parameters μ i,

i = 1, , I.

Letψ = ∑I

=1cμ be a contrast among the μ’s and let

Trang 20

i i

2

1

1

1

where n = IJ We will show in the sequel that the interval [ψˆ − Sσˆ(ψˆ ), ψˆ +

S σˆ(ψˆ )] is a confidence interval with confidence coefficient 1 − α for all

con-trastsψ Next, consider the following definition

DEFINITION 2 Letψ and ψˆ be as above We say that ψˆ is significantly different from zero,

according to the S (for Scheffé) criterion, if the interval [ ψˆ − Sσˆ(ψˆ ), ψˆ + Sσˆ

(ψˆ )] does not contain zero; equivalently, if |ψˆ | > Sσˆ(ψˆ ).

Now it can be shown that the F test rejects the hypothesis H if and only if

there is at least one contrast ψ such that ψˆ is significantly different from zero Thus following the rejection of H one would construct a confidence inter-

val for each contrast ψ and then would proceed to find out which contrasts are

responsible for the rejection of H starting with the simplest contrasts first.

The confidence intervals in question are provided by the followingtheorem

THEOREM 1 Refer to the one-way layout described in Section 17.1 and let

I

1 1

where MS e is given in Table 1 Then the interval [ψˆ − Sσˆ(ψˆ ), ψˆ + Sσˆ(ψˆ )] is a

confidence interval simultaneously for all contrasts ψ with confidence cients 1 −α, where S2= (I − 1)F I −1,n−I;α and n = IJ.

coeffi-PROOF Consider the problem of maximizing (minimizing) (with respect to

I i i i i

I

1

2 1 1

11

=

1

0

Now, clearly, f(c1, , c I)= f(γc1, , γc I) for any γ > 0 Therefore the

maxi-mum (minimaxi-mum) of f(c1, , c I), subject to the restraint

is the same with the maximum (minimum) of f( γ c1, , γ c I)= f(c′1, , c I′),

c′=γ c , i = 1, , I subject to the restraints

17.4 A Multicomparison Method 459

Trang 21

′ =

=

i I

1

0and

1

1

2 1

I

i i

k

i i I

i i I

Trang 22

λ1 μ λ2 μ μ

2

1

12

I

k k

k k k

I

k k k

2

2 .

k k k

i I

1

2 1

2 1

Trang 23

From (40) and (39) it follows then that

for all c i , i = 1, , I such that ∑ I

i=1c i= 0, or equivalently,

P[ψˆ −Sσ ψˆ ˆ( )≤ψ ψ≤ ˆ+Sσ ψˆ ˆ( ) ]= −1 α,for all contrasts ψ, as was to be seen (This proof has been adapted from thepaper “A simple proof of Scheffé’s multiple comparison theorem for contrasts

in the one-way layout” by Jerome Klotz in The American Statistician, 1969,

Vol 23, Number 5.) ▲

In closing, we would like to point out that a similar theorem to the one just

proved can be shown for the two-way layout with (K≥ 2) observations per celland as a consequence of it we can construct confidence intervals for all con-trasts among the α’s, or the β’s, or the γ’s

Exercises

17.4.1 Show that the quantity J i Y i Y

i I

Sec-χ2

I−1, under the null hypothesis

17.4.2 Refer to Exercise 17.1.1 and construct confidence intervals for allcontrasts of the μ’s (take 1 − α = 0.95)

Trang 24

18.1 Introduction 463

463

Chapter 18 The Multivariate Normal Distribution

DEFINITION 1

18.1 Introduction

In this chapter, we introduce the Multivariate Normal distribution and lish some of its fundamental properties Also, certain estimation and inde-pendence testing problems closely connected with it are discussed

estab-Let Y j , j = 1, , m be i.i.d r.v.’s with common distribution N(0, 1) Then we know that for any constants c j , j = 1, , m and μ the r.v ∑ m

j=1c j Y j+

μ is distributed as N(μ, ∑ m

j=1c2

j) Now instead of considering one

(non-homogeneous) linear combination of the Y’s, consider k such combinations;

Let Y j , j = 1, , m be i.i.d r.v.’s distributed as N(0, 1) and let the r.v.’s X i , i=

1, , k, or the r vector X, be defined by (1) or (2), respectively Then the

joint distribution of the r.v.’s X i , i = 1, , k or the distribution of the r vector

X, is called Multivariate (or more specifically, k-Variate) Normal.

REMARK 1 From Definition 1, it follows that if X i , i = 1, , k are jointly

normally distributed, then any subset of them also is a set of jointly normallydistributed r.v.’s

Trang 25

From (2) and relation (10), Chapter 16, it follows that EX= μμμμμ and ΣΣΣΣΣ/x =

k

m

j j j

k

j jm j

k

m

1

1 1

1

1 1 1

k

Y j jm j k

j j j

k

j j m j k

1

2

12

121

k

1 1 2

1 2

The ch.f of the r vector X= (X1 , , X k)′, which has the k-Variate Normal

distribution with mean μμμμμ and covariance matrix ΣΣΣΣΣ/, is given by

From (6) it follows that φx , and therefore the distribution of X, is completely

determined by means of its mean μμμμμ and covariance matrix ΣΣΣΣΣ/, a fact analogous

to that of a Univariate Normal distribution This fact justifies the followingnotation:

X ~N( )μμ ΣΣ/, whereμμμμμ and ΣΣΣΣΣ/ are the parameters of the distribution.

Now we shall establish the following interesting result

Let Y j , j = 1, , k be i.i.d r.v.’s with distribution N(0, 1) and set X = CY + μμμμμ, where C is a k × k non-singular matrix Then the p.d.f fx of X exists and is given

by

THEOREM 2

THEOREM 1

Trang 26

whereΣΣΣΣΣ/ = CC′ and |ΣΣΣΣΣ/| denotes the determinant of ΣΣΣΣΣ/.

PROOF From X = CY + μμμμμ we get CY = X − μμμμμ, which, since C is non-singular,

REMARK 2 A k-Variate Normal distribution with p.d.f given by (7) is called

a non-singular k-Variate Normal The use of the term non-singular

corre-sponds to the fact that |ΣΣΣΣΣ/| ≠ 0; that is, the fact that ΣΣΣΣΣ/ is of full rank

In the theorem, let k = 2 Then X = (X1 , X2)′ and the joint p.d.f of X1 , X2is theBivariate Normal p.d.f

PROOF By Remark 1, both X1 and X2 are normally distributed and let X1~ N(μ1,σ2

2 2

1 2

1 2 1

2

11

COROLLARY 1

Trang 27

1 2 2

2 2

2 1 2

1 2

1 1 2 2

2 2 2 2

1

1

2 12

μσ

n On the other hand, |ΣΣΣΣΣ/| ΣΣΣΣΣ/−1 is also a diagonal matrix with the jth

diagonal element given by ∏i ≠jσ2

i, so that ΣΣΣΣΣ/−1 itself is a diagonal matrix with the

jth diagonal element being given by 1/σ2

j It follows that

i i

12

and this establishes the independence of the X’s.

REMARK 3 The really important part of the corollary is that noncorrelationplus normality implies independence, since independence implies non-correlation in any case It is also to be noted that noncorrelation withoutnormality need not imply independence, as it has been seen elsewhere

Exercises

18.1.1 Use Definition 1 herein in order to conclude that the LSE βββββˆ of βββββ in (9)

of Chapter 16 has the n-Variate Normal distribution with mean βββββ andcovariance matrix σ2

S−1 In particular, (βˆ ,βˆ )′, given by (19″) and (19′) of

COROLLARY 2

Trang 28

2 2

2 2 1

j j

n

j j n

and correlation coefficient equal to − =

j j n

1

2 1

18.1.2 Verify relation (8)

18.1.3 Let the random vector X= (X1 , X k)′ be distributed as N(μμμμμ, ΣΣΣΣΣ/) and

suppose that ΣΣΣΣΣ/ is non-singular Then show that the conditional joint

distribu-tion of X i1, , X i m , given X j1, , X j n (1 ≤ m < k, m + n = k, all i1, , i m≠ from

all j1, , j n), is Multivariate Normal and specify its parameters

18.2 Some Properties of Multivariate Normal Distributions

In this section we establish some of the basic properties of a MultivariateNormal distribution

Let X= (X1, , X k)′ be N(μμμμμ, ΣΣΣΣΣ/) (not necessarily non-singular) Then for any

m × k constant matrix A = (α ij), the r vector Y defined by Y = AX has the

m-Variate Normal distribution with mean Aμμμμμ and covariance matrix AΣΣΣΣΣ/A′.

In particular, if m = 1, the r.v Y is a linear combination of the X’s, Y = ααααα′X,

say, and Y has the Univariate Normal distribution with mean ααααα′μμμμμ and varianceα

αααα′ΣΣΣΣΣ/ααααα

so that by means of (6), we have

and this last expression is the ch.f of the m-Variate Normal with mean Aμ and

covariance matrix A ΣΣΣΣΣ/A′, as was to be seen The particular case follows from

the general one just established ▲

For j = 1, , n, let X j be independent N(μμμμμj,ΣΣΣΣΣ/j ) k-dimensional r vectors and let

c j be constants Then the r vector

2 1

Trang 29

(a result parallel to a known one for r.v.’s).

j

n j

j

j But

2 j

1

2 1

12

PROOF In the theorem, taken μμμμμj= μμμμμ, ΣΣΣΣΣ/j = ΣΣΣΣΣ/ and c j = 1/n, j = 1, , n.

Let X= (X1, , X k)′ be non-singular N(μμμμμ, ΣΣΣΣΣ/) and set Q = (X − μμμμμ)′ΣΣΣΣΣ/−1

of a k-Variate Normal with mean μμμμμ and covariance matrix ΣΣΣΣΣ//(1 − 2it) Hence

COROLLARY

THEOREM 5

Trang 30

18.2.1 Consider the k-dimensional random vectors X n = (X1n , , X kn)′, n =

1, 2, and X= (X1 , , X k)′ with d.f.’s F n , F and ch.f.’sφn,φ, respectively

Then we say that {Xn } converges in distribution to X as n→ ∞, and we write

for which F is continuous (see

also Definition 1(iii) in Chapter 8) It can be shown that a multidimensionalversion of Theorem 2 in Chapter 8 holds true Use this result (and alsoTheorem 3′ in Chapter 6) in order to prove that Xn d X

where X is distributed as N(μμμμμ, ΣΣΣΣΣ/) if and only if {λλλλλ′Xn} converges in distribution

as n → ∞, to an r.v Y which is distributed as Normal with mean λλλλλ′μμμμμ and

varianceλλλλλ′ΣΣΣΣΣ/λλλλλ for every λλλλλ ∈ k

18.3 Estimation of μμμμμ and ΣΣΣΣΣ/ and a Test of Independence

First we formulate a theorem without proof, providing estimators for μμμμμ andΣΣΣΣΣ/, and then we proceed with a certain testing hypothesis problem

For j = 1, , n, let X j = (X j1 , , X jk)′ be independent, non-singular N(μμμμμ, ΣΣΣΣΣ/)

r vectors and set

i) X ¯ and S are sufficient for (μμμμμ, ΣΣΣΣΣ/);

ii) X¯ and S/(n− 1) are unbiased estimators of μμμμμ and ΣΣΣΣΣ/, respectively;

iii) X¯ and S/n are MLE’s of μμμμμ and ΣΣΣΣΣ/, respectively

Now suppose that the joint distribution of the r.v.’s X and Y is the

Bivariate Normal distribution That is,

THEOREM 6

18.3 Estimation of μμμμμ and ∑∑/ and Test of Independence 469

Trang 31

1 1

2

1 1

2 2

2 2 2

πσ σ ρρ

μ

μσ

μσ

μσ

Then by Corollary 2 to Theorem 2, the r.v.’s X and Y are independent if and only if they are uncorrelated Thus the problem of testing independence for X and Y becomes that of testing the hypothesis H :ρ = 0 For this purpose,

consider an r sample of size n(X j , Y j ), j = 1, , n, from the Bivariate Normal under consideration Then their joint p.d.f., f, is given by

2

1 1

2 2

2 2

μσ

μ

, , (9)

For testing H, we are going to employ the LR test And although the

MLE’s of the parameters involved are readily given by Theorem 6, we

choose to derive them directly For this purpose, we set g( θθθθθ) for logf(θθθθθ)

considered as a function of the parameter θθθθθ ∈ ΩΩΩ, where the parameter space ΩΩ

2 2 2

2 2

1 2

Trang 32

18.1 Introduction 471

where q j , j = 1, , n are given by (9) Differentiating (10) with respect to μ1

and μ2 and equating the partial derivatives to zero, we get after somesimplifications

1

1

2 2

See also Exercise 18.3.1 (11)

Solving system (11) for μ1 and μ2, we get

deriva-1

11

Next, differentiating g with respect to ρ and equating the partial derivative tozero, we obtain after some simplifications (see also Exercise 18.3.3)

2

2 2

It can further be shown (see also Exercise 18.3.5) that the values of the

parameters given by (12) and (16) actually maximize f (equivalently, g) and the

Trang 33

It follows that the MLE’s of μ1,μ2,σ2

1,σ2

2 and ρ, under ΩΩΩ, are given by (12) and(16), which we may now denote by μˆ1,Ω,μˆ2,Ω,σˆ2

1,Ω,σˆ2 2,Ω and ρˆΩ That is,

ˆ , , ˆ , , ˆ , , ˆ , , ˆ .

2

2 2

Underωω (that is, for ρ = 0), it is seen (see also Exercise 18.3.6) that the MLE’s

of the parameters involved are given by

ˆ , , ˆ , , ˆ , , ˆ ,

2

2 2

ωω=x ωω=y ωω=S x ωω=S y (19)and

1

2

(20)

Replacing the x’s and y’s by X’s and Y’s, respectively, in (17) and (20), we have

that the LR statistic λ is given by

1

2 1

2 1

(22)

From (22), it follows that R2≤ 1 (See also Exercise 18.3.7.) Therefore by

the fact that the LR test rejects H whenever λ < λ0, where λ0 is determined, so

that P H(λ < λ0)= α, we get by means of (21), that this test is equivalent to

rejecting H whenever

R2>c0, equivalently, R< −c0 or R>c0, c0 = −1 λ02n (23)

In (23), in order to be able to determine the cut-off point c0, we have to know

the distribution of R under H Now although the p.d.f of the r.v R can be

derived, this p.d.f is none of the usual ones However, if we consider thefunction

it is easily seen, by differentiation, that W is an increasing function of R

Therefore, the test in (23) is equivalent to the following test

Reject wheneverH W< −c or W> ,c (25)

where c is determined, so that P H (W < −c or W > c) =α It is shown in the sequel

that the distribution of W under H is t−2 and hence c is readily determined.

Ngày đăng: 23/07/2014, 16:21

TỪ KHÓA LIÊN QUAN

w