Crossed Random Effects Models

Một phần của tài liệu Linear models and time series analysis regression, ANOVA, ARMA and GARCH (Trang 166 - 176)

Once we move beyond one-factor models, each factor needs to be designated as eithercrossed(with some other factor) ornested(within another factor). This section examines the former case, such that all factors are crossed. In addition, with more than one factor, some could be fixed and some could be random, giving rise to a mixed model. In this section, we will restrict ourselves to all factors being random, and only mention that the two-factor crossed/mixed model is discussed in, e.g., Searle et al.

(1992, p. 122). We will briefly look at an example of a mixed model later in Section 3.3.1.3, within the context of a nested model.

Recall Section 2.5 on the two-way ANOVA model with fixed effects. The setup in the two-factor crossed REM model is the same, but the classes are now random instead of fixed. As in the fixed effects case, the model can be additive in the two effects or include an interaction term. Continuing our high school student writing evaluation example, imagine now, in addition to theA=20 schools chosen

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.75

0.8 0.85 0.9 0.95 1

Quantile q Empirical CI coverage, n = 10

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

0.75 0.8 0.85 0.9 0.95 1

Quantile q Empirical CI coverage, n = 5

Figure 3.4 Top:Actual coverage probability as a function of tuning parameterqfor the confidence interval of the intraclass correlation coefficient for the one-way unbalanced REM withA=20,n=10, 20 missing values,𝜎a2=0.4, and 𝜎e2=0.8.Bottom:Same but having usedn=5.

randomly from a suitable population,Btest evaluators are chosen randomly from a large population of candidates (such as undergraduate university admissions staff ) and for each of theABcombinations, n=10 high school pupils are randomly chosen. Observe how each class of factor A iscrossedwith each class of factor B. Interest centers on the variance components arising from the different schools (variance factor A), and the different evaluators (variance factor B), along with the error variance from the different pupils. This two-factor model is addressed in Section 3.2.1, while Section 3.2.2 considers the crossed model with three factors.

3.2.1 Two Factors

For the two-way crossed REM, we observe the set{Yijk}, whereYijkis thekth observation correspond- ing to the cross of theith class from the first effect and thejth class of the second effect,i=1,,A, j=1,,B,k=1,,n, thus yielding a total ofABnobservations. With two factors (whether fixed, random, or mixed), it is common to speak of theith row andjth column.

3.2.1.1 With Interaction Term

This model with interaction is such that we assumeYijkcan be represented as

Yijk =𝜇+ai+bj+cij+eijk, (3.37)

where theai,bj,cij, andeijkare independent unobserved random variables with

aii.i.d.∼ N(0, 𝜎a2), bji.i.d.∼ N(0, 𝜎2b), ciji.i.d.∼ N(0, 𝜎2c), eijki.i.d.∼ N(0, 𝜎e2). (3.38) In other words, the particular observedArows andBcolumns are independently drawn from a large population of row and column effects, respectively. Some authors write factorcijas(ab)ijto emphasize that it represents the interaction ofaiandbj.

It follows from (3.37) and (3.38) that𝔼[Yijk] =𝜇, Var(Yijk) =𝜎Y2 =𝜎2a+𝜎b2+𝜎2c+𝜎e2,

Cov(Yijk,Yijk′) =𝜎a2+𝜎b2+𝜎2c, Cov(Yijk,Yijk′) =𝜎2a, and Cov(Yijk,Yijk′) =𝜎b2, where, as in (3.3),i′ is an element in{1,2,,A}\i, etc. If𝜎2c =0, then the model is additive, as discussed below in Section 3.2.1.2, otherwise, there are interaction effects between the two classes. Unlike in the two-way fixed effects ANOVA, whereby inclusion of the interaction terms implyABadditional parameters (albeit subject to constraints), for the two-way crossed REM only a single additional parameter,𝜎c2, is required. Nevertheless, precise estimation of variance components is not possible with typical sample sizes (as seen from the often depressingly large width of confidence intervals), so that removal of𝜎c2, if justified, is beneficial for estimation of the remaining variance components.

As with the two-way ANOVA with fixed effects, we stack theYijkin theABn×1 vectorYin lexicon order such that indexkchanges fastest, followed by indexj, and then indexi, and similarly for the error vectore. Witha= (a1,,aA)′,b= (b1,,bB)′, andc= (c11,c12,,cAB)′, and recalling the matrices in the fixed effects case (2.48) and (2.62),

Y= (1A1B1n)𝜇 + (IA1B1n)a

+ (1AIB1n)b (3.39)

+ (IAIB1n)c + (IAIBIn)e,

from which the elegant structure reveals itself, and can be straightforwardly used to expressYfor higher-order crossed models. Of course, (3.39) simplifies somewhat for computational purposes as

Y= (1ABn)𝜇+ (IA1Bn)a+ (1AIB1n)b+ (IAB1n)c+e (3.40)

=X𝜷+𝝐,

whereX=1ABn, 𝜷=𝜇, and 𝝐is the rest of (3.40). Thus,Y∼NABn(𝝁,𝚺), where𝝁=X𝜷 and, from (3.40),

𝚺=𝕍(Y) =𝕍(𝝐) = (IA1Bn)Var(a)(IA1Bn)′+ ã ã ã +Var(e)

= (IAJBJn)𝜎2a

+ (JAIBJn)𝜎2b (3.41)

+ (IAIBJn)𝜎c2

+ (IAIBIn)𝜎e2,

after some simplification similar to that used to obtain (3.6). Observe also the predictable pattern in (3.41), allowing for easy extension to higher-order (balanced, crossed, random effects) models.

Thus, the likelihood is easily expressed and, similar to the Matlab exercise in Section 3.1.1, the reader is invited to develop the code to compute the m.l.e. and approximate parameter standard errors. Note that (3.41) can be used for simulation, as was done in Listing 3.2, but it is perhaps easier to simulate a,b, andc, and use (3.37) directly, with a tripleforloop, outputting also the classes. The generated data can be output to a text file and read in and analyzed by SAS, as in Section 3.1.5.

We now turn to the basic distribution theory associated with the model. As always, we start with the identity

Yijk =•••+ (i••−•••) + (j•−•••) + (ij•−i••−j•+•••) + (Yijkij•). (3.42)

Theorem 3.4 Independence and Distribution Squaring each term in (3.42) and summing over all subscripts results in all cross terms being zero, so that

SST=SS𝜇+SSa+SSb+SSc+SSe, (3.43)

where eachSSterm corresponds to its counterpart in (3.42) from left to right.

The r.h.s.SSvalues in (3.43) are mutually independent, and SS𝜇

𝛾𝜇𝜒12

(ABn𝜇2 𝛾𝜇

) , SSa

𝛾a

𝜒A−12 , SSb 𝛾b

𝜒B−12 , SSc 𝛾c

𝜒(A−1)(B−1)2 , (3.44)

andSSe𝜎e2∼𝜒AB(n−1)2 , where

𝛾𝜇 = Bn𝜎a2+An𝜎2b+n𝜎c2+𝜎e2, 𝛾a = Bn𝜎2a+n𝜎c2+𝜎2e, 𝛾b = An𝜎b2+n𝜎2c +𝜎e2, 𝛾c = n𝜎2c +𝜎e2.

The corresponding ANOVA table is given in Table 3.2, along with theEMSvalues.

Proof: See Problem 3.1. ◾

InspectingEMSvalues in Table 3.2 immediately gives the ANOVA method point estimators

̂𝜎2e =MSe, ̂𝜎c2= MScMSe

n , ̂𝜎b2= MSbMSc

An , ̂𝜎a2= MSaMSc

Bn . (3.45)

Table 3.2 ANOVA table for the balanced two-factor crossed REM.

Source df SS EMS

Mean 1 ABnȲ•••2 𝜎e2+n𝜎c2+An𝜎b2+Bn𝜎2a+ABn𝜇2

A A−1 Bn

A i=1

(Ȳi••−Ȳ•••)2 𝜎e2+n𝜎c2+ +Bn𝜎2a

B B−1 An

B j=1

(Ȳj•−Ȳ•••)2 𝜎e2+n𝜎c2+An𝜎b2 AB (A−1)(B−1) n

A i=1

B j=1

( Ȳij•−Ȳi••

Ȳj•+Ȳ•••

)2

𝜎e2+n𝜎c2

Error AB(n−1)

A i=1

B j=1

n k=1

(YijkȲij•)2 𝜎e2

Total ABn

A i=1

B j=1

n k=1

Yijk2

Calculations similar to those in (3.19) and (3.20) lead to expressions for the sample variances as Var(̂𝜎e2) = 2𝜎e4

AB(n−1), Var(̂𝜎2c) = 2 n2

[ (𝜎e2+n𝜎2c)2

(A−1)(B−1)+ 𝜎e4

AB(n−1) ]

, Var(̂𝜎a2) = 2

B2n2

[(𝜎2e +n𝜎c2+Bn𝜎a2)2

A−1 + (𝜎2e +n𝜎c2)2 (A−1)(B−1)

]

, (3.46)

Var(̂𝜎b2) = 2 A2n2

[(𝜎e2+n𝜎2c +An𝜎b2)2

B−1 + (𝜎e2+n𝜎2c)2 (A−1)(B−1)

] ,

which can be used to form Wald confidence intervals for the variance components.

Unlike in the one-way case, this model is such that the m.l.e. does not have a closed-form solution (see, e.g., Sahai and Ojeda, 2004, Sec. 4.4.2), except for ̂𝜎2e,ML, which is the same as in (3.45). No doubt (3.45) will be close to the m.l.e., and thus, if all estimates are positive, will serve as excellent starting val- ues for numeric computation of the m.l.e. One can correctly speculate that, in all higher-order crossed models, the m.l.e. (or, more correctly, a solution to the set of log-likelihood derivative equations) is not expressible in closed form.

Similar to the motivation for theF test in (3.17) pertaining to the one-way REM case, theEMS values in Table 3.2, and the independence of the sums of squares, suggest the followingFtests for𝜎a2, 𝜎b2, and𝜎2c, respectively, withP∶= (A−1)(B−1):

Fa = MSa MSc𝛾a

𝛾cFA−1,P, Fb= MSb MSc𝛾b

𝛾cFB−1,P, Fc= MSc MSe𝛾c

𝜎e2FP,AB(n−1). (3.47)

As an example of an exact confidence interval, fromFcin (3.47), 1−𝛼=Pr

(L

Fc < 𝜎e2 n𝜎2c+𝜎e2

< U Fc

)

=Pr

(FcU−1 n < 𝜎c2

𝜎e2

< FcL−1 n

)

=Pr

( FcU

nU+FcU < 𝜎c2 𝜎2c +𝜎e2

< FcL nL+FcL

)

, (3.48)

whereLandUare such that Pr(LFP,AB(n−1) ⩽U) =1−𝛼.

Not having pivots, exact confidence intervals for the individual variance components do not exist, though one can use asymptotic pivots via the Wald intervals formed from (3.46), as well as the easily derived ones using the Satterthwaite method from Section 3.1.4. In particular, the latter are

1−𝛼≈Pr

((MScMSe)

n u𝜎c2⩽(MScMSe) n l

)

, = (MScMSe)2 ( (MSc)2

(A−1)(B−1)+ (MSe)2

AB(n−1)

), 1−𝛼≈Pr

((MSbMSc)

An u𝜎2b(MSbMSc) An l

)

, = (MSbMSc)2 ((MSb)2

B−1 + (MSc)2

(A−1)(B−1)

), and

1−𝛼≈Pr

((MSaMSc)

Bn u𝜎a2⩽(MSaMSc) Bn l

)

, = (MSaMSc)2 ((MSa)2

(A−1) + (MSc)2

(A−1)(B−1)

), foruandlsuch that 1−𝛼=Pr(l𝜒2̂

du).

3.2.1.2 Without Interaction Term

If the analyst decides that the magnitude of𝜎c2is negligible compared to the other variance compo- nents (typically as a result of failure to reject the null hypothesis that𝜎c2=0, based on theFc test in (3.47), using a conventional test significance level, though possibly also coupled with theoretical knowledge of the process and/or results of previous, similar studies), then the model may be assumed additive. In this case, (3.45) becomes

̂𝜎2e =MSe, ̂𝜎b2= MSbMSe

An , ̂𝜎2a= MSaMSe

Bn . (3.49)

By squaring and summing identity (3.42) without the interaction term, i.e.,

Yijk =•••+ (i••−•••) + (j•−•••) + (Yijki••−j•+•••), (3.50) it is straightforward to verify that all cross terms are zero, so that

SST=SS𝜇+SSa+SSb+SSe,

where, as before, the r.h.s. SS values are mutually independent (see Problem 3.2). This results in ANOVA Table 3.3.

The distributions ofSS𝜇𝛾𝜇,SSa𝛾a,SSb𝛾b, andSSe𝜎2eare the same as those in (3.44) but with𝛾i

values such that𝜎c2=0, i.e.,

𝛾𝜇=Bn𝜎2a+An𝜎b2+𝜎2e, 𝛾a=Bn𝜎a2+𝜎e2, and 𝛾b=An𝜎2b+𝜎e2. The first twoFtests in (3.47) now become

Fa= MSa MSe𝛾a

𝜎e2

FA−1,d, Fb= MSb MSe𝛾b

𝜎e2

FB−1,d, (3.51)

where the denominator degrees of freedom in (3.51) isd=ABnAB+1.

3.2.2 Three Factors

Now with three crossed factors, we observeYijkl, thelth observation in theith row,jth column,kth

“pipe”,i=1,,A,j=1,,B,k=1,,C,l=1,,n, and assume thatYijklcan be represented as Yijkl =𝜇+ai+bj+ck+dij+fik+gjk+hijk+eijkl, (3.52)

Table 3.3 ANOVA table for the balanced two-factor crossed additive (no interaction effect) random effects model, where the error degrees of freedom is d=ABnAB+1.

Source df SS EMS

Mean 1 ABnȲ•••2 𝜎e2+An𝜎b2+Bn𝜎2a+ABn𝜇2

A A−1 Bn

A i=1

(Ȳi••−Ȳ•••)2 𝜎e2+ +Bn𝜎2a

B B−1 An

B j=1

(Ȳj•−Ȳ•••)2 𝜎e2+An𝜎b2

Error d

A i=1

B j=1

n k=1

( YijkȲi••

Ȳj•+Ȳ•••

)2

𝜎e2

Total ABn

A i=1

B j=1

n k=1

Yijk2

where the ai, bj, ck, dij, fik, gjk, hijk, and eijkl are independent unobserved random variables with aii.i.d.∼ N(0, 𝜎a2), …, eijkli.i.d.∼ N(0, 𝜎e2). As with the two-factor case, the particular observedA rows,B columns, andC pipes are independently drawn from a large population of row, column, and pipe effects, respectively. The model is additive if all interaction terms are zero.

As a logical extension of (3.42), the identity

Yijkl=••••+ (i•••−••••) + (j••−••••) + (••k•−••••) + (ij••−i•••−j••+••••) + (ik•−i•••−••k•+••••) + (jk•−j••−••k•+••••)

+ (ijk•−ij••−ik•−jk•+i•••+j••+••k•−••••) + (Yijklijk•)

suggests itself as the correct one to use. Note how, omitting the••••term throughout, the bracketed terms (except the last) are generated analogous to the structure of the inclusion-exclusion principle (Poincaré’s theorem) in representing the union of events in basic probability. We will presume that, upon squaring and summing, calculations similar to (and correspondingly more tedious than) those of Problem 3.1 for the two-factor case give rise to the sums of squares decomposition

SST =SS𝜇+SSa+SSb+SSc+SSd+SSf +SSg+SSh+SSe,

such that the r.h.s.SSvalues are mutually independent (see, e.g., Graybill, 1976, p. 641 for details).

These are shown in Table 3.4 along with their correspondingEMS. Furthermore, lettingA′ =A−1, B′ =B−1, andC′=C−1, we state their distributions without proof as

SS𝜇 𝛾𝜇𝜒12

(ABCn𝜇2 𝛾𝜇

) , SSa

𝛾a𝜒A2′, SSb

𝛾b𝜒B2′, SSc 𝛾c𝜒C2′, SSd

𝛾d𝜒A2′B, SSf

𝛾f𝜒A2′C, SSg

𝛾g𝜒B2′C, SSh

𝛾h𝜒A2′BC, andSSe𝜎e2∼𝜒ABC(n−1)2 , where𝛾𝜇,𝛾a, etc., are given in Table 3.4.

EMS

Effect df SS e abc bc ac ab c b a 𝝁

𝜇 1 ABCn̄Y•••2 𝛾𝜇=𝜎2e +n𝜎h2+An𝜎2g +Bn𝜎f2+Cn𝜎2d+ABn𝜎c2+ACn𝜎b2+BCn𝜎2a+ABCn𝜇2 ai A−1 BCn∑(̄Yi•••−Ȳ••••)2 𝛾a=𝜎2e +n𝜎h2 +Bn𝜎f2+Cn𝜎2d +BCn𝜎2a

bj B−1 ACn∑(Ȳj••−Ȳ••••)2 𝛾b=𝜎2e +n𝜎h2+An𝜎2g +Cn𝜎2d +ACn𝜎b2 ck C−1 ABn∑(̄Y••k•−Ȳ••••)2 𝛾c =𝜎2e +n𝜎h2+An𝜎2g +Bn𝜎f2 +ABn𝜎c2

dij (A−1)(B−1) Cn∑ ∑( Ȳij••−Ȳi•••

Ȳj••+Ȳ••••

)2

𝛾d=𝜎2e +n𝜎h2 +Cn𝜎2d

fik (A−1)(C−1) Bn∑ ∑( Ȳik•−Ȳi•••

Ȳ••k•+Ȳ••••

)2

𝛾f =𝜎2e +n𝜎h2 +Bn𝜎f2 gjk (B−1)(C−1) An∑ ∑(

Ȳjk•−Ȳj••

Ȳ••k•+Ȳ••••

)2

𝛾g =𝜎2e +n𝜎h2+An𝜎2g

hijk (A−1)(B−1)(C−1)n∑ ∑ ∑

⎛⎜

⎜⎜

⎜⎝ Ȳijk•−Ȳij••

Ȳik•−Ȳjk

+Ȳi•••+Ȳj••

+Ȳ••k•−Ȳ••••

⎞⎟

⎟⎟

⎟⎠

2

𝛾h=𝜎2e +n𝜎h2 eijkl ABC(n−1) ∑ ∑ ∑ ∑

(YijklȲijk•)2 𝜎2e

Total ABCn ∑ ∑ ∑ ∑

Yijkl2

Some staring at theEMSvalues in Table 3.4 yields the ANOVA method point estimators

̂𝜎e2=MSe, ̂𝜎h2= MShMSe

n , (3.53)

and

̂𝜎g2= MSgMSh

An , ̂𝜎2f = MSfMSh

Bn , ̂𝜎2d= MSdMSh

Cn . (3.54)

For𝜎c2, it appears that use of the set{𝛾c, 𝛾f, 𝛾g, 𝛾h}will be fruitful, with 𝛾c=𝜎2e+n𝜎h2+An𝜎2g +Bn𝜎f2 +ABn𝜎c2,

𝛾f =𝜎2e+n𝜎h2 +Bn𝜎2f, 𝛾g =𝜎2e+n𝜎h2+An𝜎2g, 𝛾h=𝜎2e+n𝜎h2, yielding

̂𝜎c2= MScMSfMSg+MSh

ABn , (3.55)

and, similarly,

̂𝜎b2= MSbMSdMSg+MSh

ACn , ̂𝜎a2= MSaMSdMSf +MSh

BCn . (3.56)

ExactFtests for the second- and third-order interactions can be seen directly from the ANOVA table to be, withP=ABC′,

Fd = MSd MSh𝛾d

𝛾hFAB,P, Ff = MSf MSh𝛾f

𝛾hFAC,P, Fg = MSg

MSh𝛾g

𝛾hFBC,P, Fh = MSh MSe𝛾h

𝜎e2

FP,ABC(n−1).

As (3.55) and (3.56) suggest, from theEMS in Table 3.4, there does not exist exactF-ratios for testing𝜎a2,𝜎b2, and𝜎c2. For, say,𝜎2a, this would require there being a singleEMSthat is exactly equal to 𝔼[MSa] −BCn𝜎a2. Notice though, from ̂𝜎a2in (3.56), that

𝔼[MSd] +𝔼[MSf] −𝔼[MSh] =𝜎e2+n𝜎h2+Bn𝜎2f +Cn𝜎d2

=𝔼[MSa] −BCn𝜎a2, (3.57) so that the ratio

Fa′ = MSa

MSd+MSfMSh (3.58)

is a test statistic such that large values would reject the null of 𝜎2a=0, but it is notF distributed.

Its distribution could be approximated by applying the Satterthwaite method to the denominator. A potentially problematic issue withFa′is that it can be negative, and is why the next option is favored.

Expressing (3.57) as

𝔼[MSa] +𝔼[MSh] =BCn𝜎a2+𝔼[MSd] +𝔼[MSf] (3.59)

yields the test statistic Fa= MSa+MSh

MSd+MSf. (3.60)

AsSSa∕𝔼[MSa] =SSa𝛾a𝜒d2

a

independent ofSSh∕𝔼[MSh] =SSh𝛾h𝜒d2

h

(for degrees of free- domda=A′anddh=ABC′), (3.28) from the Satterthwaite method suggests for the numerator of Fathat, for some constantsh1andh2,

W = d̂𝛾

𝛾 =d h1(SSada) +h2(SShdh) h1𝔼[MSa] +h2𝔼[MSh]

app∼ 𝜒d2, (3.61)

wheredis obtained from (3.34). Buth1(SSada) =h1MSaandh2(SShdh) =h2MSh, so that (3.61) implies

h1MSa+h2MShapp∼ (h1𝔼[MSa] +h2𝔼[MSh])𝜒d2∕d, and likewise for the denominator of (3.60),

h3MSd+h4MSfapp∼ (h3𝔼[MSd] +h4𝔼[MSf])𝜒d2′∕d,

for estimated degrees of freedom valued′. Dividing each of the above two expressions by the scale term (and recalling that anFrandom variable is the ratio of two independent chi-squares divided by their respective degrees of freedom), it follows that

Fa= h1MSa+h2MSh h3MSd+h4MSf

app∼ h1𝔼[MSa] +h2𝔼[MSh]

h3𝔼[MSd] +h4𝔼[MSf]Fd,d, (3.62) a scaledFdistribution with degrees of freedomdandd′. Furthermore, if

h1𝔼[MSa] +h2𝔼[MSh] =h3𝔼[MSd] +h4𝔼[MSf],

then Faapp∼Fd,d′. But, recalling (3.59), this is the case forh1=h2=h3=h4=1 and𝜎a2=0, so that, under the null hypothesis of𝜎a2=0,Faapp∼Fd,d′, where, from (3.34),

d= (MSa+MSh)2

(MSa)2∕da+ (MSh)2∕dh and d′= (MSd+MSf)2 (MSd)2∕dd+ (MSf)2∕df,

fordd =AB′, anddf =AC′. Under the alternative of𝜎2a>0, the scale parameter in (3.62) is greater than one, implying that an approximate𝛼-level test rejects the null of𝜎2a=0 for largeFa, i.e.,Fa >Fd,d𝛼 ′, where, as always throughout,Fd,d𝛼 ′is the 100(1−𝛼)th percent quantile of theFd,d′ distribution.

The same analysis applies toFa′, i.e.,MSaexact∼ 𝔼[MSa]𝜒d2

adaand, with the Satterthwaite approxima- tion applied to the denominator,

Fa′ = MSa MSd+MSfMSh

app∼ 𝔼[MSa]

𝔼[MSd] +𝔼[MSf] −𝔼[MSh]Fd

a,d′′

with

d′′= (MSd+MSfMSh)2

(MSd)2∕dd+ (MSf)2∕df+ (MSh)2∕dh.

The reader is encouraged to repeat this analysis to obtain approximateFtests for𝜎b2and𝜎c2.

Various confidence intervals of interest can be derived from the Satterthwaite method. For example, for 𝜎2a𝜎e2, Burdick and Graybill (1992, p. 136) show that (at the time of their writing), no proce- dure other than Satterthwaite is available. This case was investigated using the general procedures for the Satterthwaite class of ratios proposed in Butler and Paolella (2002b). For the three-way crossed model and confidence intervals for𝜎a2∕𝜎e2, the bootstrap/saddlepoint-based method resulted in highly accurate actual coverage, substantially more than use of the Satterthwaite method, as Aand/or𝜎a2 decrease. For largeAand𝜎a2, the Satterthwaite method also performs well.

Một phần của tài liệu Linear models and time series analysis regression, ANOVA, ARMA and GARCH (Trang 166 - 176)

Tải bản đầy đủ (PDF)

(880 trang)