Two-Way Balanced Fixed Eﬀects ANOVA

The one-way ﬁxed eﬀects ANOVA model detailed in Section 2.4 is straightforwardly extended to support more than one factor. Here we consider the distribution theory of the balanced model with two factors. As a simple example to help visualize matters, consider again the agricultural example at the beginning of Section 2.4.1, and imagine an experiment in a greenhouse in which interest centers on a⩾2 levels of a fertilizer andb⩾2 levels of water. Allabcombinations are set up, withnreplications (plants) for each.

Once the ideas for the two-way ANOVA are laid out, the basic pattern for higher-order ﬁxed eﬀects ANOVA models with a balanced panel will be clear, and the reader should feel comfortable with con- ducting a data analysis in, say, SAS, or other software, and understand the output and how conclusions are (or should be) drawn.

After introducing the model in Section 2.5.1, Sections 2.5.2 and 2.5.3 present the basic theory of the cases without and with interaction, respectively, and the relevant ANOVA tables. Section 2.5.4 uses a simulated data set as an example to show the relevant coding in both Matlab and SAS.

2.5.1 The Model and Use of the Interaction Terms

For the two-way model, denote the first factor as A, witha⩾2 treatments, and the second factor as B, withb⩾2 treatments. The ordering of the two factors (i.e., which one is A and which one is B) is irrelevant, though, as mentioned in the Remark in Section 2.4.4, often A will refer to the factor associated with the scientific inquiry, while B is ablock, accounting for differences in some attribute such as (for human studies) gender, age group, political affiliation, educational level, geographic region, time of day (see, e.g., Pope, 2016), etc., or, in industrial experiments, the factory line, etc.

The two-way ﬁxed eﬀect ANOVA model extends the forms in (2.19) and (2.21), and is expressed as Yijk =𝜇ij+𝜖ijk, i=1,2,…,a, j=1,2,…,b, 𝜖ijk

i.i.d.

∼ N(0, 𝜎2),

=𝜇+𝛼i+𝛽j+ (𝛼𝛽)ij+𝜖ijk, (2.44)

k=1,…,n, subject to the constraints

∑a i=1

𝛼i=0,

∑b j=1

𝛽j=0,

∑a i=1

(𝛼𝛽)ij =0, ∀j,

∑b j=1

(𝛼𝛽)ij=0,∀i. (2.45)

Terms(𝛼𝛽)ijare referred to as theinteractionfactors (or eﬀects, or terms). In general, theijth group hasnij observations,i=1,…,a,j=1,…,b, and if any of thenijare not equal, the model is unbal- anced.

The usual ANOVA table will be shown below. It has in its output threeFtests and their associated p-values, corresponding to the null hypotheses that∑a

i=1𝛼i=0 (no factor A eﬀect),∑b

j=1𝛽i=0 (no factor B eﬀect), and∑a

i=1

∑b

j=1(𝛼𝛽)ij =0 (no interaction effect). One first inspects the latter; if the interaction effect can be deemed nonsignificant, then one proceeds to look at the former two. Violat- ing our agreement in the Remark in Section 2.4.2 to subsequently suppress discussion of the dangers of use of p-values for model selection, we mention that an inspection of some published research studies, and even teaching notes on ANOVA, unfortunately use wording such as “As the p-value

corresponding to the interaction effect is greater than 0.05, there is no interaction effect.” A better choice of wording might be: “Based on the reportedp-value, we will assume there is no significant interaction effect; and the subsequent analysis is conducted conditional on such, with the caveat that further experimental trials would be required to draw stronger conclusions on the presence of, and notably relevance of, interaction.”

Observe that, if only the interaction factor is used (along with, of course, the grand mean), i.e.,Yijk = 𝜇+ (𝛼𝛽)ij+𝜖ijk, then this is equivalent to a one-way ANOVA withabtreatments. If the interaction effect is deemed significant, then the value of including the𝛼iand𝛽jeffects is lowered and, possibly, rendered useless, depending on the nature of the interaction. In colloquial terms, one might describe the interaction effect as the presence ofsynergy, or the idea that a system is more than the sum of its parts. More specifically, assuming that the𝛼iand𝛽jare non-negative, the termsynergywould be used if, due to the nonzero interaction effect(𝛼𝛽)ij,𝔼[Yijk]> 𝜇+𝛼i+𝛽j, and the termantagonismwould be used if𝔼[Yijk]< 𝜇+𝛼i+𝛽j.

If there is no interaction effect (as one often hopes, as then nature is easier to describe), then the model reduces toYijk =𝜇+𝛼i+𝛽j+𝜖ijk, and is such that the effect of theith treatment from factor A does not depend on which treatment from factor B is used, and vice versa. In this case, the model is said to beadditive (in the main effects). This means, for example, that if one graphically plots, for a fixedj, ̂𝜇ij = ̂𝜇+̂𝛼i+ ̂𝛽j+(̂𝛼𝛽)ij as a function ofi, and overlays alljsuch plots, then the resulting lines will be approximately parallel (and vice versa). Such graphics are often produced by the ANOVA procedures in statistical software (see Figure 2.12 and, particularly, Figure 2.13 below) and typically accompany an empirical analysis. It should be obvious that, if the interaction terms are taken to be zero, then plots of ̂𝜇ij= ̂𝜇+̂𝛼i+ ̂𝛽jwill be, by construction, perfectly parallel.

2.5.2 Sums of Squares Decomposition Without Interaction

If one can assume there is no interaction effect, then the use ofn=1 is formally valid in (2.44), and otherwise not, though naturally the larger the cell sample sizen, the more accurate the inference. As a concrete and simplified example to visualize things, imagine treatment A has three levels, referring to the percentage reduction in daily consumed calories (say, 75%, 50%, and 25%) for a dietary study measuring percentage weight loss. If factor B is gender (male or female), then one would not expect a significant interaction effect. Similarly, if factor B entails three levels of exercise, one might also expect that factors A and B influenceYijklinearly, without an interaction, or synergy, effect.

Model (2.44) without interaction is given byYijk=𝜇+𝛼i+𝛽j+𝜖ijk, and when expressed as a linear model in matrix terms, it isY=X𝜷+𝝐, where

𝜷 = (𝜇, 𝛼1,…, 𝛼a, 𝛽1,…, 𝛽b)′. (2.46)

WithT=abn, letYbe theT×1 vector formed by stacking theYijksuch that the last index,k, “moves quickest”, in the sense of it changes on every row, followed by indexj, which changes wheneverk changes fromnto 1, and ﬁnally indexichanges slowest, wheneverjchanges frombto 1. The design matrix is then expressed as

X= [X1∣XA∣XB], (2.47)

1 n=12; % n replications per cell

2 a=3; % a treatment groups in the first factor 3 b=2; % b treatment groups in the second factor

4 T=a*b*n; oa=ones(a,1); ob=ones(b,1); on=ones(n,1); obn=ones(b*n,1);

5 X1=ones(T,1); XA=kron(eye(a), obn); XB=kron( kron( oa, eye(b) ), on );

6 X=[X1, XA, XB];

8 % The three projection matrices

9 P1=X1*inv(X1'*X1)*X1'; PA=XA*inv(XA'*XA)*XA'; PB=XB*inv(XB'*XB)*XB'; %#ok<MINV>

11 % Claim: P1=PA*PB

12 diff = P1 - PA*PB; max(max(abs(diff))) 13 % Claim: PA-P1 is orthogonal to PB-P1 14 prod = (PA-P1)*(PB-P1); max(max(abs(prod)))

Program Listing 2.5: Generates the𝐗matrix in (2.47) and (2.48).

where, denoting ann-length column of ones as 1ninstead of𝟏nto help distinguish it from the identity matrixIn,

X1=1a⊗1b⊗1n=1T, XA=Ia⊗1b⊗1n=Ia⊗1bn,

XB=1a⊗Ib⊗1n. (2.48)

This is equivalent to ﬁrst forming the Xmatrix corresponding to n=1 and then post-Kronecker multiplying by 1n, i.e.,

X(1)= [1a⊗1b∣Ia⊗1b∣1a⊗Ib], X=X(1)⊗1n. (2.49) It should be apparent thatXis not full rank. The constraints∑a

i=1𝛼i=∑b

j=1𝛽j=0 need to be respected in order to produce the usual least squares estimator of𝜷in (2.46).

Instead of using a whole page to write out an example of (2.47), the reader is encouraged to use the (top half of the) code in Listing 2.5 to understand thekronfunction in Matlab, and conﬁrm that (2.47), (2.48), and (2.49) are correct.

LetP1,PA, andPBbe the respective projection matrices ofX1,XA, andXB. In particular, lettingJm be them×mmatrix of ones,

P1= (1T)(1′T1T)−1(1′T) =T−1JT. (2.50)

Likewise, using the Kronecker product facts from (2.23), PA= (Ia⊗1bn)((Ia⊗1′bn)(Ia⊗1bn))−1(Ia⊗1′bn)

= (nb)−1(Ia⊗1bn)(Ia⊗1′bn) = (nb)−1(Ia⊗Jbn). (2.51) Observe thatPAis symmetric because of (2.23) and the symmetry ofIa andJbn, and is idempotent because

PAPA= (nb)−2(Ia⊗Jbn)(Ia⊗Jbn) = (nb)−2(Ia⊗bnJbn) =PA.

Finally, for calculatingPB, we need to extend the results in (2.23) to (A⊗B⊗C)′= ((A⊗B)⊗C)′ = ((A⊗B)′⊗C′) =A′⊗B′⊗C′ and

(A⊗B⊗C)(E⊗F⊗G) = ((A⊗B)⊗C)((E⊗F)⊗G)

= (A⊗B)(E⊗F)⊗CG= (AE⊗BF)⊗CG

=AE⊗BF⊗CG.

Then

PB= (1a⊗Ib⊗1n)((1′a⊗Ib⊗1′n)(1a⊗Ib⊗1n))−1(1′a⊗Ib⊗1′n)

= (1a⊗Ib⊗1n)((1′a1a⊗Ib⊗1′n1n))−1(1′a⊗Ib⊗1′n)

= (an)−1(1a⊗Ib⊗1n)(1′a⊗Ib⊗1′n) = (an)−1(Ja⊗Ib⊗Jn), (2.52) which is also readily seen to be symmetric and idempotent.

Note that 1T ∈(XA)and 1T ∈(XB), and that the projection fromP1is “coarser” than that ofPA andPB, so that (and recalling that projection matrices are symmetric)

PAP1=P1PA=P1, and PBP1=P1PB=P1. (2.53) In light of 1T ∈(XA)and 1T ∈(XB), and also by way of thinking how to extend (2.35) from the one-way case, we are motivated to consider the matricesPA−P1andPB−P1. From (2.53), it is triv- ial to conﬁrm thatPA−P1andPB−P1are (obviously symmetric and) idempotent, so that they are projection matrices. Thus,

P1(PA−P1) =𝟎=P1(PB−P1). (2.54)

Also,(PA−P1)(PB−P1) =PAPB−PAP1−P1PB+P1P1=PAPB−P1. The second half of Listing 2.5 numerically conﬁrms thatP1=PAPB, from which it follows that

𝟎= (PA−P1)(PB−P1), (2.55)

as also conﬁrmed numerically. The idea here is to illustrate the use of “proof by Matlab”, which can be useful in more complicated settings when the algebra looks daunting. Of course, in this case, alge- braically proving that

P1=PAPB=PBPA (2.56)

is very straightforward: Using (2.49) for simplicity,PAPBis

(Ia⊗1b)[(Ia⊗1b)′(Ia⊗1b)]−1(Ia⊗1b)′× (1a⊗Ib)[(1a⊗Ib)′(1a⊗Ib)]−1(1a⊗Ib)′

= (Ia⊗1b)(Ia⊗b)−1(Ia⊗1′b) × (1a⊗Ib)(a⊗Ib)−1(1′a⊗Ib)

=b−1(Ia⊗1b)(Ia⊗1′b) ×a−1(1a⊗Ib)(1′a⊗Ib)

=b−1(Ia⊗Jb) ×a−1(Ja⊗Ib) = (ab)−1(Ja⊗Jb) = (ab)−1Jab,

which isP1of sizeab×ab. ThatPAPB=PBPAfollows from taking transposes and recalling thatPA andPBare projection matrices and thus symmetric.

With (2.34) from the one-way case, and the previous projection matricesP1,PA−P1, andPB−P1 in mind, it suggests itself to inspect the algebraic identity

I=P1+ (PA−P1) + (PB−P1) + (I− (PA+PB−P1)), (2.57) where I=IT, and T=abn. The orthogonality results (2.54), (2.55), and, as is easily conﬁrmed using (2.56),

P1(I− (PA+PB−P1)) =P1−P1PA−P1PB+P1P1=𝟎, (PA−P1)(I− (PA+PB−P1)) =PA(I− (PA+PB−P1)) =𝟎, (PB−P1)(I− (PA+PB−P1)) =PB(I− (PA+PB−P1)) =𝟎,

imply that the terms on the right-hand side of (2.57) are orthogonal. Thus, similar to the decomposition in (2.32) and (2.35) for the one-way ANOVA, the corrected total sum of squares for the two-way ANOVA without interaction can be decomposed by subtractingP1 from both sides of (2.57) and writing

Y′(I−P1)Y=Y′(PA−P1)Y+Y′(PB−P1)Y+Y′(I− (PA+PB−P1))Y. (2.58) That is,SST =SSA+SSB+SSE, whereSST refers to the corrected total sum of squares.

Recall Theorem 1.2, which states that, if P is symmetric and idempotent, then rank(P) =k⇐⇒

tr(P) =k. This can be used precisely as in (2.36) above to determine the degrees of freedom associated with the various sum of squares, and construct the ANOVA Table 2.3. One could easily guess, and then conﬁrm, that the degrees of freedom associated withSSAandSSBarea−1 andb−1, respectively, and that forSSEis given by the (corrected) totalabn−1, minus those ofSSAandSSB.

Next, recall:

1) Model (2.44) can be expressed asY=X𝜷+𝝐, where𝜷is given in (2.46) and𝝐∼N(𝟎, 𝜎2IT),T= abn, so thatY∼N(X𝜷, 𝜎2IT).

2) Theorem A.2, which states that, forY∼N(𝝁,𝚺),𝚺>0, the two quadratic formsY′A1YandY′A2Y are independent ifA1𝚺A2=A2𝚺A1=𝟎.

Table 2.3 The ANOVA table for the balanced two-way ANOVA model without interaction eﬀect, where “error df” is (abn−1) − (a−1) − (b−1). Mean squares denote the sums of squares divided by their associated degrees of freedom. Table 2.4 is for the case with interaction, and also gives the expected mean squares.

Source of Degrees of Sum of Mean

variation freedom squares square Fstatistic p-value

Overall mean 1 abn̄Y••2

Factor A a−1 SSA MSA MSA∕MSE pA

Factor B b−1 SSB MSB MSB∕MSE pB

Error Error df SSE MSE

Total (corrected) abn−1 SST

Total abn Y′Y

Thus, the orthogonality of the projection matrices in (2.58) and Theorem A.2 (with𝚺=𝜎2I) imply thatMSA,MSB, andMSEare all pairwise independent. As such, conditional onMSE, ratiosMSA∕MSE andMSB∕MSEare independent, and so must be functions of them. This implies that

Conditional onMSE,p-valuespAandpB in Table 2.3 are independent. (2.59) Unconditionally, ratiosMSA∕MSEandMSB∕MSE, and thus theirp-values, are not independent. This is also conﬁrmed in Problem 1.16.

In our case here, we are working with projection matrices, so we can do a bit better. In particular, SSA=Y′(PA−P1)′(PA−P1)Y, and

LA∶= (PA−P1)Y∼N((PA−P1)X𝜷, 𝜎2(PA−P1)).

Likewise deﬁningLB andLE, and lettingL= [L′A,L′B,L′E]′, basic normal distribution theory implies thatLfollows a normal distribution with a block diagonal covariance matrix because of the orthogonality of the three projection matrices. As zero covariance implies independence under normality, it follows thatLA,LB, andLEarecompletelyindependent, not just pairwise.

Thus, separate functions ofLA,LB, andLE, such as their sums of squares, are also completely independent, from which it follows thatSSA,SSB, andSSE(and thusMSA,MSB, andMSE) are completely independent. This result is well known, referred to as Cochran’s theorem, dating back to Cochran (1934), and usually proven via use of characteristic or moment generating functions; see, e.g., Khuri (2010, Sec. 5.5). Surveys of, and extensions to, Cochran’s theorem can be found in Anderson and Styan (1982) and Semrl (1996). An admirable presentation in the context of elliptic distributions is given in Gupta and Varga (1993, Sec. 5.1).

Throughout the rest of this section on two-way ANOVA we will use a particular simulated data set for illustration, as detailed below, stored as variableyin Matlab.The point right now is just to show the sums of squares in(2.58)computed in diﬀerent ways. In particular, they are computed (i) via SAS, (ii) via Matlab’s canned function, and (iii) “by hand”. The reason for the latter is to ensure a full understanding of what is being computed, as, realistically, one will not do these calculations manually, but just use canned routines in statistical software packages.

Based on our particular simulated data set introduced below, the SAS code for producing the two-way ANOVA table is given (a few pages) in SAS Program Listing 2.2. There, it is shown for the case when one wishes to include the interaction term. To omit the interaction term, as required now, simply change the model line to model Happiness = Treatment Sport;. The resulting ANOVA table is shown in SAS Output 2.10.

Matlab’sanovanfunction can also compute this, and will be discussed below. The code to do so is given in Listing 2.10, using the ﬁrst 25 lines, and changing line 25 to:

1 p=anovan(y,{fac1 fac2},'model','linear','varnames',{'Treatment A','Phy Act'})

The output is shown in Figure 2.8, and is the same as that from SAS.

Finally, to use Matlab for manually computing and conﬁrming the output from the SAS proc anova and Matlab anovan functions, apply lines 1–9 from Listing 2.5, and then those in Listing 2.6, in conjunction with our simulated data set, to compute the sums of squares calculation in (2.58).

filename ein 'anova2prozac.txt';

ods pdf file='ANOVA Prozac Output.pdf';

ods rtf file='ANOVA Prozac Output.rtf';

data a;

infile ein stopover;

input Treatment $ Sport $ Happiness;

run;

proc anova;

classes Treatment Sport;

model Happiness = Treatment | Sport;

means Treatment | Sport / SCHEFFE lines cldiff;

run;

ods _all_ close;

ods html;

SAS Program Listing 2.2: Runs the ANOVA procedure in SAS for the same data set used throughout this section. The notationTreatment | Sportis short forTreatment Sport Treat- ment*Sport.

Source DF Sum of Squares Mean Square F Value Pr > F Model 3 79.7993269 26.5997756 8.26 <.0001 Error 68 219.1019446 3.2220874

Corrected Total 71 298.9012715

Source DF Anova SS Mean Square F Value Pr > F Treatment 2 53.33396806 26.66698403 8.28 0.0006 Sport 1 26.46535881 26.46535881 8.21 0.0055

SAS Output 2.10: Analysis of the simulated data set that we will use throughout, and such that the model isYijk =𝜇+𝛼i+𝛽j+𝜖ijk, i.e., does not use the interaction term. The same output for the two treatment eﬀects sums of squares, and the error sums of squares, is given via Matlab in Figure 2.8.

2.5.3 Sums of Squares Decomposition With Interaction

We now develop the ANOVA table for the full model (2.44), with interaction. As mentioned above, in practice one starts with the full model in order to inspect the strength of the interaction term, usually hoping it is insigniﬁcant, as judged inevitably by comparing thep-value of the associatedFtest to the usual values of 0.10, 0.05, and 0.01. If the researcher decides it is insigniﬁcant and wishes to proceed without an interaction term, then, formally, all subsequent analysis, point estimates, and hypothesis test results are conditional on this decision, and one is in a pre-test estimation and pre-test testing framework.

If the interaction terms are strong enough, such that the model cannot be represented accurately without them, then the full two-way ANOVA model (2.44) can be expressed asY=X𝜷+𝝐, with

𝜷= (𝜇, 𝛼1,…, 𝛼a, 𝛽1,…, 𝛽b, (𝛼𝛽)11,(𝛼𝛽)12,…,(𝛼𝛽)ab)′, (2.60)

Source Sum Sq. d.f. Mean Sq. F Prob>F ---

Treatment A 53.334 2 26.6669 8.28 0.0006 Phy Act 26.465 1 26.4652 8.21 0.0055

Error 219.102 68 3.2221

Total 298.901 71

Figure 2.8 Same as SAS Output 2.10, but having used Matlab’s functionanovan. Note that in the fourth placed after the decimal, the mean square for treatment B (“Phy Act” in Matlab; “Sport” in SAS) diﬀers among the two outputs (by one digit), presumably indicating that diﬀerent numeric algorithms are used for their respective computations. This, in turn, is most surely irrelevant given the overstated precision of theYmeasurements (they are not accurate to all 14 digits maintained in the computer), and that theFstatistics and correspondingp-values are the same to all digits shown in the two tables.

1 % Decomposition using corrected total SS, for 2-way ANOVA, no interaction 2 SScT=y'*(eye(T)-P1)*y; SSA=y'*(PA-P1)*y;

3 SSB=y'*(PB-P1)*y; SSE=y'*(eye(T)-(PA+PB-P1))*y;

4 SSvec=[SScT, SSA, SSB, SSE]; disp(SSvec') 5 check=SScT-SSA-SSB-SSE; disp(check)

Program Listing 2.6: Computes the various sums of squares in (2.58), for the two-way ANOVA model without interaction, assuming that the simulated data set we use throughout (denotedy) is in memory (see below), and having executed lines 1–9 from Listing 2.5.

and

X= [X1∣XA∣XB∣XAB], (2.61)

where the ﬁrst three terms are as in (2.48), and XAB=

⎛⎜

⎜⎜

⎝

1n 𝟎n ã ã ã 𝟎n 𝟎n 1n ã ã ã ⋮

⋮ ⋮ ⋱

𝟎n 𝟎n ã ã ã 1n

⎞⎟

⎟⎟

⎠

=Ia⊗Ib⊗1n=Iab⊗1n. (2.62)

Note that (2.62) is the same as (2.22) for the one-way ANOVA model, but withabdiﬀerent treatments instead ofa.

The sum of squares decomposition (corrected for the grand mean) with interaction term is Y′(I−P1)Y=Y′(PA−P1)Y+Y′(PB−P1)Y

+Y′(PAB−PA−PB+P1)Y+Y′(I−PAB)Y, (2.63) orSST =SSA+SSB+SSAB+SSE. As with (2.58), all terms in the center of the quadratic forms are orthogonal, e.g., recalling (2.56) and that otherwise the “more coarse” projection dominates,

(PA−P1)(PAB−PA−PB+P1)

=PA(PAB−PA−PB+P1) −P1(PAB−PA−PB+P1)

=PA−PA−P1+P1− (P1−P1−P1+P1) =𝟎.

The reader is invited to quickly conﬁrm the other cases.

It is of value to show (once) the sums of squares in (2.63) without matrix notation and contrast them with their analogous matrix expressions. As the reader should conﬁrm,

SST =

∑n k=1

∑a i=1

∑b j=1

Yijk2 −abnȲ•••2 ,

SSA=bn

∑a i=1

(Ȳi••−Ȳ•••)2, SSB=an

∑b j=1

(Ȳ•j•−Ȳ•••)2,

SSAB=n

∑a i=1

∑b j=1

(Ȳij•−Ȳi••−Ȳ•j•+Ȳ•••)2, SSE=

∑n k=1

∑a i=1

∑b j=1

(Yijk−Ȳij•)2.

Observe that SSAB+SSE in (2.63) is precisely theSSE term in (2.58). The reader is encouraged to construct code similar to that in Listings 2.5 and 2.6 to conﬁrm the ANOVA sum of squares output shown in Figure 2.11 below for the two-way ANOVA with interaction. The relevant ANOVA table is given in Table 2.4.

From the facts that (i)MSAandMSEare independent and (ii) Theorem A.1 implies each is a𝜒2 random variable divided by its respective degrees of freedom, we know that the distribution ofFA∶=

MSA∕MSEis noncentralF, witha−1 numerator andab(n−1)denominator degrees of freedom, and numerator noncentrality

𝜃A= bn 𝜎2

∑a i=1

𝛼i2, (2.64)

where (2.64) is a (correct) guess, based on the logical extension of (2.30), and subsequently derived.

We ﬁrst use it to obtain the expected mean square associated with treatment factor A. Again recalling that, forZ∼𝜒2(n, 𝜃),𝔼[Z] =n+𝜃, we have, similar to (2.41), and recalling how𝜎2gets factored out Table 2.4 The ANOVA table for the balanced two-way ANOVA model with interaction eﬀect. Mean squares denote the sums of squares divided by their associated degrees of freedom. The expected mean squares are given in (2.65), (2.72), and (2.74)

Source of Degrees of Sum of Mean Expected

variation freedom squares square mean square Fstatistic p-value

Overall mean 1 abn̄Y•••2

Factor A a−1 SSA MSA 𝔼[MSA] MSA∕MSE pA

Factor B b−1 SSB MSB 𝔼[MSA] MSB∕MSE pB

Factor A*B (a−1)(b−1) SSAB MSAB 𝔼[MSAB] MSAB∕MSE pAB

Error ab(n−1) SSE MSE

Total (corrected) abn−1 SST

Total abn Y′Y

in front as in (2.37),

𝔼[MSA] =𝜎2(a−1) +𝜃A

a−1 =𝜎2+ bn a−1

∑a i=1

𝛼2i. (2.65)

Noncentrality term (2.64) can be formally derived by using (1.92), i.e.,

(Y∕𝜎)′(PA−P1)(Y∕𝜎) ∼𝜒2(a−1,𝜷′X′(PA−P1)X𝜷∕𝜎2), (2.66) and conﬁrming that

𝜷′X′(PA−P1)X𝜷=𝜷′X′(PA−P1)′× (PA−P1)X𝜷 =bn

∑a i=1

𝛼i2. (2.67)

This would be very easy if, with𝜶= (𝛼1,…, 𝛼a)′, we can show

(PA−P1)X𝜷 =PAXA𝜶. (2.68)

If (2.68) is true, then note that, by the nature of projection,PAXA=XA, andXA𝜶=𝜶⊗1bn, and the sum of the squares of the latter term is clearlybn∑a

i=1𝛼i2. To conﬁrm (2.68), observe from (2.61) that (PA−P1)X= (PA−P1)[X1∣XA∣XB∣XAB]

= [𝟎T×1∣ (PA−P1)XA∣𝟎T×b∣ (PA−P1)XAB]. (2.69) The latter term(PA−P1)XAB≠𝟎, but if we ﬁrst assume the interaction terms(𝛼𝛽)ijare all zero, then (2.69) implies

(PA−P1)X𝜷 = (PA−P1)XA𝜶.

Now observe thatP1XA=T−1JT(Ia⊗1bn) =a−1JT,a, whereJT,ais aT×amatrix of ones. This, and the fact that∑a

i=1𝛼i=0, impliesP1XA𝜶is zero, and (2.68), and thus (2.64), are shown.

In the case with nonzero interaction terms, with

𝜸 = ((𝛼𝛽)11,(𝛼𝛽)12,…,(𝛼𝛽)ab)′, (2.70)

we (cut corners and) conﬁrm numerically that(PA−P1)XAB𝜸 =𝟎(aT-length column of zeros), pro- vided that the constraints on the interaction terms in (2.45) are met. It isnotenough that allabterms sum to zero. The reader is encouraged to also numerically conﬁrm this, and, better, prove it alge- braically.

Thus,FA∼Fa−1,ab(n−1)(𝜃A), and the power of the test is Pr(FA>cA), wherecAis the cutoﬀ value under the null (central) distribution for a given test signiﬁcance level𝛼. Based on the values we use below in an empirical example, namelyn=12,a=3,b=2,𝜎=2, and∑a

i=1𝛼i2=2∕3, (2.64) yields𝜃A=4, so that the power of the test with signiﬁcance level𝛼=0.05 is 0.399, as computed with the code in Listing 2.7.

Analogous to (2.64), the test statistic associated with eﬀect B isFB∼Fb−1,ab(n−1)(𝜃B), where 𝜃B= an

𝜎2

∑b j=1

𝛽j2, (2.71)

which is𝜃B=81∕8 in our case, yielding a power of 0.880. Also analogously, 𝔼[MSB] =𝜎2(b−1) +𝜃B

b−1 =𝜎2+ an b−1

∑b j=1

𝛽j2. (2.72)

Ordinary and Generalized Least Squares

The Geometric Approach to Least Squares