The one-way fixed effects ANOVA model detailed in Section 2.4 is straightforwardly extended to support more than one factor. Here we consider the distribution theory of the balanced model with two factors. As a simple example to help visualize matters, consider again the agricultural example at the beginning of Section 2.4.1, and imagine an experiment in a greenhouse in which interest centers on a⩾2 levels of a fertilizer andb⩾2 levels of water. Allabcombinations are set up, withnreplications (plants) for each.
Once the ideas for the two-way ANOVA are laid out, the basic pattern for higher-order fixed effects ANOVA models with a balanced panel will be clear, and the reader should feel comfortable with con- ducting a data analysis in, say, SAS, or other software, and understand the output and how conclusions are (or should be) drawn.
After introducing the model in Section 2.5.1, Sections 2.5.2 and 2.5.3 present the basic theory of the cases without and with interaction, respectively, and the relevant ANOVA tables. Section 2.5.4 uses a simulated data set as an example to show the relevant coding in both Matlab and SAS.
2.5.1 The Model and Use of the Interaction Terms
For the two-way model, denote the first factor as A, witha⩾2 treatments, and the second factor as B, withb⩾2 treatments. The ordering of the two factors (i.e., which one is A and which one is B) is irrelevant, though, as mentioned in the Remark in Section 2.4.4, often A will refer to the factor associ- ated with the scientific inquiry, while B is ablock, accounting for differences in some attribute such as (for human studies) gender, age group, political affiliation, educational level, geographic region, time of day (see, e.g., Pope, 2016), etc., or, in industrial experiments, the factory line, etc.
The two-way fixed effect ANOVA model extends the forms in (2.19) and (2.21), and is expressed as Yijk =𝜇ij+𝜖ijk, i=1,2,…,a, j=1,2,…,b, 𝜖ijk
i.i.d.
∼ N(0, 𝜎2),
=𝜇+𝛼i+𝛽j+ (𝛼𝛽)ij+𝜖ijk, (2.44)
k=1,…,n, subject to the constraints
∑a i=1
𝛼i=0,
∑b j=1
𝛽j=0,
∑a i=1
(𝛼𝛽)ij =0, ∀j,
∑b j=1
(𝛼𝛽)ij=0,∀i. (2.45)
Terms(𝛼𝛽)ijare referred to as theinteractionfactors (or effects, or terms). In general, theijth group hasnij observations,i=1,…,a,j=1,…,b, and if any of thenijare not equal, the model is unbal- anced.
The usual ANOVA table will be shown below. It has in its output threeFtests and their associated p-values, corresponding to the null hypotheses that∑a
i=1𝛼i=0 (no factor A effect),∑b
j=1𝛽i=0 (no factor B effect), and∑a
i=1
∑b
j=1(𝛼𝛽)ij =0 (no interaction effect). One first inspects the latter; if the interaction effect can be deemed nonsignificant, then one proceeds to look at the former two. Violat- ing our agreement in the Remark in Section 2.4.2 to subsequently suppress discussion of the dangers of use of p-values for model selection, we mention that an inspection of some published research studies, and even teaching notes on ANOVA, unfortunately use wording such as “As the p-value
corresponding to the interaction effect is greater than 0.05, there is no interaction effect.” A better choice of wording might be: “Based on the reportedp-value, we will assume there is no significant interaction effect; and the subsequent analysis is conducted conditional on such, with the caveat that further experimental trials would be required to draw stronger conclusions on the presence of, and notably relevance of, interaction.”
Observe that, if only the interaction factor is used (along with, of course, the grand mean), i.e.,Yijk = 𝜇+ (𝛼𝛽)ij+𝜖ijk, then this is equivalent to a one-way ANOVA withabtreatments. If the interaction effect is deemed significant, then the value of including the𝛼iand𝛽jeffects is lowered and, possibly, rendered useless, depending on the nature of the interaction. In colloquial terms, one might describe the interaction effect as the presence ofsynergy, or the idea that a system is more than the sum of its parts. More specifically, assuming that the𝛼iand𝛽jare non-negative, the termsynergywould be used if, due to the nonzero interaction effect(𝛼𝛽)ij,𝔼[Yijk]> 𝜇+𝛼i+𝛽j, and the termantagonismwould be used if𝔼[Yijk]< 𝜇+𝛼i+𝛽j.
If there is no interaction effect (as one often hopes, as then nature is easier to describe), then the model reduces toYijk =𝜇+𝛼i+𝛽j+𝜖ijk, and is such that the effect of theith treatment from factor A does not depend on which treatment from factor B is used, and vice versa. In this case, the model is said to beadditive (in the main effects). This means, for example, that if one graphically plots, for a fixedj, ̂𝜇ij = ̂𝜇+̂𝛼i+ ̂𝛽j+(̂𝛼𝛽)ij as a function ofi, and overlays alljsuch plots, then the resulting lines will be approximately parallel (and vice versa). Such graphics are often produced by the ANOVA procedures in statistical software (see Figure 2.12 and, particularly, Figure 2.13 below) and typically accompany an empirical analysis. It should be obvious that, if the interaction terms are taken to be zero, then plots of ̂𝜇ij= ̂𝜇+̂𝛼i+ ̂𝛽jwill be, by construction, perfectly parallel.
2.5.2 Sums of Squares Decomposition Without Interaction
If one can assume there is no interaction effect, then the use ofn=1 is formally valid in (2.44), and otherwise not, though naturally the larger the cell sample sizen, the more accurate the inference. As a concrete and simplified example to visualize things, imagine treatment A has three levels, referring to the percentage reduction in daily consumed calories (say, 75%, 50%, and 25%) for a dietary study measuring percentage weight loss. If factor B is gender (male or female), then one would not expect a significant interaction effect. Similarly, if factor B entails three levels of exercise, one might also expect that factors A and B influenceYijklinearly, without an interaction, or synergy, effect.
Model (2.44) without interaction is given byYijk=𝜇+𝛼i+𝛽j+𝜖ijk, and when expressed as a linear model in matrix terms, it isY=X𝜷+𝝐, where
𝜷 = (𝜇, 𝛼1,…, 𝛼a, 𝛽1,…, 𝛽b)′. (2.46)
WithT=abn, letYbe theT×1 vector formed by stacking theYijksuch that the last index,k, “moves quickest”, in the sense of it changes on every row, followed by indexj, which changes wheneverk changes fromnto 1, and finally indexichanges slowest, wheneverjchanges frombto 1. The design matrix is then expressed as
X= [X1∣XA∣XB], (2.47)
1 n=12; % n replications per cell
2 a=3; % a treatment groups in the first factor 3 b=2; % b treatment groups in the second factor
4 T=a*b*n; oa=ones(a,1); ob=ones(b,1); on=ones(n,1); obn=ones(b*n,1);
5 X1=ones(T,1); XA=kron(eye(a), obn); XB=kron( kron( oa, eye(b) ), on );
6 X=[X1, XA, XB];
7
8 % The three projection matrices
9 P1=X1*inv(X1'*X1)*X1'; PA=XA*inv(XA'*XA)*XA'; PB=XB*inv(XB'*XB)*XB'; %#ok<MINV>
10
11 % Claim: P1=PA*PB
12 diff = P1 - PA*PB; max(max(abs(diff))) 13 % Claim: PA-P1 is orthogonal to PB-P1 14 prod = (PA-P1)*(PB-P1); max(max(abs(prod)))
Program Listing 2.5: Generates the𝐗matrix in (2.47) and (2.48).
where, denoting ann-length column of ones as 1ninstead of𝟏nto help distinguish it from the identity matrixIn,
X1=1a⊗1b⊗1n=1T, XA=Ia⊗1b⊗1n=Ia⊗1bn,
XB=1a⊗Ib⊗1n. (2.48)
This is equivalent to first forming the Xmatrix corresponding to n=1 and then post-Kronecker multiplying by 1n, i.e.,
X(1)= [1a⊗1b∣Ia⊗1b∣1a⊗Ib], X=X(1)⊗1n. (2.49) It should be apparent thatXis not full rank. The constraints∑a
i=1𝛼i=∑b
j=1𝛽j=0 need to be respected in order to produce the usual least squares estimator of𝜷in (2.46).
Instead of using a whole page to write out an example of (2.47), the reader is encouraged to use the (top half of the) code in Listing 2.5 to understand thekronfunction in Matlab, and confirm that (2.47), (2.48), and (2.49) are correct.
LetP1,PA, andPBbe the respective projection matrices ofX1,XA, andXB. In particular, lettingJm be them×mmatrix of ones,
P1= (1T)(1′T1T)−1(1′T) =T−1JT. (2.50)
Likewise, using the Kronecker product facts from (2.23), PA= (Ia⊗1bn)((Ia⊗1′bn)(Ia⊗1bn))−1(Ia⊗1′bn)
= (nb)−1(Ia⊗1bn)(Ia⊗1′bn) = (nb)−1(Ia⊗Jbn). (2.51) Observe thatPAis symmetric because of (2.23) and the symmetry ofIa andJbn, and is idempotent because
PAPA= (nb)−2(Ia⊗Jbn)(Ia⊗Jbn) = (nb)−2(Ia⊗bnJbn) =PA.
Finally, for calculatingPB, we need to extend the results in (2.23) to (A⊗B⊗C)′= ((A⊗B)⊗C)′ = ((A⊗B)′⊗C′) =A′⊗B′⊗C′ and
(A⊗B⊗C)(E⊗F⊗G) = ((A⊗B)⊗C)((E⊗F)⊗G)
= (A⊗B)(E⊗F)⊗CG= (AE⊗BF)⊗CG
=AE⊗BF⊗CG.
Then
PB= (1a⊗Ib⊗1n)((1′a⊗Ib⊗1′n)(1a⊗Ib⊗1n))−1(1′a⊗Ib⊗1′n)
= (1a⊗Ib⊗1n)((1′a1a⊗Ib⊗1′n1n))−1(1′a⊗Ib⊗1′n)
= (an)−1(1a⊗Ib⊗1n)(1′a⊗Ib⊗1′n) = (an)−1(Ja⊗Ib⊗Jn), (2.52) which is also readily seen to be symmetric and idempotent.
Note that 1T ∈(XA)and 1T ∈(XB), and that the projection fromP1is “coarser” than that ofPA andPB, so that (and recalling that projection matrices are symmetric)
PAP1=P1PA=P1, and PBP1=P1PB=P1. (2.53) In light of 1T ∈(XA)and 1T ∈(XB), and also by way of thinking how to extend (2.35) from the one-way case, we are motivated to consider the matricesPA−P1andPB−P1. From (2.53), it is triv- ial to confirm thatPA−P1andPB−P1are (obviously symmetric and) idempotent, so that they are projection matrices. Thus,
P1(PA−P1) =𝟎=P1(PB−P1). (2.54)
Also,(PA−P1)(PB−P1) =PAPB−PAP1−P1PB+P1P1=PAPB−P1. The second half of Listing 2.5 numerically confirms thatP1=PAPB, from which it follows that
𝟎= (PA−P1)(PB−P1), (2.55)
as also confirmed numerically. The idea here is to illustrate the use of “proof by Matlab”, which can be useful in more complicated settings when the algebra looks daunting. Of course, in this case, alge- braically proving that
P1=PAPB=PBPA (2.56)
is very straightforward: Using (2.49) for simplicity,PAPBis
(Ia⊗1b)[(Ia⊗1b)′(Ia⊗1b)]−1(Ia⊗1b)′× (1a⊗Ib)[(1a⊗Ib)′(1a⊗Ib)]−1(1a⊗Ib)′
= (Ia⊗1b)(Ia⊗b)−1(Ia⊗1′b) × (1a⊗Ib)(a⊗Ib)−1(1′a⊗Ib)
=b−1(Ia⊗1b)(Ia⊗1′b) ×a−1(1a⊗Ib)(1′a⊗Ib)
=b−1(Ia⊗Jb) ×a−1(Ja⊗Ib) = (ab)−1(Ja⊗Jb) = (ab)−1Jab,
which isP1of sizeab×ab. ThatPAPB=PBPAfollows from taking transposes and recalling thatPA andPBare projection matrices and thus symmetric.
With (2.34) from the one-way case, and the previous projection matricesP1,PA−P1, andPB−P1 in mind, it suggests itself to inspect the algebraic identity
I=P1+ (PA−P1) + (PB−P1) + (I− (PA+PB−P1)), (2.57) where I=IT, and T=abn. The orthogonality results (2.54), (2.55), and, as is easily confirmed using (2.56),
P1(I− (PA+PB−P1)) =P1−P1PA−P1PB+P1P1=𝟎, (PA−P1)(I− (PA+PB−P1)) =PA(I− (PA+PB−P1)) =𝟎, (PB−P1)(I− (PA+PB−P1)) =PB(I− (PA+PB−P1)) =𝟎,
imply that the terms on the right-hand side of (2.57) are orthogonal. Thus, similar to the decomposi- tion in (2.32) and (2.35) for the one-way ANOVA, the corrected total sum of squares for the two-way ANOVA without interaction can be decomposed by subtractingP1 from both sides of (2.57) and writing
Y′(I−P1)Y=Y′(PA−P1)Y+Y′(PB−P1)Y+Y′(I− (PA+PB−P1))Y. (2.58) That is,SST =SSA+SSB+SSE, whereSST refers to the corrected total sum of squares.
Recall Theorem 1.2, which states that, if P is symmetric and idempotent, then rank(P) =k⇐⇒
tr(P) =k. This can be used precisely as in (2.36) above to determine the degrees of freedom associated with the various sum of squares, and construct the ANOVA Table 2.3. One could easily guess, and then confirm, that the degrees of freedom associated withSSAandSSBarea−1 andb−1, respectively, and that forSSEis given by the (corrected) totalabn−1, minus those ofSSAandSSB.
Next, recall:
1) Model (2.44) can be expressed asY=X𝜷+𝝐, where𝜷is given in (2.46) and𝝐∼N(𝟎, 𝜎2IT),T= abn, so thatY∼N(X𝜷, 𝜎2IT).
2) Theorem A.2, which states that, forY∼N(𝝁,𝚺),𝚺>0, the two quadratic formsY′A1YandY′A2Y are independent ifA1𝚺A2=A2𝚺A1=𝟎.
Table 2.3 The ANOVA table for the balanced two-way ANOVA model without interaction effect, where “error df” is (abn−1) − (a−1) − (b−1). Mean squares denote the sums of squares divided by their associated degrees of freedom. Table 2.4 is for the case with interaction, and also gives the expected mean squares.
Source of Degrees of Sum of Mean
variation freedom squares square Fstatistic p-value
Overall mean 1 abn̄Y••2
Factor A a−1 SSA MSA MSA∕MSE pA
Factor B b−1 SSB MSB MSB∕MSE pB
Error Error df SSE MSE
Total (corrected) abn−1 SST
Total abn Y′Y
Thus, the orthogonality of the projection matrices in (2.58) and Theorem A.2 (with𝚺=𝜎2I) imply thatMSA,MSB, andMSEare all pairwise independent. As such, conditional onMSE, ratiosMSA∕MSE andMSB∕MSEare independent, and so must be functions of them. This implies that
Conditional onMSE,p-valuespAandpB in Table 2.3 are independent. (2.59) Unconditionally, ratiosMSA∕MSEandMSB∕MSE, and thus theirp-values, are not independent. This is also confirmed in Problem 1.16.
In our case here, we are working with projection matrices, so we can do a bit better. In particular, SSA=Y′(PA−P1)′(PA−P1)Y, and
LA∶= (PA−P1)Y∼N((PA−P1)X𝜷, 𝜎2(PA−P1)).
Likewise definingLB andLE, and lettingL= [L′A,L′B,L′E]′, basic normal distribution theory implies thatLfollows a normal distribution with a block diagonal covariance matrix because of the orthogo- nality of the three projection matrices. As zero covariance implies independence under normality, it follows thatLA,LB, andLEarecompletelyindependent, not just pairwise.
Thus, separate functions ofLA,LB, andLE, such as their sums of squares, are also completely inde- pendent, from which it follows thatSSA,SSB, andSSE(and thusMSA,MSB, andMSE) are completely independent. This result is well known, referred to as Cochran’s theorem, dating back to Cochran (1934), and usually proven via use of characteristic or moment generating functions; see, e.g., Khuri (2010, Sec. 5.5). Surveys of, and extensions to, Cochran’s theorem can be found in Anderson and Styan (1982) and Semrl (1996). An admirable presentation in the context of elliptic distributions is given in Gupta and Varga (1993, Sec. 5.1).
Throughout the rest of this section on two-way ANOVA we will use a particular simulated data set for illustration, as detailed below, stored as variableyin Matlab.The point right now is just to show the sums of squares in(2.58)computed in different ways. In particular, they are computed (i) via SAS, (ii) via Matlab’s canned function, and (iii) “by hand”. The reason for the latter is to ensure a full understanding of what is being computed, as, realistically, one will not do these calculations manually, but just use canned routines in statistical software packages.
Based on our particular simulated data set introduced below, the SAS code for producing the two-way ANOVA table is given (a few pages) in SAS Program Listing 2.2. There, it is shown for the case when one wishes to include the interaction term. To omit the interaction term, as required now, simply change the model line to model Happiness = Treatment Sport;. The resulting ANOVA table is shown in SAS Output 2.10.
Matlab’sanovanfunction can also compute this, and will be discussed below. The code to do so is given in Listing 2.10, using the first 25 lines, and changing line 25 to:
1 p=anovan(y,{fac1 fac2},'model','linear','varnames',{'Treatment A','Phy Act'})
The output is shown in Figure 2.8, and is the same as that from SAS.
Finally, to use Matlab for manually computing and confirming the output from the SAS proc anova and Matlab anovan functions, apply lines 1–9 from Listing 2.5, and then those in Listing 2.6, in conjunction with our simulated data set, to compute the sums of squares calculation in (2.58).
filename ein 'anova2prozac.txt';
ods pdf file='ANOVA Prozac Output.pdf';
ods rtf file='ANOVA Prozac Output.rtf';
data a;
infile ein stopover;
input Treatment $ Sport $ Happiness;
run;
proc anova;
classes Treatment Sport;
model Happiness = Treatment | Sport;
means Treatment | Sport / SCHEFFE lines cldiff;
run;
ods _all_ close;
ods html;
SAS Program Listing 2.2: Runs the ANOVA procedure in SAS for the same data set used through- out this section. The notationTreatment | Sportis short forTreatment Sport Treat- ment*Sport.
Source DF Sum of Squares Mean Square F Value Pr > F Model 3 79.7993269 26.5997756 8.26 <.0001 Error 68 219.1019446 3.2220874
Corrected Total 71 298.9012715
Source DF Anova SS Mean Square F Value Pr > F Treatment 2 53.33396806 26.66698403 8.28 0.0006 Sport 1 26.46535881 26.46535881 8.21 0.0055
SAS Output 2.10: Analysis of the simulated data set that we will use throughout, and such that the model isYijk =𝜇+𝛼i+𝛽j+𝜖ijk, i.e., does not use the interaction term. The same output for the two treatment effects sums of squares, and the error sums of squares, is given via Matlab in Figure 2.8.
2.5.3 Sums of Squares Decomposition With Interaction
We now develop the ANOVA table for the full model (2.44), with interaction. As mentioned above, in practice one starts with the full model in order to inspect the strength of the interaction term, usually hoping it is insignificant, as judged inevitably by comparing thep-value of the associatedFtest to the usual values of 0.10, 0.05, and 0.01. If the researcher decides it is insignificant and wishes to proceed without an interaction term, then, formally, all subsequent analysis, point estimates, and hypothesis test results are conditional on this decision, and one is in a pre-test estimation and pre-test testing framework.
If the interaction terms are strong enough, such that the model cannot be represented accurately without them, then the full two-way ANOVA model (2.44) can be expressed asY=X𝜷+𝝐, with
𝜷= (𝜇, 𝛼1,…, 𝛼a, 𝛽1,…, 𝛽b, (𝛼𝛽)11,(𝛼𝛽)12,…,(𝛼𝛽)ab)′, (2.60)
Source Sum Sq. d.f. Mean Sq. F Prob>F ---
Treatment A 53.334 2 26.6669 8.28 0.0006 Phy Act 26.465 1 26.4652 8.21 0.0055
Error 219.102 68 3.2221
Total 298.901 71
Figure 2.8 Same as SAS Output 2.10, but having used Matlab’s functionanovan. Note that in the fourth placed after the decimal, the mean square for treatment B (“Phy Act” in Matlab; “Sport” in SAS) differs among the two outputs (by one digit), presumably indicating that different numeric algorithms are used for their respective computations. This, in turn, is most surely irrelevant given the overstated precision of theYmeasurements (they are not accurate to all 14 digits maintained in the computer), and that theFstatistics and correspondingp-values are the same to all digits shown in the two tables.
1 % Decomposition using corrected total SS, for 2-way ANOVA, no interaction 2 SScT=y'*(eye(T)-P1)*y; SSA=y'*(PA-P1)*y;
3 SSB=y'*(PB-P1)*y; SSE=y'*(eye(T)-(PA+PB-P1))*y;
4 SSvec=[SScT, SSA, SSB, SSE]; disp(SSvec') 5 check=SScT-SSA-SSB-SSE; disp(check)
Program Listing 2.6: Computes the various sums of squares in (2.58), for the two-way ANOVA model without interaction, assuming that the simulated data set we use throughout (denotedy) is in memory (see below), and having executed lines 1–9 from Listing 2.5.
and
X= [X1∣XA∣XB∣XAB], (2.61)
where the first three terms are as in (2.48), and XAB=
⎛⎜
⎜⎜
⎝
1n 𝟎n ã ã ã 𝟎n 𝟎n 1n ã ã ã ⋮
⋮ ⋮ ⋱
𝟎n 𝟎n ã ã ã 1n
⎞⎟
⎟⎟
⎠
=Ia⊗Ib⊗1n=Iab⊗1n. (2.62)
Note that (2.62) is the same as (2.22) for the one-way ANOVA model, but withabdifferent treatments instead ofa.
The sum of squares decomposition (corrected for the grand mean) with interaction term is Y′(I−P1)Y=Y′(PA−P1)Y+Y′(PB−P1)Y
+Y′(PAB−PA−PB+P1)Y+Y′(I−PAB)Y, (2.63) orSST =SSA+SSB+SSAB+SSE. As with (2.58), all terms in the center of the quadratic forms are orthogonal, e.g., recalling (2.56) and that otherwise the “more coarse” projection dominates,
(PA−P1)(PAB−PA−PB+P1)
=PA(PAB−PA−PB+P1) −P1(PAB−PA−PB+P1)
=PA−PA−P1+P1− (P1−P1−P1+P1) =𝟎.
The reader is invited to quickly confirm the other cases.
It is of value to show (once) the sums of squares in (2.63) without matrix notation and contrast them with their analogous matrix expressions. As the reader should confirm,
SST =
∑n k=1
∑a i=1
∑b j=1
Yijk2 −abnȲ•••2 ,
SSA=bn
∑a i=1
(Ȳi••−Ȳ•••)2, SSB=an
∑b j=1
(Ȳ•j•−Ȳ•••)2,
SSAB=n
∑a i=1
∑b j=1
(Ȳij•−Ȳi••−Ȳ•j•+Ȳ•••)2, SSE=
∑n k=1
∑a i=1
∑b j=1
(Yijk−Ȳij•)2.
Observe that SSAB+SSE in (2.63) is precisely theSSE term in (2.58). The reader is encouraged to construct code similar to that in Listings 2.5 and 2.6 to confirm the ANOVA sum of squares output shown in Figure 2.11 below for the two-way ANOVA with interaction. The relevant ANOVA table is given in Table 2.4.
From the facts that (i)MSAandMSEare independent and (ii) Theorem A.1 implies each is a𝜒2 random variable divided by its respective degrees of freedom, we know that the distribution ofFA∶=
MSA∕MSEis noncentralF, witha−1 numerator andab(n−1)denominator degrees of freedom, and numerator noncentrality
𝜃A= bn 𝜎2
∑a i=1
𝛼i2, (2.64)
where (2.64) is a (correct) guess, based on the logical extension of (2.30), and subsequently derived.
We first use it to obtain the expected mean square associated with treatment factor A. Again recalling that, forZ∼𝜒2(n, 𝜃),𝔼[Z] =n+𝜃, we have, similar to (2.41), and recalling how𝜎2gets factored out Table 2.4 The ANOVA table for the balanced two-way ANOVA model with interaction effect. Mean squares denote the sums of squares divided by their associated degrees of freedom. The expected mean squares are given in (2.65), (2.72), and (2.74)
Source of Degrees of Sum of Mean Expected
variation freedom squares square mean square Fstatistic p-value
Overall mean 1 abn̄Y•••2
Factor A a−1 SSA MSA 𝔼[MSA] MSA∕MSE pA
Factor B b−1 SSB MSB 𝔼[MSA] MSB∕MSE pB
Factor A*B (a−1)(b−1) SSAB MSAB 𝔼[MSAB] MSAB∕MSE pAB
Error ab(n−1) SSE MSE
Total (corrected) abn−1 SST
Total abn Y′Y
in front as in (2.37),
𝔼[MSA] =𝜎2(a−1) +𝜃A
a−1 =𝜎2+ bn a−1
∑a i=1
𝛼2i. (2.65)
Noncentrality term (2.64) can be formally derived by using (1.92), i.e.,
(Y∕𝜎)′(PA−P1)(Y∕𝜎) ∼𝜒2(a−1,𝜷′X′(PA−P1)X𝜷∕𝜎2), (2.66) and confirming that
𝜷′X′(PA−P1)X𝜷=𝜷′X′(PA−P1)′× (PA−P1)X𝜷 =bn
∑a i=1
𝛼i2. (2.67)
This would be very easy if, with𝜶= (𝛼1,…, 𝛼a)′, we can show
(PA−P1)X𝜷 =PAXA𝜶. (2.68)
If (2.68) is true, then note that, by the nature of projection,PAXA=XA, andXA𝜶=𝜶⊗1bn, and the sum of the squares of the latter term is clearlybn∑a
i=1𝛼i2. To confirm (2.68), observe from (2.61) that (PA−P1)X= (PA−P1)[X1∣XA∣XB∣XAB]
= [𝟎T×1∣ (PA−P1)XA∣𝟎T×b∣ (PA−P1)XAB]. (2.69) The latter term(PA−P1)XAB≠𝟎, but if we first assume the interaction terms(𝛼𝛽)ijare all zero, then (2.69) implies
(PA−P1)X𝜷 = (PA−P1)XA𝜶.
Now observe thatP1XA=T−1JT(Ia⊗1bn) =a−1JT,a, whereJT,ais aT×amatrix of ones. This, and the fact that∑a
i=1𝛼i=0, impliesP1XA𝜶is zero, and (2.68), and thus (2.64), are shown.
In the case with nonzero interaction terms, with
𝜸 = ((𝛼𝛽)11,(𝛼𝛽)12,…,(𝛼𝛽)ab)′, (2.70)
we (cut corners and) confirm numerically that(PA−P1)XAB𝜸 =𝟎(aT-length column of zeros), pro- vided that the constraints on the interaction terms in (2.45) are met. It isnotenough that allabterms sum to zero. The reader is encouraged to also numerically confirm this, and, better, prove it alge- braically.
Thus,FA∼Fa−1,ab(n−1)(𝜃A), and the power of the test is Pr(FA>cA), wherecAis the cutoff value under the null (central) distribution for a given test significance level𝛼. Based on the values we use below in an empirical example, namelyn=12,a=3,b=2,𝜎=2, and∑a
i=1𝛼i2=2∕3, (2.64) yields𝜃A=4, so that the power of the test with significance level𝛼=0.05 is 0.399, as computed with the code in Listing 2.7.
Analogous to (2.64), the test statistic associated with effect B isFB∼Fb−1,ab(n−1)(𝜃B), where 𝜃B= an
𝜎2
∑b j=1
𝛽j2, (2.71)
which is𝜃B=81∕8 in our case, yielding a power of 0.880. Also analogously, 𝔼[MSB] =𝜎2(b−1) +𝜃B
b−1 =𝜎2+ an b−1
∑b j=1
𝛽j2. (2.72)