Two Sample t-Tests for Diﬀerences in Means- 123docz.net

Every basic statistics course discusses the classict-test for the null of equality of the means of two normal populations. This is done under the assumption of equal population variances, and usually also without the equality assumption. In both cases, the test decision is the same as that delivered by the binary result of zero being in or out of the corresponding conﬁdence interval, the latter having been detailed in Section III.8.3.

Arguments in favor of the use of conﬁdence intervals and the study of eﬀect sizes, as opposed to the blind application of hypothesis tests, were discussed in Section III.2.8. There, it was also discussed how hypothesis testing can have a useful role in inference, notably in randomized studies that are repeatable. We now derive the distribution of the associated test statistic, under the equal variance

assumption, using the linear model framework. This is an easy task, given the general results from Chapter 1.

LetY1ji.i.d.∼ N(𝜇1, 𝜎2),j=1,…,m, independent ofY2ji.i.d.∼ N(𝜇2, 𝜎2),j=1,…,n, with𝜎2>0. This can be expressed as the linear modelY=X𝜷+𝝐, where, in standard notation,𝝐∼N(𝟎, 𝜎2IN),N=m+n,

[ 𝟏m 𝟎m 𝟎n 𝟏n

] , 𝜷=

[𝜇1 𝜇2 ]

, Y= [Y1

Y2 ]

, (2.1)

and

̂𝜷= (X′X)−1X′Y=

[ m 0

0 n

]−1[ Y1•

Y2•

]

= [Ȳ1•

Ȳ2•

]

, (2.2)

where we deﬁne the notation Y1•=

∑m j=1

Y1j, Ȳ1• = Y1•

m, and likewise, Y2• =

∑n j=1

Y2j, Ȳ2• = Y2•

n . (2.3)

The residual sum of squares, RSS=S(𝜷̂) =̂𝝐′̂𝝐, is immediately seen to be S(̂𝜷) =

∑m j=1

(Y1j−Ȳ1•)2+

∑n j=1

(Y2j−Ȳ2•)2= (m−1)S21+ (n−1)S22, (2.4) whereS2i is the sample variance based on the data from groupi,i=1,2. Thus, from (2.4) and (1.58), an unbiased estimator of𝜎2is

̂𝜎2=S(𝜷)∕(m̂ +n−2). (2.5)

In the case thatm=n(as we will consider below, witha⩾2 groups instead of just two, for the bal- anced one-way ﬁxed eﬀects ANOVA model), (2.5) can be expressed as

(m=n), (a=2), ̂𝜎2= 1 a(n−1)

∑a i=1

∑n j=1

(Yij−Ȳi•)2. (2.6)

Remark (1.57) states that RSS=Y′(I−P)Y=‖Y‖2−‖X̂𝜷‖2, where P is the usual projection matrixP=X(X′X)−1X′. It is a useful exercise to conﬁrm, in this simple setting, that this RSS formula also leads to (2.4). For clarity, letȲ1•2 = (Ȳ1•)2. We have, from the deﬁnition ofXand𝜷̂in (2.2),

∥Y∥2− ∥X𝜷∥̂ 2=

∑m j=1

Y1j2+

∑n j=1

Y2j2−mȲ1•2 −nȲ2•2

∑m j=1

(Y1j2−Ȳ1•2) +

∑n j=1

(Y2j2−Ȳ2•2). (2.7)

But, as

∑m j=1

(Y1j−Ȳ1•)2=

∑m j=1

Y1j2−2

∑m j=1

Y1jȲ1•+

∑m j=1

Ȳ1•2

∑m j=1

Y1j2−2mȲ1•2 +mȲ1•2 =

∑m j=1

Y1j2−mȲ1•2 (2.8)

∑m j=1

(Y1j2−Ȳ1•2),

and likewise for the second group, (2.7) is equivalent to (2.4). ◾ The null hypothesis is that𝜇1=𝜇2, and in the notation of Section 1.4, H𝜷=b, withJ=1,H= [1,−1]and scalarb=0. From (1.90) withA= (X′X)−1, it follows thatHAH′=m−1+n−1. Thus, (1.87) is painlessly seen to be

Y′(P−P)Y=S(̂𝜸) −S(𝜷) = (Ĥ ̂𝜷)′(HAH′)−1H𝜷̂= (Ȳ1•−Ȳ2•)2

m−1+n−1 . (2.9)

Remark As we did above for (2.4), it is instructive to derive (2.9) by brute force, directly evaluating S(̂𝜸) −S(̂𝜷). Here, it will be convenient to letn1=mandn2=n, which would anyway be necessary in the general unbalanced case witha⩾2 groups. Under the reduced model,PY=X̂𝜸with

𝜸 =Ȳ••=N−1

∑2 i=1

∑

j=1

Yij=N−1Y••,

this being the mean of all theYij, whereN =n1+n2. Then S(̂𝜸) =

∑

j=1

(Y1j−Ȳ••)2+

∑

j=1

(Y2j−Ȳ••)2

= (Y2)1•−2Ȳ••Y1•+n1(Ȳ••)2+ (Y2)2•−2Ȳ••Y2•+n2(Ȳ••)2

= (Y2)1•+ (Y2)2•−N(Ȳ••)2,

which could have been more easily determined by realizing that, in this case,S(̂𝜸) = (Y2)••−N(Ȳ••)2, and(Y2)••= (Y2)1•+ (Y2)2•. Observe that

N(Ȳ••)2=N−1(Y1•+Y2•)2

=N−1(Y1•)2+N−1(Y2•)2+2N−1Y1•Y2•

=N−1n21Ȳ1•2 +N−1n22Y2•2 +2N−1n1n2Ȳ1•Ȳ2•. Next, from (2.4), and the latter expression in (2.8),

S(̂𝜷) =

∑

j=1

Y1j2−n1Ȳ1•2 +

∑

j=1

Y2j2−n2Ȳ2•2 = (Y2)1•+ (Y2)2•−n1Ȳ1•2 −n2Ȳ2•2, so that

S(̂𝜸) −S(𝜷) =̂ n1Ȳ1•2 (

1−n1 N )

+n2Ȳ2•2 (

1−n2 N )

−2n1n2 N Ȳ1•Ȳ2•

= n1n2

n1+n2(Ȳ1•2 +Ȳ2•2 −2Ȳ1•Ȳ2•)

= (Ȳ1•−Ȳ2•)2 n−11 +n−12 ,

which is the same as (2.9). ◾

Based on (2.9), theFstatistic (1.88) is F= (Ȳ1•−Ȳ2•)2∕(m−1+n−1)

((m−1)S21+ (n−1)S22)∕(m+n−2) = (Ȳ1•−Ȳ2•)2

S2p(m−1+n−1) ∼F1,m+n−2, (2.10) a centralFdistribution with 1 andm+n−2 degrees of freedom, where

S2p= (m−1)S21+ (n−1)S22

m+n−2 (2.11)

from (2.5) is referred to as thepooled variance estimatorof𝜎2. Observe thatF =T2, where T= Ȳ1•−Ȳ2•

Sp√

m−1+n−1

∼tm+n−2

is the usual “tstatistic” associated with the test. Thus, a two-sidedt-test of size𝛼, 0< 𝛼 <1, would reject the null if|T|>ct, wherectis the quantile such that Pr(T>ct) =𝛼∕2, or, equivalently, ifF>c, where Pr(F>c) =𝛼. Note thatc=c2t.

Under the alternative,F∼F1,m+n−2(𝜃), where, from (1.82) withA= (X′X)−1, 𝜃= 1

𝜎2𝜷′H′(HAH′)−1H𝜷= 1 𝜎2 𝛿2

m−1+n−1, 𝛿=𝜇2−𝜇1. (2.12)

For a given value of 𝜃, the power of the test is Pr(F>c). To demonstrate, let m=nso that 𝜃= n𝛿2∕(2𝜎2). In Matlab, we could use

1 n = 10; delta = 0.3; sig2=6; theta = n *deltaˆ2 /2 /sig2;

2 c = finv(0.95,1,2*n-2); pow = 1 - spncf(c,1,2*n-2,theta);

wherespncfrefers to the saddlepoint c.d.f. approximation of the singly noncentralFdistribution;

see Section II.10.2. As an illustration, Figure 2.1 plots the power curve of the two-sidedt-test as a function of 𝛿, using𝜎2=1,𝛼=0.05, and three values ofn. As expected, for a given𝛿, the power increases withn, and for a givenn, the power increases with𝛿.

It is more useful, though not always possible, to ﬁrst decide upon a size𝛼and a power𝜌, for given values of𝜎2and𝛿, and then calculaten. That requires solving for the smallest integernsuch that

Pr(F1,2n−2(0)>c)⩽𝛼 and Pr(F1,2n−2(n𝛿2∕(2𝜎2))>c)⩾𝜌.

Equivalently, and numerically easier, we ﬁnd the smallestn∈ℝ>0such that

Pr(F1,2n−2(0)>c) =𝛼 and Pr(F1,2n−2(n𝛿2∕(2𝜎2))>c) =𝜌, (2.13) and then round up to the nearest integer. A program to accomplish this is given in Listing 2.1. (It uses the saddlepoint approximation to the noncentralF distribution to save computing time.) This can then be used to ﬁnd the required sample sizen∗as a function of, say,𝜎2. To illustrate, the top panel of Figure 2.2 plotsn∗versus𝜎2for𝛼=0.05,𝜌=0.90, and three values of𝛿. It appears thatn∗is linear in𝜎2, and this is now explained.

0 0.2 0.4 0.6 0.8 1 0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Discrepancy δ

power

Power of the F test n = 10

n = 15 n = 20

Figure 2.1 Power of theFtest, given in (2.10) and (2.12), as a function of𝛿, using𝛼=0.05 and𝜎2=1.

LetX1,…,Xnbe an i.i.d. sample from a N(𝜇, 𝜎2)population with𝜎2known. We wish to know the required sample sizenfor a one-sided hypothesis test ofH0∶𝜇=𝜇0 versusHa∶𝜇=𝜇a, for𝜇a>

𝜇0, with size𝛼∈ (0,1)and power𝜌∈ (𝛼,1). AsX̄n∼N(𝜇0, 𝜎2∕n)under the null, letZ=√ n(X̄n− 𝜇0)∕𝜎∼N(0,1), so that the required test cutoﬀ value,c𝛼, is given by Pr(Z>c𝛼∣H0) =𝛼, or c𝛼 = Φ−1(1−𝛼). The power is

𝜌=Pr(Z>c𝛼 ∣Ha) =Pr(X̄n> 𝜇0+c𝛼√

𝜎2∕n∣Ha)

=Pr

(X̄n−𝜇𝛼

√𝜎2∕n > 𝜇0−𝜇𝛼+c𝛼√ 𝜎2∕n

√𝜎2∕n

||||

|| Ha )

or, simplifying, with𝛿=𝜇a−𝜇0, the minimal sample size is⌈n⌉, where⌈⋅⌉denotes the ceiling function, i.e.,⌈2.3⌉=⌈2.8⌉=3, and

n= 𝜎2

𝛿2(Φ−1(1−𝛼) − Φ−1(1−𝜌))2

= 𝜎2

𝛿2(Φ−1(1−𝛼) + Φ−1(𝜌))2, 𝜌∈ (𝛼,1). (2.14)

Observe that (2.14) does not make sense for𝜌∈ (0, 𝛼). This formula is derived in most introductory statistics texts (see, e.g., Rosenkrantz, 1997, p. 299), and is easy because of the simplifying assumption that𝜎2is known, so that thetdistribution (orF) is not required.

For the two-sided test, again assuming𝜎2known, it is straightforward to show thatnis given by the solution to

Φ(−z−k) + Φ(−z+k) =𝜌, where z= Φ−1(1−𝛼∕2) and k=𝛿√

n∕𝜎, (2.15)

(see, e.g., Tamhane and Dunlop, 2000, pp. 248–249), which needs to be solved numerically. However, for𝛿 >0, the termΦ(−z−k)will be relatively small, so that

n≈ 𝜎2 𝛿2 (

Φ−1 (

1− 𝛼 2 )

+ Φ−1(𝜌) )2

(2.16) should be highly accurate. These formulae all refer to testing with a single i.i.d. sample (and𝜎2known).

These could, however, be applied toDii.i.d.∼ N(𝜇D, 𝜎2D), whereDi=Xi−Yiare computed from paired

1 function [n,c]=design1(delta,sigma2,alpha,power)

2 if nargin<4, power=0.90; end, if nargin<3, alpha=0.05; end 3 d2=deltaˆ2; perc0=1-alpha; M=2; n=2;

4 c=ncf2cdfx(perc0,1,2*n-2,0,0); theta1=n*d2/(2*sigma2);

5 F=spncf(c,1,2*n-2,theta1,0);

6 while ( 1-F < power )

7 n=n*M; c=ncf2cdfx(perc0,1,2*n-2,0,0); theta1=n*d2/(2*sigma2);

8 F=spncf(c,1,2*n-2,theta1,0);

9 end

10 hib=n; lob= n/M; % this should bound n 11 % now use bisection:

12 versuch = (lob+hib)/2; valid=0; TOL=1e-8;

13 while (valid==0)

14 z=betainv(alpha,(2*versuch-2)/2,1/2);

15 c=((2*versuch-2)/z - (2*versuch-2))/1;

16 theta1=versuch*d2/(2*sigma2); F=spncf(c,1,2*versuch-2,theta1,0);

17 check=F-(1-power); valid= (abs( check ) < TOL);

18 if (valid==0)

19 if check<0, hib=versuch; else lob= versuch; end 20 versuch= (lob+hib)/2;

21 else n=versuch;

22 end

23 end

24 n=ceil(n); z=betainv(alpha,(2*n-2)/2,1/2);

25 c=((2*n-2)/z - (2*n-2))/1;

26 % check the result 27 theta1=n*d2/(2*sigma2);

28 size_SPA=1-spncf(c,1,2*n-2,0,0) %#ok<NASGU,NOPRT>

29 size_exact=1-fcdf(c,1,2*n-2) %#ok<NASGU,NOPRT>

30 power_SPA = 1-spncf(c,1,2*n-2,theta1,0) %#ok<NASGU,NOPRT>

31 power_exact = 1-ncf(c,1,2*n-2,theta1,0) %#ok<NASGU,NOPRT>

33 end % function

Program Listing 2.1: Computesn⋆(and cutoﬀ valuec) for the given values𝛿,𝜎2,𝛼and𝜌. The last part of the program takes into account thatnis fractional. Round upnto get an integer and then recompute the cutoﬀ value such that the size is exactly𝛼. Functionsncf2cdfxandspncfuse the saddlepoint approximation and are available in the set of programs associated with this book. The former is given in Listing 2.2. The word “Versuch” is a noun in German meaning “attempt” or “try”, the latter being a reserved word in Matlab.

observations from a bivariate normal population. If theXiandYihave the same variance𝜎2and the correlation between them is zero, then (2.14) and (2.16) can be applied with𝜎D2 =Var(Di) =2𝜎2. In particular, for the two-sided test,

n∗≈2𝜎2 𝛿2 (

Φ−1 (

1−𝛼 2 )

+ Φ−1(𝜌))2

. (2.17)

Observe that (2.17) embodies two approximations: one is the nonzero termΦ(−z−k)in (2.15), the other is that𝜎2is known. It explains the linearity ofn∗ in Figure 2.2. To illustrate the accuracy, the bottom panel of Figure 2.2 is the same as the top panel, but using (2.17). We see that the approximation is excellent for the constellation of parameters under consideration.

1 function ncf2cdfx=ncf2cdfx(alpha,n1,n2,theta1,theta2)

2 % cutoff value of the (possibly doubly noncentral) F distribution using the SPA.

3 % Compare to Matlab's built in ncfinv and finv.

5 if (theta1>0) && (theta2>0), xval = 1.5*theta1/theta2; else xval=1; end 6 multip=1; cdf=2;

7 while (cdf>alpha) 8 versuch= xval/multip;

9 cdf = spncf(versuch,n1,n2,theta1,theta2); multip= multip*2;

10 end

11 lob= versuch;

13 multip= 1; cdf=-1;

14 while (cdf<alpha) 15 versuch= xval*multip;

16 cdf= spncf(versuch,n1,n2,theta1,theta2); multip= multip*2;

17 end

18 hib= versuch;

20 if 1==1 % Matlab's routine for minimization when bounds are known 21 opt=optimset('TolX',1e-5,'Display','off');

22 ncf2cdfx=fminbnd(@(x) spncf_(x,n1,n2,theta1,theta2,alpha),lob,hib,opt);

23 else % use bisection

24 versuch = (lob+hib)/2; valid=0; TOL=1e-8;

25 while (valid~=1)

26 cdf= spncf(versuch,n1,n2,theta1,theta2);

27 valid= (abs(cdf-alpha)<TOL);

28 if (valid==1), ncf2cdfx= versuch;

29 else

30 if (cdf<alpha), lob= versuch; else hib= versuch; end 31 versuch= (lob+hib)/2;

32 end

33 end

34 end

35 end % function 36

37 function disc=spncf_(x,n1,n2,theta1,theta2,alpha) 38 disc=abs(spncf(x,n1,n2,theta1,theta2,2) - alpha);

39 end % function

Program Listing 2.2: Evaluation of the location-scaled-dimensional Student’stdensity. Continued from Listing 2.1.

Two Sample t-Tests for Diﬀerences in Means

Ordinary and Generalized Least Squares

The Geometric Approach to Least Squares