Two Sample t-Tests for Differences in Means

Một phần của tài liệu Linear models and time series analysis regression, ANOVA, ARMA and GARCH (Trang 92 - 98)

Every basic statistics course discusses the classict-test for the null of equality of the means of two normal populations. This is done under the assumption of equal population variances, and usually also without the equality assumption. In both cases, the test decision is the same as that delivered by the binary result of zero being in or out of the corresponding confidence interval, the latter having been detailed in Section III.8.3.

Arguments in favor of the use of confidence intervals and the study of effect sizes, as opposed to the blind application of hypothesis tests, were discussed in Section III.2.8. There, it was also discussed how hypothesis testing can have a useful role in inference, notably in randomized studies that are repeatable. We now derive the distribution of the associated test statistic, under the equal variance

assumption, using the linear model framework. This is an easy task, given the general results from Chapter 1.

LetY1ji.i.d.∼ N(𝜇1, 𝜎2),j=1,,m, independent ofY2ji.i.d.∼ N(𝜇2, 𝜎2),j=1,,n, with𝜎2>0. This can be expressed as the linear modelY=X𝜷+𝝐, where, in standard notation,𝝐∼N(𝟎, 𝜎2IN),N=m+n,

X=

[ 𝟏m 𝟎m 𝟎n 𝟏n

] , 𝜷=

[𝜇1 𝜇2 ]

, Y= [Y1

Y2 ]

, (2.1)

and

̂𝜷= (XX)−1XY=

[ m 0

0 n

]−1[ Y1•

Y2•

]

= [1•

2•

]

, (2.2)

where we define the notation Y1•=

m j=1

Y1j, 1• = Y1•

m, and likewise, Y2• =

n j=1

Y2j, 2• = Y2•

n . (2.3)

The residual sum of squares, RSS=S(𝜷̂) =̂𝝐̂𝝐, is immediately seen to be S(̂𝜷) =

m j=1

(Y1j1•)2+

n j=1

(Y2j2•)2= (m−1)S21+ (n−1)S22, (2.4) whereS2i is the sample variance based on the data from groupi,i=1,2. Thus, from (2.4) and (1.58), an unbiased estimator of𝜎2is

̂𝜎2=S(𝜷)∕( +n−2). (2.5)

In the case thatm=n(as we will consider below, witha⩾2 groups instead of just two, for the bal- anced one-way fixed effects ANOVA model), (2.5) can be expressed as

(m=n), (a=2), ̂𝜎2= 1 a(n−1)

a i=1

n j=1

(Yiji•)2. (2.6)

Remark (1.57) states that RSS=Y′(IP)Y=‖Y‖2−‖X̂𝜷‖2, where P is the usual projection matrixP=X(XX)−1X′. It is a useful exercise to confirm, in this simple setting, that this RSS formula also leads to (2.4). For clarity, let1•2 = (1•)2. We have, from the definition ofXand𝜷̂in (2.2),

Y∥2− ∥X𝜷̂ 2=

m j=1

Y1j2+

n j=1

Y2j2−mȲ1•2 −nȲ2•2

=

m j=1

(Y1j2−1•2) +

n j=1

(Y2j2−2•2). (2.7)

But, as

m j=1

(Y1j1•)2=

m j=1

Y1j2−2

m j=1

Y1j1•+

m j=1

1•2

=

m j=1

Y1j2−2mȲ1•2 +mȲ1•2 =

m j=1

Y1j2−mȲ1•2 (2.8)

=

m j=1

(Y1j2−1•2),

and likewise for the second group, (2.7) is equivalent to (2.4). ◾ The null hypothesis is that𝜇1=𝜇2, and in the notation of Section 1.4, H𝜷=b, withJ=1,H= [1,−1]and scalarb=0. From (1.90) withA= (XX)−1, it follows thatHAH′=m−1+n−1. Thus, (1.87) is painlessly seen to be

Y′(PP)Y=S(̂𝜸) −S(𝜷) = (Ĥ ̂𝜷)′(HAH′)−1H𝜷̂= (1•−2•)2

m−1+n−1 . (2.9)

Remark As we did above for (2.4), it is instructive to derive (2.9) by brute force, directly evaluating S(̂𝜸) −S(̂𝜷). Here, it will be convenient to letn1=mandn2=n, which would anyway be necessary in the general unbalanced case witha⩾2 groups. Under the reduced model,PY=X̂𝜸with

̂

𝜸 =••=N−1

∑2 i=1

ni

j=1

Yij=N−1Y••,

this being the mean of all theYij, whereN =n1+n2. Then S(̂𝜸) =

n1

j=1

(Y1j••)2+

n2

j=1

(Y2j••)2

= (Y2)1•−2••Y1•+n1(••)2+ (Y2)2•−2••Y2•+n2(••)2

= (Y2)1•+ (Y2)2•−N(••)2,

which could have been more easily determined by realizing that, in this case,S(̂𝜸) = (Y2)••−N(••)2, and(Y2)••= (Y2)1•+ (Y2)2•. Observe that

N(••)2=N−1(Y1•+Y2•)2

=N−1(Y1•)2+N−1(Y2•)2+2N−1Y1•Y2•

=N−1n211•2 +N−1n22Y2•2 +2N−1n1n21•2•. Next, from (2.4), and the latter expression in (2.8),

S(̂𝜷) =

n1

j=1

Y1j2−n11•2 +

n2

j=1

Y2j2−n22•2 = (Y2)1•+ (Y2)2•−n11•2 −n22•2, so that

S(̂𝜸) −S(𝜷) =̂ n11•2 (

1−n1 N )

+n22•2 (

1−n2 N )

−2n1n2 N 1•2•

= n1n2

n1+n2(1•2 +2•2 −21•2•)

= (1•−2•)2 n−11 +n−12 ,

which is the same as (2.9). ◾

Based on (2.9), theFstatistic (1.88) is F= (1•−2•)2∕(m−1+n−1)

((m−1)S21+ (n−1)S22)∕(m+n−2) = (1•−2•)2

S2p(m−1+n−1) ∼F1,m+n−2, (2.10) a centralFdistribution with 1 andm+n−2 degrees of freedom, where

S2p= (m−1)S21+ (n−1)S22

m+n−2 (2.11)

from (2.5) is referred to as thepooled variance estimatorof𝜎2. Observe thatF =T2, where T= 1•−2•

Sp

m−1+n−1

tm+n−2

is the usual “tstatistic” associated with the test. Thus, a two-sidedt-test of size𝛼, 0< 𝛼 <1, would reject the null if|T|>ct, wherectis the quantile such that Pr(T>ct) =𝛼∕2, or, equivalently, ifF>c, where Pr(F>c) =𝛼. Note thatc=c2t.

Under the alternative,FF1,m+n−2(𝜃), where, from (1.82) withA= (XX)−1, 𝜃= 1

𝜎2𝜷H′(HAH′)−1H𝜷= 1 𝜎2 𝛿2

m−1+n−1, 𝛿=𝜇2−𝜇1. (2.12)

For a given value of 𝜃, the power of the test is Pr(F>c). To demonstrate, let m=nso that 𝜃= n𝛿2∕(2𝜎2). In Matlab, we could use

1 n = 10; delta = 0.3; sig2=6; theta = n *deltaˆ2 /2 /sig2;

2 c = finv(0.95,1,2*n-2); pow = 1 - spncf(c,1,2*n-2,theta);

wherespncfrefers to the saddlepoint c.d.f. approximation of the singly noncentralFdistribution;

see Section II.10.2. As an illustration, Figure 2.1 plots the power curve of the two-sidedt-test as a function of 𝛿, using𝜎2=1,𝛼=0.05, and three values ofn. As expected, for a given𝛿, the power increases withn, and for a givenn, the power increases with𝛿.

It is more useful, though not always possible, to first decide upon a size𝛼and a power𝜌, for given values of𝜎2and𝛿, and then calculaten. That requires solving for the smallest integernsuch that

Pr(F1,2n−2(0)>c)⩽𝛼 and Pr(F1,2n−2(n𝛿2∕(2𝜎2))>c)⩾𝜌.

Equivalently, and numerically easier, we find the smallestn∈ℝ>0such that

Pr(F1,2n−2(0)>c) =𝛼 and Pr(F1,2n−2(n𝛿2∕(2𝜎2))>c) =𝜌, (2.13) and then round up to the nearest integer. A program to accomplish this is given in Listing 2.1. (It uses the saddlepoint approximation to the noncentralF distribution to save computing time.) This can then be used to find the required sample sizen∗as a function of, say,𝜎2. To illustrate, the top panel of Figure 2.2 plotsn∗versus𝜎2for𝛼=0.05,𝜌=0.90, and three values of𝛿. It appears thatn∗is linear in𝜎2, and this is now explained.

0 0.2 0.4 0.6 0.8 1 0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Discrepancy δ

power

Power of the F test n = 10

n = 15 n = 20

Figure 2.1 Power of theFtest, given in (2.10) and (2.12), as a function of𝛿, using𝛼=0.05 and𝜎2=1.

LetX1,,Xnbe an i.i.d. sample from a N(𝜇, 𝜎2)population with𝜎2known. We wish to know the required sample sizenfor a one-sided hypothesis test ofH0∶𝜇=𝜇0 versusHa𝜇=𝜇a, for𝜇a>

𝜇0, with size𝛼∈ (0,1)and power𝜌∈ (𝛼,1). Asn∼N(𝜇0, 𝜎2∕n)under the null, letZ=√ n(n𝜇0)∕𝜎∼N(0,1), so that the required test cutoff value,c𝛼, is given by Pr(Z>c𝛼H0) =𝛼, or c𝛼 = Φ−1(1−𝛼). The power is

𝜌=Pr(Z>c𝛼Ha) =Pr(n> 𝜇0+c𝛼

𝜎2∕nHa)

=Pr

(n𝜇𝛼

𝜎2∕n > 𝜇0−𝜇𝛼+c𝛼𝜎2∕n

𝜎2∕n

||||

|| Ha )

,

or, simplifying, with𝛿=𝜇a𝜇0, the minimal sample size is⌈n⌉, where⌈⋅⌉denotes the ceiling func- tion, i.e.,⌈2.3⌉=⌈2.8⌉=3, and

n= 𝜎2

𝛿2(Φ−1(1−𝛼) − Φ−1(1−𝜌))2

= 𝜎2

𝛿2(Φ−1(1−𝛼) + Φ−1(𝜌))2, 𝜌∈ (𝛼,1). (2.14)

Observe that (2.14) does not make sense for𝜌∈ (0, 𝛼). This formula is derived in most introductory statistics texts (see, e.g., Rosenkrantz, 1997, p. 299), and is easy because of the simplifying assumption that𝜎2is known, so that thetdistribution (orF) is not required.

For the two-sided test, again assuming𝜎2known, it is straightforward to show thatnis given by the solution to

Φ(−zk) + Φ(−z+k) =𝜌, where z= Φ−1(1−𝛼∕2) and k=𝛿

n𝜎, (2.15)

(see, e.g., Tamhane and Dunlop, 2000, pp. 248–249), which needs to be solved numerically. However, for𝛿 >0, the termΦ(−zk)will be relatively small, so that

n𝜎2 𝛿2 (

Φ−1 (

1− 𝛼 2 )

+ Φ−1(𝜌) )2

(2.16) should be highly accurate. These formulae all refer to testing with a single i.i.d. sample (and𝜎2known).

These could, however, be applied toDii.i.d.∼ N(𝜇D, 𝜎2D), whereDi=XiYiare computed from paired

1 function [n,c]=design1(delta,sigma2,alpha,power)

2 if nargin<4, power=0.90; end, if nargin<3, alpha=0.05; end 3 d2=deltaˆ2; perc0=1-alpha; M=2; n=2;

4 c=ncf2cdfx(perc0,1,2*n-2,0,0); theta1=n*d2/(2*sigma2);

5 F=spncf(c,1,2*n-2,theta1,0);

6 while ( 1-F < power )

7 n=n*M; c=ncf2cdfx(perc0,1,2*n-2,0,0); theta1=n*d2/(2*sigma2);

8 F=spncf(c,1,2*n-2,theta1,0);

9 end

10 hib=n; lob= n/M; % this should bound n 11 % now use bisection:

12 versuch = (lob+hib)/2; valid=0; TOL=1e-8;

13 while (valid==0)

14 z=betainv(alpha,(2*versuch-2)/2,1/2);

15 c=((2*versuch-2)/z - (2*versuch-2))/1;

16 theta1=versuch*d2/(2*sigma2); F=spncf(c,1,2*versuch-2,theta1,0);

17 check=F-(1-power); valid= (abs( check ) < TOL);

18 if (valid==0)

19 if check<0, hib=versuch; else lob= versuch; end 20 versuch= (lob+hib)/2;

21 else n=versuch;

22 end

23 end

24 n=ceil(n); z=betainv(alpha,(2*n-2)/2,1/2);

25 c=((2*n-2)/z - (2*n-2))/1;

26 % check the result 27 theta1=n*d2/(2*sigma2);

28 size_SPA=1-spncf(c,1,2*n-2,0,0) %#ok<NASGU,NOPRT>

29 size_exact=1-fcdf(c,1,2*n-2) %#ok<NASGU,NOPRT>

30 power_SPA = 1-spncf(c,1,2*n-2,theta1,0) %#ok<NASGU,NOPRT>

31 power_exact = 1-ncf(c,1,2*n-2,theta1,0) %#ok<NASGU,NOPRT>

32

33 end % function

Program Listing 2.1: Computesn(and cutoff valuec) for the given values𝛿,𝜎2,𝛼and𝜌. The last part of the program takes into account thatnis fractional. Round upnto get an integer and then recompute the cutoff value such that the size is exactly𝛼. Functionsncf2cdfxandspncfuse the saddlepoint approximation and are available in the set of programs associated with this book. The former is given in Listing 2.2. The word “Versuch” is a noun in German meaning “attempt” or “try”, the latter being a reserved word in Matlab.

observations from a bivariate normal population. If theXiandYihave the same variance𝜎2and the correlation between them is zero, then (2.14) and (2.16) can be applied with𝜎D2 =Var(Di) =2𝜎2. In particular, for the two-sided test,

n∗≈2𝜎2 𝛿2 (

Φ−1 (

1−𝛼 2 )

+ Φ−1(𝜌))2

. (2.17)

Observe that (2.17) embodies two approximations: one is the nonzero termΦ(−zk)in (2.15), the other is that𝜎2is known. It explains the linearity ofn∗ in Figure 2.2. To illustrate the accuracy, the bottom panel of Figure 2.2 is the same as the top panel, but using (2.17). We see that the approximation is excellent for the constellation of parameters under consideration.

1 function ncf2cdfx=ncf2cdfx(alpha,n1,n2,theta1,theta2)

2 % cutoff value of the (possibly doubly noncentral) F distribution using the SPA.

3 % Compare to Matlab's built in ncfinv and finv.

4

5 if (theta1>0) && (theta2>0), xval = 1.5*theta1/theta2; else xval=1; end 6 multip=1; cdf=2;

7 while (cdf>alpha) 8 versuch= xval/multip;

9 cdf = spncf(versuch,n1,n2,theta1,theta2); multip= multip*2;

10 end

11 lob= versuch;

12

13 multip= 1; cdf=-1;

14 while (cdf<alpha) 15 versuch= xval*multip;

16 cdf= spncf(versuch,n1,n2,theta1,theta2); multip= multip*2;

17 end

18 hib= versuch;

19

20 if 1==1 % Matlab's routine for minimization when bounds are known 21 opt=optimset('TolX',1e-5,'Display','off');

22 ncf2cdfx=fminbnd(@(x) spncf_(x,n1,n2,theta1,theta2,alpha),lob,hib,opt);

23 else % use bisection

24 versuch = (lob+hib)/2; valid=0; TOL=1e-8;

25 while (valid~=1)

26 cdf= spncf(versuch,n1,n2,theta1,theta2);

27 valid= (abs(cdf-alpha)<TOL);

28 if (valid==1), ncf2cdfx= versuch;

29 else

30 if (cdf<alpha), lob= versuch; else hib= versuch; end 31 versuch= (lob+hib)/2;

32 end

33 end

34 end

35 end % function 36

37 function disc=spncf_(x,n1,n2,theta1,theta2,alpha) 38 disc=abs(spncf(x,n1,n2,theta1,theta2,2) - alpha);

39 end % function

Program Listing 2.2: Evaluation of the location-scaled-dimensional Student’stdensity. Continued from Listing 2.1.

Một phần của tài liệu Linear models and time series analysis regression, ANOVA, ARMA and GARCH (Trang 92 - 98)

Tải bản đầy đủ (PDF)

(880 trang)