One‐way Repeated Measures

Một phần của tài liệu Spss® Data Analysis For Univariate, Bivariate And Multivariate Statistics (2019).Pdf (Trang 96 - 104)

Consider the following fictional data on learning as a function of trial. For these data, six rats were observed in a Skinner box, and the time (in minutes) it took each rat to press a lever in the box was recorded. If the rat is learning the “press lever” response, then the time it takes the rat to press the level should decrease across trials.

Trial 2

1 3 Rat Means

Rat

8.2

10.0 5.3 7.83

11.2

12.1 9.1 10.80

8.1

9.2 4.6 7.30

10.5

11.6 8.1 10.07

7.6

8.3 5.5 7.13

9.5

10.5 8.1 9.37

M= 9.18 M= 10.28

1 2 3 4 5 6

Trial means M= 6.78

Learning as a Function of Trial (Hypothetical Data)

Notice that overall, the mean response time decreases over time from a mean of 10.28 to a mean of 6.78. For these data, each rat is essentially serving as its own “control,” since each rat is observed

8

Repeated Measures ANOVA

repeatedly across the trials. Again, this is what makes these data “repeated measures.” Notice there are only 6 rats used in the study. In a classic between‐subjects design, each data point would repre- sent an observation on a different rat, of which for these data there would be 18 such observations.

For our data, the dependent variable is response time measured in minutes, while the independent variable is trial. The data call for a one‐way repeated measures ANOVA. We wish to evaluate the null hypothesis that the means across trials are the same:

Null Hypothesis: Trial 1 Mean = Trial 2 Mean = Trial 3 Mean

Evidence to reject the null would suggest that somewhere among the above means, there is a differ- ence between trials. Repeated measures ANOVA violates the assumption of independence between conditions, and so an additional assumption is required of such designs, the so‐called sphericity assumption, which we will evaluate in SPSS.

Entering data into SPSS is a bit different for a repeated measures than it is for a classic between‐

subjects design. We enter the data as follows:

Notice that each column corresponds to data on each trial. To analyze this data, we proceed as follows:

ANALYZE → GENERAL LINEAR MODEL→ REPEATED MEASURES

SPSS will show factor 1 as a default in the Within‐Subject Factor Name. We rename this to trial and type in under Number of Levels the number 3, since there are three trials. Click on Add, which now shows the trial variable in the box (trial(3)).

Next, click on Define.

We will also obtain a plot of the means. Select Plots:

Finally, we will obtain a measure of effect size before going ahead with the analysis. Select Options.

Below we move trial over to the Display Means for window, and check off the box Compare main effects, with a Confidence interval adjustment equal to LSD (none). Then, to get the measure of effect size, check off Estimates of effect size.

Move trial_1, trial_2, and trial_3 over to the respective slots in the Within‐Subjects Variables (trial) window.

In the Repeated Measures: Profile Plots window, we move trial over to the Horizontal Axis, then click on Add so that trial appears in the Plots window at the bottom of the box. Click on Continue.

Click on Continue, then OK to run the analysis:

GLM trial_1 trial_2 trial_3 /WSFACTOR=trial 3 Polynomial /METHOD=SSTYPE(3)

/PLOT=PROFILE(trial)

/EMMEANS=TABLES(trial) COMPARE ADJ(LSD) /PRINT=ETASQ

/CRITERIA=ALPHẶ05) /WSDESIGN=trial.

SPSS first confirms for us that our within‐subjects factor has three levels to it.

Within-Subjects Factors Measure: MEASURE_1

Dependent Variable trial

1 2 3

trial_1 trial_2 trial_3

Next, SPSS gives us the multivariate tests for the effect:

Multivariate Testsa

Partial Eta Squared

Effect Value F Hypothesis df Error df Sig.

trial

a. Design: Intercept Within Subjects Design: trial b. Exact statistic

Pillai’s Trace Wilks’ Lambda Hotelling’s Trace Roy’s Largest Root

.942 .058 16.126 16.126

32.251b 32.251b 32.251b 32.251b

2.000 2.000 2.000 2.000

4.000 4.000 4.000 4.000

.003 .003 .003 .003

.942 .942 .942 .942

Multivariate tests are a bit more complicated to interpret compared with the univariate F‐ratio and are discussed more extensively in this book’s chapter on MANOVA and discriminant analysis (Chapter 11). Multivariate models are defined by having more than a single response variable. Long story short, for our data, instead of conceiving response time in minutes as a single response vari- able, we may instead conceive the analysis as having three response variables, that is, responses on trials 1, 2, and 3. What this means is that our analysis could conceivably be considered a multivariate ANOVA rather than a univariate repeated‐measures ANOVA, and so SPSS reports the multivariate tests along with the ordinary univariate ones (to be discussed, shortly). For now, we do not detail the meaning of these multivariate tests nor give their formulas on how to interpret them. We simply indicate for now that all four tests (Pillai’s trace, Wilks’ lambda, Hotelling’s trace, and Roy’s larg- est root) suggest the presence of a multivariate effect, since the p‐value for each test is equal to 0.003 (under Sig.). Hence, coupled with the effect size estimate of partial Eta‐squared equal to 0.942, we have evidence that across trials, the mean response times are different in the population from which these data were drawn. Again, we will have more to say on what these multivariate sta- tistics mean when we survey MANOVA later in this book. For now, the rule of thumb is that if p < 0.05 for these tests (or whatever significance level you choose to use), it indicates the presence of an effect.

SPSS next provides us with Mauchly’s test of sphericity:

Mauchly’s Test of Sphericitya

Epsilonb Measure: MEASURE_1

Within Subjects Effect trial

Mauchly’s W Approx. Chi-

Square df Sig. Greenhouse-

Geisser Huynh-Feldt Lower-bound

a. Design: Intercept Within Subjects Design: trial

Tests the null hypothesis that the error covariance matrix of the orthonormalized transformed dependent variables is proportional to an identity matrix.

b. May be used to adjust the degrees of freedom for the averaged tests of significance. Corrected tests are displayed in the Tests of Within-Subjects Effects table.

.276 5.146 2 .076 .580 .646 .500

This test is given as a consequence of the analysis being a repeated‐measures ANOVA rather than a usual between‐subjects ANOVA. Sphericity is a rather complex subject and we do not detail it here.

A repeated‐measures ANOVA was conducted on trial having three levels. All multivariate tests suggested a rejection of the null hypothesis that mean learning times per trial are different in the population from which the sample data were drawn. Pillai’s trace, Wilks’ lambda, Hotelling’s trace, and Roy’s largest root were all statistically significant (p = 0.003). Mauchly’s test was performed to evaluate the null hypothesis of sphericity across trials. There was insufficient evidence to suggest a violation of sphe- ricity (p = 0.076). Univariate tests of significance on the trial factor rejected the null hypothesis of no mean trial differences (p < 0.001). Approximately 94% of the variance ( p2 0.936) in mean learning times can be accounted for by trial. The Greenhouse–Geisser, a more conservative test, which guards against a potential violation of sphericity, also rejected the null (p < 0.001). Tests of within‐subjects contrasts to evaluate trend revealed that both a linear and quadratic trend account for the trajectory of trial better than chance; how- ever, a linear trend appears slightly preferable (p < 0.001) over a quadratic one (p = 0.004). A plot of trial means generally supports the conclusion of a linear trend. Pairwise comparisons revealed evidence for pair- wise mean differences between all trials regardless of whether a Bonferroni correction was implemented.

For details, see Kirk (1995). What you need to know is that if the test is not statistically significant, then it means you have no reason to doubt the assumption of sphericity, which means, pragmatically, that you can interpret the univariate effects without violating the assumption of sphericity. Had Mauchly’s been statistically significant (e.g. p < 0.05), then it would suggest that interpreting the uni- variate effects to be problematic, and instead interpreting the multivariate effects (or adjusted Fs, see below) would usually be recommended. For our data, the test is not statistically significant, which means we can, at least in theory, go ahead and interpret the ensuing univariate effects with the unad- justed traditional F‐ratio. The right‐hand side of the above output contains information regarding adjustments that are made to degrees of freedom if sphericity is violated, which we will now discuss.

SPSS next gives us the univariate tests:

Tests of Within-Subjects Effects Measure: MEASURE_1

trial Sphericity Assumed Greenhouse-Geisser Huynh-Feldt Lower-bound Error(trial) Sphericity Assumed

Greenhouse-Geisser Huynh-Feldt Lower-bound

Source Type III Sum

of Squares df Mean Square F Sig. Partial Eta

Squared 72.620

72.620 72.620 72.620

.000 .000 .000 .000

.936 .936 .936 .936 38.440

38.440 38.440 38.440

1.160 1.292 1.000

33.131 29.750 38.440 19.220 2

2.647 2.647 2.647 2.647

5.801 6.461 5.000

.456 .410 .529 .265 10

We can see that for trial, we have evidence to reject the null hypothesis, since p < 0.05 (Sig. = 0.000).

Partial eta‐squared is equal to 0.936, meaning that approximately 94% of the variance in response time can be explained by trial. Notice that SPSS reports four different tests: (i) sphericity assumed, (ii) Greenhouse–Geisser, (iii) Huynh–Feldt, and (iv) lower bound. Since we did not find evidence to reject the assumption of sphericity, we would be safe, theoretically at least, in interpreting the

“sphericity assumed” line. However, since Mauchly’s test is fairly unstable and largely influenced by distributional assumptions, many specialists in repeated measures often recommend simply report- ing the Greenhouse–Geisser result, regardless of the outcome of Mauchly’s. For details on how the Greenhouse–Geisser test works, see Denis (2016). For our applied purposes, notice that the degrees of freedom for G–G are equal to 1.160 in the numerator and 5.801 in the denominator. These degrees of freedom are smaller than what they are for sphericity assumed. Greenhouse–Geisser effectuates a bit of a “punishment” on the degrees of freedom if sphericity cannot be assumed, making it a bit more difficult to reject the null hypothesis. Even though the F‐ratios are identical for sphericity assumed and Greenhouse–Geisser (both are equal to 72.620), the p‐values are not equal. We cannot see this from the output because it appears both are equal to 0.000, but if you double‐click on the p‐values, you will get the following for sphericity assumed versus Greenhouse–Geisser:

Tests of Within-Subjects Effects Measure MEASURE_1

Source Type III Sum df Mean Square F Sig.

of Squares Partial Eta

Squared .936 .936 .936 .936 72.620

72.620 72.620 72.620 19.220 33.131 29.750 38.440 2 1.160 1.292 1.000 38.440 38.440 38.440 38.440

.000 .000 .000 trial Sphericity Assumed

Greenhouse-Geisser Huynh-Feldt Lower-bound

.265 .456 .410 .529 10 5.801 6.461 5.000 2.647 2.647 2.647 2.647 Error(trial) Sphericity Assumed

Greenhouse-Geisser Huynh-Feldt Lower-bound

0.000001

Tests of Within-Subjects Effects Measure MEASURE_1

Source Type III Sum df Mean Square F Sig.

of Squares Partial Eta

Squared .936 .936 .936 .936 72.620

72.620 72.620 72.620 19.220 33.131 29.750 38.440 2 1.160 1.292 1.000 38.440 38.440 38.440 38.440

.000 .000 .000 trial Sphericity Assumed

Greenhouse-Geisser Huynh-Feldt Lower-bound

.265 .456 .410 .529 10 5.801 6.461 5.000 2.647 2.647 2.647 2.647 Error(trial) Sphericity Assumed

Greenhouse-Geisser Huynh-Feldt Lower-bound

0.000143

Notice that the p‐value for the Greenhouse–Geisser is larger than the p‐value for sphericity assumed. This is because as a result of the “punishment,” it is more difficult to reject the null under the G–G. For our data, it makes no difference in terms of our decision on the null hypothesis, since both p‐values are very small, much less than the customary 0.05, and so regardless of which we inter- pret, we reject the null hypothesis.

Next, SPSS presents us with tests of within‐subjects contrasts:

Tests of Within-Subjects Contrasts Measure: MEASURE_1

trial Error(trial)

trial Linear Quadratic Linear Quadratic

Source Type III Sum

of Squares df Mean Square F Sig. Partial Eta

Squared 36.750

1.690 2.300 .347

36.750 1.690

79.891 24.375

.000 .004

.941 .830 .460

.069 1

1 5 5

Interpreting these tests is optional. They merely evaluate whether the trial means tend to increase or decrease in a linear or other trend. According to the output, evidence for a linear trend is slightly more convincing than that for a quadratic trend, since the p‐value for the linear trend is equal to 0.000, while the p‐value for the quadratic trend is equal to 0.004. When we couple this with the plot that we requested, we see why:

Estimated Marginal Means of MEASURE_1 Profile Plots

10.00

9.00

8.00 Estimated Marginal Means 7.00

1 2

trial

3

We see from the plot that from trials 1 to 3, the mean response time decreases in a somewhat linear fashion (i.e. the plot almost resembles a line).

Next, SPSS provides us with the between‐subjects effects:

Source Intercept Error

1378.125 35.618

1 5

1378.125 193.457 .000 .975

7.124 Type III Sum

of Squares Mean Square Partial Eta

Squared Sig.

F df

Tests of Between-Subjects Effects Measure: MEASURE_1

Transformed Variable: Average

The above is where we would see any between‐subject variables that we included into the analysis.

For our data, we have no such variables, since “trial” is the only variable under study. However, the error term sums of squares of 35.618 on 5 degrees of freedom is, in actuality in this case, the effect of the subjects variable. To see this, and merely for demonstration (you would not actually do this in a formal analysis, we will not even get p‐values), let us redo the analysis such that we devote a column to the subjects variable:

Let us now try running the analysis as before, but this time, also designating subject as a between‐

subjects variable:

Notice that the above sums of squares of 35.618 and associated degrees of freedom and mean square mirrors that of the output we obtained above for the error term. Hence, what SPSS is desig- nating as error in this simple case is, in fact, the effect due to subjects for this one‐way repeated measures ANOVA. Had we included a true between‐subjects factor, SPSS would have partitioned this subject variability accordingly by whatever factor we included in our design. The important point to note from all this is that SPSS partitions effects in repeated measures by “within subjects”

When we run the above analysis, we get the following output for the between‐subjects effects:

Source Intercept Error subject

1378.125 .000 35.618

1 5 0

1378.125 . . 1.000

1.000 .

. . 7.124 Type III Sum

of Squares Mean Square Partial Eta

Squared Sig.

F df

Tests of Between-Subjects Effects Measure: MEASURE_1

Transformed Variable: Average

and “between subjects,” and any between‐subjects factors we include in our design will be found in the tests of between‐subjects effects output. We will demonstrate this with an example shortly in which we include a true between‐subjects factor.

To conclude our analysis, we move on to interpreting the requested pairwise comparisons:

Pairwise Comparisons

Measure:

Based on estimated marginal means

*. The mean difference is significant at the .05 level.

b. Adjustment for multiple comparisons: Least Significant Difference (equivalent to no adjustments).

MEASURE_1

(I) trial (J) trial Std. Error

95% Confidence Interval for Difference Sig. Lower Bound Upper Bound Mean

Difference (I-J)

1 2

3

2 1

3

3 1

2

1.100*

3.500*

–1.100*

2.400*

–3.500*

–2.400*

.153 .392 .153 .297 .392 .297

.001 .000 .001 .000 .000 .000

.707 2.493 –1.493 1.637 –4.507 –3.163

1.493 4.507 –.707 3.163 –2.493 –1.637

As we can see from above, we have evidence to suggest that the means of all trials are different from one another. The above table compares trial 1 with trial 2, trial 1 with trial 3, etc., all having p‐values of less than 0.05 (a Bonferroni correction would have yielded the same decisions on null hypotheses, which we will demonstrate in a moment). SPSS also provides us with confidence intervals for the pairwise differences. For example, the first confidence interval has lower limit of 0.707 and upper limit of 1.493, which means that in 95% of samples drawn from this population, the true mean differ- ence is expected to lay between these extremes.

Had we wanted to perform a Bonferroni adjustment on the post hoc, we could have selected the Bonferroni correction from the GUI window or simply entered the syntax below.

Một phần của tài liệu Spss® Data Analysis For Univariate, Bivariate And Multivariate Statistics (2019).Pdf (Trang 96 - 104)

Tải bản đầy đủ (PDF)

(206 trang)