Unfortunately, the main effects are not cient to answer questions such as, “Does the effect of factorA remain the same suffi-at different levels of factor B?” For example, in some observa
Trang 1j X .j.—overall mean of all the scores.
With this notation, if we only test for the main effect of factor B (similarly for
the main effect of factor A), the null and alternative hypotheses can be written as
• H0(B): µj = µ for all j
• H1(B): µj = µ for at least one j
The Main Sum Between, MS B b, forJ cells of factor B, is the average of estimated
variance of estimated column means (this ignoring factor A) That is,
MS B b
That is, the ratio of two main averages has anF distribution with (J − 1) degrees
of freedom in the numerator and(N − IJ) degrees of freedom in the denominator,
which is the total number of observationsN minus the total number of cells, IJ.
Interaction Between Factors. Unfortunately, the main effects are not cient to answer questions such as, “Does the effect of factorA remain the same
suffi-at different levels of factor B?” For example, in some observations of
Experi-ment One the main effect of the visibility factor is that it significantly affects thesubjects’ path length: The invisible task results in longer path lengths compared
to the visible task However, we notice that the subjects’ scores on the visibilityfactor are also affected by the interface factor: Namely, in the physical test (in thebooth) the visibility factor has no significant effect on the path length, whereas inthe virtual test the visibility factor has a significant effect on the path length Thissuggests that the tests on main effects may be missing such interaction effects.The latter can be tested by the following formulas of interaction (for details refer
MS AB= SS AB
(I − 1)(J − 1)
(7.14)
Trang 2F = MS AB
where MS AB represents the Interaction Mean Square between factors A and B.
The ratio has an F distribution with (I − 1)(J − 1) degrees of freedom in the
numerator and(N − IJ) degrees of freedom in the denominator.
In the case considered now, the results ofF test may show that the effect of
visibility factor on the path length depends on the type of interface utilized in
a given task In other words, there is an interaction between the visibility factorand the interface factor One way to express interactions is by saying that one
effect is modified (qualified) by another effect.
When the data indicate an interaction between factors, the notion of a maineffect has no meaning In such cases, tests of simple effects can be more usefulthan tests of main effects Simple effect tests are done via one-way analysis
of variance across levels of one factor, performed separately at each level ofthe other factor For example, even if we suspect an interaction between thevisibility factor and the interface factor, we might undertake simple effect testsfor the visibility factor separately at the virtual and physical level, respectively,and see what kind of conclusions can be made based on the results
7.4.5 Implementation: Two-Way Analysis for Path Length
We are now ready to perform the analysis of variance on the Experiment Onedata From other tests above, we already know that the direction factor has asignificant effect on the path length We know, further, that the left-to-right task
is significantly easier for the subjects (it results in shorter paths) than the left task We now want to analyze the combined effect of visibility and interfacefactors on the subjects’ performance Even though the underlying data are notknown to obey the normal distribution, we justify using the ANOVA by theF
right-to-test being known to be robust
The data set has been first separated into the LtoR and RtoL data sets TheANOVA variables are:
• Dependent variable: Path length
• Independent variables:
1 Visibility factor, with two levels: visible and invisible
2 Interface factor, with two levels: virtual and physical
In the tables of results that appear here, the following terms are used:
df effect—degrees of freedom for a given effect, including main and
interac-tion effects
MS effect—Mean Square for an effect, including main and interaction effects.
df error—degrees of freedom for the error variance, or Mean Square Within.
Trang 3MS error—Mean Square for the error variance, or Mean Square Within.
Rows with the effect names “1” and “2” correspond to main effects
Rows with more than one digit in the name, such as “12” or “123,” relate tothe corresponding interaction effects
Results. For the left-to-right task, the summary of ANOVA results appears inTable 7.9 The p-levels for the visibility and interface factors are about 0.01,
and the p-level for the interface is much greater than 0.01 This means the
main effects of both factors are slightly significant, and there is no interaction
We therefore conclude that for both visible and invisible environments, the pathlength is affected only slightly by the interface factor This reconciles with ourknowing that for the physical task the path length is slightly shorter than forthe virtual task And, for both physical and virtual tasks the path length is onlyslightly affected by the visibility factor Again, this reconciles with our knowingthat in the visible environment the path length is slightly shorter than in theinvisible environment
The summary of ANOVA results for the right-to-left task appears in Table 7.10.Here thep-levels for the visibility factor, the interface factor, and the interaction
are all greater than 0.01 Therefore, the main effects make no significant ence for the dependent variable, and there is no interaction The conclusion is that
differ-TABLE 7.9 ANOVA Results for Path Length: Interface and Visibility Factors; LtoR Task
ANOVA Effects Studied: 1 — interface, 2 — visibility
Trang 4for either the visible or invisible environments, the path length for the physicaltask is not significantly different from the path length in the virtual task Also,
in either of physical or virtual tasks, the path length in the visible environmentdoes not significantly differ from the path length in the invisible environment
7.4.6 Implementation: Two-Way Analysis for Completion Time
In the previous section we have analyzed the effects of test factors on the length
of paths generated by the human subjects in Experiment One We will nowanalyze how these same factors affect another performance indicator, the taskcompletion time
Each completion time score is random and independent (for the 48 subjectstested here); this meets the “sampling assumption” of nonparametric statisticsand analysis of variance Even though a closer look at the completion time datashows that they do not obey a normal distribution (as the ANOVA assumptionrequires), we still use ANOVA, counting on theF test known to be robust.
To analyze the effect of all factors on the completion time data, a three-wayanalysis of variance has been done The ANOVA variables are as follows:
• Dependent variable: Completion time
• Independent variables:
1 Direction factor, with two levels: LtoR and RtoL
2 Visibility factor, with two levels: visible and invisible
3 Interface factor, with two levels: virtual (simulation) and physical (booth).Second, since we are more interested in the visibility factor and interface factor,and since the performance in LtoR task significantly differs from that in RtoLtask, a two-way ANOVA was implemented The ANOVA variables are:
• Dependent variable: Completion time
• Independent variables:
1 Visibility factor, with two levels: visible and invisible
2 Interface factor, with two levels: virtual and physical
Results. The summary of ANOVA results of analysis of variance for all threefactors used in Experiment One appears in Table 7.11 The p-levels for the
interface factor, the direction factor, and the interaction between them are lessthan 0.01 This means that these two main effects likely significantly affect thedependent variable (completion time), and there is interaction between them The
p-levels for the remaining main effect, visibility, and for interactions with this
factor are greater than 0.01 This means that there is no significant differencefor these effects and interactions However, given that an interaction has beendetected, we should not be forming any conclusions from the results in Table 7.11until we separate the factor levels
Trang 5TABLE 7.11 ANOVA Results for Completion Time: Direction, Interface, and Visibility Factors
ANOVA Effects Studied: 1 — interface, 2 — visibility, 3 — direction
the visibility factor is about 0.01, and thep-level for the interface factor is greater
than 0.01 Therefore, the main effect of interface is statistically significant, themain effect of visibility is slightly significant, and there is no interaction betweenthem This reconciles with our knowledge that for both visible or invisible tasks,the completion time for the physical task is significantly shorter than for thevirtual task Similarly, for both physical or virtual tasks the completion time isslightly shorter in the visible environment than in the invisible environment.The summary of ANOVA results for the right-to-left task appears in Table 7.13.Thep-level for the interface factor is smaller than 0.01, the p-levels for the vis-
ibility factor and interaction are greater than 0.01 This means the main effect
of the interface factor is statistically significant, and there is no interaction This
Trang 6TABLE 7.13 ANOVA Results for Completion Time: Interface and Visibility Factors, RtoL Task
ANOVA Effects Studied: 1 — interface, 2 — visibility
com-7.5 RESULTS — EXPERIMENT TWO
Recall that Experiment Two was designed to analyze the effect of subjects’training and the related effect of the visibility factor on human performance Atotal of 12 subjects appeared in this study In the first group, which includedsix subjects, on day 1 each subject was subjected to six different training tasks,plus one test task at the end, all in the visible environment About one weeklater, on day 2, the same subjects performed the same six training tasks, plusthe same test task, this time in the invisible environment In the second group,the remaining six subjects did the same tasks in the opposite order—that is,tests in the invisible environment on day 1 and tests in the visible environment
on day 2 The specific task was right-to-left movement of the arm, the same
as in Experiment One (recall that this is a more difficult task compared to theleft-to-right task)
We therefore have a training factor Day, with two levels, day 1 and day 2.
Subjects were expected to learn the motion planning skill through a repeatedexercise
Similar to Experiment One, human performance was measured by the path
length and completion time for each of the tasks Path and Time Path length
is the measure of motion generated by the arm manipulator during the task.Completion time is the time it takes the subject to complete the task Bothmeasure the subjects’ proficiency in carrying out motion planning We supposethat both the path length and the completion time may be affected by such factors
as training and visibility of the scene, and we would like to quantify those effects
In statistical terms, the training and visibility factors are independent ables, whereas the path length and completion time are dependent variables The
Trang 7vari-objective of data analysis is to test whether the training and/or visibility tor improves the overall human performance in motion planning If in terms ofboth dependent variables the improvement in subjects’ performance turns out
fac-to be significant, follow-up tests on the separate effects on human performanceshould be conducted, to explain which specific aspects of human performanceare responsible for such effects Multivariate analysis of variance (MANOVA) is
a good technique for data analysis of overall effects [128]
Multivariate analysis of variance is conceptually a straightforward extension
of the univariate ANOVA technique described above Their major distinction isthat if in ANOVA one evaluates mean differences on a single dependent vari-able, in MANOVA one evaluates mean vector differences simultaneously on two
or more dependent variables In addition, the MANOVA design accounts forthe fact that dependent variables may be correlated For instance, two depen-dent variables in Experiment Two, the path length and completion time, areindeed relatively highly correlated, with the correlation coefficient 0.79 In thiscase, MANOVA should provide a distinct advantage over separate ANOVAs
In fact, performing separate ANOVA tests carries an implicit assumption thateither the dependent variables are uncorrelated or such correlations are of noimportance
7.5.1 The Technique
Assumptions. The first and partly second of the three following assumptionsare required by MANOVA (and are the same for the statistical tests consideredabove):
1 Observation scores are randomly sampled from the population of interest.Observations are statistically independent of one another
2 Dependent variables have a multivariate normal distribution within eachgroup of interest This means that (a) each dependent variable is dis-tributed normally, (b) any linear combination of the dependent variablesare distributed normally as well; (c) all subsets of the variables have amultivariate normal distribution In practice, it is unlikely that this andthe next assumption are met precisely Fortunately, similar to ANOVA,MANOVA is relatively robust to violations of these assumptions In prac-tice, MANOVA tends to perform well regardless of whether or not the dataviolate these assumptions
3 Homogeneity of covariance matrices That is, all groups of data are assumed
to have a common within-group population covariance matrix This can belikened to the assumption in ANOVA of homogeneity of variance for eachdependent variable, or the assumption that correlation between any twodependent variables must be the same in all groups If the number of sub-jects is approximately the same in the experimental groups, a violation ofthe assumption of covariance matrix homogeneity leads to a slight reduction
in statistical power [128–130]
Trang 8Multivariate Null Hypothesis. Hypotheses in MANOVA are very similar to those
in univariate ANOVA, except that vectors of means are considered instead ofsingle values (scalars) of means For a simple example, imagine we carry out
a one-way MANOVA for a visible task and invisible task groups We would
like to know if the scores of path length and completion time came from the
same population that includes visible and invisible task data That is, we want tocompare the population mean vector for the dependent variables for one groupwith the population mean vector for the dependent variables for another group.Suppose µij represents the mean of the dependent variable i for group j ,
i = 1, 2, j = 1, 2 The mean vector for group j can be written as
The alternative hypothesisH1in this case says that for at least one variable there
is at least one group with a population mean different from that in the othergroup(s):
H1: µ1= µ2
Calculating MANOVA Test Statistics. Derivation of the MANOVA test statistics
is similar to that in ANOVA but involves relatively cumbersome matrix operationsand equations Hence we will limit the discussion to a conceptual level (seeRef 130 for more detail)
Recall that the ANOVA attempts to test if the amount of variance explained bythe independent variable (namely, SS b, see Section 7.4.3) exceeds significantlythe variance that has not been explained (namely,SSw) The variance here is afunction of the sum of squares of deviations from the mean for an entire group
(the latter being called the sum of squares, SS) The ANOVA’s F statistics is a
ratio of the mean square between,MSb, to the mean square within,MSw.Instead of scalars of dependent variables, MANOVA employs a vector of
dependent variables A single sum of squares is replaced with a complete (total)
matrix of sums of squares and cross-products,SP t Along its diagonal the matrixhas the sums of squares that represent variances for all dependent variables, and
in its off-diagonal elements it has cross-products that represent covariances ofvariables Just as a univariate ANOVA, MANOVA divides matrix SPt into thewithin-group matrix, SPw, and the between-group matrix, SPb From algebra,the matrix determinant expresses the amount of generalized variance, or the totalvariability that is present in the underlying data and is expressed through thedependent variables One can hence compare the generalized variance of onematrix with another
Wilks’ lambda test is perhaps the most widely used statistical test of
multivari-ate mean differences [130] It derives from the following idea Since matrixSP b
Trang 9represents the amount of explained variance and covariance, and matrixSPwresents the remaining variance and covariance, in the case of a significant effectone would expect matrixSPbto have a larger generalized variance compared tomatrixSPw Wilks’ lambda index,, is defined as a ratio of determinants of the
We associate the value of with the effect’s significance The value can
also be interpreted as the proportion of unexplained variance The main effectsand interaction effects in multiple-way MANOVA are conceptually the same asthose in ANOVA While computations are more complex in MANOVA, theirunderlying logic is the same as in ANOVA
If an overall significant multivariate effect is found, the next natural step
is to submit the data to further testing, to see whether all dependent variables
or some specific dependent variables are affected by the independent variables.Performing multiple univariate ANOVAs for each of the dependent variables is acommon method for interpreting the respective effects One attempts to identifyspecific dependent variables that contributed to the overall significant effect
Repeated Measures MANOVA. In our statistical tests so far, all independent
variables involved in ANOVA and MANOVA were also between-subjects
vari-ables (or factors); we were interested in differences between means or mean
vectors of several distinct groups of subjects The observed scores were dent of each other at different levels of the between-subjects variables
indepen-However, in Experiment Two we also want to study the difference in responses
of the same subjects before and after treatment; in our case, treatment is training This variable is called repeated measures, and its analysis is called repeated
measures MANOVA In a repeated measures design the several response variables
are results of the same test carried out by the same subjects, applied a number oftimes or under more than one experimental condition For example, in ExperimentTwo each subject was assessed as to their path length and completion time onday 1 and again on day 2 The variable “day” is a repeated measures variable,
as well as a within-subjects variable
In other words, a between-subjects variable is a grouping variable—similar tothe visibility or interface in our study—whereas a within-subjects variable refers
to the measurements for every level of the within-subjects variable For example,
a within-subjects variable may be “time,” or “day,” or “training factor.” A studycan involve both within- and between-subjects independent variables Our Experi-ment Two analysis constitutes a 2 (days) by 2 (visibility levels) repeated measuresMANOVA, or repeated measures ANOVA The first independent variable, day, is
a within-subjects (repeated measures) variable, and the last independent variable,visibility, is a between-subjects variable
Trang 10Repeated measures MANOVA is an extension of the standard MANOVA Theunderlying principles of both are almost the same In the standard MANOVA,
vectors of means are compared across the levels of independent variables In the repeated measures MANOVA, vectors of mean differences are compared across
the levels of independent variables
Mean differences are the differences in values of dependent measures betweenlevels of the within-subjects variable These can be seen as new independentvariables If, for example, the dependent variables were measured for each subject
at four different time moments, say at times T1 through T4, these original fourvariables would be transformed to three alternative derived difference variables,denoted (T1–T2), (T2–T3), and (T3–T4) These three new variables directlyaddress the questions of interest The repeated measures MANOVA, therefore,compares the vectors of means across the new transformed variables, not theoriginal scores
When conducting a repeated measures MANOVA, a sphericity assumption
must be met It requires that the covariance matrix for the transformed variables
be a diagonal matrix That is, the values (variances) along the diagonal of thetransformed covariance matrix should be equal, and all the off-diagonal elements(correlation coefficients) should be zeros The purpose of the sphericity assump-tion is to ensure the homogeneity of covariance matrices for the new transformedvariables [131, 132]
7.5.2 Implementation Scheme
Experiment One. Recall that in Experiment One the observation scores in eachtask were measured on two dependent variables, path length and completiontime Subjects have been randomly selected, and sets of scores were mutuallyindependent Further, the two dependent variables were correlated, with the cor-relation coefficient 0.74 We take this correlation into account when performingthe significance test, since the overall set of dependent variables may containmore information than each of the individual variables This suggests that theExperiment One data can be a candidate for a multivariate analysis of variance,MANOVA
Since, as discussed in the previous section, the effect of direction factor inExperiment One is statistically significant, we separately perform two sets ofMANOVAs—one for the left-to-right task and the other for the right-to-left task.When performing MANOVA for the left-to-right task, the data set forms a two-way array, 2 (visibility) × 2 (interface) For the right-to-left task, the data setalso forms a two-way array, 2 (visibility)× 2 (interface) The results of analysisshould answer questions such as: (1) does human performance improve in thevisible environment compared to the invisible environment? (2) Does humanperformance improve in a test with the physical arm manipulator as compared
to the virtual arm manipulator? (3) Does the effect of the visibility factor workacross the levels of the interface factor?
Trang 11TABLE 7.14 Descriptive Statistics for the Data in Experiment Two
Descriptive Statistics Variable/Task ValidN Mean Minimum Maximum Std Dev.
Similar to Experiment One, the Experiment Two data for each task wererecorded for two dependent variables, path length and completion time Subjectswere randomly selected, and the sets of scores were independent of each other.The correlation coefficient of the two dependent variables is 0.79 This correlationsuggests that each dependent variable contains some new information as well assome information overlapping with the other dependent variables Accountingfor this correlation allows us to test the significance of dependent variables inhuman performance Since the data in Experiment Two correspond to the samesubjects on day 1 and day 2, the day factor is a repeated measures variable withtwo levels, day 1 and day 2 These data call for a repeated measures MANOVA.The Experiment Two data form a two-way array, 2 (day) × 2 (visibility) Ifany main effects or interaction effects are identified, multiple univariate ANOVAwould be performed, in order to observe the effects on each dependent variable
In our data analysis we are interested in these questions: (1) Is there an ment in human performance across day 1 and day 2? (2) Is there a statisticallysignificant difference in human performance in the visible as opposed to invisibleenvironment? and (3) Does the effect of one independent variable change overthe levels of another independent variable?
improve-Combined Experiment One and Two. There is another data set that we canuse to test the effects of training and visibility The first half of the data (12subjects) in this new combined data set was extracted from the Experiment One.Six of these were randomly picked among the virt–vis–RtoL data, and another
Trang 12six were randomly picked among the virt–invis–RtoL data The second half (12subjects) of the data in the new combined set are the day 2 data (12 subjects)from Experiment Two.
The purpose of MANOVA or ANOVA analysis on this combined data set is
to test whether human performance would show improvement from day 0, whenthe subjects executed tasks without any training (in Experiment One), to day
2, when subjects had a benefit of several training trials (in Experiment Two).The effect of visibility would also be tested here Note, however, that the dayfactor in this analysis is not a repeated measure variable any longer but instead
a between-subjects variable This is because there are no pairs of data for day
0 and day 2 coming from the same subjects (which would be required by thedefinition of repeated measure variable) The data for day 0 are independent withrespect to the data for day 2 Therefore, the new combined data form a two-wayarray, 2 (day)× 2 (visibility)
7.5.3 Results and Interpretation
1 The MANOVA scheme was applied to the left-to-right data in Experiment
One The variables involved are as follows:
• Dependent variables:
1 Path length
2 Completion time
• Independent variables:
1 Interface, with 2 levels: virtual and physical
2 Visibility, with 2 levels: visible and invisible
The results are shown in Table 7.15 Here df is the degrees of freedom (see
Section 7.4.5) Note that thep-level for interaction between the two independent
variables is significantly high We thus conclude that there is no interaction effect.This means that the effect of one independent variable does not change acrossthe levels of the other independent variable The p-level for the interface factor
is almost zero We therefore reject the null hypothesis of the MANOVA, and
we conclude that the interface factor has a statistically significant effect on the
TABLE 7.15 Results of MANOVA for LtoR Task, Experiment One
MANOVA Effects Studied: 1 — interface, 2 — visibility
Trang 13overall human performance The p-level for the visibility factor shows that the
overall human performance is only slightly improved in the visible environmentcompared to the invisible environment
Given a significant effect indicated by the MANOVA results, multiple variate ANOVAs have followed
uni-2 The MANOVA was applied to the right-to-left data in Experiment One.
The variables are as follows:
• Dependent variables:
1 Path length
2 Completion time
• Independent variables:
1 Interface, with 2 levels: virtual and physical
2 Visibility, with 2 levels: visible and invisible
The results are shown in Table 7.16 Note that thep-level for the interaction
effect between the two independent variables is large enough; we can concludethat there is no interaction effect This means that the effect of one independentvariable is not influenced by the other independent variable The p-level for the
interface factor is almost zero, hence we reject the null hypothesis of MANOVA.That is, the interface factor has a statistically significant effect on the overallsubjects’ performance The p-level for the visibility factor is large; we thus
conclude that the overall subjects’ performance was affected by the visibilityfactor
Since a significant effect was demonstrated by this MANOVA, multiple variate ANOVAs have followed
uni-3 MANOVA was applied to the Experiment Two data The variables involved
are as follows:
• Dependent variables:
1 Path length
2 Completion time
TABLE 7.16 Results of MANOVA for RtoL Task, Experiment One
MANOVA Effects Studied: 1 — interface, 2 — visibility
Trang 14TABLE 7.17 Results of MANOVA, Experiment Two
MANOVA Effects Studied: 1 — visibility, 2 — day
• Independent variables:
1 Visibility, with 2 levels: visible and invisible
2 Day (repeated measures), with 2 levels: day 1 and day 2
The results are shown in Table 7.17 The p-level for the interaction effect is
bigger than the significance level (0.05), pointing to no significant interactioneffect This means the result for one independent variable is not modified bythe other independent variable Thep-level for the visibility factor is also larger
than the significance level, so the null hypothesis should be accepted In otherwords, the overall human performance is not affected by the visibility factor The
p-level for the day factor suggests a slight effect of the day (training) factor on
human performance This reconciles with our knowing that the overall subjects’performance was better on day 2 compared to day 1
In order to see which indicator of human performance might be affected bythe day (training) factor, multiple univariate ANOVAs have been performed
4 ANOVA was applied to the path-length-dependent variable in Experiment
Two The variables involved are as follows:
• Dependent variables:
1 Path length
• Independent variables:
1 Visibility, with two levels: visible and invisible
2 Day (repeated measures), with two levels: day 1 and day 2
The results are shown in Table 7.18 The p-levels for the main effects and
interaction effect are larger than the significance level 0.05 Each null hypothesisfor the main effects and interaction effect should hence be accepted We concludethat there are no significant effects of the visibility factor and day (training) factor,and that these results do not change across the levels of the independent variables
In other words, surprisingly, the visibility factor has no significant effect on thesubjects’ path length, and the day (training) effect has no significant effect onthe path length as well
Trang 15TABLE 7.18 Results of ANOVA for Path Length, Experiment Two
ANOVA Effects studied: 1 — visibility, 2 — day
TABLE 7.19 Results of ANOVA on Completion Time, Experiment Two
ANOVA Effects studied: 1 — visibility, 2 — day
5 An ANOVA was applied to the independent variable of completion time in
Experiment Two The variables involved are as follows:
• Dependent variables:
1 Completion time
• Independent variables:
1 Visibility, with 2 levels: visible and invisible
2 Day (repeated measures), with 2 levels: day 1 and day 2
The results are shown in Table 7.19 The p-levels for the main effects and
interaction effect are all larger than the accepted threshold significance level0.05 Hence all null hypotheses for the main effects and interaction effect areaccepted We thus conclude that there are no significant effects for the visibilityfactor and day (training) factor, and this does not change across all levels of theindependent variables In other words, surprisingly, neither the visibility factornor the day (training) factor has a significant effect on the completion time
6 The MANOVA was applied to the combined data set The variables involved
are as follows:
• Dependent variables:
1 Path length
2 Completion time