One-way ANOVA for Independent Samples In this case, we want to determine if there is a significant difference in the height to weight ratio between the three age groups in the sample in
Trang 1the ranks scores for the other condition The mean ranks for each of these three levels are given, as well as the sums of the ranks for each and the number of cases that fall under each level
The main results are underneath this table, where the Z value and the p value are given The usual standard for levels of significance is used (if p is less than 0.05) How many cases are there where HWRATIO is greater than HWRATIO2?
Is there a significant difference between ranked height/weight ratios before and after the exercise/diet program?
Trang 2WEEK 4: October 24th
ANOVAS
This practical will involve familiarising students with the analysis of variance (ANOVA) The ANOVAs used in this practical are when you may want to determine
if there is a significant difference between three or more groups when you have only a
single variable
One-way ANOVA for Independent Samples
In this case, we want to determine if there is a significant difference in the height to weight ratio between the three age groups in the sample in family.sav - children, adults and elderly We also want to carry out a Tukey‟s post-hoc test to identify where those difference lie, if any The procedure is remarkably similar to carrying out an
unrelated samples t-test Go: ANALYZE, COMPARE MEANS, ONE-WAY
ANOVA
As you can see, the layout of the dialogue box is basically the same as the one for unrelated t-tests from last week First select your Dependent variable(s) - in this case move the variable HWRATIO into the dependent list section Your factor (independent variable) is the variable AGEGRP Press the Continue button
Before running the analysis, press the Post-hoc button and turn on the Tukey‟s test Now press the Continue and Ok buttons and the analysis will be carried out
OUTPUT
There are two sections to the results for the one-way ANOVA
1 The first section indicates whether any significant differences exist between the different levels of the independent variable The between groups, within groups, sums of squares are listed, degrees of freedom, the F-ratio and the F-probability score (significance level) It is this last part that indicates significance If the F-prob is less than 0.05 than a significant difference exist In this case, the F-F-prob
is 0.000, so we can say that there is a statistically significant difference in height
to weight ratios between the three age groups
2 The post-hoc test identifies where exactly those difference lie The final part of the second section is a small table with the levels of the independent variable listed down the side Looking at the comparisons between these levels we see that children have a significantly higher mean height to weight ratio than adults and the elderly (this is also indicated by the asterixes)
For the meantime, ignore the third table of the output
Trang 3One-way ANOVA for Related Samples
The procedure for running this is very different from anything you‟ve done before The first step is easy enough - you need to add a third height to weight ratio variable, representing the ratios for the subjects some time after they stopped doing the exercise/diet plan The data is below:
Variable Name: HWRATIO3
Variable Label: Height/Weight Ratio post-plan
Data: see table below
Subject Number HWRATIO3 score
The first step is to run a single factor ANOVA by going: ANALYZE, GENERAL
LINEAR MODEL, REPEATED MEASURES
The dialogue box is different from the usual format The first step is to give a name to the factor being analysed, basically the thing the three variables have in common All three variables cover height to weight ratios, so
in the “With-in Subject Factor Name:” box type RATIO
in the Number of Levels box, type 3 (representing the three variables)
press the Add button, then the Define button
The next dialogue box is a bit more familiar In the right-hand column, there are three question marks with a number beside each Select each of the three variables to be included in the analysis, and move them across with the arrow button Notice how each of the variables replaces one of the question marks, indicating to SPSS which
Trang 4three variables represent the three levels of the factor RATIO Then proceed by clicking on OK
OUTPUT
Firstly, you can ignore the sections of the output titled “Multivariate Tests” and
“Mauchly‟s Test of Sphericity”
You need to examine the section titled “Tests of Within-Subjects Effects” This section indicates whether any significant differences exist between the different levels
of the within subjects variable The degrees of freedom and sums of squares are listed,
as well as the F-score and its significance level If the significance level is less than 0.05 than a significant difference exist In this case, it is 0.001 (look at the measure for sphericity assumed), so we can say that there is a statistically significant difference in height to weight ratios between the three times when measurement were taken
You can ignore the section titled “Tests of Between-Subjects Effects” It is irrelevant here
To do a post-hoc test to identify where the differences lie, the SPSS for Windows made easy manual recommends doing Paired-Sample T-tests In this case
HWRATIO & HWRATIO2
HWRATIO & HWRATIO3
HWRATIO2 & HWRATIO3
From these three T-tests, you can determine which of the height to weight ratios are significantly different from each other
Kruskall-Wallis ANOVA (KWANOVA – Unrelated)
This is similar to the non parametric independent ANOVA, where ranks are used instead of the actual scores We will run the analysis on the same variables, so go
ANALYZE, NONPARAMETRIC TESTS, and K INDEPENDENT SAMPLES
As with the parametric test, move HWRATIO over to the test (dependent variable list and AGEGRP over to the Grouping (independent) variable list, and define the group with a minimum of 1 and a maximum of 3 Click the Ok button Notice that the non
parametric ANOVA doesn‟t have a post-hoc test If you run this ANOVA, you‟ll
have to consult a statistics book as to how to do a post hoc on the results One way would be to run a series of t-tests on all the combinations of the conditions
OUTPUT
The first section gives you the mean ranks and the number of cases for each level of the independent variable The second section lists the Chi-Square value, degrees of freedom and significance of the test
Trang 5Is there a significant difference between the three groups (remember you can‟t say exactly what that difference is without a post hoc test)?
Friedman‟s - Related ANOVAs
This is similar to the nonparametric related samples ANOVA, where ranks are used instead of the actual scores We will run the analysis on the same variables, so go
ANALYZE, NONPARAMETRIC TESTS, and K RELATED SAMPLES
This is much easier to run - just move the three variables (HWRATIO, HWRATIO2
and HWRATIO3) over to the right column and click OK
OUTPUT
There is the Chi-square score, the d.f and whether it‟s significant (as usual, has to be less than 0.05) Again, for post-hoc tests, you‟ll probably have to consult a statistics book or possibly run three non-parametric related samples T-tests
Trang 6WEEK 5: 30th October
Study Week
No Practical
Trang 7WEEK 7: November 14th QUALITATIVE RESEARCH: STUDENT SEMINAR
PRESENTATION PREPARATION
Students should use this time to prepare work for their presentations Dr Alison will be available in his office for guidance if necessary
WEEK 8: November 21st QUALITATIVE RESEARCH: STUDENT SEMINAR
INTERVIEWING AND DISCOURSE ANALYSIS
conductig interviews etc
This period should be used to conduct interviews in preparation for the session on content analysis Students are expected to conduct interviews or sessions that result in naturally occurring language It is important that this material is transcribed in
preparation for week 11‟s session Dr Alison will be available for consultation
WORKING WITH NATURALLY OCCURING
LANGUAGE
PREPARATION
Students will use this period to work with their material gathered in the previous sessions They should use this time to prepare for presentations in the final practical session (12th December)
WORKING WITH NATURALLY OCCURING
LANGUAGE: STUDENT SEMINAR
Students are expected to organise their own seminar presentations in this session on the results and methods employed regarding the content analysis of their material
Trang 8SECTION III EXTRA MATERIAL
Trang 9For the benefit of students who wish to follow up other procedures in their own time,
we have included the following section which gives you some opportunity to play with graphics packages and explore some issues associated with regression in preparation for next term Try not to worry if this all sounds unfamiliar at first This section is simply to give you a running start when it comes to your work after Christmas
REGRESSION
Simple Regression
In simple regression, the values of one variable (the dependent variable (y in this case)) are estimated from those of another (the independent variable (x in this case))
by a linear (straight line) equation of the general form:
y‟=bo+b1(x)
where y‟ is the estimated value of y, b1 is the slope (known as the regression coefficient)
and bo is the intercept (known as the regression constant)
Multiple Regression
In multiple regression the values of one variable (the dependent variable (y)) are estimated form those of two or more variables (the independent variables (x1,
x2,…,xn)) This is achieved by the construction of a linear equation of the general form:
y‟=bo+b1(x1)+b2(x2)+…+bn(xn)
where the parameters b1,b2,…,bn are the partial regression coefficients and the intercept bo is the regression constant
Residuals
When a regression equation is used to estimate the values of a variable (y) from those
of one or more independent variables (x), the estimates (y‟) will not be totally accurate (i.e., the data points will not fall precisely on the straight line) The discrepancies between y (the actual values) and y‟ (the estimated values) are known
as residuals and are used as a measure of accuracy of the estimates and of the extent
to which the regression model gives a good account of the data in question
Trang 10The multiple correlation coefficient
One measure of the efficacy of regression for the prediction of y is the Pearson correlation between the true values of the target variable y and the estimates y‟ obtained by substituting the corresponding values of x into the regression equation The correlation between y and y‟ is known as the multiple correlation coefficient (R (versus r which is Pearson‟s (the correlation between the target variable and any one independent variable)) In simple regression R takes the absolute value of r between the target variable and the independent variable (so if r=-0.87 than R=0.87)
Running Simple Regression
Using the family.sav file we want to look at how accurately we can estimate height to weight ratios (HWRATIO) using the subject‟s age (AGE) To run a simple
regression, choose ANALYSE, REGRESSION and LINEAR
As usual, the left column lists all the variables in your data file There are two sections for variables on the right The “Dependent” box is where you move the dependent variable Move HWRATIO there The “Independent(s)” box is where you move AGE
Next click the STATISTICS button, and turn on the “Descriptive” option
As already states, a residual is the difference between the actual value of the dependent variable and its predicted value using the regression equation Analysis
of the residuals gives a measure of how good the prediction is and whether there are any cases that should be considered outliers and therefore dropped from the analysis Click on “Case-wise diagnostics” to obtain a listing of any exceptionally large residuals
Now click on CONTINUE
Now click on the PLOTS button Since systematic patterns between the predicted
values and the residuals can indicate possible violations of the assumption of linearity you should plot the standardised residuals against the standardised predicted values To do this transfer *ZRESID into the Y: box and *ZPRED into
the X: box and then Click CONTINUE
Now click Ok
Output
The first thing to consider is whether your data contains any outliers There are no outliers in this data If there were this would be indicated in a table labelled
“Casewise Diagnostics” and the cases that corresponded to these outliers would have
to be removed from your data file using the filter option you learned previously
With that out of the way, the first table (Descriptive Statistics) to look at is right at the top The first part gives the means and standard deviations for the two variables (e.g