Independent or Unrelated Samples - two variables, the first tells SPSS what condition EACH subject belongs to, the second is the actual score for that subject: Variable 1 what condition
Trang 1Double-click on the chart to move the histogram from the Chart Carousel Window to
a Chart Window The menu bar and tool bar change to show editing facilities
First, click on CHART then OPTIONS and NORMAL CURVE - then hit OK The
normal curve superimposed over the histogram is the one for the above mean and standard deviation Admittedly, it‟s difficult to make a decision with such a small sample, but does the curve appear to be a good fit to the histogram?
Now, click on the icon „swap axes‟ Does the histogram look better with vertical bars
or horizontal bars?
Now try some of the other icons and tools to change the chart These changes require the appropriate part of the chart to have been selected Click on any bar The bars will become highlighted with small black squares at their corners Then click on the Fill Pattern - tool button (the rectangle with diagonal shading) To apply a pattern, click
on it and then click on apply Once you have finished with the patterns, click on close Also, try the Colour Palette tool button (the one with the pen) and the Bar Labels icon tool button (the one with the fingernails)
You can also change the style of the line showing the Normal curve, and the fill pattern and colour of the background of the histogram Once you have finished with
your work, select FILE and then SAVE CHART Save your histogram as
artwork.chz
To copy or move a chart into Word click on EDIT and then select COPY the chart
To move to Word minimise SPSS and open word If Word is already open then press
ALT & TAB to move between programs Once in Word, go to EDIT PASTE Finally, exit from SPSS for windows by selecting FILE EXIT
Section II: Manipulating the Data in the Matrix
(Computing, Recoding, Filtering and Deleting Data)
Computing Values
Start off SPSS and open the file family.sav (you should find this file on your M: drive in the folder that you named survey) We shall use the COMPUTE command
to build up a new variable that will be labelled BMI, which stands for body mass
index This is calculated as:
Body mass index = weight (pounds)/ height (inches)2
Select TRANSFORM and then COMPUTE and set the Target Variable to bmi
Click on Type & Label and enter the label body mass index in the label box Click continue to return to the Computer Variable dialog box Using the source list on the left and the calculator pad in the centre, build up
Trang 2Weight * 0.4536 / (height * 0.0254) **2
in the numeric expression box Run the completed command The new variable is added to the end of the data We shall check the new variable by estimating a few
descriptive statistics using FREQUENCIES (via Analyze – Descriptive Statistics)
(Analyze – Descriptive Statistics – Explore would be a better command, but Frequencies will do here)
Select ANALYZE, DESCRIPTIVE STATISTICS and then FREQUENCIES
Move body mass index (bmi) to the Variable(s) box Since bmi is a metric variable with a potentially different value for every case in the data suppress frequency tables
by clearing the check box Click on DISPLAY FREQUENCY TABLES Now you
will get a message saying „You have turned off all output Unless you request Display Frequency Tables, Statistics or Charts, Frequencies will generate no output‟
No worries, we will estimate descriptive statistics by clicking on STATISTICS and clicking on the check boxes for the following: MEAN, MEDIAN, MINIMUM and MAXIMUM Run the command and look at the output
What are the sample values of the mean, median, minimum and maximum?
(The mean should be around 25.0 Any values outside the range15.0 to 35.0 should
be queried)
Do the sample statistics satisfy these rough checks? If not, something is wrong!
Conditionally Computing Values
Now we shall use the IF sub-command (via Transform-Compute) to set up a new variable The sub-command allows you to set up a new variable under the condition that the original variable, which it is based on, fulfils certain criteria We want to set
up a new variable AGEHOH for the age of the head of the household In other
words, If a person in the sample is head of the household, AGEHOH shall indicate that person‟s age
Select TRANSFORM and then COMPUTE and clear the previous settings by clicking on RESET Set the Target Variable to AGEHOH and click on TYPE & LABEL to assign the label age head of household Click on Continue, and then set the Numeric Expression to AGE We want this (i.e., the current age in years) to be
applied when the case is head of household, which occurs when RELTOHOH is zero (For the variable RELTOHOH – relationship to head of household – the value 0
denotes that a person is head of household) Select IF… and INCLUDE IF CASE SATISFIES CONDITION Set up the condition RELTOHOH = 0 in the large box
and run the command The variable AGEHOH should now be added to the end of the data Have a look at the new variable You should see ages set for some cases only Let‟s check AGEHOH by moving it in the data matrix to the column after RELTOHOH so that we can see what happened more clearly
Trang 3selecting UTILITIES and VARIABLES… selecting RELTOHOH from the source lists and then clicking on GO TO and CLOSE Now click on any cell of the variable
that is immediately to the right of RELTOHOH (this variable should be sex) Then
select DATA and then INSERT VARIABLE Alternatively, you can click on INSERT VARIABLE tool (which is the sixth button from the right)
Now, a blank column headed var00001 containing system-missing values (dots) is inserted before the selected variable Move the AGEHOH to this column by
single-clicking on AGEHOH to highlight the column and then selecting EDIT and CUT
To paste it in the desired location single-click on the head of the blank column
(var00001) and select EDIT and then PASTE
Look at the values in the DATA EDITOR window
Do all heads of household have AGEHOH set? If not, what might be the reason? (Hint: Look at the variable that agehoh is derived from!)
What value is set for cases who are not heads of household?
Re-coding Values
The RECODE command in SPSS is very powerful and efficient but it can be a little
tricky to set up due to the number of clicks required We shall recode BMI into a new variable BMIGRP, which takes the values
Select TRANSFORM and then RECODE and INTO DIFFERENT VARIABLES Select BMI from the source list into the central INPUT VARIABLE – OUTPUT VARIABLE box Enter BMIGRP into the Name box and click on Change to complete the INPUT VARIABLE – OUTPUT VARIABLE box Also enter a suitable variable label for BMIGRP in the LABEL box (e.g., categorical body mass
index)
To set up the recoding, click on OLD and NEW VALUES….We build up the recode specification for the third category of BMIGRP first In the OLD VALUE box, select RANGE and THROUGH HIGHEST and enter 30.0 in the box before THROUGH HIGHEST In the NEW VALUE section, enter 3 into the VALUE box Then click
on ADD to copy the specification 30.0 THROUGH HIGHEST = 3 to the OLD –
NEW box Build up the other two specifications, in order of 25.0 through 30.0 = 2 and LOWEST THROUGH 25.0 = 1 Now run the completed command
Trang 4To finish, double-click on BMIGRP in the Data Editor window, and define suitable value labels (i.e., 1= okay, 2 = overweight, 3 = obese)
Are the values of BMIGRP correct for the first ten cases?
Filtering Cases
In this example, we shall filter cases The filtering option allows you to exclude certain cases from further analysis temporarily
Before filtering, generate a two-way frequency table for ownrent by typaccm by
selecting ANALYZE, then DESCRPTIVE STATISTICS and then CROSSTABS
and selecting ownrent for Row(s) and typaccm for column(s) Run the command and look at the table in the output
1 What exactly does the frequency count in the first cell of the second table refer to?
6 what?
We shall filter using the variable PERSNO, which is the number of persons in the household
2 What will be the effect of selecting cases satisfying the condition persno=1? What
is the impact on households?
Now, select DATA and SELECT CASES and then IF CONDITION IS SATISFIED and make sure that UNSELECTED CASES are FILTERED (This is very important as the alternative is DELETED, which we want to avoid now!)
Select IF… and build up the condition persno = 1 in the large box Run the completed command Find persno in the data editor window
3 What appears in the status bar when filtering is in effect? (The status bar is at the bottom of the window)
4 What has happened to case numbers with persno ≠ 1?
Rerun the CROSSTABS command (via Analyse – Descriptive statistics) and look at
the new table in the output
5 What exactly does the frequency count in the first cell refer to now? 3 What?
Go to the Data Editor Window and save the filtered data as familyf.sav Then select
DATA, SELECT CASES and then ALL CASES Run the command
6 What happens to the status bar and the case numbers?
Trang 5Deleting Cases
Instead of filtering cases we shall delete unselected cases without doing any harm to
data stored in disk system files Select DATA, SELECT CASES, IF CONDITION
IS SATISFIED which picks up the previous condition on persno = 1 Then select UNSELECTED CASES are DELETED Run the command and have a look at the
Data Editor Window
1 How many cases are left?
2 What are the values of PERSNO?
3 What are the values of HSEMO? What does that successfully show?
Now, rerun the CROSSTABS command in the previous section and look at the
output
4 Do the results agree with those obtained when cases are filtered?
Return to the Data Editor Window and save the selected cases to a NEW system file named familyd.sav (after deleting cases you should do this as soon as possible to avoid overwriting your complete data file by accident)
Finally, re-open familyf.sav, the filtered file you saved from the previous section
5 Is filtering still on?
Exit from SPSS, saving the contents of the output window into output3.spo
Open up family.sav that you saved to your survey folder
Trang 6WEEK 3: October 17th
T-Tests
Section I: Parametric T-tests (related & unrelated)
This practical will show you how to run a t-test so that you can look at the difference between means of two scores
Experimental designs can be of two basic types – within subject (dependent or
related) and between subject (independent or unrelated) The former is when all subjects are subjected to all conditions (e.g., testing reaction times before and after receiving a drug) Between subject designs are when you divide subjects into independent groups, such as on the basis of gender, or into one group that receives a drug, and a second that receives a placebo
DEPENDENT OR RELATED SAMPLES T-TEST
First, a quick review of the test layouts
1 Related Samples - two variables, one for each condition of the experiment Each subject has two scores, as a result:
Variable 1 (First set of scores for the subjects, e.g reaction time
before taking the drug)
Variable 2 (Second set of scores for the subjects, e.g reaction time
after taking the drug)
Sub No
2 Independent or Unrelated Samples - two variables, the first tells SPSS what
condition EACH subject belongs to, the second is the actual score for that subject:
Variable 1 (what condition each subject belongs to, e.g group 1 are the controls, group 2 receive the drug)
Variable 2 (actual score, e.g each subject‟s reaction time)
Sub No
1 (control) subject‟s condition (1) subject 1 score
Trang 7T-Test for Related Sample
This is the parametric comparison of two related groups, for example, when you want
to compare mean scores for subjects at some task before and after taking a drug Each set of subject scores for the related t-test must be entered as an individual variable in SPSS So, in the above example, all the individual(s) scores for the task before taking the drug would be in one column and all the scores after taking the drug in another
First, open family.sav The next step is to add a variable to the data file, so that we can run the related t-test In this case, the comparison will be between the subjects‟ height/weight ratio before they were put on a 4-week diet/exercise plan and after The variable already in the data set HWRATIO is the measure before At the end of the data file, add the variable HWRATIO2 to represent their measurements after the plan Using what you learned in the first lesson about entering data, create the new variable using the information below:
Variable Name: HWRATIO2
Variable Label: Height/Weight Ratio after plan
Data: see table 1 below
To run the procedure, go ANALYZE, COMPARE MEANS and then PAIRED-SAMPLES T-TEST
The usual dialogue box appears The dialogue box has the two-column format The only difference is that you must select pairs of variables and move them across, rather than just one variable at a time To do this, you have to click on one variable, then locate the other variable and click on it The two variables that you have requested should appear in the current selection box After clicking on both, you then press the arrow button to move the pair across SPSS will analyse each pair to determine if their means are significantly different statistically In this case, select the variables
HWRATIO and HWRATIO2 and move them across, then press the OK button Table 1: Data for Height/Weight Ratio after a 4-week diet/exercise plan
Trang 814 50
OUTPUT
The results appear in three sections
The first section gives you a table called Paired Samples Statistics with the mean
scores, standard deviations and standard error mean for the two variables
The second section is a table called Paired Samples Correlation(s) showing the
correlation between the two variables and the level of significance
The third section is more important The table called Paired Samples Test
indicates the significance of the results This includes the t-value, degrees of freedom (d.f.) and the two-tailed significance level
What is the t-value for the comparison between the height to weight ratio scores?
Is there a significant difference between the scores before and after the diet/exercise plan? If so, which is the greater height/weight ratio?
T-Test for Independent Samples
This is the parametric t-test for two independent samples - a between-subjects design where, for example, subjects are randomly assigned to two separate test conditions (e.g drug and control) and the mean scores (e.g reaction time) are compared to determine if they are significantly different from each other
In this case, you want to test whether there is a statistical difference in weight to height ratios between the male and female subjects The format for variables to be used in the independent t-test is different from that used in the related Instead of the scores being placed in two separate columns (variables), all of the scores are placed in
a single column (variable) A second variable identifies for SPSS which of the two groups each score belongs to So, in this case, there is the variable HWRATIO2 as the dependent variable and NSEX as the independent variable
To run the analysis, go to ANALYZE, COMPARE MEANS and then INDEPENDENT-SAMPLES T-TEST As usual, the left column lists all the
variables in your data file On the right, there are two boxes:
Trang 9 The grouping variable box is where you move the variable that distinguishes between the two independent groups (e.g the variable NSEX)
First, select the dependent variable HWRATION2 and move it over to the test variable(s) section Next move NSEX over into the grouping Variable section and
press the DEFINE GROUPS button Values from the grouping variable must be
entered into the two boxes In the case of the variable sex, where only two levels are recorded, you would just enter “1" in the top box for male subjects, and “2" in the
lower one for female subjects Hit the CONTINUE button, then hit the OK button
[Note: There may be times where you have a larger range of values, such as five different education levels, but only want to look at the difference between two of them You would enter the two values you wish to compare.]
OUTPUT
There are two sections:
The first section of the output gives you a table called Group Statistics which
indicates the number of cases and the mean scores etc for each condition
The second section provides a table called Independent Samples T-test and
starts with Levene‟s Test for Equality of Variance If the variance is unequal and
is indicated by significant difference, then when you look at the results of the t-test in the final table, you use the line starting with Equal variances not assumed
If it isn‟t significant, you look at the line starting with Equal variances assumed The final table gives you t-values, degrees of freedom and the two-tailed significance levels
In this case, Levene‟s is not significant (0.137), so we look at the equal variance line
In this case, it is not significant (two-tailed significance of 478), so we reject the hypothesis that there is a difference between males and females in their height to weight ratios
Section II: Non-Parametric T-tests (Wilcoxon - related & Mann-Whitney - unrelated)
All of the tests today can be found under ANALYZE, NONPARAMETRIC TESTS
Mann-Whitney - Unrelated
This is the non-parametric t-test for two independent samples - a between-subjects
design To run the analysis, choose: ANALYZE, NONPARAMETRIC TESTS, and
2 INDEPENDENT SAMPLES
As usual, the left column lists all the variables in your data file On the right, there are two boxes:
Trang 10 the “test variable(s)” box is where you move the dependent variable(s)
the “grouping variable” box is where you move the variable that distinguishes between the two independent groups (e.g the variable sex)
So, move HWRATIO2 into the test variable box, and move NSEX into the grouping variable box Now, click the Define Groups button Values from the grouping variable must be entered into the two boxes In the case of the variable NSEX, you enter “1" in the top box for male subjects, and “2" in the lower one for female subjects Hit the Continue button, then hit the Ok button
OUTPUT
SPSS divides the entire set of subjects into three groups:
those with a score of 1 (male)
those with a score of 2 (female)
cases with missing data, which are excluded from the analysis)
The first section gives the mean ranks for the two conditions that are included, as well
as the sums of the ranks and the numbers of cases
The second section gives the Z score and p-values for the T-test
Is there a difference between males and females? How do the results from this week compare to last week‟s?
Wilcoxon - Related
This is the non-parametric repeated measures T-test, in a within subjects design Like the parametric equivalent, we‟ll be running a comparison of height to weight ratios for the sample population before and after a four-week exercise/diet program To run the
analysis, choose: ANALYZE, NONPARAMETRIC TESTS, and 2 RELATED SAMPLES
The dialogue box has the two-column format The only difference is that you must select pairs of variables and move them across SPSS will analyse each pair to determine if their mean ranks are significantly different statistically For this analysis, select the two variables HWRATIO and HWRATIO2, then click the Ok button OUTPUT
The output for this procedure is quite different from the parametric test The first section gives you information about how many rank scores for one condition are