introduction to spss RESEARCH METHODS & STATISTICS HANDBOOK PHẦN 3 pot

Independent or Unrelated Samples - two variables, the first tells SPSS what condition EACH subject belongs to, the second is the actual score for that subject: Variable 1 what condition

Trang 1

Double-click on the chart to move the histogram from the Chart Carousel Window to

a Chart Window The menu bar and tool bar change to show editing facilities

First, click on CHART then OPTIONS and NORMAL CURVE - then hit OK The

normal curve superimposed over the histogram is the one for the above mean and standard deviation Admittedly, it‟s difficult to make a decision with such a small sample, but does the curve appear to be a good fit to the histogram?

Now, click on the icon „swap axes‟ Does the histogram look better with vertical bars

or horizontal bars?

Now try some of the other icons and tools to change the chart These changes require the appropriate part of the chart to have been selected Click on any bar The bars will become highlighted with small black squares at their corners Then click on the Fill Pattern - tool button (the rectangle with diagonal shading) To apply a pattern, click

on it and then click on apply Once you have finished with the patterns, click on close Also, try the Colour Palette tool button (the one with the pen) and the Bar Labels icon tool button (the one with the fingernails)

You can also change the style of the line showing the Normal curve, and the fill pattern and colour of the background of the histogram Once you have finished with

your work, select FILE and then SAVE CHART Save your histogram as

artwork.chz

To copy or move a chart into Word click on EDIT and then select COPY the chart

To move to Word minimise SPSS and open word If Word is already open then press

ALT & TAB to move between programs Once in Word, go to EDIT PASTE Finally, exit from SPSS for windows by selecting FILE EXIT

Section II: Manipulating the Data in the Matrix

(Computing, Recoding, Filtering and Deleting Data)

Computing Values

Start off SPSS and open the file family.sav (you should find this file on your M: drive in the folder that you named survey) We shall use the COMPUTE command

to build up a new variable that will be labelled BMI, which stands for body mass

index This is calculated as:

Body mass index = weight (pounds)/ height (inches)2

Select TRANSFORM and then COMPUTE and set the Target Variable to bmi

Click on Type & Label and enter the label body mass index in the label box Click continue to return to the Computer Variable dialog box Using the source list on the left and the calculator pad in the centre, build up

Trang 2

Weight * 0.4536 / (height * 0.0254) **2

in the numeric expression box Run the completed command The new variable is added to the end of the data We shall check the new variable by estimating a few

descriptive statistics using FREQUENCIES (via Analyze – Descriptive Statistics)

(Analyze – Descriptive Statistics – Explore would be a better command, but Frequencies will do here)

Select ANALYZE, DESCRIPTIVE STATISTICS and then FREQUENCIES

Move body mass index (bmi) to the Variable(s) box Since bmi is a metric variable with a potentially different value for every case in the data suppress frequency tables

by clearing the check box Click on DISPLAY FREQUENCY TABLES Now you

will get a message saying „You have turned off all output Unless you request Display Frequency Tables, Statistics or Charts, Frequencies will generate no output‟

No worries, we will estimate descriptive statistics by clicking on STATISTICS and clicking on the check boxes for the following: MEAN, MEDIAN, MINIMUM and MAXIMUM Run the command and look at the output

 What are the sample values of the mean, median, minimum and maximum?

(The mean should be around 25.0 Any values outside the range15.0 to 35.0 should

be queried)

 Do the sample statistics satisfy these rough checks? If not, something is wrong!

Conditionally Computing Values

Now we shall use the IF sub-command (via Transform-Compute) to set up a new variable The sub-command allows you to set up a new variable under the condition that the original variable, which it is based on, fulfils certain criteria We want to set

up a new variable AGEHOH for the age of the head of the household In other

words, If a person in the sample is head of the household, AGEHOH shall indicate that person‟s age

Select TRANSFORM and then COMPUTE and clear the previous settings by clicking on RESET Set the Target Variable to AGEHOH and click on TYPE & LABEL to assign the label age head of household Click on Continue, and then set the Numeric Expression to AGE We want this (i.e., the current age in years) to be

applied when the case is head of household, which occurs when RELTOHOH is zero (For the variable RELTOHOH – relationship to head of household – the value 0

denotes that a person is head of household) Select IF… and INCLUDE IF CASE SATISFIES CONDITION Set up the condition RELTOHOH = 0 in the large box

and run the command The variable AGEHOH should now be added to the end of the data Have a look at the new variable You should see ages set for some cases only Let‟s check AGEHOH by moving it in the data matrix to the column after RELTOHOH so that we can see what happened more clearly

Trang 3

selecting UTILITIES and VARIABLES… selecting RELTOHOH from the source lists and then clicking on GO TO and CLOSE Now click on any cell of the variable

that is immediately to the right of RELTOHOH (this variable should be sex) Then

select DATA and then INSERT VARIABLE Alternatively, you can click on INSERT VARIABLE tool (which is the sixth button from the right)

Now, a blank column headed var00001 containing system-missing values (dots) is inserted before the selected variable Move the AGEHOH to this column by

single-clicking on AGEHOH to highlight the column and then selecting EDIT and CUT

To paste it in the desired location single-click on the head of the blank column

(var00001) and select EDIT and then PASTE

Look at the values in the DATA EDITOR window

 Do all heads of household have AGEHOH set? If not, what might be the reason? (Hint: Look at the variable that agehoh is derived from!)

 What value is set for cases who are not heads of household?

Re-coding Values

The RECODE command in SPSS is very powerful and efficient but it can be a little

tricky to set up due to the number of clicks required We shall recode BMI into a new variable BMIGRP, which takes the values

Select TRANSFORM and then RECODE and INTO DIFFERENT VARIABLES Select BMI from the source list into the central INPUT VARIABLE – OUTPUT VARIABLE box Enter BMIGRP into the Name box and click on Change to complete the INPUT VARIABLE – OUTPUT VARIABLE box Also enter a suitable variable label for BMIGRP in the LABEL box (e.g., categorical body mass

index)

To set up the recoding, click on OLD and NEW VALUES….We build up the recode specification for the third category of BMIGRP first In the OLD VALUE box, select RANGE and THROUGH HIGHEST and enter 30.0 in the box before THROUGH HIGHEST In the NEW VALUE section, enter 3 into the VALUE box Then click

on ADD to copy the specification 30.0 THROUGH HIGHEST = 3 to the OLD –

NEW box Build up the other two specifications, in order of 25.0 through 30.0 = 2 and LOWEST THROUGH 25.0 = 1 Now run the completed command

Trang 4

To finish, double-click on BMIGRP in the Data Editor window, and define suitable value labels (i.e., 1= okay, 2 = overweight, 3 = obese)

 Are the values of BMIGRP correct for the first ten cases?

Filtering Cases

In this example, we shall filter cases The filtering option allows you to exclude certain cases from further analysis temporarily

Before filtering, generate a two-way frequency table for ownrent by typaccm by

selecting ANALYZE, then DESCRPTIVE STATISTICS and then CROSSTABS

and selecting ownrent for Row(s) and typaccm for column(s) Run the command and look at the table in the output

1 What exactly does the frequency count in the first cell of the second table refer to?

6 what?

We shall filter using the variable PERSNO, which is the number of persons in the household

2 What will be the effect of selecting cases satisfying the condition persno=1? What

is the impact on households?

Now, select DATA and SELECT CASES and then IF CONDITION IS SATISFIED and make sure that UNSELECTED CASES are FILTERED (This is very important as the alternative is DELETED, which we want to avoid now!)

Select IF… and build up the condition persno = 1 in the large box Run the completed command Find persno in the data editor window

3 What appears in the status bar when filtering is in effect? (The status bar is at the bottom of the window)

4 What has happened to case numbers with persno ≠ 1?

Rerun the CROSSTABS command (via Analyse – Descriptive statistics) and look at

the new table in the output

5 What exactly does the frequency count in the first cell refer to now? 3 What?

Go to the Data Editor Window and save the filtered data as familyf.sav Then select

DATA, SELECT CASES and then ALL CASES Run the command

6 What happens to the status bar and the case numbers?

Trang 5

Deleting Cases

Instead of filtering cases we shall delete unselected cases without doing any harm to

data stored in disk system files Select DATA, SELECT CASES, IF CONDITION

IS SATISFIED which picks up the previous condition on persno = 1 Then select UNSELECTED CASES are DELETED Run the command and have a look at the

Data Editor Window

1 How many cases are left?

2 What are the values of PERSNO?

3 What are the values of HSEMO? What does that successfully show?

Now, rerun the CROSSTABS command in the previous section and look at the

output

4 Do the results agree with those obtained when cases are filtered?

Return to the Data Editor Window and save the selected cases to a NEW system file named familyd.sav (after deleting cases you should do this as soon as possible to avoid overwriting your complete data file by accident)

Finally, re-open familyf.sav, the filtered file you saved from the previous section

5 Is filtering still on?

Exit from SPSS, saving the contents of the output window into output3.spo

Open up family.sav that you saved to your survey folder

Trang 6

WEEK 3: October 17th

T-Tests

Section I: Parametric T-tests (related & unrelated)

This practical will show you how to run a t-test so that you can look at the difference between means of two scores

Experimental designs can be of two basic types – within subject (dependent or

related) and between subject (independent or unrelated) The former is when all subjects are subjected to all conditions (e.g., testing reaction times before and after receiving a drug) Between subject designs are when you divide subjects into independent groups, such as on the basis of gender, or into one group that receives a drug, and a second that receives a placebo

DEPENDENT OR RELATED SAMPLES T-TEST

First, a quick review of the test layouts

1 Related Samples - two variables, one for each condition of the experiment Each subject has two scores, as a result:

Variable 1 (First set of scores for the subjects, e.g reaction time

before taking the drug)

Variable 2 (Second set of scores for the subjects, e.g reaction time

after taking the drug)

Sub No

2 Independent or Unrelated Samples - two variables, the first tells SPSS what

condition EACH subject belongs to, the second is the actual score for that subject:

Variable 1 (what condition each subject belongs to, e.g group 1 are the controls, group 2 receive the drug)

Variable 2 (actual score, e.g each subject‟s reaction time)

Sub No

1 (control) subject‟s condition (1) subject 1 score

Trang 7

T-Test for Related Sample

This is the parametric comparison of two related groups, for example, when you want

to compare mean scores for subjects at some task before and after taking a drug Each set of subject scores for the related t-test must be entered as an individual variable in SPSS So, in the above example, all the individual(s) scores for the task before taking the drug would be in one column and all the scores after taking the drug in another

First, open family.sav The next step is to add a variable to the data file, so that we can run the related t-test In this case, the comparison will be between the subjects‟ height/weight ratio before they were put on a 4-week diet/exercise plan and after The variable already in the data set HWRATIO is the measure before At the end of the data file, add the variable HWRATIO2 to represent their measurements after the plan Using what you learned in the first lesson about entering data, create the new variable using the information below:

Variable Name: HWRATIO2

Variable Label: Height/Weight Ratio after plan

Data: see table 1 below

To run the procedure, go ANALYZE, COMPARE MEANS and then PAIRED-SAMPLES T-TEST

The usual dialogue box appears The dialogue box has the two-column format The only difference is that you must select pairs of variables and move them across, rather than just one variable at a time To do this, you have to click on one variable, then locate the other variable and click on it The two variables that you have requested should appear in the current selection box After clicking on both, you then press the arrow button to move the pair across SPSS will analyse each pair to determine if their means are significantly different statistically In this case, select the variables

HWRATIO and HWRATIO2 and move them across, then press the OK button Table 1: Data for Height/Weight Ratio after a 4-week diet/exercise plan

Trang 8

14 50

OUTPUT

The results appear in three sections

 The first section gives you a table called Paired Samples Statistics with the mean

scores, standard deviations and standard error mean for the two variables

 The second section is a table called Paired Samples Correlation(s) showing the

correlation between the two variables and the level of significance

 The third section is more important The table called Paired Samples Test

indicates the significance of the results This includes the t-value, degrees of freedom (d.f.) and the two-tailed significance level

What is the t-value for the comparison between the height to weight ratio scores?

Is there a significant difference between the scores before and after the diet/exercise plan? If so, which is the greater height/weight ratio?

T-Test for Independent Samples

This is the parametric t-test for two independent samples - a between-subjects design where, for example, subjects are randomly assigned to two separate test conditions (e.g drug and control) and the mean scores (e.g reaction time) are compared to determine if they are significantly different from each other

In this case, you want to test whether there is a statistical difference in weight to height ratios between the male and female subjects The format for variables to be used in the independent t-test is different from that used in the related Instead of the scores being placed in two separate columns (variables), all of the scores are placed in

a single column (variable) A second variable identifies for SPSS which of the two groups each score belongs to So, in this case, there is the variable HWRATIO2 as the dependent variable and NSEX as the independent variable

To run the analysis, go to ANALYZE, COMPARE MEANS and then INDEPENDENT-SAMPLES T-TEST As usual, the left column lists all the

variables in your data file On the right, there are two boxes:

Trang 9

 The grouping variable box is where you move the variable that distinguishes between the two independent groups (e.g the variable NSEX)

First, select the dependent variable HWRATION2 and move it over to the test variable(s) section Next move NSEX over into the grouping Variable section and

press the DEFINE GROUPS button Values from the grouping variable must be

entered into the two boxes In the case of the variable sex, where only two levels are recorded, you would just enter “1" in the top box for male subjects, and “2" in the

lower one for female subjects Hit the CONTINUE button, then hit the OK button

[Note: There may be times where you have a larger range of values, such as five different education levels, but only want to look at the difference between two of them You would enter the two values you wish to compare.]

OUTPUT

There are two sections:

 The first section of the output gives you a table called Group Statistics which

indicates the number of cases and the mean scores etc for each condition

 The second section provides a table called Independent Samples T-test and

starts with Levene‟s Test for Equality of Variance If the variance is unequal and

is indicated by significant difference, then when you look at the results of the t-test in the final table, you use the line starting with Equal variances not assumed

If it isn‟t significant, you look at the line starting with Equal variances assumed The final table gives you t-values, degrees of freedom and the two-tailed significance levels

In this case, Levene‟s is not significant (0.137), so we look at the equal variance line

In this case, it is not significant (two-tailed significance of 478), so we reject the hypothesis that there is a difference between males and females in their height to weight ratios

Section II: Non-Parametric T-tests (Wilcoxon - related & Mann-Whitney - unrelated)

All of the tests today can be found under ANALYZE, NONPARAMETRIC TESTS

Mann-Whitney - Unrelated

This is the non-parametric t-test for two independent samples - a between-subjects

design To run the analysis, choose: ANALYZE, NONPARAMETRIC TESTS, and

2 INDEPENDENT SAMPLES

As usual, the left column lists all the variables in your data file On the right, there are two boxes:

Trang 10

 the “test variable(s)” box is where you move the dependent variable(s)

 the “grouping variable” box is where you move the variable that distinguishes between the two independent groups (e.g the variable sex)

So, move HWRATIO2 into the test variable box, and move NSEX into the grouping variable box Now, click the Define Groups button Values from the grouping variable must be entered into the two boxes In the case of the variable NSEX, you enter “1" in the top box for male subjects, and “2" in the lower one for female subjects Hit the Continue button, then hit the Ok button

OUTPUT

SPSS divides the entire set of subjects into three groups:

 those with a score of 1 (male)

 those with a score of 2 (female)

 cases with missing data, which are excluded from the analysis)

The first section gives the mean ranks for the two conditions that are included, as well

as the sums of the ranks and the numbers of cases

The second section gives the Z score and p-values for the T-test

Is there a difference between males and females? How do the results from this week compare to last week‟s?

Wilcoxon - Related

This is the non-parametric repeated measures T-test, in a within subjects design Like the parametric equivalent, we‟ll be running a comparison of height to weight ratios for the sample population before and after a four-week exercise/diet program To run the

analysis, choose: ANALYZE, NONPARAMETRIC TESTS, and 2 RELATED SAMPLES

The dialogue box has the two-column format The only difference is that you must select pairs of variables and move them across SPSS will analyse each pair to determine if their mean ranks are significantly different statistically For this analysis, select the two variables HWRATIO and HWRATIO2, then click the Ok button OUTPUT

The output for this procedure is quite different from the parametric test The first section gives you information about how many rank scores for one condition are

Định dạng
Số trang	10
Dung lượng	156,05 KB