Cohen’s Kappa for Reliability With Nominal Data

Một phần của tài liệu Spss for introductory statistics use and interpretation 2011 (Trang 135 - 140)

When we have two nominal variables with the same values (usually two raters’ observations or scores using the same codes), you can compute Cohen’s kappa to check the reliability or agreement between the measures. Kappa has a few assumptions about the underlying nature of the data: 1) Participants are independent of each other, 2) the raters, reporters, or observers providing the data do so independently of one another, and 3) the rating categories are mutually exclusive and exhaustive. Data typically are nominal, but they can be ordinal. If they are normally distributed data, other measures are preferable. Imagine that the ethnicity variables were based on school records. Then, a new variable was obtained by asking students to self-report their ethnicity. The question is, how reliable is the interobserver classification of ethnicity?

7.5. What is the reliability coefficient for the ethnicity codes (based on school records) and ethnicity reported by the student?

To compute the kappa:

• Click on Analyze Descriptive StatisticsCrosstabs...

• Click on Reset to clear the previous entries.

CROSS-TABULATION, CHI-SQUARE, AND NONPARAMETRIC MEASURES 121

• Move ethnicity (ethnic) to the Rows box and ethnicity reported by students (ethnic 2) to the Columns box.

• Click on Kappa in the Statistics dialog box.

• Click on Continue to go back to the Crosstabs dialog window.

• Then click on Cells and request the Observed cell counts and Total under percentages.

• Click on Continue and then OK. Compare your syntax and output to Output 7.5.

Output 7.5: Cohen’s Kappa With Nominal Data

CROSSTABS

/TABLES=ethnic BY ethnic2 /FORMAT= AVALUE TABLES /STATISTICS=KAPPA /CELLS= COUNT TOTAL /COUNT ROUND CELL.

Crosstabs

Case Processing Summary

71 94.7% 4 5.3% 75 100.0%

ethnicity * ethnicity reported by student

N Percent N Percent N Percent

Valid Missing Total

Cases

ethnicity * ethnicity reported by student Crosstabulation

40 1 0 0 41

56.3% 1.4% .0% .0% 57.7%

2 11 1 0 14

2.8% 15.5% 1.4% .0% 19.7%

0 1 8 0 9

.0% 1.4% 11.3% .0% 12.7%

0 1 0 6 7

.0% 1.4% .0% 8.5% 9.9%

42 14 9 6 71

59.2% 19.7% 12.7% 8.5% 100.0%

Count

% of Total Count

% of Total Count

% of Total Count

% of Total Count

% of Total Euro-Amer

African-Amer Latino-Amer Asian-Amer ethnicity

Total

Euro-Amer Afican-Amer Latino-Amer Asian-Amer ethnicity reported by student

Total

Agreements between school records and student’s report.

One of six disagreements;

they are in “squares” off the diagonal.

Symmetric Measures

.858 .054 11.163 .000

71

Asymp.

Std. Errora Approx. Tb

Value Approx. Sig.

Kappa Measure of Agreement

N of Valid Cases

Not assuming the null hypothesis.

a.

Using the asymptotic standard error assuming the null hypothesis.

b. As a measure of

reliability, kappa should be high (usually > .70) not just statistically significant.

Interpretation of Output 7.5

The Case Processing Summary table shows that 71 students have data on both variables;

however, 6 students disagree with the school records, as indicated by boxes off the diagonal. The Crosstabulation table of ethnicity and ethnicity reported by student also shows the cases where the school records and the student self-reports are in agreement; they are on the diagonal and circled. There are 65 (40 + 11 + 8 + 6) students with such agreement or consistency.

The Symmetric Measures table shows that Kappa is .86. This indicates good reliability because such measures should be high ( > .70) and positive. Statistical significance is not relevant for reliability measures.

Example of How to Write About Problem 7.5 Results

Kappa was used to investigate the reliability coefficient for the ethnicity codes (based on school records) and ethnicity reported by the student (kappa = .86). This indicates that there is high reliability between the students’ reports and the school records.

Interpretation Questions

7.1. In Output 7.1: (a) What do the terms “count” and “expected count” mean? (b) What does the difference between them tell you?

7.2. In Output 7.1: (a) Is the (Pearson) chi-square statistically significant? Explain what it means. (b) Are the expected values in at least 80% of the cells > 5? How do you know?

Why is this important?

7.3 In Output 7.2: (a) How is the risk ratio calculated? What does it tell you? (b) How is the odds ratio calculated and what does it tell you? (c) How could information about the odds ratio be useful to people wanting to know the practical importance of research results? (d) What are some limitations of the odds ratio as an effect size measure?

7.4. Because father’s and mother’s education revised are at least ordinal data, which of the statistics used in Problem 7.3 is the most appropriate to measure the strength of the relationship: phi, Cramer’s V, or Kendall’s tau-b? Interpret the results. Why are tau-b and Cramer’s V different?

CROSS-TABULATION, CHI-SQUARE, AND NONPARAMETRIC MEASURES 123

7.5. In Output 7.4: (a) How do you know which is the appropriate value of eta? (b) Do you think it is high or low? Why? (c) How would you describe the results?

7.6. Write an additional sentence or two describing disagreements in Output 7.5 that you might include in a detailed research report.

Extra Problems

Using the College Student data file, do the following problems. Print your outputs after typing your interpretations on them. Please circle the key parts of the output that you discuss.

7.1. Run crosstabs and interpret the results of chi-square and phi (or Cramer’s V), as discussed in Chapter 6 and in the interpretation of Output 7.1, for: (a) gender and marital status and (b) age group and marital status.

7.2. Select two other appropriate variables; run and interpret the output as we did in Output 7.1.

7.3. Is there an association between having children or not and watching TV sitcoms?

7.4. Is there a difference between students who have children and those who do not in regard to their age group?

7.5. Compute an appropriate statistic and effect size measure for the relationship between gender and evaluation of social life.

Correlation and Regression

In this chapter, you will learn how to compute several associational statistics, after you learn how to make scatterplots and how to interpret them. An assumption of the Pearson product moment correlation is that the variables are related in a linear (straight line) way so we will examine the scatterplots to see if that assumption is reasonable. Second, the Pearson correlation and the Spearman rho will be computed. The Pearson correlation is used when you have two variables that are normal/scale, and the Spearman is used when one or both of the variables are ordinal.

Third, you will compute a correlation matrix indicating the associations among all the pairs of three or more variables. Fourth, we will show you how to compute Cronbach’s alpha, the most common measure of reliability, which is based on a correlation matrix. Fifth, you will compute simple or bivariate regression, which is used when one wants to predict scores on a normal/scale dependent (outcome) variable from one normal or scale independent (predictor) variable. Last, we will provide an introduction to a complex associational statistic, multiple regression, which is used to predict a scale/normal dependent variable from two or more independent variables.

The correlations in this chapter can vary from –1.0 (a perfect negative relationship or association) through 0.0 (no correlation) to +1.0 (a perfect positive correlation). Note that +1 and –1 are equally high or strong, but they lead to different interpretations. A high positive correlation between anxiety and grades would mean that students with higher anxiety tended to have higher grades, those with lower anxiety had lower grades, and those in between had grades that were neither especially high nor especially low. A high negative correlation would mean that students with high anxiety tended to have low grades; also, high grades would be associated with low anxiety. With a zero correlation there are no consistent associations. A student with high anxiety might have low, medium, or high grades.

Assumptions and Conditions for the Pearson Correlation(r)and Bivariate Regression 1. The two variables have a linear relationship. We will show how to check this assumption with a scatterplot in Problem 8.1. (Pearson r will not detect a curvilinear relationship unless you transform the variables, which is beyond the scope of this book.)

2. Scores on one variable are normally distributed for each value of the other variable and vice versa. If degrees of freedom are greater than 25, failure to meet this assumption has little consequence. Statistics designed for normally distributed data are called parametric statistics.

3. Outliers (i.e., extreme scores) can have a big effect on the correlation.

Assumptions and Conditions for Spearman Rho (rs)

1. Data on both variables are at least ordinal. Statistics designed for ordinal data and which do not assume normal distribution of data are called nonparametric statistics.

2. Scores on one variable are monotonically related to the other variable. This means that as the values of one variable increase, the other should also increase but not necessarily in a linear (straight line) fashion. The curve can flatten but cannot go both up and down as in a U or J.

Rho is computed by ranking the data for each variable and then computing a Pearson product moment correlation. The program will do this for you automatically when you request a Spearman correlation.

124

CORRELATION AND REGRESSION 125

• Retrieve hsbdataB.sav.

Một phần của tài liệu Spss for introductory statistics use and interpretation 2011 (Trang 135 - 140)

Tải bản đầy đủ (PDF)

(244 trang)