Chi‐square Goodness‐of‐fit Test

Một phần của tài liệu Spss® Data Analysis For Univariate, Bivariate And Multivariate Statistics (2019).Pdf (Trang 59 - 62)

This test is useful for data that are in the form of counts (as was true for Cohen’s kappa) and for which we would like to evaluate whether there is an association between two variables. An example will best demonstrate the kinds of data for which it is suitable. Consider the following 2 × 2 We move coin_flips over under Test Variable List. We set the Test Proportion at 0.50 since that is the hypoth‑

esized value under the null hypothesis. Next, click on Options and select Exact:

NPAR TESTS

/BINOMIAL (0.50)=coin_

flips

/MISSING ANALYSIS

/METHOD=EXACT TIMER(5).

A binomial test was conducted to evaluate the tenability that a coin is fair on which we obtained two heads out of five flips. The probability of get- ting such a result under the null hypothesis of fairness (p = 0.5) was equal to 0.312, suggest- ing that such a result (two heads out of five flips) is not that uncommon on a fair coin.

Hence, we have no reason to reject the null hypothesis that the coin is fair.

Binomial Test NPar Tests

coin_flips Group 1 Group 2 Total

1.00 .00

2 3 5

.40 .60 1.00

.50 1.000 .312

Category Observed

Prop. Test Prop. Exact Sig.

(2-tailed) Point Probability N

contingency table in which each cell is counts under each category. The hypothetical data come from Denis (2016, p. 92), where the column variable is “condition” and has two levels (present vs. absent).

The row variable is “exposure” and likewise has two levels (exposed yes vs. not exposed). Let us imag- ine the condition variable to be post‐traumatic stress disorder and the exposure variable to be war experience. The question we are interested in asking is:

Is exposure to war associated with the condition of PTSD?

We can see in the cells that 20 individuals in our sample who have been exposed to war have the condition present, while 10 who have been exposed to war have the condition absent. We also see that of those not exposed, 5 have the condition present, while 15 have the condition absent. The totals for each row and column are given in the margins (e.g. 20 + 10 = 30 in row 1).

Condition present (1) Condition absent (0)

Exposure yes (1) 20 10 30

Exposure no (2) 5 15 20

25 25 50

We would like to test the null hypothesis that the frequencies across the cells are distributed more or less randomly according to expectation under the null hypothesis. To get the expected cell frequencies, we compute the products of marginal totals divided by total frequency for the table:

Condition present (1) Condition absent (0)

Exposure yes (1) E = [(30)(25)]/50 = 15 E = [(30)(25)]/50 = 15 30

Exposure no (2) E = [(20)(25)]/50 = 10 E = [(20)(25)]/50 = 10 20

25 25 50

Under the null hypothesis, we would expect the frequencies to be distributed according to the above (i.e. randomly, in line with marginal totals). The chi‐square goodness‐of‐fit test will evaluate whether our observed frequencies deviate enough from expectation that we can reject the null hypothesis of no association between exposure and condition.

We enter our data into SPSS as below. To run the analysis, we compute in the syntax editor:

The output follows. We can see that SPSS arranged the table slightly different than ours but the information in the table is nonetheless consistent with our data:

Condition .00 1.00 Total

10 20 30

15 5 20

25 25 50 CountCondition * Exposure Crosstabulation

Exposure

1.00 2.00 Total

Value df Asymp. Sig.

(2-sided)

a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 10.00.

b. Computed only for a 2×2 table

Exact Sig.

(2-sided) Exact Sig.

(1-sided) Pearson Chi-Square

Continuity Correction Likelihood Ratio Fisher’s Exact Test Linear-by-Linear Association N of Valid Cases

8.333 6.750 8.630

8.167 50

1 1 1

1 .004 .009 .003

.004

.009 .004

Chi-Square Tests

We see above that our obtained Pearson Chi‐Square value is equal to 8.333 on a single degree of freedom (p = 0.004), indicating that the probability of the data we have obtained under the null hypothesis of no association between vari- ables is very small. Since this probability is less than 0.05, we reject the null hypothesis and conclude an association between exposure and condition.

We could have also obtained our results via GUI had the frequencies been a priori “unpacked” – meaning the frequencies were given by each case in the data file (we show only the first 24 cases in the Data View above):

Pearson Chi-Square Continuity Correctionb Likelihood Ratio Fisher’s Exact Test Linear-by-Linear Association N of Valid Cases

Value df Asymp. Sig.

(2-sided)

a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 10.00.

b. Computed only for a 2×2 table

Exact Sig.

(2-sided) Exact Sig.

(1-sided) 8.333a

6.750 8.630

8.167 50

1 1 1

1 .004 .009 .003

.004

.009 .004

Chi-Square Tests CROSSTABS

/TABLES=Exposure BY Condition /FORMAT=AVALUE TABLES

/STATISTICS=CHISQ /CELLS=COUNT EXPECTED /COUNT ROUND CELL /METHOD=EXACT TIMER(5).

Exposure 1.00 Count Expected Count

Condition

.00 1.00 Total

2.00 Count Expected Count Count Expected Count

10 15.0 15 10.0 25 25.0

20 15.0 5 10.0 25 25.0

30 30.0 20 20.0 50 50.0 Total

Exposure * Condition Crosstabulation

Notice that the expected counts in the above table match up with the expected counts per cell that we computed earlier. Fisher’s exact test, with two‐sided p‐value of 0.009 (and one‐tailed exact p‐value of 0.004), is useful when expected counts per cell are relatively small (e.g. less than 5 in some cells is a useful guideline).

A chi‐square goodness‐of‐fit test of independence was performed on frequencies to evaluate the null hypothesis that exposure to war is not associated with PTSD. The obtained value of chi‐square was equal to 8.333 and was found to be statistically significant (p = 0.004) for a two‐sided test.

Hence, there is evidence to suggest that exposure to war is associated with PTSD in the population from which these data were drawn.

Một phần của tài liệu Spss® Data Analysis For Univariate, Bivariate And Multivariate Statistics (2019).Pdf (Trang 59 - 62)

Tải bản đầy đủ (PDF)

(206 trang)