INTERNAL AND EXTERNAL VALIDITY

Một phần của tài liệu Research design and methods a process approach 9th edition (Trang 134 - 140)

little or no eff ect of subliminal messages on behavior.

QUESTIONS TO PONDER

1. What are the characteristics of experimental research?

2. What is the relationship between an independent and a dependent variable in an experiment?

3. How do extraneous variables aff ect your research?

4. What can be done to control extraneous variables?

5. How does a demonstration diff er from a true experiment?

6. What is the value of doing a demonstration?

INTERNAL AND EXTERNAL VALIDITY

Whether the general design of your study is experimental or correlational, you need to consider carefully two important but often confl icting attributes of any design: inter- nal and external validity. In this section, we defi ne these concepts and briefl y discuss the factors that you should consider relating to internal and external validity when choosing a research design.

Internal Validity

Much of your research will be aimed at testing the hypotheses you developed long before you collected any data. Th e ability of your research design to adequately test your hypotheses is known as its internal validity (Campbell & Stanley, 1963). Essentially, internal validity is the ability of your design to test the hypothesis that it was designed to test.

In an experiment, this means showing that variation in the independent variable, and only the independent variable, caused the observed variation in the dependent variable. In a correlational study, it means showing that changes in the value of your criterion variable relate solely to changes in the value of your predictor variable and not to changes in other, extraneous variables that may have varied along with your predictor variable.

Internal validity is threatened to the extent that extraneous variables can provide alternative explanations for the fi ndings of a study, or as Huck and Sandler (1979) call them, rival hypotheses. As an example, imagine that an instructor wants to know whether a new teaching method works better than the traditional method used with students in an introductory psychology course. Th e instructor decides to answer this question by using the new method to teach her morning section of introductory psy- chology and using the traditional method to teach her afternoon section. Both sec- tions will use the same text, cover the same material, and receive the same tests. Th e eff ectiveness of the two methods will be assessed by comparing the average scores

112 CHAPTER 4 . Choosing a Research Design

achieved on the test by the two sections. Now, imagine that the instructor conducts the study and fi nds that the section receiving the new method receives a substantially higher average grade than the section receiving the traditional method. She concludes that the new method is defi nitely better for teaching introductory psychology. Is she justifi ed in drawing this conclusion?

Th e answer, as you probably suspected, is no. Several rival hypotheses cannot be eliminated by the study, explanations at least as credible as the instructor’s view that the new method was responsible for the observed improvement in average grade.

Consider the following rival hypotheses:

1. Th e morning students did better because they were “fresher” than the afternoon students.

2. Th e morning students did better because their instructor was “fresher” in the morning than in the afternoon.

3. Th e instructor expected the new method to work better and thus was more enthusiastic when using the new method than when using the old one.

4. Students who registered for the morning class were more motivated to do well in the course than those who registered for the afternoon class.

Th ese rival hypotheses do not exhaust the possibilities; perhaps you can think of oth- ers. Because the study was not designed to rule out these alternatives, there is no way to know whether the observed diff erence between the two sections in student perfor- mance was due to the diff erence in teaching methods, instructor enthusiasm, alertness of the students, or other factors whose levels diff ered across the sections. Whenever two or more variables combine in such a way that their eff ects cannot be separated, a confounding of those variables has occurred. In the teaching study, teaching method is confounded by all those variables just listed and more. Such a study lacks internal validity.

Confounding, although always a matter of concern, does not necessarily present a serious threat to internal validity. Confounding is less problematic when the con- founding variable is known to have little or no eff ect on the dependent or criterion variable or when its known eff ect can be taken into account in the analysis. For exam- ple, in the teaching study, it may be possible to eliminate concern about the diff erence in class meeting times by comparing classes that meet at diff erent times but use the same teaching method. Such data may show that meeting time has only a small eff ect that can be ignored. If meeting time had a larger eff ect, you could arrange your study of teaching method so that the eff ect of meeting time would tend to make the new teach- ing method appear worse than the standard one, thus biasing the results against your hypothesis. If your results still favored the new teaching method, that outcome would have occurred despite the confounding rather than because of it. Th us, a study may include confounding and still maintain a fair degree of internal validity if the eff ects of the confounding variable in the situation under scrutiny are known.

Th is is fortunate because it is often impossible to eliminate all sources of con- founding in a study. For example, the instructor in our example might have attempted to eliminate confounding by having students randomly assigned to two sections meet- ing simultaneously. Th is would certainly eliminate those sources of confounding related

bor35457_ch04_099-122.indd 112

bor35457_ch04_099-122.indd 112 4/22/13 1:41 PM4/22/13 1:41 PM

INTERNAL AND EXTERNAL VALIDITY 113 to any diff erence in the time at which the sections met, but now it would be impossible

for the instructor to teach both classes. If a second instructor is recruited to teach one of the sections using the standard method, this introduces a new source of confounding in that the two instructors may not be equivalent in a number of ways that could aff ect class performance. Often the best that can be done is to substitute what you believe to be less serious threats to internal validity for the more serious ones.

Th reats to Internal Validity Confounding variables occur in both experimental and correlational designs, but they are far more likely to be a problem in the latter, in which tight control over extraneous variables is usually lacking. Campbell and Stanley (1963) identify seven general sources of confounding that may aff ect internal validity: history, maturation, testing, instrumentation, statistical regression, biased selection of subjects, and experimental mortality ( Table 4-1 ).

History may confound studies in which multiple observations are taken over time.

Specifi c events may occur between observations that aff ect the results. For example, a study of the eff ectiveness of an advertising campaign against drunk driving might measure the number of arrests for drunk driving immediately before and after the campaign. If the police institute a crackdown on drunk driving at the same time that the advertisements air, this event will destroy the internal validity of your study.

Maturation refers to the eff ect of age or fatigue. Performance changes observed over time due to these factors may confound those due to the variables being studied.

You might, for example, assess performance on a proofreading task before and after

TABLE 4-1 Factors Affecting Internal Validity

FACTOR DESCRIPTION

History Specifi c events other than the treatment occur between observations

Maturation Performance changes due to age or fatigue confound the eff ect of treatment

Testing Testing prior to the treatment changes how subjects respond in posttreatment testing

Instrumentation Unobserved changes in observer criteria or instrument calibration confound the eff ect of the treatment Statistical regression Subjects selected for treatment on the basis of their

extreme scores tend to move closer to the mean on retesting

Biased selection of subjects Groups of subjects exposed to diff erent treatments are not equivalent prior to treatment

Experimental mortality Diff erential loss of subjects from the groups of a study results in nonequivalent groups

114 CHAPTER 4 . Choosing a Research Design

some experimental manipulation. Decreased performance on the second proofreading assessment may be due to fatigue rather than to any eff ect of your manipulation.

Testing eff ects occur when a pretest sensitizes participants to what you are inves- tigating in your study. As a consequence, they may respond diff erently on a posttreat- ment measure than if no pretest were given. For example, if you measure participants’

racial attitudes and then manipulate race in an experiment on person perception, par- ticipants may respond to the treatment diff erently than if no such pretest of racial attitudes was given.

In instrumentation, confounding may be introduced by unobserved changes in criteria used by observers or in instrument calibration. If observers change what counts as “verbal aggression” when scoring behavior under two experimental conditions, any apparent diff erence between those conditions in verbal aggression could be due as much to the changed criterion as to any eff ect of the independent variable. Similarly, if an instrument used to record activity of rats in a cage becomes more (or less) sensitive over time, it becomes impossible to tell whether activity is really changing or just the ability of the instrument to detect activity.

Statistical regression threatens internal validity when participants have been selected based on extreme scores on some measure. When measured again, scores will tend to be closer to the average in the population. Th us, if students are tar- geted for a special reading program based on their unusually low reading test scores, they will tend to do better, on average, on retesting even if the reading program has no eff ect.

Biased selection of subjects threatens internal validity because subjects may diff er initially in ways that aff ect their scores on the dependent measure. Any infl uence of the independent variable on scores cannot be separated from the eff ect of the pre- existing bias. Th is problem typically arises when researchers use preexisting groups in their studies rather than assigning subjects to groups at random. For example, the eff ect of a program designed to improve worker job satisfaction might be evaluated by administering the program to workers at one factory (experimental group) and then comparing the level of job satisfaction of those workers to that of workers at another factory where the program was not given (control group). If workers given the job satisfaction program indicate more satisfaction with their jobs, is it due to the program or to preexisting diff erences between the two groups? Th ere is no way to tell.

Finally, experimental mortality refers to the diff erential loss of participants from groups in a study. For example, imagine that some people drop out of a study because of frustration with the task. A group exposed to diffi cult conditions is more likely to lose its frustration-intolerant participants than one exposed to less diffi cult conditions.

Any diff erences between the groups in performance may be due as much to the result- ing diff erence in participants as to any diff erence in conditions.

Enhancing Internal Validity Th e time to be concerned with internal validity is during the design phase of your study. During this phase, you should carefully plan which vari- ables will be manipulated or observed and recorded, identify any plausible rival hypoth- eses not eliminated in your initial design, and redesign so as to eliminate those that seriously threaten internal validity. Discovering problems with internal validity after you have run your study is too late. A poorly designed study cannot be fi xed later on.

bor35457_ch04_099-122.indd 114

bor35457_ch04_099-122.indd 114 4/22/13 1:41 PM4/22/13 1:41 PM

INTERNAL AND EXTERNAL VALIDITY 115

External Validity

A study has external validity to the degree that its results can be extended (general- ized) beyond the limited research setting and sample in which they were obtained. A common complaint about research using white rats or college students and conducted under the artifi cial conditions of the laboratory is that it may tell us little about how white rats and college sophomores (let alone animals or people in general) behave under the conditions imposed on them in the much richer arena of the real world.

Th e idea seems to be that all studies should be conducted in such a way that the fi ndings can be generalized immediately to real-world situations and to larger popu- lations. However, as Mook (1983) notes, it is a fallacy to assume “that the purpose of collecting data in the laboratory is to predict real-life behavior in the real world”

(p. 381). Mook points out that much of the research conducted in the laboratory is designed to determine one of the following:

1. Whether something can happen, rather than whether it typically does happen.

2. Whether something we specify ought to happen (according to some hypothesis) under specifi c conditions in the lab does happen there under those conditions .

3. What happens under conditions not encountered in the real world.

In each of these cases, the objective is to gain insight into the underlying mechanisms of behavior rather than to discover relationships that apply under normal conditions in the real world. It is this understanding that generalizes to everyday life, not the specifi c fi ndings themselves.

Th reats to External Validity In Chapter 1, we distinguished between basic research, which is aimed at developing a better understanding of the underlying mechanisms of behavior, and applied research, which is aimed at developing information that can be directly applied to solve real-world problems. Th e question of external validity may be less relevant in basic research settings that seek theoretical reasons to determine what will happen under conditions not usually found in natural settings or that examine fundamental processes expected to operate under a wide variety of conditions. Th e degree of external validity of a study becomes more relevant when the fi ndings are expected to be applied directly to real-world settings. In such studies, external validity is aff ected by several factors. Using highly controlled laboratory settings (as opposed to naturalistic settings) is one such factor. Data obtained from a tightly controlled laboratory may not generalize to more naturalistic situations in which behavior occurs.

Other factors that aff ect external validity, as discussed by Campbell and Stanley (1963), are listed and briefl y described in Table 4-2 . Many of these threats to external validity are discussed in later chapters, along with the appropriate research design.

How does research in psychology fare with respect to external validity? A few stud- ies suggest that it fares pretty well. For example, Craig Anderson, James Lindsay, and Brad Bushman (1999) compared the results from laboratory studies and fi eld studies (a study done in a participant’s natural environment) that were published in a number of social psychological journals. Anderson et al. found a strong correlation (r 5  .73) between the outcomes of research conducted in the laboratory and in the fi eld. More

116 CHAPTER 4 . Choosing a Research Design

recently, Gregory Mitchell (2012) also found a great deal of correlation between laboratory and fi eld studies conducted in a number of subfi elds in psychology (cor- relations ranged from .53 to .89). Th e strongest correlations were found for industrial/

organizational psychology (r 5 .89) and personality psychology (r 5 .83). Th e fi eld with the lowest level of external validity was developmental psychology (r 5 2.82). Finally, laboratory studies that produced strong eff ects were more likely to match results from the fi eld than were those that produced weaker eff ects.

What about animal research? How well do results from research with animals generalize to human populations? As we point out in Chapter 10, many behavioral fi ndings generalize quite well from animal to human research (e.g., conditioning).

A good degree of generality also appears in other research areas such as genetics and medical research (Baker, 2011; Regenberg et al., 2009). Monya Baker noted that the relationships between genetic factors and disease found in animals often match well with fi ndings in humans. Similarly, Regenberg et al. suggest that animal models are useful in understanding human outcomes in the fi eld of regenerative medicine. However, Regenberg et al. point to several obstacles that may call into question how well animal research generalizes to human regenerative medicine.

Th ese include anatomical diff erences between species, neurological diff erences, and diff erences in how clinical trial experiments are done with animal and human samples. Baker suggests that animal models may have about an 80% concordance with human outcomes, which may be suffi cient to warrant clinical trials in humans.

TABLE 4-2 Factors Affecting External Validity

FACTOR DESCRIPTION

Reactive testing Occurs when a pretest aff ects participants’

reaction to an experimental variable, mak- ing those participants’ responses unrepre- sentative of the general population Interactions between participant selection

biases and the independent variable

Eff ects observed may apply only to the participants included in the study, espe- cially if they are unique to a group (such as college sophomores rather than a cross section of adults)

Reactive eff ects of experimental arrangements

Refers to the eff ects of highly artifi cial experimental situations used in some research and the participant’s knowledge that he or she is a research participant Multiple treatment interference Occurs when participants are exposed to

multiple experimental treatments in which exposure to early treatments

aff ects responses to later treatments

bor35457_ch04_099-122.indd 116

bor35457_ch04_099-122.indd 116 4/22/13 1:41 PM4/22/13 1:41 PM

Một phần của tài liệu Research design and methods a process approach 9th edition (Trang 134 - 140)

Tải bản đầy đủ (PDF)

(608 trang)