THE NATURE OF RESEARCH DESIGN AND INFERENTIAL STATISTICS

Một phần của tài liệu Understanding Educational Statistics Using Microsoft Excel And Spss.pdf (Trang 171 - 199)

Does class size have an effect on learning? Education researchers are just beginning to take a closer look at this research question because it is such a widely held opin- ion and it has resulted in funding opportunities for schools that have low student achievement. It has been fashionable in recent years not only to reduce class size, but also to create smaller schools or ‘‘schools-within-schools’’ as a way to limit size. The idea underlying all of these notions is that fewer students means that teachers can more effectively communicate with students and help them learn.

I have evaluated many educational reform efforts that took this tack and have written in several places about similar strategies. One of the main issues in reform efforts, it seems to me, is not whether the class is smaller, but the approach to learn- ing taken by teachers, students, and educational administrations.

The question of class size and achievement is persistent. This being the case, we can use the example to discuss the nature of research and inferential statistics. To these issues we now turn.

As I mentioned in Chapter 8, my students anxiously await ‘‘headache day’’ be- cause it means we must take a different approach to our subject. This is indeed the case, although it does not have to be overly complex. The main requirement for understanding inferential statistics is to learn to think abstractly. We have dealt with descriptive statistics, which, in a sense, are procedures to measure what you see. Inferential statistics looks at data on a different level of abstraction. We must learn to understand the connection between what data we see before us and the statistical world that lies outside and beyond what we see.

Understanding Educational Statistics Using Microsoft Excel1and SPSS1. By Martin Lee Abbott.

#2011 John Wiley & Sons, Inc. Published 2011 by John Wiley & Sons, Inc.

147

RESEARCH DESIGN

Before we look in depth at inferential statistics, however, we need to cover some essential matters of research design. Statistics and research design are companion topics that need to be understood together. Often, the two subjects are taught together in college curricula, or they are prerequisites for one another. There is no best way to sequence the ideas, so I simply try to introduce research design elements at the brink of discussing inferential statistics because they have mutual reliance.

This is a book on statistics, so we cannot hope to cover all the complexities of research design. We can only attempt to provide a research ‘‘primer.’’ In what follows, I will outline some of the basics, but for a comprehensive understanding of research design, you need to consult standard authorities in the field. You might start with Earl Babbie’s excellent work (Babbie, 2010) that has served for many years to provide an excellent examination of social research practice.

Social research is a field of inquiry in which we devise standardized methods for examining available data to answer research questions.Typically, this involves collecting data from subjects or existing sources and subjecting the data to the methods of statistics to provide illumination. We may start with a research question and then figure out the best way to proceed to solve it. This procedure is research design. How can we structure our inquiry so that we can find and use data in the most defensible way to answer a research question? Figure 9.1 shows a process we might envision that will help us negotiate a research question.

FIGURE 9.1 The process of social research.

Theory

As you can see, the top element in Figure 9.1 is ‘‘Theory,’’ an abstract idea in which we state the conceptual nature of the relationship among our ideas of inquiry. For instance, we might state our earlier question as a theoretical ques- tion: What is the relationship between size of learning environment and student learning?

Hypothesis

Because a theory cannot be directly assessed (it exists on an abstract and conceptual level), we must find a way to empirically restate the theory so that it can be as- sessed. That is the role of thehypothesis, a statement that captures the nature of the theoretical question in such a way that it can be directly verified. As you can see in Figure 9.1, I note that the hypothesis is written in ‘‘testable language,’’ or written in a way that will provide empirical evidence in support of, or contrary to, a theoreti- cal question.

Let me add here that I am introducing a scientific process that underlies most all scientific attempts to understand the world. The physical sciences and social sci- ences alike use the methods I am describing to generate and verify knowledge. The theory testing process shown in Figure 9.1 is the heart of this process. By following the process, we can support or refute a theoretical position, but we can never

‘‘prove’’ it directly. If a hypothesis statement, being constructed in empirical, and therefore limited, language is borne out, we simply add to our confidence in the theory. There are many hypotheses that can be generated to test any one theory since the empirical world cannot completely capture the essence of the abstract con- ceptual world.

Here is an example of what I mean. I might generate the following hypothesis regarding the theoretical statement above about size and learning: ‘‘Students in classroom A (smaller student to teacher ratio) will evidence higher standard test scores in mathematics than those in classroom B (larger student to teacher ratio).’’

Do you see the difference in language between theory and hypothesis? Theory is abstract while hypothesis is concrete. Theory is more ‘‘general,’’ while hypothesis is more ‘‘specific.’’

Theory cannot be captured by a single hypothesis statement; there are sim- ply too many empirical possibilities that can be created to ‘test’ the theory.

For example, I might suggest another hypothesis: ‘‘Schools’ student to teacher ratios will be negatively correlated with their aggregated test scores in math.’’ As you can see, this is another restatement of the theory in lan- guage that is testable. I can easily (well, perhaps easier!) collect school data on a sample of schools and statistically assess the relationship between their student to teacher ratios and their aggregated test scores. If there is a rela- tionship, as predicted by the hypothesis, this would lend support to the theo- retical tie between size of learning environment and learning, but it would not ‘‘prove’’ it.

RESEARCH DESIGN 149

TYPES OF RESEARCH DESIGNS

Research designs are simply the ‘‘housing’’ within which we carry out our analyses to test a theory. You can see in Figure 9.1 that the research design enables the re- searcher tosituatethe hypothesis. How can we create the analysis so that we can use statistical tools to their best advantage to provide evidence for the theory?

While there are many different possibilities, we can note three:

1. Experiment

2. Post Facto—Correlational 3. Post Facto—Comparative Experiment

There are two ‘‘classes’’ of designs; experimental and nonexperimental (post facto).

Experiments are designs in which the researcher consciously changes the values of a study variable under controlled conditions and observes the effects on an outcome variable. Post facto designs are those that involve measuring the relationships among variables using data that have already been collected.

Of the two examples of hypothesis I listed above, the first is closer to an experi- ment, depending on how I can control the conditions. Thus, if a principal allows me to randomly select students and randomly assign them to two different classrooms (A and B) with different student to teacher ratios, and then after a period of time assess the differences in student test scores, I would be performing an experiment. I consciously change the values of a study variable (student to teacher ratio) and as- sess the effects on an outcome variable (student achievement).

Control Groups. Of course, the particular way in which I control all the influences other than the two research variables will have a bearing on the strength and validity of my results. The key to a powerful experimental design is limiting these influ- ences. One way to do so is to create a ‘‘control group,’’ which is typically a group similar in every way to a ‘‘treatment group’’ except for the research variable of interest. In our example, classroom A may have substantially lower student to teacher ratios (e.g., 12:1) than the ‘‘normal’’ classroom B (e.g., 16:1). In this case, the treatment variable is ratio size, and students in classroom B are the control group. Table 9.1 shows these groups. The only difference they have from students in classroom A is that there are more students per teacher in their classroom. There- fore, if students in classroom A get superior test scores, the experimenter will TABLE 9.1 The Experimental Groups

Research Treatment Variable: Student–Teacher Ratio Outcome Variable Classroom A (low ratio) Experimental group Test scores Classroom B (typical ratio) Control group Test scores

attribute the test score increase (the outcome) to a lower student–teacher ratio. The- oretically, there are no other differences present in the design that could account for the difference in test scores.

As you might imagine, there are a host of potentially ‘‘confounding’’ conditions, or ways that the two groups cannot be called comparable. Perhaps the experimenter cannot truly choose students randomly and assign them randomly to different class- rooms. If so, then there are differences being ‘‘built in’’ to the experiment: Were similar students chosen for the different classes? Were there students of ‘‘equal’’

aptitudes, genders, and personality types represented in both groups, for example?

Randomization.Experimenters userandomizationmethods to ensure comparabil- ity of experimental groups. By randomization, I mean (1) selecting students ran- domly and (2) randomly assigning them to different conditions. The power of randomness is that it results in individual differences between students to be equa- ted across groups. If every student has an equal chance of being chosen for an experiment, along with an equal chance of being assigned to classroom A or B, then the resulting groups should be as equal as possible; there should be very little bias that would normally influence some students to be chosen for one class and other students to be chosen for the other class.

Quasi-Experimental Design.Experiments can be either ‘‘strongly’’ constructed or

‘‘weakly’’ constructed according to how well the experimenter can control the dif- ferences between both groups. Often, an experimenter cannot control all the condi- tions that lead to inequality of groups, but they still implement the study. Thequasi- experimental designis just such a design. Here, the experimenter may be forced, because of the practicalities of the situation, to use a design that does not include all of the controls that would make it an ideal experiment. Perhaps they do not have the ability to create a control group and must rely on a similar, ‘‘comparable group,’’ or they may be confronted with using existing classes of students rather than being able to create the classes themselves.

In the experimental design we discussed above, the experimenter may not be able to randomly select students from the student body and then randomly assign them to different conditions. Perhaps the experimenter can only assign the students randomly to different classrooms. In this case, we cannot be assured that the stu- dents in the two classrooms are equal, since we could not assure complete random- ness. However, we might proceed with the experiment and analyze how this potential inequality might affect our conclusions. This is shown in Table 9.2, in which the difference from the experimental design shown in Table 9.1 is the ab- sence of randomization and the lack of a true control group.

There are a great many variations of experimental and quasi-experimental de- signs. The key differences usually focus on the lack of randomization and/or true comparison groups in the latter. For a comprehensive understanding of experimen- tal design and the attendant challenges of each variation, consult Campbell and Stanley (1963) for the definitive statement. In this authoritative discussion, the authors discuss different types of designs and how each can address problems of

TYPES OF RESEARCH DESIGNS 151

internal validity (whether the conditions of the experiment were present to control extraneous forces) and external validity (including generalizability).

It is probably best to think of experimental designs as being stronger or weaker rather than as specific ‘‘types’’ that can be employed in certain situations. Many research design books list and describe several specific (experimental and quasi- experimental) designs, noting the features that limit problems of internal and exter- nal validity. Figure 9.2 shows how research designs exist on a continuum in which they can approach ‘true’ experimental designs on one end that limit all problems and can make causal attributions to those on the other end of the continuum that are beset with problems which limit their ability to produce meaningful experimental conclusions.

Variables. By now, you will recognize that I have used the language of ‘‘variables’’

in my explanation of experimental design. Before we proceed to discuss other de- signs, we need to note different kinds of variables. Variables, by definition, are the quantification of concepts (like the student to teacher ratios or test scores) used in research that can take different values (i.e., vary). Thus, math achievement is a quantified set of test scores that vary by individual student.

Independent Variables. In research design, we often refer to certain types of varia- bles. The ‘‘independent variable’’ is understood to be a variable whose measure does not relate to or depend upon other variables. Thus, in our experimental design

FIGURE 9.2 The nature of experimental designs.

TABLE 9.2 A Quasi-Experimental Design

Research Treatment Variable: Student–Teacher Ratio

Outcome Variable Classroom A

(low ratio)

Experimental group (not randomly selected and assigned)

Test scores Classroom B

(typical ratio)

Comparison group (not randomly selected and assigned, but chosen to be comparable to the experimental group)

Test scores

example, student–teacher ratio is such a variable because in our research problem we assume that this is the influence that will lead to an impact on other variables. It is assumed to be a ‘‘cause’’ of some research action.

There are a host of problems with the independent variable designation. Typi- cally, we refer to a variable as independent only in the context of an experiment because we are framing it as leading to certain effects. In nonexperimental contexts, I prefer to use the designation ‘‘predictor variable,’’ which does not evoke the lan- guage of causality. A variable can be a predictor of an outcome without being an independent variable.

In research designs, independent study variables can be ‘‘manipulated’’ or ‘‘non- manipulated,’’ depending on their nature. Manipulated independent variablesare those the experimenter consciously changes, or manipulates, in order to create the conditions for observing differential effects of treatment groups on the outcome var- iable. In our example, student–teacher ratio is the manipulated independent variable because the researcher could assign students to two different levels or conditions of this variable: low or high ratios. Another example could be group size, if a re- searcher wanted to compare different reading groups according to the numbers of students in the group; perhaps one group would consist of ‘‘few’’ students and an- other “many” students. In this example, group size is manipulated (i.e., consciously changed by the researcher) by creating two reading groups of different sizes.

Nonmanipulated independent variablesare those that cannot change or cannot be manipulated by the researcher. Typically, they are characteristics, traits, or attributes of individuals. For example, gender or age can be independent variables in a study, but they cannot be changed, only measured. When these types of varia- bles are used in a research study, the researcher cannot make causal conclusions.

The essence of a true experiment is to observe the effects of changing the conditions of a variable differentially for different groups and then observing the effects on the outcome. If nonmanipulated variables are used, by definition the research design cannot be experimental. For example, if the researcher was interested in the effects of gender on achievement, the research design can only group the subjects by their already designated gender; no causal conclusions can be made.

Dependent Variables. Dependent variablesare those thought to be the ‘‘receivers of action’’ in a research study; their value depends upon (is tied to) a previously occurring variable. Where independent variables are causes, dependent variables are ‘‘effects’’ or results. In nonexperimental contexts, I like to think of these as

‘‘outcome variables’’ that are linked to predictors.

Post FactoResearch Designs

The second hypothesis example I presented above (‘‘schools’ student–teacher ratios will be negatively correlated with their aggregated test scores in math’’) is apost facto correlational design. Here, I am simply using data that already exist (on schools’ student–teacher ratios and their aggregate test scores)—hence,post facto, which means ‘‘after the fact.’’ I do not consciously change anything; rather I use

TYPES OF RESEARCH DESIGNS 153

what data I can gatherfrom what is already generatedto see if the two set of scores are correlated. This design uses the statistical process of correlation to measure the association of two sets of existing scores. [We will discuss this process at length in the correlation chapter. (Chapter 14).]

Apost factodesign can alsocompare conditions rather thancorrelatecondi- tions. Thepost facto comparativedesign seeks to understanddifference. Thus, for example, I might compare twoalready existing classesof students to see if they have different test scores. (Perhaps we are interested in whether one class, com- posed of girls, has different test scores in math than another class composed of boys.) Statistically, I will assess whether there is a difference between the means of the test scores, for example. This type of approach uses methods of difference like thettest, ANOVA, and others. It ispost facto, since we are using data that already exist (i.e., I did not randomly select and assign students to different classes; I used classes already operative), but it is not correlational, since we see to assessdiffer- encerather thanassociation.

THE NATURE OF RESEARCH DESIGN

I cannot hope to discuss the nuances of each type of design. However, I will intro- duce the different designs in the context of discussing different statistical proce- dures in the chapters ahead. For now, it is enough to know that there are different ways to assess theories. We devise hypotheses according to the nature of our inter- ests and inquiry, and we thereby validate or question theories by empirical (statisti- cal) processes.

I should mention here some important aspects of research designs that I will de- velop in later chapters. In brief, each design has strengths and limitations. The experiment can be a powerful way of making ‘‘causal statements’’ because, if only one thing changes (the main treatment) while everything else is similar between the groups being tested, we can attribute any effects or changes in outcomes to the thing that was changed. Using the first example again, if we chose and assigned students appropriately and if the only difference between the two groups was the student to teacher ratio, then we could attribute any resultant difference in test scores primar- ily to the ratios. (Of course, as we will learn, we have to take great care to control all other influences beside the ratios in order to make a causal conclusion.)

Post facto designs cannot lead to causal attributions. Because the data are already collected, a number of different influences are already ‘‘contained in the data.’’ In this event, any two groups we compare have differences other than the research interest (student to teacher ratio) that will intrude upon differences in test outcomes. It is a matter of controlling these influences that is the difference between an experiment and apost factodesign.

Research Design Varieties

There is another dimension to research design, namely, the variety of ways in which it is carried out to collect data. Experiments can take place in the laboratory or in

Một phần của tài liệu Understanding Educational Statistics Using Microsoft Excel And Spss.pdf (Trang 171 - 199)

Tải bản đầy đủ (PDF)

(552 trang)