Comparing the learning from intelligent tutoring systems, non-intelligent computer- based versions, and traditional classroom instruction.

Michael Mendicino Educational PsychologyWest Virginia UniversityNeil HeffernanComputer Science DepartmentWorcester Polytechnic InstituteJournal of Interactive Learning ResearchTitle: Com

Trang 1

Michael Mendicino Educational PsychologyWest Virginia University

Neil HeffernanComputer Science DepartmentWorcester Polytechnic InstituteJournal of Interactive Learning ResearchTitle: Comparing the learning from intelligent tutoring systems, non-intelligent computer-

based versions, and traditional classroom instruction

Abstract

There have been some studies that show that computer-assisted

instructional systems (CAI) can be superior to traditional classroom

instruction (Kulik, 1983, 1994, 2003; Bangert-Drowns, Kulik & Kulik,

1985) Other studies have compared new “intelligent tutoring systems”

(ITS) to classroom instruction (Koedinger, Anderson, Hadley, &

Mark, 1997; Anderson, Corbett, Koedinger, & Pelletier,

1995) while many studies have compared Intelligent

tutoring systems to CAI-like controls (Carroll & Kay, 1988;

Corbett, & Anderson, 2001; Mathan, 2003; Schooler, & Anderson, 1990)

We are aware of no studies that have taken a single ITS and compared it to

both: 1) classroom instruction and 2) CAI In this study we compare these

three (classroom instruction, CAI and ITS) using a newly developed ITS

(Heffernan & Koedinger, 2003).We seek to try to quantify the value added

of CAI over classroom instruction, versus the value-added of ITS on top

of CAI We found evidence that the ITS was much better than the

classroom instruction but with an effect size of only 0.6 Our results in

trying to calculate the value-added of the CAI over the classroom were

mixed, with 2 studies showing effects but the third one not showing

statistically reliable differences The extra value-added of the ITS over

CAI did seem to be robust across the three studies with an average 0 4

effect size

Trang 2

There have been some studies that show that traditional computer- assisted

instructional systems (CAI) can be superior to traditional classroom instruction (Kulik,

1983, 1994, 2003; Bangert-Drowns, Kulik & Kulik, 1985) Other studies have compared new so-called “intelligent” tutoring systems (ITS) to classroom instruction (Koedinger & Anderson, 1993; Koedinger, Anderson, Hadley, & Mark, 1997; Anderson, Corbett, Koedinger, & Pelletier, 1995) while many studies have

compared intelligent tutoring systems to CAI-like controls (Carroll & Kay, 1988; Corbett, & Anderson, 2001; Mohan, 2003; Schooler, & Anderson, 1990) We are aware of no studies that have taken a single ITS and compared it to both:1) classroom instruction and 2) CAI In this study we compare all three with respect to student

learning and “motivation” in the algebra domain

Kulik’s (1985 & 1994) studies suggest CAI systems lead to about 0.3 to 0.5 standard-deviation effect sizes over classroom instruction The Koedinger, et al., (1997) study which compared a commercially available ITS (Cognitive Tutor) to a classroom control suggest a 1 standard-deviation effect size for experiment designed metrics, while for external metrics (The Iowa Algebra Aptitude test and a subset of the Math SAT) found

an effect size of 0.3, but this study may also suffer from a confound of the effect of the ITS with a new text-book prepared to go along with the curriculum We are uncertain how to compare these effect sizes with the Kulik and Kulik effect size of about 0.4 as we don’t know if the metrics in the Kulik and Kulik studies are more generally like

externally designed measures or experiment defined measures.In another study, VanLehn

et al (2005) compared an ITS not to classroom instruction, but to doing homework in a traditional paper-and-pencil manner They found results similar to the Cognitive Tutor results mentioned above with effect sizes of about 1 SD for their own measures, and about 0.4 for what they consider analogue to “externally designed measures”

In this study we compare these three (classroom instruction, CAI and ITS) using anewly developed ITS (Heffernan & Koedinger, 2003).We seek to try to quantify the valueadded of CAI over classroom instruction, versus the value-added of ITS on top of CAI How much more learning does adding an “intelligent tutor” get you over CAI? This question is important because ITSs are complex and costly to build and we need to understand if it’s worth the investment, or maybe CAI is good enough? We do this in the context of a mathematics classroom while teaching the skill of writing algebra

expressions for word problems, a skill we call symbolization In this paper we report three experiments, with three teachers and a total of 160 students All studies involved analyzing the amount of learning by students within a one classroom period, measured byexperimenter constructed pre- and posttests the day before and after the experiment Seven of the items were experimenter-designed questions and two were standardized test questions

Trang 3

Area of Math we focused on

Students in the United States lag behind many other countries in math skills, particularly at the eighth and twelfth grade levels (TIMSS, 1997) While US eighth grade students showed improvement in math, scoring above the international average (TIMMS,2003), better math instruction that integrates technology is still needed to ensure

continued improvement One skill students have difficulty with is writing algebra

expressions for word problems, a skill we call symbolization Heffernan & Koedinger

(1997) stated that “symbolization is important because if students cannot translate problems into the language of algebra, they will not be able to apply algebra to solve real world problems.” The need for this skill is more crucial now because students have access to graphing calculators and computers that can perform symbol manipulation skills, but translating word problems into the symbolic language of algebra remains a uniquely human endeavor

Other studies have been conducted to determine what behaviors make human tutoring effective and how these behaviors can be incorporated into computer-based tutoring systems (McArthur, Stasz, & Zmuidznasa, 1990; Merrill, Reiser, Ranney and Trafton 1992; Graesser & Person, 1994; Chi, Siler, Jeong, Yamauchi, & Hausmann, 2001;) For example, Merrill, et al.,(1992) concluded that a major reason human tutors are effective is that they let students do most of the work in overcoming impasses, while providing only as much assistance as necessary and keeping students from following

“garden paths” of reasoning that are unlikely to lead to learning VanLehn, Siler, & Murray (2003) also found that allowing students to reach impasses correlated with learning gains Finally, numerous studies (Swanson, 1992; Grasser, Person, & Magliano, 1995; Chi, Siler, Jeong, Yamauchi, & Hausmann, 2001; Katz, Connelly, & Allbritton 2003) hypothesized that it is the interactive nature of the tutorial dialog (i.e., interaction hypothesis) that accelerates learning

Computer-Assisted Instruction

Trang 4

Computer-based tutoring systems appear to hold promise for improving

mathematics instruction The first computer-based tutoring systems appeared over thirty years ago with the goal of approaching the effectiveness of human tutors According to Corbett & Trask (2000) these systems, called computer-assisted instruction (CAI)

afforded one advantage of human tutors: individualized interactive learning support While these systems were interactive and provided explicit instruction in the form of longweb pages or lectures they offered no dialog Studies demonstrated the effectiveness of CAI in mathematics at the elementary level (Burns & Bozeman, 1981) secondary level (Kulik, Bangert, & Williams, 1983) and college level (Kulik, Kulik, & Cohen, 1980) In ameta-analysis of 28 studies involving CAI, Kulik et al., (1985) found that CAI improved student achievement by an average effect size of 0.47 over students receiving

conventional instruction In another meta-analysis Kulik (1994) summarized 97 studies from the 1980’s that compared classroom instruction to computer-based instruction and found an average effect size of 0.32 in favor of computer-based instruction Kulik

claimed that students learned more and learned faster in courses that involved based instruction Finally, Kulik (2003) summarized the findings of eight meta-analyses covering 61 studies published after 1990 The median effect size for studies using

computer-computer tutorials was 0.59 meaning that students who received computer-computer tutorials performed in the 72nd percentile while students receiving conventional instruction

performed in the 59th percentile While these studies suggest that CAI can be an effective instructional aid in both elementary and secondary schools, CAI does not address the main concern of McArthur et al., (1990) who claims that teaching tactics and strategies are the least well developed components of most intelligent tutors

individualized assistance that is just-in-time and sensitive to the students’ particular approach to a problem (Anderson, Corbett, Koedinger, and Pelletier 1995) They also provide canned explanations and hint messages that get more explicit as students

continue asking for help until the tutor is telling the student exactly what to do The feedback is immediate and step-wise and is structured so as to lead students toward expert-like performance The tutor intervenes as soon as students deviate from the

solution path, but the cognitive tutor does not engage students in dialog by asking new questions Cognitive tutors also use knowledge tracing technology that traces student’s knowledge growth across problem solving activities and uses this information to select problems and adjust the pacing to adapt to individual student needs

Even though these new cognitive tutors do not engage students in dialog, they have nonetheless had a significant impact on student learning in a variety of domains Forexample, Koedinger, Anderson, Hadley, & Mark (1997) compared a cognitive tutor, PAT

Trang 5

(Pump Algebra Tutor) to traditional algebra instruction The PAT intelligent tutor was built to support the Pittsburgh Urban Mathematics Project (PUMP) algebra curriculum that is centrally focused on mathematical analysis of real world situations and the use of computational tools The study evaluated the effect of the PUMP curriculum and PAT tutor use and found that students in the experimental classes outperformed control classes

by 100% on assessments of the targeted problem solving and multiple representations These results also translated into a one standard deviation effect size Recent studies comparing PAT and traditional algebra instruction have found improvements in the 50-100% range thus replicating the above results (Koedinger, Corbett, Ritter, & Shapiro, 2000) This cognitive tutor is currently used by approximately 375,000 students in over

1000 schools

Morgan & Ritter (2002) conducted a study comparing the Cognitive Tutor

Algebra I course and a Traditional Algebra I Course which used a different text with students in their junior high school system Dependent measures included the Education Testing Service (ETS) Algebra I end-of-course exam, course grades and a survey of attitudes towards mathematics These measures certainly seem to have the benefit of not having been defined by the experimenters themselves When restricting the analysis to only those teachers who taught both curricula, the researchers found statistically

significant differences on all dependent measures in favor of the cognitive tutor Morgan and Ritter state that the strongest components of teacher effects have to do with teacher education and professional development and only indirectly with practices In their study the curriculum effect that they were examining had to do with teacher practices which would be expected to be relatively small Therefore, they conclude that the effect size of 0.29 is impressive taken in this context

Finally, as part of the Andes project, VanLehn et al., (2004) evaluated Andes an ITS developed to replace paper-and-pencil homework and to increase student learning in introductory college physics courses Andes provides immediate feedback to student responses and also provides three kinds of help including: 1) pop up error messages when

the error is probably due to lack of attention rather than lack of knowledge, 2) What’s

Wrong Help when the student is essentially asking what is wrong with that, and 3) Next Step Help if students are not sure what to do next The What’s Wrong and Next Step Help

selections generate a hint sequence that includes a pointing hint, a teaching hint, and a bottom-out hint that tells students exactly what to do

Andes was evaluated from 1999 to 2003 and in all years Andes students scored higher than control students with effect sizes ranging from 0.21 to 0.92 VanLehn et al compared their results to the results of the Koedinger et al (1995) study which they suggest is the benchmark study with respect to tutoring systems The Koedinger et al., study evaluated the PAT intelligent tutoring system and a novel curriculum (PUMP) which Carnegie Learning distributes as the Algebra I Cognitive Tutor Koedinger et al used both experimenter-designed questions and standardized tests While analyzing the experimenter-designed tests, they found effect sizes of 1.2 and 0.7 and effect sizes of 0.3 while analyzing multiple-choice standardized tests VanLehn et al found very similar effect sizes (1.21 & 0.69) for their conceptual experimenter-written tests and similar effect sizes, 0.29, for their multiple-choice standardized tests Thus, both evaluations havesimilar tests and effect sizes They both have impressive 1.2 and 0.7 effect sizes for conceptual, experimenter-designed tests, and lower effect sizes on standardized, answer

Trang 6

only tests Given the large difference between experimenter-designed tests versus

externally designed tests, it makes one wonder how to interpret the Kulik studies that argue that CAI , when compared to classroom instruction, gives between 0.3 and 0.7 effect sizes

The authors of the Andes study stated, that their evaluation differed from the Koedinger et al evaluation in a crucial way The Andes evaluations manipulated only the way that students did their homework—on Andes vs on paper The evaluation of the Pittsburgh Algebra Tutor (PAT) was also an evaluation of the Pittsburgh Urban

Mathematics Project curriculum (PUMP), which focused on analysis of real world

situations and the use of computational tools such as spreadsheets and graphers

Therefore, how much gain was due to the tutoring system and how much was due to the new curriculum is not clear Finally, VanLehn et al stated that in their study, the

curriculum was not reformed; therefore, the gains in their evaluation may be a better

measure of the power of intelligent tutoring systems per se

Dialog-based Intelligent tutors

Both CAI and cognitive tutors have proved to be more effective than traditional classroom instruction, yet neither has approached the effectiveness of human tutors Perhaps they have not captured the features of human tutoring that account for its

effectiveness Researchers have recently developed ITSs that incorporate dialog that is based on human tutors in specific domains Preliminary results are promising We

mention two related projects before focusing on Heffernan’s system used in this

evaluation

The Tutoring Research Group at the University of Memphis has developed AutoTutor (Graesser et al., 2001), an ITS that helps studentsconstruct answers to computer literacy questions and qualitative

physics problems by holding a conversation in natural language thus taking advantage of the interaction hypothesis AutoTutor attempts to imitate a human tutor by reproducing the dialog patterns and

strategies that were likely to be used by a human tutor AutoTutor presents questions and problems from a curriculum script, attempts to comprehend learner contributions that are entered by keyboard,

formulates dialog moves that are sensitive to the learner’s

contributions … and delivers the dialog moves with a talking head that simulates facial expressions and speech to give the impression of a discussion between the tutor and student (Graesser, Wiemer-Hastings, K., Wiemer-Hastings, P & Kreuz, 1999) AutoTutor has produced gains of0.4 to 1.5 sigma depending on the learning performance measure, the comparison condition, the subject matter, and the version of AutoTutor (Graesser et al., 2003)

Rosé C, Jordan P, Ringenberg M, Siler S, VanLehn K, and

Weinstein A (2001) integrated Atlas and the Andes system to compare

a model-tracing ITS with an ITS incorporating dialog Atlas facilitates incorporating tutorial dialog while Andes is a model-tracing ITS for quantitative physics that provides immediate feedback by highlighting each step attempted in either red or green to indicate a right or wrong

Trang 7

answer Andes also provides a hint sequence for students asking for help The researchers were able to compare student learning between the original Andes and the integrated Atlas-Andes with dialog Atlas-Andes students scored significantly higher on post-test measures with

a difference of 0.9 standard deviations

Heffernan & Koedinger (2002) & Heffernan, (2001) developed an intelligent tutoring system (ITS) that uses dialog to help students write algebra expressions It models both student behavior and tutorial behavior by combining a cognitive model of

student behavior in the domain of symbolization with a tutorial model of strategies

observed in human tutors The cognitive student model has a set of production rules that models the problem solving skills needed to write algebraic expressions The tutorial model is based on the observation of an experienced human tutor during an hour long tutoring session and thus captures tutorial strategies that were observed to be effective for

human tutors in the domain of symbolization It was the first intelligent tutor that had

both a model of student thinking and a model of tutorial planning The system is of the

“coached practice” variety that is meant to scaffold “learning by doing” while students practice their problem solving skills It is different from typical Cognitive Tutors in that ittakes its cues more from the dialogs that human tutors have with students and it is also more flexible in the interactions it has with students For example, it can acknowledge that part of an answer is correct and then engage a student in a “sub dialog” to help him

or her to improve the incorrect path It “breaks” problems down for students by asking questions and re-phrasing questions, but does not give students answers Whereas

Cognitive Tutors typically cover a full-year of algebra, Heffernan’s ITS deals only with

symbolization problems The following is a sample tutorial dialog:

Tutor: In the marching band there are 7 fewer flute players than

drummers If the number of drummers is "x", write an expressionfor the

total number of flute players and drummers

Student: 7-x+x {Correct answer is x+(x-7)}

Tutor: No We will come back to the total number of flute players

Tutor: Correct Now, lets try this question again: Please write an

expression for the total number of flute players and drummers

Student: x-7+x

Trang 8

Tutor: Correct Let us go onto the next problem.

Heffernan (2001) investigated the impact on learning of two versions of his ITS

In the control version, if students answered incorrectly the tutor told them the answer to type then moved on to another problem This approximates a common homework

situation in which students can look up correct solutions in the back of the book In the experimental version, the ITS engaged students in tutorial dialog specific to student errors in an attempt to help students construct the correct answer Students in the

experimental condition performed better on post-test measures showing evidence of learning form dialogs Heffernan only controlled for the number of problems in this experiment and not for time, therefore, he did not determined if the extra time spent in thedialogs was worth the effort

Heffernan (2002) reported on a web-based experiment in which he controlled for time in an attempt to see if the learning gains students acquired were worth extra time students spent in dialog Heffernan found students in the experimental condition

completed only half as many problems as students in the control condition, but still showed learning gains over the control condition with an effect size of 0.5 Heffernan also reported a possible motivation benefit to dialog In summary, Ms Lindquist seems like one example that supports the hypothesis that incorporating dialog into an ITS can lead to increases in student learning Heffernan & Croteau (2004) replicated some of the findings that Ms Lindquist seems to show some benefit over CAI for some lessons

The purpose of these experiments is to replicate research comparing normal classroom instruction and CAI and to extend the research by also comparing supposed

“intelligent” tutoring instruction to the other conditions We will test the hypothesis that

“intelligent” dialog accounts for more learning than 1) computer-assisted instruction as well as 2) classroom instruction This investigation will seek to determine how much added value “intelligence” will account for above computer-assisted instruction when compared to classroom instruction We will also investigate differences in learning and motivation when comparing classroom instruction, computer-assisted instruction, and intelligent tutoring

Experiment 1: Compare One Teacher to CAI and ITS

In this experiment, students’ learning of symbolization skills is measured from pretests and post-tests administered before and after classroom instruction (traditional or cooperative), computer-assisted instruction, or computer instruction with additional intelligent tutoring

Research Question: The research question for these studies was: Are the effects of

computer-delivered instruction significantly better than the effects of classroom

instruction on students’ ability to learn to symbolize? At a finer grained level, are the effects of intelligent tutoring feedback different than the effects of the simple non-

intelligent tutoring approach of traditional CAI?

Trang 9

Setting and Participants The study took place in the students’ regular algebra

classrooms and in a computer lab that consisted of 20 computers with internet access Thehigh school was located in a rural area and served approximately 1200 students Forty-six percent of the students received either free or free and reduced lunches According to Department of Education data on NCLB, this school ranked in the bottom half and did not meet AYP due to low socio-economic subgroup scores

The participants for Experiment 1 were students enrolled in equivalent algebra 1

inclusion classes during the 2004-2005 school year The classes were not Honors or Advanced Placement, but were typical classes with students mostly of average ability One class had twenty-two students and the other had twenty-one students However, a total of seven students, four from one class and three from the other, were not included in the study because they missed at least one day during the experiment Therefore, a total

of thirty-six students participated in the study, twenty-two females and fourteen males Fourteen were students identified as learning disabled, and twenty-two were typical regular education students There were thirty freshman and six sophomores ranging in ages from fourteen to sixteen years The classes were co-taught by a fully certified regular education math teacher and a highly qualified, math through algebra 1, special education teacher Both teachers shared responsibilities for teaching algebra content, lesson planning, and student accommodations The lead author was the primary instructorfor both classes during the experiment, but not the students’ regular teacher Individual Education Programs were reviewed to ensure that the general classroom placement was the least restrictive and most appropriate algebra I placement for students with learning disabilities

Content The computer curriculum is composed of five sections, starting with

relatively easy one-operator problems (i.e., “7x”), and progressing up to more difficult four or five operator problems (i.e., “3x+5*(20-x”) The content of the 9 item pre- and post-tests was identical and contained four multiple-choice questions and five questions requiring students to write algebraic expressions (See Appendix A for sample tests.) Seven of the items were experimenter-designed questions and two were standardized test questions An answer key was constructed and used by the scorer to award one point for each correct answer

The classroom lessons were designed with items of similar content, format and difficulty level In fact, problems used in the classroom lessons were isomorphic to the computer lessons so no group had an unfair advantage (See Appendix B for sample classroom problems)

Procedures Both the control and experimental conditions took place during the

students’ regular fifty-minute class periods The classroom lessons were delivered by the lead author and the study was conducted over a one-week period with pretest, mid-test, and post-test administered on Monday, Wednesday, and Friday and the computer

condition presented on Tuesday and Thursday Prior to the experiment, students in both classes had minimal exposure to algebraic expressions and equations while working in their text: Algebra I, Glencoe Mathematics Series

During the traditional instruction condition, the classroom activities were divided into two main parts: 1) introduction with in-class examples, and 2) guided practice The

Trang 10

introduction period began with the teacher giving each student a worksheet containing twenty-five word problems ranging in difficulty from simple one-operator problems to complex four-operator problems After reviewing the objective of the lesson, problems were displayed on an over-head projector while the instructor read a problem and

demonstrated how to translate it into an algebraic expression The instructor used various instructional strategies separately and in combination while demonstrating problems For example, on one problem the instructor exclusively used the “clue” word method by identifying clue words such as “more than”, “less than”, and “sum” that indicate

mathematical operations and parentheses On another problem, he used the “clue” word method along with dividing the problem into component parts and solving each part separately On all problems demonstrated, however, the instructor continually checked for understanding by asking comprehension gauging questions and eliciting questions anddiscussion from students A total of five problems were presented and took approximatelytwenty minutes During guided practice, students were instructed to work on the

remaining problems until the end of the class period, approximately thirty minutes The instructor was available to all students and assisted in the order in which help was

requested The guidance was not interactive in nature, but consisted mainly of promptingstudents to look for clue words, defining words (e.g., “per” means “divide by”, “twice” means “two times a number”), explaining procedures (e.g., “less than” is a backwards construction), and giving hints All questions were answered regardless of the nature

The cooperative instruction condition also consisted of two parts: introduction with in-class examples and cooperative learning groups The introduction period followedthe same instructional sequence used during the traditional instruction condition and also lasted twenty minutes However, students were placed in groups of four and encouraged

to work together on the problems with no additional guidance from the instructor The cooperative learning model had been used on a regular basis in these classes, so students were familiar with the structure and expectations For example, students understood the concept of peer support inherent in the groupings and the many forms in which it can be manifested such as clarifying, interpreting, modeling, explaining, and taking

responsibility for their own learning as well as the group’s learning When students requested assistance from the instructor, they were reminded to attempt the problem as a group first and then were given indirect support, when needed Students worked on the problems in their groups for thirty minutes

During the computer-delivered lesson, students logged on to the computer as soon

as the class began This process took five minutes for the majority of students, a few, however, needed more time to fully log on The computer-system then randomly assignedeach student either to the ITS or the CAI condition Students continued working on computer delivered problems until the end of class The additional five to seven minutes spent logging on effectively resulted in less instructional time for the students in the computer lesson Therefore, students in the classroom conditions received about eight percent more time on task

Design A counterbalanced design, in which all groups received all treatments but

in a different order, was used in this study Each student participated in the experimental condition and either in the cooperative or traditional instruction part of the control

condition For example, students in Group -1 participated in the control condition first, while students in Group - 2 participated in the experimental condition first ensuring that

Trang 11

each group participated in a different sequence The experiment lasted one week On Monday students were administered the pretest and were given instructions and a

demonstration of how to log on to the computer-based system Students were not allowed

to practice items at this time The goal was to become familiar with the computer system and its operations On Tuesday, Group -1 participated in the cooperative instruction condition while Group -2 participated in the computer condition On Thursday the order was reversed On Wednesday and Friday all students were given a mid-test and posttest respectively

Every effort was taken to ensure a matched control group While random

assignment was not possible in the school setting, the control group was an equivalent algebra 1 class taught by the same teachers A pretest was administered to both groups to ensure initial balance on the dependent measures The students were given a mid-test after the first condition and a post-test after completion of the experiment The data for this study were analyzed using SPSS Repeated measures analysis of variance, one-way analysis of variance, t-tests and descriptive statistics were used Table 1 displays the overall design of the study

Table 1

Group – 1- Classroom First, Then Computer

Monday ~ 10-20 minutes Pre test / Introduction to

Computer-system

Average 50minutes

CooperativeLearningCondition

Thursday Experiment:

Average 45minutes

Computer ConditionRandomlyassigned

By computer

Trang 12

Group – 2 Computer First, Then Classroom

Monday ~ 10-20 minutes Pre test / Introduction to

Computer-systemTuesday Experiment:

Average 45minutes

Computer ConditionRandomlyassigned

TraditionalLearningCondition

Results from Exp 1

The means of the two groups were balanced at pretest (mean number correct for computer first group = 3.00, sd = 1.085; mean number correct for computer second group

= 2.89, sd = 1.323; t =.275, p<.785) Students in all conditions learned a significant amount as shown by analysis of variance with repeated measures which revealed

statistically significant differences between the mean number correct at pretest, mid-test, and post-test (F = 30.32, p<.000) Given that the groups were balanced at pretest and there were large learning gains, we want to determine if there were disproportionate gainsdependent upon condition We first compared both computer versions (CAI & ITS) as a group with classroom instruction Later, we break out the CAI versus ITS

First we discuss the main effect of computer versus classroom Given that we had

a pretest, mid-test and post-test, we calculated gain scores for each student for both classroom and computer conditions (Meaning if “Johnny” was in the condition that first went to the computer lab, and later, after the mid-test, he had classroom instruction Johnny’s computer gain that would be the mid-test scores minus the pretest, and Johnny’sclassroom gain would be the post-test minus the mid-test.) There was a statistically significant difference (t=2.469, p<.019) between the average computer gain (m = 1.7, sd

= 1.22) versus the average classroom gain (m = 1.1, sd = 926), suggesting that students learned about 0.67 more problems from the computer versus the classroom The effect size of this defense was 0.60 with 95% confidence intervals of 0.08 – 1.02

Next we consider whether computer learning gains were different between CAI and ITS When comparing conditions, the ITS group showed about 0.5 problem gain overthe CAI condition (m=1.42, sd =1.45; m=1.95 and sd=1.04), which yielded an effect size

of 41 and confidence intervals of -0.06 – 0.88, but these differences were not statisticallysignificant (t =.126, p<.215) With this somewhat large p-value of 126 it suggested that

Trang 13

we should run more subjects to see if we can get a statistically significant result (See experiment 2 below) Given that ITS seemed to be better than classroom instruction, we compared the performance of the ITS versus classroom, dropping those students who got the CAI and found that the students who had ITS did much better (t=2.673, p<.014) with

an effect size of 0.59 and confidence intervals of -0.02 – 1.19 Conversely, when we compared CAI to classroom instruction the results were not statistically significant (t

=.905, p<.383) with an effect size of 4 and confidence intervals of -0.34 – 1.15

Because we used a counterbalanced design and half of the students got the

computer first and the other half got the classroom instruction first, it is worth looking to see if there is an effect of order, and indeed we found that students’ average learning gains were higher in the first section as a main effect (F=29.38, p<.000) When

comparing computer gain and group, the computer-first group (M = 1.89, sd = 1.49) gained the classroom-first group (M = 1.78, sd = 1.26) The mean mid-test score for the computer first group was (5.06, sd = 1.11) and for the computer second group, (m= 3.67,

out-sd = 1.71) The mean post-test score for the computer first was (5.44, out-sd = 856) and (5.44, sd = 1.38) for computer second group

Because the first author was particularly interested in students with learning disabilities, we did a more focused analysis looking at the fifteen students with learning disabilities (LD) and found, not surprisingly, they started with lower pretest scores (m= 2.93, sd = 1.22) By repeating the above analysis, but using only the 15 students with LD,

we found students with LD showed statistically significant differences in learning (t = 2.101, p < 054) and similar but larger effects showing that the students with LD learned more from the computer than classroom instruction The average classroom gain was (m=.954, sd = 91) while the average computer gain was (m=1.62, sd = 1.22), with an effect size of 0.55 and confidence interval of -0.18, 1.28.The students with learning disabilities made comparable gains to students without learning disabilities, but started out with lower average pre-totals

Discussion of Experiment 1

On average, all students had learning gains in all conditions However, these comparisons indicate that students’ use of computer-delivered “intelligent” feedback (ITS) enhanced learning symbolization skills more than teacher-centered classroom instruction, and CAI These results hold also for students with learning disabilities

It is important to note, that while the computer condition overall was significantly better than the classroom condition, the tutorial dialog was not significantly better than the simple computer-assisted condition Therefore, we can not be certain that tutorial dialog alone is more effective or more efficient than simple computer feedback

There are other factors that may have accounted for the present results including differences in classroom conditions For example, students were either in the computer first or computer second conditions and some students were either in direct instruction or cooperative learning groups Also we can not rule out multiple-treatment interference because each group received more than one treatment nor can we can rule out the effects

of practice as each student participated in pre-,mid- and post-tests Experimenter bias may also be a factor because the first author taught both lessons in the classroom

conditions It is also important to mention that this experiment was conducted in October

Trang 14

only one and one-half months into the school year with students having only minimal exposure to writing algebra expressions; therefore, these students might be considered nạve learners with respect to symbolizing Given that research (Rose & VanLehn, 2003) suggests that nạve learners may benefit more from tutorial dialog we can not rule this out

as a factor for the benefit of computer-delivered intelligent feedback in this experiment Finally, the control condition consisted of two different teaching methodologies,

traditional instruction and cooperative instruction Perhaps one of these methods is not very effective for teaching algebra word-problem symbolization in a single class period

Experiment 2: Comparing CAI versus ITS done as homework

In experiment I, students who received intelligent tutoring feedback showed about

a two-thirds problem gain over students who received simple computer-based feedback The results were not statistically significant Also, students in the traditional instruction condition out-gained students in the cooperative instruction condition by about one-third

of a problem Again these results were not significant Thus the purpose of experiment 2 was to determine if we could obtain statistically significant results between intelligent feedback and simple computer-based feedback with additional students while also

eliminating the confound between type of classroom teaching

As we explain below, we intended to replicate experiment 1 with larger numbers, and to eliminate the confound of type of classroom instruction, but instead we wound up comparing CAI and ITS when done at home This is similar to the VanLehn et al., (2004)study in which they compared Andes an ITS that provides immediate step-wise feedback,but no dialog, to paper-and-pencil homework and found that Andes helped students learn while replacing only their paper-and-pencil homework However, their study did not compare the ITS to CAI nor did it look at motivational benefits of the ITS

Our two research questions for Experiment 2 are:

Research question #2a: For students who have internet connections at home, does 40 minutes of classroom problem solving do better or worse, in terms of student learning gains than ~40 minutes of home computer use

Research question #2b: For students who have internet connections at home, does ~40 minutes of CAI do better or worse, and by how much, than supposed “intelligent”

tutoring Measure of interest included student learning gains measured in school, gains with the computer system itself, time on task, and student reported satisfaction with the computer system We also asked students to self-report times and condition

Method

Setting and Participants The setting for this study was two regular algebra

classrooms in the same high school as in experiment 1, and students’ home computers That is, instead of the experimental condition occurring in a computer lab in the school,

Trang 15

students in both classes were offered extra credit to do the experimental condition as a homework assignment Obviously, only students with internet access could participate in the study This was not a planned circumstance, as we had intended to try to replicate experiment #1, but because the computer labs recently installed new security software that prevented the web site from functioning correctly, the first author instead sought volunteers to engage in the computer condition at home for extra class credit.

Fourteen students from each class agreed to participate in the experimental

condition which meant they agreed to work at home, for at least thirty minutes, on the computer-delivered system Thus, the participants for this study were twenty-eight students (20 female, 8 male, ages 14-16 years) out of a possible forty-five students from both classes All students were classified as typically achieving students, that is, none were identified as learning disabled The lead author, while not the student’s regular teacher, taught both classes during the experiment

Procedures As in experiment 1, the study was conducted over a one-week period

and included a pretest, midtest, and post-test administered on Monday, Wednesday, and

Friday The control and experimental conditions occurred on Tuesday and Thursday in a

counterbalanced manner Group-one received the traditional instruction control on

Tuesday in their classroom and the experimental condition on Thursday, while group-tworeceived the experimental condition on Tuesday and the control condition on Thursday in their classroom For the experimental condition both groups were taught how to log on to the system and instructed to spend at least thirty minutes on the computer system from the time they were logged on without stopping

To be clear, students who did not volunteer to be part of the experiment were not

in the classroom; students who did volunteer to do the extra homework required were

“pulled out” of their normal classroom for the “classroom instruction” part of this

experiment Obviously, this was a more motivated group of students

During the traditional instruction condition we used the same format and

materials (worksheets, pre-, mid-, and post-tests) used in experiment 1 The classroom activities were again divided into two main parts: 1) introduction with in-class examples, and 2) guided practice Students were given a worksheet with the same twenty-five problems in the same order used in experiment 1 The instructor demonstrated how to translate word problems into algebraic expressions by first displaying problems on an over-head projector and reading the problems to the class The instructor then discussed several traditional textbook methods used to translate word problems to algebraic

expressions including matching “clue” words with mathematical operations and

procedures and using problem-solving plans such as, explore the problem, plan the solution, solve the problem, and examine the solution The instructor demonstrated five problems which took approximately twenty-minutes During the remaining thirty-

minutes students completed their worksheet, and the instructor was available to all students and assisted in the order in which students requested help

Design A counterbalanced design was used in which all groups received all

conditions but in a different order Specifically, one group participated in the

experimental condition first while the other group participated in the control condition first, thus ensuring a different sequence of instruction for each group In the experimental condition, students continued to receive computer-delivered instruction, but unlike Experiment 1, the control condition was limited to traditional classroom instructional

Tiêu đề	Comparing The Learning From Intelligent Tutoring Systems, Non-Intelligent Computer-Based Versions, And Traditional Classroom Instruction.
Tác giả	Michael Mendicino, Neil Heffernan
Trường học	West Virginia University
Chuyên ngành	Educational Psychology
Thể loại	journal article

Định dạng
Số trang	31
Dung lượng	369,5 KB