The Compatibility of L2 Learners’ Assessment of Self- and Peer Revisions of Writing With Teachers’ Assessment MANAMI SUZUKI Dokkyo University Saitama, Japan The current study ex
Trang 1The Compatibility of L2 Learners’ Assessment
of Self- and Peer Revisions of Writing With
Teachers’ Assessment
MANAMI SUZUKI
Dokkyo University
Saitama, Japan
The current study examines second language (L2) learners’ assessment of text changes that they made on their written drafts in self-revisions and peer self-revisions, focusing on linguistic features of their repairs
as well as the validity of self-assessment In the study, the term self-revision refers to writers’ revisions of their own written texts, whereas the term peer
revision means that writers revise their drafts while talking with their peers
Learners’ selfassessment of their language performance or profi
-ciency is often referred to as a kind of alternative assessment (Alderson & Banerjee, 2001) or alternative in assessment (Brown & Hudson, 1998)
Advantages of alternative assessment including self- and peer assessment are (a) quick administration; (b) students’ involvement in the assessment process; (c) enhancement of students’ autonomy of language learning by means of involvement; and (d) increase of students’ motivation toward language learning (Blanche & Merino, 1989; Brown & Hudson, 1998) Disadvantages of alternative assessment are concerned with reliability and validity (Blanche, 1988; Blanche & Merino, 1989; Blue, 1988; Jafarpur, 1991) The number of studies on alternative assessment is still limited (Oscarson, 1997) Most studies of self- or peer assessment in second lan-guage acquisition research do not focus on assessment of specifi c linguis-tic features such as morphemes, lexis, or discourse (Cheng & Warren, 2005; Patri, 2002; Ross, 1998; Rothschild & Klingenberg, 1990) The study
of self-assessment validity with regard to linguistic features will thus give pedagogical suggestions to training or form-focused instruction for self-assessment, particularly in a process-oriented writing class where students need to assess and revise their previous self-revision or peer revision Theoretically, further research is needed to examine differences in L2 learners’ assessment with regard to different conditions of revision (i.e., self-revision and peer revision) Moreover, the reliability of self-assessment should be verifi ed Pedagogically, this has consequences for organizing self-assessment, self-revision, and peer revision in the writing classroom Two research questions guided this study:
1 How differently do L2 writers assess text changes that they have made
on their written drafts in self-revisions and peer revisions?
Trang 22 Is L2 writers’ self-assessment of written text changes compatible with
the writing teachers’ assessment in linguistic features?
METHOD
Participants
Twenty-four Japanese second-year university students who enrolled in
a two-semester long English course volunteered to participate in this
study They were all registered in the English department at a private
uni-versity in Japan They were from a middle class socioeconomic
back-ground The current study was conducted in the middle of the fall
semester of 2003 The students were placed in the class by their results
on the Test of English as a Foreign Language Institutional Testing
Program (TOEFL ITP) developed from the past TOEFL tests by the
Educational Testing Service (ETS, 2009a) The participants took the
TOEFL ITP in January 2003 before the course started in April The mean
of participants’ scores on the TOEFL ITP test was 515.3 and the standard
deviation was 22.7
The participants’ profi ciency level was intermediate This level of
learners was specifi cally selected because it was not investigated by
previ-ous studies which directly relate to the present one
Grouping of Students
The current study required two equivalent writing tasks as contexts
to compare students’ assessment of text changes made during
self-revisions and peer self-revisions The participants were divided into two
groups which were supposed to be as similar as possible in L2 (English)
profi ciency, English writing profi ciency, gender, age, or length and
con-text of L2 learning
Two groups, A and B, were formed randomly in reference to the
fol-lowing criteria: individual participants’ means of the adjusted standard
deviation scores 1 on the TOEFL ITP test in January, 2003; two raters’
holistic assessments; and the total number of words that the students’ had
written as course work at the end of the previous semester
1 The adjusted standard deviation score is widely used in Japan It is a technique for
stand-ardization of scores of different tests that makes the mean score of any tests become 50 (e.g.,
to compare scores on different subject tests) In order to make 90% of the students’ scores
belong to the range 20 to 80, the gap between individual student’s scores and the mean is
divided by the standard deviation and then multiplied by 10 The formula is adjusted
stan-dard deviation = (( X i − X ave ) ÷ SD ) × 10 + 50, where X i = individual student’s score; X ave = M
Trang 3The means ( M ) and the standard deviations ( SD ) of ages in each group (Group A and Group B) were almost the same ( M = 20.1 vs M = 19.8;
SD = 0.5 vs SD = 0.8) Each group’s means and standard deviations of length of English learning were also similar ( M = 9.0 years vs M = 8.8 years; SD = 2.3 vs SD = 1.7) Group A consisted of ten female and two
male students, whereas Group B comprised eight female and four male students Each dyad for the peer revision task was randomly selected by a computer program
Procedures
The current study was a part of my dissertation study of L2 learners’ self-revisions and peer revisions (Suzuki, 2006) The data were collected over 2 weeks in the middle of the fall semester, 2003 (Weeks 6 and 7) in the language laboratory at their school, where their class was always held (see Table 1 ) In both weeks, participants wrote an essay for the TOEFL Test of Written English (paper based; ETS, 2009b) for 30 minutes The participants’ teacher and I (the author) selected two essays, one for Writing Task 1 and one for Writing Task 2, and we selected prompts that were as equivalent as possible in structure, diffi culty, and participants’ interest (Appendix A) In the fi rst week, Group A students revised their written drafts by themselves immediately after they completed the writing task, whereas Group B students were engaged in peer revision Students had 15 minutes per paper for revision, so it took 30 minutes to complete peer revision In this study, I asked students to revise a draft in
a 15 minute session Thus, further revision after each revision session was not allowed All participants used black pens that I distributed for the fi rst drafts In the self-revision group, participants used much thicker blue pens than the black pens so that text changes that they made could
be distinguishable In the peer revision group, students who revised their own drafts used blue pens on their own drafts and used green pens
TABLE 1 Data Collection Timetable
Writing Task 1 (30 mins.) Writing Task 2 (30 mins.)
Self-revision
(15 mins.)
Peer Revision (15 mins
per paper → total
30 mins.)
Peer Revision (15 mins
per paper → total
30 mins.)
Self-revision (15 mins.) Self-assessment (about 5 mins.; within 2 days
after revision)
Self-assessment (about 5 mins.; within
2 days after revision)
Trang 4on their partners’ drafts Within 2 days after participants had revised
their drafts, I interviewed each student individually in their teacher’s
offi ce Participants and I (the author) identifi ed all text changes and
gave a number to each change on their written drafts Students assessed
their text changes using a scale ranging from 5 ( greatly improved ) to 1
( not improved ) during the sessions (Appendix C) 2 I asked students to
judge to what extent each text change that participants had made
improved the written text In the next week of Writing Task 1 (famous
person in history) and its revision, students engaged in Writing Task 2
(famous entertainer or athlete) The procedure from the fi rst drafts to
revised writing was repeated and participants did the other procedure
in Writing Task 2 Group A revised with their peer, while Group B
performed self-revision
Data Analysis
Coding of Text Changes
With the assistance of an experienced researcher, I classifi ed all text
changes ( N = 453) into nine linguistic categories (Appendix B) My
cat-egorization of the linguistic types follows the taxonomies of revision
changes in previous studies (Faigley & Witte, 1981, 1984; Yagelski, 1995)
Our interrater reliability of identifi cation of linguistic types of text
changes was 0.84 (kappa statistics) Following that, we discussed
discrep-ancies and reached 100% agreement in our categorization
Teachers’ Assessment
To compare the students’ self-assessment with writing teachers’
assess-ment, I and another native speaker of English, both of whom had been
teaching English in several universities in Japan separately, assessed all
the text changes, using the same 5-point scale (Appendix C) The
inter-rater reliability of our assessment was 0.97 by Spearman’s rho We
dis-cussed and resolved the differences of assessment to reach 100%
agreement I used the agreed assessment as teachers’ assessment to compare
students’ self-assessment
2 The Likert scale for assessment in the present study (Appendix C) was originally
deve-loped The validity of the scale was not verifi ed, which might be one of limitations in the
study
Trang 5Data Analysis in Respect to Research Questions
Research Question 1: Self-Assessment of Text Changes in Self- and Peer Revision
I calculated the percentage of the number of each point (1–5) that stu-dents rated per the total number of text changes made in self-revisions and peer revisions in order to examine the tendency of students’ self-assessment The means and standard deviations of their self-assessment
in both conditions of revision (self-revision and peer revision) were also calculated to compare participants’ self-assessment of text changes made during self-revisions with their self-assessment of text changes made dur-ing peer revisions I used descriptive statistics because the number of text changes which students made was different in self-revisions and peer
revi-sions ( n = 287, 166, respectively)
Research Question 2: Comparison Between Students’
Self-Assessment and Teachers’ Assessment
I calculated reliabilities between students’ self-assessment and teach-ers’ assessment (using Spearman’s rho) by the nine linguistic categories (Appendix B) Furthermore, I compared the means and standard devia-tions of students’ self-assessment to those of teachers’ assessment by the linguistic categories and the two conditions of revision (self-revision and peer revision)
RESULTS
Research Question 1
How differently do L2 writers assess text changes that they have made on their written drafts in self-revisions and peer revisions? Results found that students tended to assess text changes during peer revisions
slightly more highly than text changes during self-revisions ( M = 3.7, 3.5,
respectively) This result is summarized in Table 2
Self-revision ( N = 287) Peer revision ( N = 166)
TABLE 2
Means ( M ) and Standard Deviations ( SD ) of Students’ Self-Assessment of Text Changes
Trang 6Figure 1 demonstrates the percentages of the number of each point
(1–5) that students rated per the total number of text changes in
self-revisions and peer self-revisions Students used a medium point (Point 3) the
most frequently (39%) among a 5-point scale and tended to use higher
points, such as Points 4 or 5 (29%, 20%, respectively), when they assessed
text changes that they made during self-revisions On the other hand,
stu-dents used the highest point (Point 5) the most frequently (34%), and
they also often used Point 3 (31%) for assessment of text changes made
during peer revisions
Moreover, students had a tendency to rate within slightly narrower
range when they assessed text changes made during self-revisions than
when they assessed text changes made during peer revisions As Table 2
shows, the standard deviation of students’ assessment of text changes in
self-revisions was 0.98, and the standard deviation of their assessment of
text changes in peer revisions was 1.15
Research Question 2
Is L2 writers’ self-assessment of written text changes compatible with
the writing teachers’ assessment in linguistic features? The means and
standard deviations of students’ assessment and teachers’ assessment are
displayed in Table 3
FIGURE 1 Percentages of the Number of Each Point That Students Rated per the Total Number of Text
Changes During Self-Revisions and Peer Revisions
Trang 7The study found that the average reliability between students’
assess-ment and teachers’ assessassess-ment of all text changes ( N = 453) was 0.52
(Spearman’s rho) However, the agreement of students’ assessment and teachers’ assessment depended on linguistic features of text changes Table 4 provides reliability between students’ and teachers’ assessment of text changes by nine linguistic features The reliabilities of punctuation and spelling were higher than 0.80 Particularly, it is noteworthy that students’ assessment of discourse-level text changes (organization and
paragraphing) was positively correlated to teachers’ assessment ( r s = 0.73,
0.75, respectively) On the contrary, reliabilities of vocabulary and word choice, length of sentence, and especially, word form corrections were
very low ( r s = 0.38, 0.20, and −0.13, respectively)
Table 5 shows the means and standard deviations of students’ self-assessment and teachers’ self-assessment of each linguistic type of text change Generally, students tended to underestimate their text changes, except text changes of vocabulary and word choice and sentence type, compared with teachers’ assessment Furthermore, the standard deviations of self-assessment were a little lower than those of teachers’ self-assessment except assessment of text changes of capitalization and paragraphing
Self-revision ( N = 287) Peer revision ( N = 166)
Students
Teachers
TABLE 3
Means ( M ) and Standard Deviations ( SD ) of Students’ Self-Assessment and Teachers’
Assessment
Level of text changes Linguistic type r s n
Word form corrections −0.13 116 Vocabulary/word choice 0.38 104
TABLE 4 Reliability of Students’ and Teacher’s Assessment of Text Changes by Linguistic Type
Trang 8DISCUSSION
Students assessed their text changes that they made themselves during
self-revisions and text changes that they made with their partners during
their revisions differently Students tended to assess their text changes
made during peer revisions more highly than the text changes made
dur-ing self-revisions Students assessed within a little narrower range when
they assessed their text changes they made themselves during
self-revisions than the text changes they made during peer self-revisions Peer
revi-sion might have facilitated students’ clearer decirevi-sion making about text
changes in their draft and have given students confi dence in their revision
The range of students’ self-assessment was narrower than that of teachers’
assessment The result of self-assessment’s narrow range confi rmed Cheng
and Warren’s (2005) conclusion As Cheng and Warren suggested,
previ-ous practice and experience of assessment procedures could make the
range of self-assessment wider and closer to teachers’ assessment
In the current study, students tended to underestimate their text changes,
compared with teachers The result followed previous self-assessment
stud-ies (Blanche, 1988; Heilenman, 1990; Yamashita, 1996) The fi ndings of
these studies were that more advanced-level L2 learners tended to
under-estimate their language profi ciency Participants in the current study were
intermediate-level learners L2 writing teachers of intermediate-level
stu-dents may need to give their stustu-dents instructions to give the stustu-dents
con-fi dence in their L2 ability and their self-assessment of their L2 procon-fi ciency
The average correlation between students’ self-assessment and
teach-ers’ assessment of all text changes ( r s = 0.52) confi rmed Ross’s (1998)
meta-analysis for validation of L2 self-assessment In Ross’s study, the
aver-age correlation between self-assessment of writing and the criterion
variables was 0.52 ( d = 1.2) Reliability of self-assessment of writing was
not high enough to establish a validation
Level of text
changes Linguistic type
Self-assessment
M ( SD )
Teachers’ assessment
M ( SD ) n
Surface level Punctuation 3.2 (0.70) 3.9 (1.33) 20
Capitalization 3.8 (1.14) 4.8 (0.42) 10
Word form corrections 3.7 (1.09) 4.1 (1.40) 116
Vocabulary/word choice 3.7 (1.05) 3.5 (1.50) 104
Sentence level Sentence types 4.0 (0.96) 3.4 (1.47) 60
Length of sentence 3.3 (0.98) 3.4 (1.39) 116
Discourse level Organization 3.7 (1.25) 3.9 (1.30) 7
TABLE 5
Means ( M ) and Standard Deviations ( SD ) of Self-Assessment and Teachers’ Assessment
by Linguistic Type
Trang 9The current study specifi cally analyzed the correlation of self-assess-ment and teachers’ assessself-assess-ment by nine linguistic types of text change The correlations of self-assessment of text changes of punctuation, spell-ing, organization, and paragraphing were very high in this study Particularly, it is notable that students could assess discourse-level text changes accurately (organization, r s = 0.73; paragraphing, r s = 0.75)
Cresswell (2000) empirically proved that previous revision instruction to raise students’ awareness of global-level text changes infl uenced the pro-cess of students’ revision Cresswell’s study was not product oriented, however, and thus, she did not examine the outcome of students’ global revision The current study found that students’ global-level text changes were successful (see Table 5) Furthermore, the study empirically proved that students’ self-assessment of global-level text changes was accurate Therefore, as Cresswell indicated, L2 writing teachers’ prior instruction
to raise students’ awareness of global-level revision might be effective for the success in students’ self- and peer revision
The reliabilities of vocabulary and word choice, length of sentence, and word form correction between self-assessment and teachers’ assessment were not high These linguistic text changes were a large proportion of all the text changes that students made during revisions (74%) Particularly, word form corrections and vocabulary and word choice were typical ESL/ EFL writers’ errors (Ferris, 2003; Ferris & Roberts, 2001) English writing teachers may need to give students form-focused instruction on English morphology (e.g., articles, pluralization, subject–verb agreement, and verb tense) before students are engaged in writing, revision, or self-assessment The form-focused instruction before students’ revisions or self-assessment might make their revision or peer revision more successful and self-assessment more accurate, which could save English writing teachers’ time for feedback on individual students’ morphological errors In contrast, lexical errors, which are sometimes considered untreatable (Ferris, 2003), seem to need teachers’ explicit feedback Teachers’ assessment of lexical
text changes was not very high ( M = 3.5), and self-assessment correlation
of lexical text changes was low ( r s = 0.38) Teachers’ more explicit feedback
(e.g., explicit error correction) on vocabulary and word choice may be important for the success in L2 students’ self-revision, peer revision, and self-assessment Further research on sentence-level text changes, whose errors are also considered untreatable (Ferris, 2003), is neces sary because
of the linguistic failure and the low self-assessment correlations
CONCLUSION
The sample of the current study is small and homogeneous ( N = 24)
This study did not consider other factors like L2 or L1 educational and sociocultural background, L2 profi ciency level, gender, or age,
Trang 10which might be independent variables of self-revision, peer revision, and
self-assessment Further larger scale and cross-sectional studies are
neces-sary to give a general conclusion about L2 learners’ self-assessment and
validity In spite of that, I believe that the current study has pedagogical
consequences for organizing self-assessment in the process-oriented
writ-ing classroom The fi ndwrit-ings can be helpful for more effective instruction
for self-assessment of L2 writing
ACKNOWLEDGMENTS
I thank Alister Cumming and Wataru Suzuki for their comments on an earlier version
of this manuscript, and I thank those who kindly volunteered to participate in the
study I also gratefully acknowledge the TESOL Quarterly reviewers’ signifi cant
sugges-tions, which contributed to the fi nal version of this article
THE AUTHOR
Manami Suzuki fi nished the doctoral program at the Department of Curriculum,
Teaching and Learning, Ontario Institute for Studies in Education of the University
of Toronto in 2006 She has been teaching English and language education at Dokkyo
University, Saitama, Japan, and Tokyo Woman’s Christian University, Tokyo, Japan, as
a lecturer since April 2004
REFERENCES
Alderson, J C., & Banerjee, J (2001) Language testing and assessment (Part 1)
Language Teaching, 34 , 213–236
Blanche, P (1988) Self-assessment of foreign language skills: Implications for
teach-ers and researchteach-ers RELC Journal, 19, 75–96
Blanche, P., & Merino, B J (1989) Self-assessment of foreign-language skills:
Implications for teachers and researchers Language Learning, 39, 313–340
Blue, G M (1988) Self assessment: The limits of learner independence ELT
Documents, 131, 100–118
Brown, J D., & Hudson, T (1998) The alternatives in language assessment TESOL
Quarterly, 32 , 653–675
Cheng, W., & Warren, M (2005) Peer assessment of language profi ciency Language
Testing, 22 , 93–121
Cresswell, A (2000) Self-monitoring in student writing ELT Journal, 54 , 235–244
ETS (2009a) Test of English as a Foreign Language Institutional Testing Program
Princeton, NJ: Author Available from http://www.ets.org
ETS (2009b) Test of English as a Foreign Language Test of Written English Princeton,
NJ: Author Available from http://www.ets.org
Faigley, L., & Witte, S (1981) Analyzing revision College Composition and Communication,
32, 400–414
Faigley, L., & Witte, S (1984) Measuring the effects of revisions on text structure In
R Beach & L Bridwell (Eds.), New directions in composition research (pp 95–108)
New York: Guildford Press
Ferris, D (2003) Response to student writing: Implications for second language students
Mahwah, NJ: Erlbaum