1. Trang chủ
  2. » Luận Văn - Báo Cáo

AN INVESTIGATION INTO WRITTEN ACHIEVEMENT TESTS FOR THE 12TH FORM STUDENTS

50 636 1

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề An Investigation Into Written Achievement Tests For The 12th Form Students
Trường học Ngo Quyen High School (NGHS)
Chuyên ngành English Language Teaching and Testing
Thể loại thesis
Năm xuất bản 2023
Thành phố Hai Phong
Định dạng
Số trang 50
Dung lượng 311,76 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

For the above-mentioned reasons, the author is encouraged to undertake this minorthesis with the aim at investigating the designing final written achievement tests for the 12thform stude

Trang 1

CHAPTER 1: INTRODUCTION

1.1 RATIONALE

A good test can be used as a valuable teaching device Heaton (1991:5) states that

“test may be constructed primarily as devices to reinforce learning and to motivate thestudent or primarily as a means of assessing the students’ performance in the language.”According to this linguist, the relationship between testing and teaching is “so closelyinterrelated that it is virtually impossible to work in either field without being constantlyconcerned with the other” For proper evaluation and assessment of the English languagelearning and teaching process, testing, an important tool in educational research and forprogram evaluation (Lauwerys and Seanlon (1969:2) is employed as an indispensable part

of the training program at Ngo Quyen high school (NGHS) in Hai Phong city

However, the designing a good test is not simple Having been a teacher of Englishfor many years, I have been involved in designing, administering and marking many kinds

of English tests such as progress and end-of-term tests and also have often heard teachersand test-takers at NQHS complaining that some of the final achievement tests for 12th formstudents do not faithfully reflect the real linguistic competence of the test-takers What istested is not really taught and the test measures neither the achievement of the courseobjectives nor the expected linguistic skills and knowledge of the students Probably, this

is because the test writers use the tests which are designed elsewhere and are not suitablefor the students What test writers are concerned with seems to be the reliability of the testrather than its validity The situation coincides with the comments made by some testresearchers as Brown (1994: 373) and Hughes (1989:1) on recent language testing, “a greatdeal of language testing is of very poor quality Too often language testing has a harmfuleffect on teaching and learning and too often they fail to measure accurately whatever it isthey are intended to measure” Another reason is that language testing here has not beenpaid enough attention to I have not witnessed either comprehensive or systematicevaluation on the effectiveness and appropriateness of these tests

For the above-mentioned reasons, the author is encouraged to undertake this minorthesis with the aim at investigating the designing final written achievement tests for the 12thform students at NQHS through evaluating a current final achievement test by bothstudents and teachers mainly in terms of its validity I hope that the result of the study can

Trang 2

then help to improve the quality of the final achievement tests for the 12th form students atNGHS.

1.2 SCOPE OF THE STUDY

Due to the limitations of time and ability, the scope of the study is limited toresearch on examining the current final achievement test for the 12th form students atNQHS mainly in terms of its validity

The study provides empirical evidence of the current final achievement test andproposes practical suggestions on the improvement of the final tests for the 12th formstudents at NQHS in general

1.3 AIMS OF THE STUDY

The study is aimed at reporting the result of the examination of current finalachievement test for the 12th form students at NQHS in terms of its validity It highlyemphasizes analyzing the teachers’ and students’ evaluation on the test and theirsuggestions towards its improvement

The specific aims of the research are:

- To investigate the NQHS English teachers’ and the 12th form students’ evaluation of thecurrent final achievement test in terms of its validity

- To find out the differences and similarities (if there are any) in teachers’ and test takers’evaluation of the test and to suggest reasons why there are such similarities anddifferences

- To provide some practical recommendations for the improvement of the finalachievement tests so as to achieve more accurate measures of students’ Englishcompetence

1.4 RESEARCH QUESTIONS

The research questions of the study are as follows:

- How is the current final achievement test for the 12th form students at NGHS evaluated byboth students and teachers in terms of its validity?

- What improvements are recommended by the teachers and students with regard to thevalidity of the test?

1.5 METHODS OF THE STUDY

In order to achieve the above aims, a study has been carried out with the following

Trang 3

First, the author based herself both on the theory and principles of language testing,major characteristics of a good test (with special focus on test validity), achievement testand practical tips to write it From her critical reading, many reference materials have beengathered, analyzed, and synthesized to draw out a theoretical basis to evaluate the currentfinal achievement test for the 12th form students at NQHS

Second, qualitative methodologies involving data collected through surveyquestionnaires were employed Two questionnaires were administered to the 12th formstudents and teachers of English at NQHS in order to investigate their evaluativecomments on the current final achievement test in terms of its validity and theirsuggestions for its improvement Besides, many other methods such as interviews,informal discussion with students, teachers, and classroom testing observation are alsoused to get more needed information

1.6 STRUCTURE OF THE STUDY

The minor thesis is organized into five major chapters:

- Chapter one presents basic information such as the rationales, the aims, the researchquestions, the methods, and the structure of the study

- Chapter two is about a review of related literature that provides the theoretical basis forevaluating and building a good language test This review consists of background onlanguage testing, criteria of a good test, theory on the written achievement test such as itstwo kinds and practical tips for writing achievement tests

- Chapter three, the main part of the study, analyzes the results of the survey including thequestionnaires and direct interviews to find out the existing problems in designing thecurrent achievement test in particular and other final achievement tests in general atNGHS

- Chapter four proposes some suggestions on improvement of designing the finalachievement tests basing on the mentioned theoretical and practical study

- Chapter five provides a summary and suggestions for further research on the topic, andreference materials as well

Trang 4

CHAPTER 2: LITERATURE REVIEW

2.1 DEFINITION OF TESTING

A test is generally defined by Carroll (1968:46) as “A psychological or educationaltest is a procedure designed to elicit certain behavior from which one can make inferencesabout certain characteristics of an individual” Simply put, a test is an instrument designed

to elicit a specific sample of an individual’s behavior

Similarly, Davies (1991:13) states that the tests are operational in nature, i.e, theyare intended to measure whether or not the candidates can do certain things in English The

“things” they are asked to do are specified at each level and represent authentic tasks of thesort which confront language users in real life

Genesee and John A Upshur (1996) look at tests as a task that measures one’sability to perform a particular task They argue that a test is, first of all, about something.That is, it is about intelligence, or European history, or second language proficiency In

educational terms, tests have subject matter or content Second, a test is a task or set of

tasks that elicits observable behavior from the test taker The test may consist of only onetask, such as writing a composition, or a set of tasks, such as in a lengthy multiple-choiceexamination in which each question can be thought of as a separate task Different test

tasks represent different methods of eliciting performance Third, tests yield scores that

represent attributes or characteristics of individuals In order to be meaningful, test scoresmust have a frame of reference Test scores along with the frame of reference used to

interpret them is referred to as measurement Thus, tests are a form of measurement.

(p.141) In other words, content, methods and measurement are three aspects of tests Thequality of the end-of-year tests depends on whether the content of the test is a good sample

of the relevant subject matter If the content of a test is a poor reflection of what has beentaught or what is supposed to be learned, then performance on the test will not provide agood indication of achievement in that subject area What a test is measuring is a reflection

of not only its content but also the method it employs Tests that employ different methodsare measuring somewhat different skills, no matter how similar their content might be.Tests in education measure differences in degree They describe how proficiently studentscan read a second language or how appropriately they speak in particular social situations,for example

Trang 5

In the foreign language teaching context, a test can be defined as an educationalinstrument which is designed to measure what someone can do with the foreign language

to serve a particular purpose (McNamara:11) As an instrument, a test may be responded todifferently by testees and test-users Understanding testees and test-users’ responses to, andperceptions of tests has been a critical issue in foreign language testing Suchunderstanding is even more important where learner-centredness is promoted as aphilosophical orientation in foreign language teaching

Testing, the act of administering a test, is closely related to teaching and learning.This relationship is discussed in the next section (section 2.2)

2.2 RELATIONSHIP BETWEEN TESTING, TEACHING AND LEARNING

With regard to the relationship between testing, teaching and learning, there havebeen two extreme views

In the past, there was a common view that teaching and learning were separatedboth theoretically and in practice According to this view, a test is a necessary butunpleasant imposition from outside the classroom: it helps to set standards but uses upvaluable class time

But other researchers acknowledge the close link between them For example,Harrison (1991:7) believes that far from being divorced from each other, testing andteaching are closely interrelated A test is seen as a natural extension of classroom that canserve each as a basis for improvement

Upshur (1971) adds that, language testing both serves and is served by research inlanguage acquisition and language teaching Language tests can be valuable sources ofinformation about the effectiveness of learning and teaching Language teachers regularlyuse tests to help diagnose student strengths and weaknesses, to assess student progress, and

to assist in evaluating student achievement Language tests are also frequently used assources of information in evaluating the effectiveness of different approaches to languageteaching As sources of feedback on learning and teaching, language tests can thus provideuseful input into the process of language teaching

That kind of feedback is termed “backwash” by Hughes (1989) who defines theterm as “the effect of testing on teaching and learning” He goes on to explain that testingcan have either a beneficial or a harmful effect on teaching and learning “If a test isregarded as important, then preparation for it can come to dominate all teaching and

Trang 6

learning activities And if the test content and testing techniques are at variance with theobjectives of the course, then there is likely to be harmful backwash” (p.1) However, henotes that the relationship between teaching and testing is that of partnership In otherwords, we cannot expect testing only to follow teaching; rather a good test is an obedientservant since it follows and apes the teaching (Davies (1968: 5) What we should demand

of it, however, is that it should be supportive of good teaching and, where necessary, exert

a corrective influence on bad teaching If testing always had a beneficial backwash onteaching, it would have much better reputation amongst teachers (Hughes:2)

Cohen (1994) discusses the effects of backwash more broadly, in terms of “howassessment instruments affect educational practices and beliefs” (p.41) Wall and Alderson(1993), go a little bit farther to argue convincingly on the basis of extensive empiricalresearch, that backwash has potential for affecting not only individuals, but the educationalsystem as well

Read (1983:2) points out: “A test can help both teachers and learners to clarifywhat the learners really need to know assuming that it is unrealistic to expect them tomaster everything they are presented with during a particular course.” The result of testsshows teachers not all but part of learners’ ability, which helps teachers to improve ways ofteaching or revise knowledge

According to Heaton (1898:7), “a well-constructed classroom test will provide thestudents with an opportunity to demonstrate their ability to perform certain tasks in thelanguage and the students should be able to learn from their weakness” Obviously, underthe influence of the tests, the students are motivated to use what they have done and avoidthe mistakes and errors that they have made The learners know how far they haveachieved the object of the course so that they can upgrade their level or they have to learnmore “A good test can sustain or enhance class morale and aid learning.” (Madsen,(1983:3)

Because of the important role a test plays in either supporting or impeding teachingand learning, it is critical that a test must be supportive of good teaching This raises thenecessity to investigate the opinions of the test users, specifically the learners and theteachers

2.3 TYPES OF ACHIEVEMENT TESTS

An achievement test is one of the means available to teachers and students alike of

Trang 7

assessing progress According to Hughes (1990:10), “achievement tests are directly related

to language course, their purpose being to establish how successful individual students,groups of students, or the courses themselves have been in achieving objectives” To make

it clearer and to distinguish it from others simultaneously, Harrison (1991) stresses that

“an achievement test looks back over a longer period of learning than the diagnostic test”(p.7) He provides a clear distinction between achievement and diagnostic tests in thatachievement tests cover a much wider range of material than a diagnostic tests and relate tolong-term rather than short-term objectives Achievement tests are designed to assess thewhole course or even a number of courses Those students who have finished an Englishcourse will sit for the test and will be evaluated whether or not they have learnt it well.Their standards and differences are judged in relation with other students in the same stage

by test results On the other hand, diagnostic tests also look back on the previous course forpersistent errors for which they from remedial work It can be referred that diagnostic testscan be used to predict and improve future teaching and learning

Additionally, Heaton- when widening the concept of achievement tests- definedthem as the ones “based on what the students are presumed to have learnt- not necessary onwhat they have actually learnt nor on what has actually been taught” (Heaton, 1991:172)

According to the time of administration and designed objectives, achievement testscan be subdivided into two kinds of achievement tests: Progress achievement and finaltests

2.3.1 Progress achievement tests

Progress achievement tests are always administered during the course, after achapter or a term, and often written by the teacher They are based on teaching program.Hughes (1990:12) claims “these tests are intended to measure the progress that students aremaking.” Since “progress” in achieving course objectives, these tests should be related toobjectives These should make a clear progression towards the final achievement testsbased on course objectives Then if the syllabus and teaching methods are appropriate tothese objectives, progress tests based on short term objectives will fit well with what hasbeen taught If not, there will be pressure to create a better fit

Progress achievement tests are supposed to help the teacher to judge the degree ofsuccess of his or her teaching and help to find out how much students have gained fromwhat have been taught Accordingly, the teachers can identify the weakness of the learners

Trang 8

or diagnose the areas not properly achieved during the course of study.

In short, progress achievement tests can be regarded as a useful device that providethe students with a good chance to perform the target language in a positive and effectivemanner and to gain additional confidence in doing them This way can be a goodpreparative and supportive step towards the final achievement test for the students becausethey will get familiar with the tests and the strategy to do them

2.3.2 Final achievement tests

Final achievement tests, as the name suggest, is usually a formal examination,given at the end of the school year or at the end of the course to measure how far studentshave achieved the teaching goals (Hughes(1990:10) They may be written andadministered by ministries of education, official examining board, or by members ofteaching institutions The content of these tests must be related to the courses with whichthey are concerned Hughes (1990:11) suggests two approaches towards designingachievement tests: syllabus-content approach and objective content approach

The syllabus-content approach means that the content of a final achievement testsshould be based on a detailed course syllabus or on the books and other material used Thetests designed basing on what the students have already learnt in the course books can beconsidered fair tests On the contrary, the badly designed syllabus or badly chosen materialwhich is different from the course objectives may bring about misleading results which areunlikely to show what students have achieved on the other When this occurs, test resultswill fail to meet the test validity in terms of course objectives

The syllabus-objective approach is to design the test content directly on theobjectives of the course This approach has some good points Firstly, it forces coursedesigners to elicit about course objectives Secondly, this approach can help to workagainst the poor teaching practice that syllabus content-based tests fail to do However, thisapproach has to cope with the problems in testing what the students have neither learnednor prepared

Of the two approaches mentioned, Hughes (1990:11) favors the latter one byarguing that it will provide more accurate information about individual and groupachievement, and it is likely to promote a more beneficial backwash effect on teaching

2.3.3 Roles of achievement tests

The roles of achievement tests are clearly shown by McNamara(2000:6): “

Trang 9

Achievement tests accumulate evidence during or at the end of a course of study in order tosee whether and where progress has been made in terms of the goals of learning.Achievement tests should support the teaching to which they relate.” That is, achievementtests play an important role in the teaching-learning process Besides bearing all thecharacteristics of a normal test, achievement tests can supply more accurate and fullerinformation because they look back on the course students have been learning These tests’backwash effect can show teachers how appropriately or effectively their teaching hasbeen Furthermore, results obtained from achievement tests enable teachers to becomefamiliar with the of each student and with the progress of the class in general (Heaton,1991:1) With final achievement tests and progress achievement tests, teachers can takeusual control of their class through their results.

This type of test works mainly as a motivation to learning It should “encourage thestudents to perform well in the target language and to gain additional confidence” (Heaton,1990:171) When a good test is conducted and the test results are high, students may feelencouraged and try more Even a bad performance can be an incentive to work more,because the frequency of progress achievement test during the course of study is very high

In short, the achievement test works as assessment of both teachers’ and studentsperformance during the whole course and as an encouragement to both teachers’ andstudents’ progress Thus, it cannot be neglected in any syllabus and any teaching program

2.3.4 Practical tips for writing achievement tests.

Harrison (1991) states that designing and setting an achievement test is a biggerand more formal operation than the equivalent work for a diagnostic test An achievementtest involves more detailed preparation and covers a wider range of material as it relates tolong-term rather than short-term objectives (p.64)

As for Harrison (1983:7), it is necessary for test writers to draw out a testspecification before writing a test Test specification is resulted from the process ofdesigning test content and test method (Mc Namara (2000:31) The specifications includeinformation on the length, the structure of each part of the test, the type of materials, theextent to which the candidates will have to engage, the source of materials, the extent towhich authentic materials may be altered, the response format, the test rubrics and howresponses are to be scored They are usually written before the test and then the test iswritten on the basis of the specifications After the test is written, the specification should

Trang 10

be consulted again to see whether the test matches the objective set in the specification.

Therefore, writing specifications is an important step because it insures that itemwriters can write up test items that measure appropriately whatever the test developersintend to and that the range of conditions suitable for the test objectives will not beexceeded When writing specifications, teachers should use an index card on the top ofwhich they can write the test objectives and below is the table of specifications Theyshould try not to repeat the wording of the objective; remember to increase the level ofdetail preparatory to writing tests items The final step is writing the items themselves andentering them on the back of the index card

Harrison (1983:16) indicates the following factors to be taken into considerationwhen one sets up the table of specification for a test

- Time: The first factor teachers should be attended on is answering the question howmuch can be tested in the time available for the test They should decide a reasonableamount of time for the majority of the test takers to be able to complete the test If not,

a counter effect will happen, as the students are too panic and fearful to do the workunder pressure of time Students who are not given enough time will not be able todemonstrate their full achievement On the other hand, students who are given toomuch time to do a test can treat it like a puzzle rather than an actual language test

- Coverage: The next important factor to be taken into account is determining the testcontent in terms of grammatical and functional items and skills so that it accuratelyreflects the syllabus and objectives It also involves determining whether the testshould ask for the main idea, specific details or inferences, etc

- Test techniques: subjective and objective methods: There are many techniques fortesting both language and skills Most of them are only familiar to teachers Theyshould also be familiar to students before being used in a test Heaton (1988:27) takes asimilar view on test techniques by arguing that a good classroom test will usuallycontain both subjective and objective test items Each method has its own strong pointsand weak points The reason he gives for such a combination of test techniques is that

it helps to guarantee a high quality of the test The choice of test type will depend onwhat has been taught, to what extent and how, that is to say, it depends on the syllabus

Objective items (for instance, multiple choice items, matching, true-falseitems, etc.) can be marked very quickly and completely reliably because it has only

Trang 11

one correct answer or a limited number of correct answers And this kind of test can

be marked by a machine or by an inexperienced person Objective tests, therefore,can produce reliable results and focus on accuracy and discreet items, but theyprovide an assessment of only a limited range of the students’ abilities Anobjective test will be a very poor test if its test items are poorly written, irrelevantareas and skills are emphasized in the test simply because they are testable, and if it

is confined to language based usage and neglects the communicative skills involve

Subjective items (such as compositions, reports, letters, informationtransferring, etc) on the other hand, offer better ways of testing language skills andcertain areas of language than objective questions Subjective tests can provideinformation about the students’ wider command of communication, but thatinformation may be supplied somewhat haphazardly and is not always easy toassess in a reliable way-through marking guides of performance descriptions can go

a considerable way towards reducing this unreliability

- Format: Another factor that needs to be focused on is the test format – the form thetest items are going to take Teachers themselves have to decide the length of the testitems as well as the whole test, the number of questions, kind of used test methods such

as objective or subjective methods, and finally the time allowance More importantly,some guidelines should be given to the testees when determining the test format.Lastly, the test writers have to decide at this point whether to use an objective orsubjective format for each part of the test This choice has important implication for themarking of the test

- Difficulty is another area that calls for teachers’ attention when constructing test Itinvolves choosing appropriate level for each item or part of the test The level ofdifficulty of items included in the test should parallel that of the practice activities done

by the students during the course This kind of variation in level of difficulty of testitems appropriate to placement or proficiency tests is not necessary in an achievementtest, as it is not their primary aim to discriminate between strong and week students

- Rubrics: The test instructions should be clear and not ambiguous unless these willinvalidate the test by misleading them by turning the instructions into an additional testitem, though unintended (Dangerfield (1985:150) The students may complete theitems wrongly because they misinterpreted the instruction It is also advisable to

Trang 12

provide an example of an answered test item where the format permits (e.g in the case

of multiple choice or sentence transformation items but not, of course, in the case ofcompositions)

- Marking: Marking is an important but complicated part in the testing structure It isusually the last step of the whole test-designing process to enable the tester to have theexact and true evaluation of the testees’ performance in the test It contains the keys,marking instructions, marking scale, etc, needed for each item and the whole test

- The most important point to be noted here is that weighting on different parts testshould reflect the balance of the syllabus Second, the weighting of marks should takeinto consideration the difficulty of a test item and, to an extent, the proportion of theoverall test time that is likely to take students to complete those items A final point inrelation to marks is that, if the test includes an element which has to be markedsubjectively, the teachers should give careful proportion of the total marks for the test,but also to the criteria to be used for assessing that element Even when only oneperson is marking a set of test papers, it is important for reliability and consistency thatmarking should be done according to guidelines of one form or another

2.4 MAJOR CHARACTERISTICS OF A GOOD TEST

The most important consideration in designing a language test is its usefulness, andthis can be defined in terms of some basic test characteristics To write a good test,(Harrison, 1983:10) claims that it is essential for test designers to considerate reliability,validity, discrimination and practically These four test qualities all contribute to testusefulness, so that they cannot be evaluated independently of each other

is marked and the uniformity of the assessment it makes

Test reliability considered by Moore (1992:110) as a measurement device and itsconsistency is the dependability and trustworthiness of that device

Trang 13

Bachman (1990:24), a leading testing expert describes reliability as “a quality oftest score” He points out that if a student receives a low score on a test one day and highscore on the same test two days later, the test doesn’t yield consistent results, and the scorecannot be considered reliable indicator of the individual’s ability.

To sum up, reliability is a necessary characteristic of any good tests In otherwords, for a test, it to be valid at all, a test must first be reliable as a measuring instrument

or it should measure precisely whatever it is supposed to measure

2.4.2 Test validity

Heaton (1991:159) defines validity as: “the extent to which the test measures what

is intended to measure” A test is considered valid when it specifically measures what it issupposed to access In other words, the test interpretation made from its results isappropriate to the purpose of testing

Similarly, Henning (1987) states that a test is valid if it measures accurately what it

is intended to measure This seems simple enough However, it is not simple to saywhether or not a test is valid because of its variously different sub-kinds such as face,content and construct, each of which deserves our attention In this part I will present eachaspect in turn

2.4.2.1 Face validity

According to Tim McNamara (2000:105) “face validity is a type of validityreferring to the degree to which a test appears to measure the knowledge or abilities itclaims to measure, as judged by untrained observer (such as the candidate taking the test,

or the institution which plans to administer it)” Face validity is concerned with whatteachers and students think of the test Does it appear to them a reasonable way ofassessing the students, or does it seem trivial, or too difficult, or unrealistic? A test whichpretended to measure pronunciation ability but which did not require the candidate tospeak might be thought to lack face validity That means, face validity concerns the appeal

of the test to the popular judgement, typically that of other testers, teachers, moderators,and test takers

Alderson and Clapham and Wall (1995:173) recognized face validity as aninfluence factor in testing According to them, while opinions of students about tests arenot experts, they can be important because those opinions represent the kind of responsethat you can get from the people who are taking the test If a test does not appear to be

Trang 14

valid to the test takers, they may not do their best, so the perceptions of non-experts areuseful

validity only if it included a proper sample of the relevant structures Just what are the

relevant structures will depend, of course, upon the purpose of the test

What is the importance of content validity? “First, the greater a test’s contentvalidity, the more likely it is to be accurate measure of what it is supposed to measure Atest in which major areas identified in the specification are under-represented- or notrepresented at all –is unlikely to be accurate Secondly, such a test is likely to have aharmful backwash effect Areas which are not tested are likely to become areas ignored inteaching and learning” (Hughes (1989:22)

So, in content validation, the experts should look at whether the test isrepresentative of the skills they are trying to test That is to say, the experts look at thecontent of the test and compare it with a statement of what the content ought to be Thisinvolves looking at the syllabus, in the case an achievement test, and the test specificationsand deciding what the test was intended to test and whether it accomplishes what it isintended to In other words, the content validity depends on a careful analysis of thelanguage being tested and of the particular course objectives

2.4.2.3 Construct validity

Davies (1999:33) defined that “the construct validity of a language test is anindication of how representative it is of an underlying theory of language learning.Construct validation involves an investigation of the quantities that a test measures, thusproviding a basis for the rationale of a test”

Trang 15

For Rthur Hughes (1989:26): “A test is said to have construct validity if it can bedemonstrated that it measures just the ability which it is supposed to measure” The word

“construct” refers to any underlying ability which is hypothesized in a theory of languageability Take reading for example We construct the items related to reading ability andadminister them as a pilot test Then, we take samples of reading ability and draw reliablescores After all, the comparison between the two is made If the co-efficiency is agreeable,the test is said to measure reading ability In other words, if we attempted to measure theability in a particular test then that part of the test would have construct validity only if wewere able to demonstrate that we were indeed measuring just that ability

2.4.3 Relationship between reliability and validity.

Test researchers and developers have admitted that reliability and validity areessential measurement qualities This is because these are the qualities that provide themajor justification for using test scores numbers as a basis for marking inferences ordecisions (Bachman and Palmer, 1996:19)

We often think of reliability and validity as two distinct but related characteristics

of test scores Although validity is the most important characteristic, reliability is anecessary condition to validity The two measurement qualities, reliability and constructvalidity, are thus essential to the usefulness of any language tests Reliability is a necessarycondition for construct validity, and hence for usefulness To be valid a test must provideconsistently accurate measurements It must therefore be reliable

Reliability and validity are considered two basic principles by (Heaton, (1990:6)when writing useful tests A reliable test, however, may not be valid at all In other words,reliability is not sufficient condition for either construct validity or usefulness Suppose, forexample, that we needed a test for placing individuals into different levels in an academicwriting course A multiple-choice test of grammatical knowledge might yield veryconsistent or reliable score, but this would not be sufficient to justify using test as aplacement test for writing course This is because grammatical knowledge is only oneaspect of the ability to use language to perform academic writing tasks

It should be noted that a test could be reliable without possessing validity.However, reliability is clearly inadequate by itself if a test does not succeed in measuringwhat it is supposed to measure It is impossible for test writers to try in vain to increase thevalidity of a reliable test due to the features of test items that constructs it From the outset

Trang 16

of test construction, test validity should be of most essential focus of all A reliable test, infact, may not be quite valid For example, a multiple-choice test which is very reliable, butits validity is poor if it fails to measure what it intend to measure.

Furthermore, the emphasis on test validity is recognized by Hughes (1989:22) that,

“the greater a test’s content validity is, the more likely it is to be accurate measure of what

it is to measure” In other words, if major areas in the test specification are not identified ornot represented, the test is said to be inaccurate Furthers, such an inaccurate test is likely

to have a harmful backwash effect because those are not presented or not tested willprobably be ignored in teaching and learning

Due to the importance of validity in the test, sometimes a trade-off which is infavor of validity at the expense of reliability is accepted Taking this perspective, this studyfocused more on the validity of the final achievement test used at Ngo Quyen High School

2.4.4 Practicality

All tests cost time and money- to prepare, administer, score and interpret Time andmoney are in limited supply, and so there is often likely to be a conflict between whatappears to be a perfect testing solution in a particular situation and considerations ofpractically

A test must be practicable Practicality plays a crucial role in deciding whether atest is good or not This characteristics of a test involves administration, scoring,stationery, interpretation of results etc According to Brown (1994:253), a test isimpractical if it is taken in 10 hours and if it is prohibitively expensive The duration of atest, for example, may affect its successful operation Students will feel very tired at theend of the test, and the score will be surely affected Or if the scoring is complicated, it willcost much time and money because these will be staffs to mark students’ papers Thelonger it takes to construct, administer and score, the higher costs are Testers should avoidthis

In brief, “tests should be as economical as possible in time (preparation, sitting andmarking) and in cost (materials and hidden costs of time spent)” (Heaton, 1991:172)

2.4.5 Discrimination

Another important feature of a test is discrimination Heaton (1988:165) identifiesdiscrimination of a test as the capacity to discriminate the different candidates and to

Trang 17

reflect the differences in the performances of the individual in the group If a test is eithertoo easy or too difficult, it cannot realize its purpose of discrimination between candidates.Therefore, the test items must be in a wide difficulty scale, ranging from “extremely easyitems” to “extremely difficult items” Below is how the items in the test should be spreadover a wide difficulty level :

- extremely easy items

- very easy items

- easy items

- fairly easy items

- items below average difficult level

- items of average difficult items

- items above average difficult level

- fairly difficult items

- difficult items

- very difficult items

- extremely difficult items

Similarly, Harrison (1994:14) defines discrimination as “the extent to which a testseparates the students from each other” The extent of the need to discriminate will varydepending on the purpose of the test For example, if a placement test is able to efficientlydiscriminate among students, it will be much easier to divide students into suitable groupsand similarly to an achievement or a diagnostic test, the level of each individual willclearly be shown

Conclusion: In this chapter, I have reviewed the literature on important issues related to

language testing These include the relationship between testing and teaching, which isoften referred to as "backwash effect", types of achievement tests, the characteristics of agood test with an emphasis on four important constructs, i.e, reliability, validity,practicality and discrimination Of these constructs, validity, particularly content validityseems to be the most important determinant which gives the test the power to test what is

to be measured

The next chapter will present the study which includes the participants, the methods

of data collection and the data analysis

Trang 18

CHAPTER 3: THE STUDY

In this chapter, the writer provides some information about the current situations ofteaching, learning English, and language testing at NQHS in Hai Phong as the basicsettings for the study The rationale for the method chosen for the study is also presentedhere The primary focus of this chapter is the data analysis and finding from the data

3.1 THE SUBJECTS AND THE CURRENT ENGLISH TEACHING,

LEARNING AND TESTING SITUATIONS AT NQHS

3.1.1 Students and their backgrounds

Pupils who have been studying at NQHS were selected from many lower secondaryschools in the city after taking the recruitment exam Most of them had been studyingEnglish for four years at lower secondary schools Because English was not a core subjectwhich they had to taken to enter the upper secondary schools, it was not paid muchattention by both teachers and students As a result, their level of proficiency in Englishwas varied on the one hand, and unsatisfactory, on the other

At supper secondary schools, English is one of the six core subjects, which arecompulsory in the national examination Therefore English has become an importantsubject, especially for 12th form students After 2 years of learning English at high school,the 12th form students have been learning English for 105 periods, covering the last 16units of the textbook Tieng Anh 12 They have three class hours of English every week

As far as training targets are concerned, the primary concern of the students isgetting good marks at written achievement tests and the national examination and even atthe university entrance examinations Different motivation, different objectives lead todifferent ways of learning and different ways of teaching The question here is that how totest them appropriately to meet the needs of students and requirements of NQHS

3.1.2 The English teaching staff

The English group consists of 15 teachers All the English teachers were trained inVietnam and none of them was ever trained abroad They are well-trained and ratherprofessionally experienced with at least 3 years’ teaching About a quarter of the teachershas done and is doing M.A course Therefore, most teachers are qualified enough toconduct communicative activities in a foreign language lesson They can use English inclass quite well They also continuously acquire for themselves a great knowledge of

Trang 19

general English and specialized subjects through their self-study and in-country trainingprograms.

3.1.3 English teaching and learning at NQHS

Being one of the six core subjects, which are compulsory in the nationalexamination at the end of supper secondary school, English is paid much attention at everyschool in general and at NQHS in particular

With the renovation in education, the English program at supper secondary schoolshas been redesigned recently The seven-year English program is used nationwide inplacement of the previous three-year one The purpose of the new one is maybe to narrowthe gaps between classroom English and real English This course book focuses on fourskills and provides appropriate grammar and vocabulary According to the content of thecourse book for 12th form students, after studying 16 units students are expected to have thefollowing abilities:

Listening: Students are able to understand passages or dialogues related to 16 topics in the

textbooks

Speaking: Students are able to carry out conversation about culture, future life, sports

events

Reading: Students are expected to understand the general and detailed contents of the

reading passages in length of 280-320 words about the topics of 16 units

Writing: Students are able to write a letter of request, describe the world in the future, give

instructions

However, due to the lack of materials and equipments, the shortage of time, thelarge-sized classes, mixed-level students, mixed motivations and expectations of learningEnglish, the teachers have a lot of difficulties in their conducting teaching effectively We

do not have a suitable laboratory for students to study listening Whenever teachers wanttheir students to practice listening, they have to bring cassettes to the class Teacherssometimes ignore the speaking section because of the limited time Also, writing is achallenge for students so they are not very interested in doing them

Furthermore, the main aim of the teaching English at supper secondary schools is tohelp the students to get the best marks at written tests Owning to the characteristics of thetests, teachers pay much attention only to reading and supply them with grammar rules andvocabulary

Trang 20

3.1.4 The current testing situation at NQHS.

In every school year, the 12th form students at NQHS have to take two final writtenachievement tests at the end of the 1st and 2nd terms All teachers are asked to design theresource tests (they may or may not teach 12th form students) The tests then are collectedand the final paper is designed out of three or four resource tests by the leader of theEnglish group Objective items such as multiple-choice question items have been used forthe final written achievement tests in order to achieve the high-test reliability anddiscrimination among test takers In addition to make it easier for teachers to score theexamination papers, separate sheets are provided for the test takers to write down theanswers

However, I have learned that most language tests do not follow test specificationsbecause there are not any available to them They themselves design most tests by a cut-and-paste method, by which I mean they use commercial tests available to write testswithout following any rules of testing Testing techniques have not been paid properattention to and the role of the testing in teaching and learning has not been fullyrecognized Tests, therefore, may lack some major important criteria of a good testconcerning its validity, reliability, format and practicality

The first and foremost characteristic is that except for those carefully designedtests, some are of poor quality, misspelling, or too difficult This is because test makersonly try to fulfill her or his duty without considering its effectiveness on the one hand, andmost of them may not be aware of testing theories on the other Those tests often fail tomeasure accurately whatever they are intended to measure

Moreover, test content is sometimes found to be unrelated to the objectives of thecourse They are likely to fail to measure some language skills such as speaking andlistening There is no listening part in the final achievement tests Also, teachers have nochance to test learners’ speaking ability otherwise measuring the way of pronunciation ofstudents is done by phonetics section of the test which seems to be not accurate to measurethe students’ speaking skill

More importantly, students are often clear about the test formats As the result,most teaching practice and class activities are accordingly test oriented, and what will not

be tested might be left uncovered Students apparently shape an attitude of learning fortesting and for grading only

Trang 21

For these reasons, the final achievement teat at NQHS has never been empiricallyevaluated by test users This study is the first attempt to explore test users' opinions of thetest To provide background information, table 1 shows the structure of the test.

Table 1: The English final achievement tests have been constructed as follow:

Time allowance: 60 minutes (50 multiple choice questions in total)

scale

A Phonetics I Multiple choice

(3 spelling, 2 word stress)

25 choices 25

C Writing III Multiple choice (correcting mistakes)

IV Multiple choice (sentence transformation or sentence building)

5 sentences

5 sentences

5

5

D Reading V Multiple choice (gap filling)

VI Multiple choice (Choosing the correct answer)

5 gaps

5 sentences

55

It is known that testing is among various types of evaluation forms, which provideinformation on the strength and weakness in the achievement of the students Thus,whether or not what we are doing is worthwhile? Whether or not what we are testingdepicts the true pictures of our students’ abilities?

The next section provides an account of the methods used in this study to answerthe above-mentioned questions

3.2 RESEARCH METHODS

This study used survey questionnaire and interview as an information collectiontool The overall purpose of the survey is to investigate the perception of teachers (testmakers at the English group) and the 12th form students (test takers) at NQHS of the

existing final achievement test based on the criteria of a good test mentioned in (2.3).

However, the major focus is on terms of validity

Trang 22

It is expected that the result of the survey would help to:

(1) find out the differences and similarities in teachers and students’ evaluationstowards the test validity

(2) achieve more accurate measures of students’ achievement with reference to thetraining objectives

As far as concerned, the key methods applied for reliable information are surveyquestionnaires and interviews Those methods help to collect and confirm different kinds

of data However, each has its own advantages and disadvantages

12th form students

In examining the actual testing situation of the current 12th form achievement test, afourteen-item questionnaire was given to 15 teachers, and fourteen-questionnaire wasadministered to 200 12th form students at NQHS This questionnaire consists of one part tocollect students’ and teachers’ opinions on the whole test and each particular section of thetest such as the Grammar and Vocabulary, Reading comprehension, Writing section andone part with questions for improvements of the final test For the subjects to have a clearidea of the content and be able to decide what/ how to respond in a relevant way to acertain question/situation, clear instructions were given at the beginning of the

questionnaire In every question, informants are asked to tick (√) the most appropriate column among “strongly disagree”, “Disagree”, “Don’t know”, “Agree”, “Strongly

agree” The questionnaire is anonymous, so the participants feel more comfortable to

answer the questions

This research method has a number of advantages It can reach large number ofpeople in a short time and as a result, it gets from the informants a lot of data thatresearchers needs and the survey data appear to be valid due to the answer collected from

Trang 23

many people Moreover, the data collected are relatively easy to be summarized andreported as all the informants answer the same questions Most importantly, thequestionnaires give the student-informants an opportunity to express their opinions andneeds without fear either to be embarrassed or to speak their mind Because theconfidentiality is ensured by not mentioning students’ names, the student-informants aremore likely to give unbiased answers Finally, using questionnaires is quite inexpensive.

However, this method inherently carries some disadvantages, which should notaffect the quality of the collected data The first drawback is that there is little provision forthe expression of unanticipated responses since all the surveys have to follow a fixedformat The other disadvantage lies in the common fact that all questionnaires used forresearch purposes are of limited utility in getting at the causes of problems or possiblesolutions Accordingly, it should need the aids of other methods

3.2.3 Interviews

The interviews with the teachers of English group and 12th form students justfinished the English final achievement test administered by the school for information Theinterviews are primarily based on the initial analysis of the valid questionnaires to classifyany vague information from the questionnaire I took notes of the interviews which wereanalyzed in triangulation with the questionnaire data The strong point of this method isthat experiences, opinions and drawings are exchanged much more openly and directly.Nevertheless, some informants are shy and afraid of expressing their own ideas That maylead to the difficulty collecting some sensitive information

3.3 DATA ANALYSIS

This section deals with the data collected from a survey on both the teachers andstudents concerning their evaluation on the current final achievement test for 12th formstudents given at the end of the school year

3.3.1 Data analysis of students’ survey questionnaires and interviews.

Two hundred questionnaires (see the questionnaire in appendix 1) wereadministered to two hundred 12th form students of NQHS

The author intends to collect data in stratification with an aim to classify thedifferences in perceptions of the test among students themselves Therefore, the studentpopulation was divided into two separated groups Group 1 consists of 100 students whose

Trang 24

main subjects are math, physics and chemistry They learn basis English only and areassumed to be the A stream (A) The other group with 100 students who learn advancedEnglish, math and literature is named D stream (D)

The data collected from the students’ survey questionnaires with 14 questions intotal are divided into 5 parts The main part 1 consists of 5 questions used to ask forstudents’ comments on the whole test Part 2, 3, and 4 collect their opinions on eachparticular section of the test such as grammar and vocabulary (2 questions), readingcomprehension (4 questions) and writing (3 questions) Each part is displayed in aseparated table as follows:

Table 2: Students’ opinions on the whole test

1 The test measures what the

students have been taught.

0 0 73 14 12 5 10 48 5 33

2 The task types given in the test are

familiar to the students.

0 0 0 0 15 3 56 68 29 29

3 Time allowance for this test is

enough.

15 2 53 11 9 4 10 57 13 26

4 The weighting demonstrated on the

marking scale of the test is

appropriate.

24 44 37 24 13 7 15 13 11 12

5 In terms of test item format, this

English test mainly intends to

measure the students’ grammar and

vocabulary knowledge.

0 0 0 0 25 7 63 72 12 21

Table 2 shows the information collected with 6 questions in which students areasked to state their views towards whether or not the test relates to what they have beentaught (Q1), if the task types given in the text are familiar to the students (Q2), their opinions on the time allowance (Q3), the marking scale of the test (Q4), the main knowledge is measured through the test (Q5), and if the result of the test can encourage them to learn better (Q6).

As shown in table 2, 48% and 33% of the students from D stream agree andstrongly agree that the can measure what they have been taught while 14% of studentsdisagree and 5% have no ideas In contrast, only 15% of the students from the A stream go

Trang 25

along with this idea and the number with opposite views is 73% It is obvious that the test

is quite easy for students of D stream and it doesn’t evaluate their real ability However,the students from A stream find the test far more difficult than what they have been taught

As can be seen from the table, most of the students both from A stream (85%) and

D stream (87%) agree that the task types given in the test are similar to them No students

find them strange Those who choose “agree” explain the reason why they can do the testwell simply is their teachers often give them similar task types and help them practicemuch before taking the test

According to the statistic data, majority of the students from D stream (83%) think

that the time allowance for the test is enough except for a small number of students insist

that the test should be longer in time accounting for 13% Meanwhile it is clearly seen thatthe number of students from A stream expecting the time allowance to be longer makes up68% since the test is quite long and difficult for them

Being asked about their opinion on the marking scale, more than half of the

students from both streams (61% and 68%) do not think it is good whereas 26% of Astream’s students and 25% of students from D stream think that it is acceptable Forstudents of A stream, more points should be given to easier part such as Grammar so thatthey can get better result for the test while the test results sometime dissatisfy the students

of D stream as it seems unable to discriminate themselves from other weaker students Asfor them, there should be more points on reading or writing parts

Concerning the main knowledge measured in the test, the majority of the students

(>80%) agree that the test mainly intends to measure students’ grammar and vocabularyknowledge

Table 3: Students’ comments on the Grammar and Vocabulary section

1 The Grammar and Vocabulary

part is long enough and related to

what students have been taught

4 23 5 29 7 4 68 24 16 20

2 A student who is given a high

score in the grammar and vocabulary

5 13 9 20 8 33 55 21 23 13

Ngày đăng: 29/01/2014, 14:35

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm

w