A study on the validity of the current final english test for the 2nd semester non english majors at hanoi university of industry

LIST OF TABLES AND CHARTSTable 1: A framework for language assessments Table 2: The syllabus for the second semester Table 3: Factors to consider in writing items and tasks Table 4: Key

Trang 1

NGUYEN VAN BAC

A STUDY ON THE VALIDITY OF THE CURRENT FINAL ENGLISH TEST FOR THE 2ND SEMESTER NON-ENGLISH MAJORS AT HANOI UNIVERSITY OF INDUSTRY

(Nghiên cứu về tính xác thực của bài thi tiếng Anh cuối học kỳ thứ hai hiện nay dành cho sinh viên không chuyên tiếngAnh tại trường

Đại học Công Nghiệp Hà Nội)

M.A Minor Programme Thesis

Field: Methodology Code: 60 14 10

Hanoi, 2010

Trang 2

NGUYEN VAN BAC

A STUDY ON THE VALIDITY OF THE CURRENT FINAL ENGLISH TEST FOR THE 2ND SEMESTER NON-ENGLISH MAJORS AT HANOI UNIVERSITY OF INDUSTRY

(Nghiên cứu về tính xác thực của bài thi tiếng Anh cuối học kỳ thứ hai hiện nay dành cho sinh viên không chuyên tiếngAnh tại trường

Đại học Công Nghiệp Hà Nội)

M.A Minor Programme Thesis Field: Methodology Code: 60 14 10

Supervisor: Pham Thi Hanh, M.A

Hanoi, 2010

Trang 3

TABLE OF CONTENTS

CANDIDATE’S STATEMENT……….

ACKNOWLEDGEMENT………

ABSTRACT……….

TABLE OF CONTENTS ………

LIST OF ABBREVIATIONS ………

LIST OF TABLES AND CHARTS ………

CHAPTER 1: INTRODUCTION……….

1.1 RATIONALE………

1.2 SCOPE OF STUDY………

1.3 AIMS OF STUDY………

1.4 METHODS OF STUDY……….

1.5 RESEARCH QUESTIONS………

1.6 DESIGN OF STUDY………

CHAPTER 2: LITERATURE REVIEW……….

2.1 RELATIONSHIP BETWEEN LANGUAGE TESTING AND LANGUAGE TEACHING AND LEARNING………

2.2 LANGUAGE TESTING………

2.2.1 Purpose of language testing……….

2.2.2 Types of language testing……….

2.2.3 The current trends in language testing………

2.3 QUALITIES OF A GOOD TEST……….

2.3.1 Reliability………

2.3.2 Validity………

i ii iii iv vii viii 1 1 2 2 3 3 3 5 5

6 6 7 9 10 10 11

Trang 4

2.3.3 Practicality………

2.4 VALIDITY……….

2.4.1 Content or face validity……….

2.4 2 Response Validity………

2.4.3 Concurrent validity and predictive validity……….

2.4.4 Construct validity……….

CHAPTER 3: THE STUDY………

3.1 THE SUBJECT AND THE CONTEXT OF ENGLISH TEACHING AND LEARNING AT HAUI………

3.1.1 English teaching and learning context at HaUI.………

3.1.2 English Testing for non English majors at HaUI………

3.1.3 Subject of the study ………

3.1.3.1 Students………

3.1.3.2 Teachers………

3.2 RESEARCH METHODS………

3.2.1 Survey questionnaire………

3.2.2 Interview……….

3.2.3 Document analysis……….

3.3 DATA COLLECTION PROCEDURE……….

CHAPTER 4: FINDINGS AND DISCUSSIONS……….

4.1 DATA ANALYSIS………

4.1.1 Data analysis of students and teachers survey questionnaires and interviews………

4.1.2 Data analysis of students’ score………

4.2 DISCUSSIONS………

11 12 12 13 13 13 15

15 15 17 18 18 18 19 19 21 21 22 23 23

23 32 34

Trang 5

4.3 SUGGESTIONS FOR IMPROVING THE QUALITY OF THE

CURRENT FINAL TEST FOR THE SECOND SEMESTER NON

ENGLISH MAJOR STUDENT AT HAUI………

4.3.1 Take the students’ language ability and knowledge in consideration………

4.3.2 Make clear instructions for the test composers……….

4.3.3 Determine objectives of the test………

4.3.4 Determine the content of the test……….

CHAPTER 5: CONCLUSION………

REFERENCES………

Appendix 1A………

Appendix 1B………

Appendix 2A………

Appendix 2B………

Appendix 3………

35 35

35 36 36 38 40 I IV VII X XIII XIV XV

XVII

Trang 6

6.OM Other Major Students Group

7.SLA Second Language Acquisition

8.TESOL Teaching English to Speakers of Other Languages9.TOEIC Test of English for International Communication

Trang 7

LIST OF TABLES AND CHARTS

Table 1: A framework for language assessments

Table 2: The syllabus for the second semester

Table 3: Factors to consider in writing items and tasks

Table 4: Key points presented in the course book New headway pre-intermediateTable 5: List of test scores selected from the second semester final test

Chart 1: Opinions of students on the time allowance of the final test

Chart 2: Teachers’ comments on time allowance of the test

Chart 3: Appropriateness of the final test in student’s opinion

Chart 4: Teachers’ comment on the test appropriateness

Chart 5: Test items best measure students’ true ability in students’ perceptionChart 6: Teachers’ opinions on test items best measuring your students’ true abilityChart 7: Student’s comments on the level of Grammar and Vocabulary test

Chart 8: Teachers’ comments on Grammar and Vocabulary test

Chart 9: Student’s comment on the difficulty level of Reading Comprehension testChart 10: Teachers’ comment on the difficulty level of Reading commprehension testChart 11: Student’s comment on the appropriateness of Writing test

Chart 12: Teachers’ comment on the writing test

Chart 13: Student’s comment on Listening commprehension test

Chart 14: Teachers’ comment on the construct of Listening comprehension test

Trang 8

CHAPTER 1: INTRODUCTION

1.1 RATIONALE

Along with the emergence of the globalization, English has been proving its importance inmost areas including science, technology, telecommunication, media, culture, internationalrelations In Vietnam, English teaching and learning have drawn a lot of concerns from notonly teachers and students, but the whole society

English is a non-major subject for most students of Hanoi university of Industry (HaUI).However, it has received more attention and time than any other basic subjects taught inthe training program On the average, students have to learn English in 5 semesters, four ofwhich serve for general English, and in the last semester students will learn English forSpecific Purposes (ESP) Compared with other universities in Vietnam, the time forEnglish program at HaUI is one of the longest

To evaluate student’s English level and achievement at Hanoi University of Industry,testing is an essentially important tool Testing and assessment have been considered as thelight on both the nature of language proficiency and language learning In other words,tests can produce the assessment of students’ ability of language use For each semester,students have to take three progress tests, and one final achievement test In the finalachievement test, listening, speaking, reading, grammar knowledge and writing, eachaccounts for 20% of the total mark scale

Although there are a lot of tests used as mentioned above in the process of teaching, it isrecognized that these tests may still not exactly evaluate the student’s ability to useEnglish Some students have excellent performance in class, but the results of their test arenot satisfactory, and vice versa Other teachers at Hanoi University of Industry also holdthis view and they often complain that the current final achievement test for the secondsemester does not reflect the true language competence of their students Some studentsand teachers share the same point that what is taught in the program is not included in thetest; therefore, it seems not to measure students’ achievement of the course and theirexpected linguistic skills and knowledge Through interaction with other teachers, I reckonthat test writers often choose the test items somewhere else, but not based on the coursebook and the syllabus given at the beginning of the course

Trang 9

One more reason why I choose this topic for my research is that the test evaluation andassessment at Hanoi University of Industry appear not to receive proper attention Being ateacher of English, I have also involved in designing many kinds of test for non Englishmajor students at HaUI, but there are no formal discussions, no systematic andcomprehensive assessments, and no research on the appropriateness of the tests.

With above mentioned reasons, I have decided to choose the research topic: “A study on

the validity of the current final English test for the 2nd semester non-English Majors at Hanoi University of Industry.” It is believed that this study will be helpful for English

teachers in English Faculty of Hanoi University of Industry who often participate indesigning the progress tests and final achievement exams

1.2 SCOPE OF STUDY

The scope of this minor thesis is limited to a study on examining the validity of the currentfinal test for second semester non English major students in terms of its validity for thenon-English majors at Hanoi University of Industry

Due to the limitations of time, the author cannot send the questionnaires to all non-Englishstudents of Hanoi University of Industry However, to achieve a broad view from theteachers and students of Hanoi University of Industry about the final test in terms of itsvalidity, the author tries his best to give the questionnaires to the students of 5 Facultiesincluding Economic Faculty, Chemistry Technology Faculty, Electronic TechnologyFaculty, Mechanical Technology Faculty, and Electrical Technology Faculty The studentsquestioned are all university students, not covering the college students The author alsocannot conduct the survey and interview with all the teachers of English Department;instead, he selects the experienced ones who regularly involve in designing tests for nonEnglish major students and those who are currently involving in teaching first year students

in their 2nd semester

1.3 AIMS OF STUDY

The study aims at investigating the validity of the current final achievement test for the 2ndsemester non English major students at Hanoi University of Industry The specific aims of the research are as follows:

Trang 10

- to investigate the appropriateness of the current final test for the 2nd semester non English majors in terms of time allowance, difficulty level, test contents.

- to find out the teachers’ and students’ comments on the test validity;

- to provide some suggestions for improving the test in terms of its validity

Questionnaires are sent to teachers and students involving in teaching and learning in thesecond semester to collect information on their views of the test’s validity

In addition, an informal interview and discussions are also carried out with the teachers ofEnglish and their students to gain more information on the appropriateness of the test

1.5 RESEARCH QUESTIONS

The study is conducted to find the answers to the following research questions:

1 Does the final test for non English major students in the 2nd semester give a true picture

of truly the students’ English Competence according to the view of teacher and students?

2 Does the test measure what is purported to measure (i.e its validity)?

3 How can the test be made valid? In what way should the current final test be improved?

1.6 DESIGN OF STUDY

The study consists of five chapters, organized as follows:

Chapter 1- Introduction- provides background to the study, identifies the problems,states the aim, purpose and significance of the study, the scope, the methods, the researchquestions and the design of the study

Chapter 2 - Literature review- Presents a review of related literature that providesthe theoretical background of the testing and evaluation in general and the test validity in

Trang 11

particular This review also provides an overview of other studies related to testing,evaluation, especially the evaluation of tests in terms of its validity.

Chapter 3 - The Study- Provides information about the subjects of the study It thendescribes the data collection instruments and data collection procedure The rationale forchoosing such data collection instruments is also provided

Chapter 4 - Findings and Discussions- Analyses and discusses the data collected toreveal the real results and the validity of the final 2nd semester exam for non English majorstudents of HaUI The causes for any problems if any and some implications for effectivefinal achievement tests will be also discovered

Chapter 5 - Conclusion- Summarizes the major findings that are hoped to find theappropriate way to enhance the validity of final achievement tests to non English majors.Limitation of the study and suggestions for further research are also given in this chapter

Trang 12

CHAPTER 2: LITERATURE REVIEW

2.1 RELATIONSHIP BETWEEN LANGUAGE TESTING AND LANGUAGE TEACHING AND LEARNING

Shohamy (2000) introduced three dimensions of potential contributions of LT to SLA: (1)defining the construct of language ability; (2) applying LT findings to test SLA hypothesesand (3) providing SLA researchers with quality criteria for tests and tasks Also, he gaveout three dimensions of potential contributions of SLA to LT including (1) identifyinglanguage components for elicitation and criteria assessment; (2) proposing tasks forassessing language; (3) informing language testers about differences and accommodatingthese differences

When regarding relationship between language testing and language teaching and learning,Hughes (1989) supposed that it has both negative and positive sides He believed that “toooften language tests have a harmful effect on teaching and they fail to measure accuratelywhatever it is they are intended to measure.”(1989:1) He argues that good teaching maynot create good language tests and vice versa

Bachman (1990) backed the notion that the language testing has positive effects onlanguage teaching and learning In his words, “advances in language testing are stimulated

by advances in our understanding of the processes of language acquisition and languageteaching.” (1990:3) For the use of language tests in education program, he said that, “thefundamental use of testing in an educational program is to provide information for makingdecisions, that is, for evaluation” (Bachman, 1990: 54) He supposed that throughlanguage testing we can evaluate learners’ achievements in each certain learning period,self-evaluate our ways of teaching or teaching methods, or the language test can provide uswith input into the language teaching process It was also believed that language teachingand language learning help rater and test makers have more information and more inputresources for designing and improving achievement tests

In short, language testing and language teaching and learning have a close and interrelatedrelation Teaching and learning provide a great source of language materials for test and inturn, testing reinforces, improves and encourages the teaching and learning process

“teaching and testing are so closely related that it is virtually impossible to work in eitherfield without being constantly concerned with the other.” (Heaton, 1988:5)

Trang 13

2.2 LANGUAGE TESTING

2.2.1 Purpose of language testing

Shohamy (1985: 6) made a distinction between classroom tests and external tests.Classroom tests are written and administered by teachers while external tests are designedand submitted by an external agency The purposes of classroom tests are to find outwhether what was taught in the program was also successful acquired; evaluate andimprove instruction; obtain information on students’ progress and language knowledge;help organize learning/ teaching materials; provide information for grades; help diagnosestudents’ strengths and weaknesses in the language and motivate students to learn

External tests, however, evaluate proficiency; decide whether to accept students to a certainprogram; provide information for administrative decision- special treatment to certaingroup, assist in selection and grouping; help evaluate the curriculum; serve researchpurposes and obtain information for grading

Having some ideas similar to Shohamy, but Henning explains the purpose of languagetesting in a different way According to him, language tests aim to deal with the diagnosisand feedback, screening and selection, placement, program evaluation, providing researchcriteria, and assessment of attitudes and sociopsychological differences (Henning, 1987,

pp 1-4)

He states that the most common aim of language test is to find out strengths andweaknesses in students’ learning ability In this sense, the use of diagnostic tests providescritical information to the student, teacher as well as administrator that should make theleaning process more efficient

Language tests can also be used to decide whether students should be allowed to participate

in a particular program of instruction To make fair selection and decision, the test must beaccurate in the sense that they must provide information that is both reliable and valid

Another use of tests is to classify students’ ability to learn languages In this sense, tests areused to identify a particular performance level of the student and to place them at a suitablelevel of instruction

In addition, tests are usually used to provide information about effectiveness of programs

of instruction In this sense, group mean or average scores are of greater interest thanisolated scores of individual students

Trang 14

Language tests score can be used to provide research criteria such as comparisons ofmethods, and techniques of instruction, text books or audio visual aids Also, tests can beused to assess student’s attitude toward the target language, its people and their culture,which are essential elements for good language learning.

2.2.2 TYPES OF LANGUAGE TESTING

Harrison (1991) introduces four types of language tests, including placement, diagnostic,achievement and proficiency Placement tests are designed to classify new students intocertain group, so that they can start the course at the same level as the other students inclass The placement test is concerned with the student’s present standing, thus it relates togeneral ability rather than specific points of learning

Diagnostic tests are used for checking student’s progress in learning particular elements ofthe course The test may be given at the end of a unit in the course book or a lessondesigned to teach on particular point The diagnostic test tries to find out that how well thestudents learnt a particular material, and it’s closely related to particular elements in thecourse which have just been taught

Achievement tests look back over a longer period of learning than the diagnostic tests Thetest aims at showing standard which the students have now reach in relation to otherstudents at the same stage Achievement tests covers a much wider range of material than adiagnostic test and relate to long term rather than short term objectives

A proficiency test aims at assessing the student’s ability to apply in actual situations what

he or she has learnt The test is not usually related to any particular course because it isconcerned with the student’s current standing in relation to his or her future needs

The following is the summary of types of language test established by Harrison:

Trang 15

Category Content Purpose ConsiderationsPlacement General reference Grouping speed of results

forward to future variety of tests

Diagnostic Detailed reference back Motivation Short term objectives

to class work Remedial work New examples of the

materials taughtAchievement General reference back Certification Decision about

to the course Comparison with sampling

others at the same Similar material tostage that taught in new

contextProficiency Specific purposes Evidence of ability to Definition of

Reference forward to use language in operation needsparticular applications of practical situations Authenticitylanguage acquired Context

Strategies for coping

Table 1: A framework for language assessments

(Source: Harrison, 1991: 5)Henning (1987), however, develops different categories for types of language testing Heintroduces seven types including objective vs subjective tests, direct vs indirect tests,discrete point vs integrative tests, aptitude, achievement and proficiency test, criterion ordomain referenced vs norm referenced or standardized tests, speed tests vs power test andother test categories (Henning, 1987, pp 4-9)

The objective vs subjective tests are distinguished on the basis of the manner in whichthey are scored An objective test may be scored by comparing examinee responses with anestablished set of acceptance responses or scoring key The example of this kind of test ismultiple choice test On the other hand, a subjective test may be scored by opinionatedjudgment based on insight and expertise of the scorer The example of this type would be

Trang 16

free composition or cloze tests which permit all grammatical acceptable responses tosystematic deletions from a context.

Direct tests are said to test language performance directly whereas indirect tests indirectlytap true language performance The direct tests are usually in the forms of spoken testswhich are the ratings of language use in real communication situations The indirect testsare usually in the forms of written tests such as multiple choice, cloze tests

Discrete point tests, as a variety of diagnostic tests, are designed to measure knowledge orperformance in very restricted areas of the target language Integrative tests, on the otherhand are used to assess a greater variety of language abilities

Aptitude tests are usually used to measure the suitability of a specific program ofinstruction or a particular kind of employment Achievement tests are used to measure theextend of what students have already learnt Proficiency tests are the most often globalmeasures of ability in language or other context area

For the criterion-or domain-referenced test, the instructions are designed after the test arecreated The tests must match teaching objectives perfectly and they are useful whenobjectives are under constant revision Such kinds of test are useful with small and/orunique group for whom norms are not available The norm- referenced or standardizedtests, on the other hand, must have been administered to a larger number of examinee fromthe target population Acceptable standards of achievement can only be found by reference

to the mean or average score

Speed test is the test in which the items are easy but the time seem to be insufficient Incontrast, power test includes difficult items, but the time is sufficient

Henning (1987) also mention some other categories of tests including examinations vsquizzes, questionnaires, single stage and multi stage tests, language skill tests and languagefeature tests, etc

2.2.3 The current trends in language testing

Shohamy (1985: 5) gives out three trends in language testing Firstly, it’s the transitionfrom discrete point tests to integrative tasks In the past, language test often based on theindependent items like putting the correct verb forms, selecting the lexical elements Tests,nowadays, however, mostly aim at testing communicative competence, and the tasks are

Trang 17

much wider, such as writing letters, comprehension of a whole text with specific elements

in the text

The second trend in language testing is the transition from indirect to direct/ authentictests Until now, the test methods were mainly indirect, that means it has no relation to thereal life situation which are more similar to what the test takers will encounter in reallanguage use

The last trend in language testing mentioned by Shohamy (1985) is the transition fromknowledge to performance type tests In performance type tests, students or test takers have

to apply the knowledge of the language performing certain functions like actually speaking

or actually writing

2.3 QUALITIES OF A GOOD TEST

Three most important characteristics of a good test, according to Harrison (1991), arereliability, validity and practicality However, according to Bachman and Palmer (1996), atest's usefulness can be determined by considering the following measurements qualities ofthe test: reliability, construct validity, authenticity, interactivity, impact, and practicality Inthis minor MA thesis, I will mention the three qualities of a good test including reliability,validity and practicality with special focus on validity in the next part of the thesis

2.3.1 Reliability

The test is reliable if it consistently provides accurate measures of abilities at all times,with different students and/or different testers According to Harrison (1991: 10), “thereliability of the test is its consistency.” He confirmed that it is very important that thestudents’ score should be the same or nearly the same whether the test taker takes one test

or another, and the same result the test taker obtain whether the test is marked by oneperson or another, and a test should measures the same thing all the time “There are threeaspects to reliability: the circumstances in which the test is taken, the way in which it ismarked and the uniformity of the assessment it makes.” Harrison (1991: 11)

Henning (1987: 74) supposes “Reliability is thus a measure of accuracy, consistency,dependability, or fairness of scores resulting form administration of a particularexamination.” He added that if reliability is concerned with accuracy of measurement,reliability may increase when the error of measurement is made to minimize Therefore,

Trang 18

we should take care of the amount of error present in our measurement so that thereliability could be quantified.

The term reliability, according to Bachman and Palmer (1996), refers to consistency ofmeasurement Elaborately, they say that a reliable test score is consistent across differentcharacteristics of the testing situation Moreover, if test scores are inconsistent, theyprovide no information about the ability being measured Because it is impossible toeliminate inconsistencies on the whole, we try to reduce variations in the test's taskfeatures

2.3.3 Practicality

The test must be well organized in advance in relation to time, space, classroommanagement, equipment, cost “Practicality is the relationship between the resources thatwill be required in design, development, and use of the test and the resources that will beavailable for these activities” (Bachman and Palmer, 1996:36) They illustrated that thisquality is unlike the others because it focuses on how the test is conducted Moreover,Bachman and Palmer (1996) classified the addressed resources into three types: humanresources, material resources, and time Based on this definition, practicality can bemeasured by the availability of the resources required to develop and conduct the test.Therefore, our judgment of the language test is whether it is practical or impractical

Trang 19

2.4 VALIDITY

According to Henning (1987), validity has been distinguished into empirical and nonempirical kinds The non-empirical validity does not required the collection of data or theuse of formulae (e.g content or face validity, response validity) while the empirical kinds

of validity usually involve resource to mathematical formulae for the computation ofvalidity coefficient The common kinds of empirical validity include concurrent andpredictive validity Another kind of validity mentioned by Henning is construct validity.Having some similar ideas with Henning, Bachman (1990) also introduced five main types

of validity including content validity, criterion validity, concurrent validity, predictivevalidity and construct validity In this minor MA thesis, I will classify the major types ofvalidity based on the assumptions of Henning (1987) and Bachman (1990)

2.4.1 Content or face validity

Commonly, testing specialists consider content and face validity to be synonyms(Magnusson, 1967) Of course, some others make distinction between them and supposethat face validity, unlike content validity, is often determined impressionistically

Content or face validity is intuitive and logical but usually lacks an empirical basis Thename of this kind of validity shows that it is concerned with whether or not the content ofthe test is sufficiently representative and comprehensive for the test to be a valid measure

of what it is supposes to measure

The test content must be selective For example, the achievement test’s content should bebound to the content of instruction which in turn is constrained by the instructionalobjectives

According to Bachman (1990), there are two aspects of content validity including contentrelevance and content coverage The content relevance requires the specification of thebehavioral domain in question and the attendant specification of the task or test domain.Content coverage is the extent to which the tasks required in the test adequately representthe behavioral domain in question Demonstrating that a test is relevant to and covers agiven area of content or ability is therefore a necessary part of validation

Trang 20

2.4.2 Response Validity

Response validity refers to the extent to which examinees respond in the manner expected

by the test developer It mentions the response manner of the test takers and the instruction

of the test For example, if the test takers respond in a difficult and unreflective manner,their obtained score may not represent their actual ability Moreover, if the instruction ofthe test is unclear and the test format is unfamiliar to the examinees, their response may notreflect their true ability The two cases mentioned above may be said to be lack of responsevalidity

2.4.3 Concurrent validity and predictive validity

“Concurrent validity is a kind of empirical criterion related validity.” (Henning, 1987) Thevalidity is based on the collected data and formulas applied to generate an actual numericalvalidity coefficient Of course, the validity coefficient derived represents the strength ofrelationship with some external criterion measure

To validate a test of some particular ability in this way, one administers a recognized,reputable test of the same ability to the same persons concurrently or within a few days ofthe administration of the test to be validated

Bachman (1990) supposed that concurrent validity can examine differences in testperformance among groups of individuals at different levels of language ability, orexamine correlations among various measures of a given ability

Predictive validity has close relationship to concurrent validity It is usually reported in theform of a correlation coefficient with some measure of success in the field or subject ofinterest The predictive validity can tell us how well test scores can predict some futurebehavior (Bachman 1990: 250)

2.4.4 Construct validity

The construct validity is empirical in nature because it involves the gathering of data andtesting of hypotheses However, unlike concurrent and predictive validity, it does not haveany one particular validity coefficient associated with it

According to Henning (1987), the purpose of construct validation is to provide evidencethat underlying theoretical constructs being measured are themselves valid The constructvalidation usually begins with a psychological construct that is part of a formal theory

Trang 21

which enables certain predictions about how the construct variable will behave or beinfluenced under specified conditions Then the construct is tested under the conditionsspecified, and it is said to be valid if the hypothesized result occur and the hypotheses aresupported.

Construct validity concerns the extent to which performance on tests is consistent withpredictions that we make on the basis of a theory of abilities, or constructs (Bachman1990: 255) In order to examine the construct validation, it is necessary to exam patterns ofcorrelations among item scores and test scores, and between characteristics of items andtests and scores on items and tests; analyze and model the processes underlying testperformance; study group differences; study changes over time, or investigate the effects ofexperimental treatment (Messick 1989)

Trang 22

CHAPTER 3: THE STUDY

3.1 THE SUBJECT AND THE CONTEXT OF ENGLISH TEACHING AND LEARNING AT HAUI

3.1.1 English teaching and learning context at HaUI.

English faculty is one of the biggest faculties of Hanoi University of Industry There aremore than 150 teachers of English who are divided into three divisions One division is incharge of teaching English for students of English, the other one is in charge of teachingEnglish for secondary and vocational student, and the biggest one teaches English for allcollege and university non English major students All students of Hanoi University ofIndustry study English as their foreign language

According to the objectives given in the syllabus , the teaching aims of the English coursefor the non English students in the second semester are stated as follows:

In general, it helps enhance the knowledge and skills students have studied at the elementary level (the 1 st term), as well as improve General English level of student up to pre-intermediate level.

In details, it aims to provide students with knowledge of vocabulary, grammar,pronunciation and develop their listening, speaking, reading and writing skills based onnatural and social science topics; give students orientation about the importance of English

in their life and in their future jobs; and build and practice languages learning skills as well

as develop their own thinking and ideas when communicating in English

Grammar: Grammatical points are improved and enhanced through each unit All

principles related to grammatical points in each lesson are practiced effectively throughgroup work and pair work

Vocabulary: In this section, students have chance to improve their own vocabulary

considerably Vocabulary provided mostly is related to the topic of each unit

Skill work: In this part, students improve and develop their listening, speaking, reading

and writing skills These skills are integrated and this integration will help students touncover their creativeness, which brings about the best learning result

Everyday English (communication focus): Students are equipped with some cultural

knowledge of English speaking countries and communication samples Besides, studentscan entertain with songs and interesting conversation practice

Trang 23

Writing: Students are able to write some short paragraphs about the topics related to each

unit such as writing about their last holiday, future plans, hometowns, etc

The syllabus of General English course for the second semester non English major students

of HaUI is described in the following table:

Theory Practice Test

1 Unit 1: Getting to know you! 4 4

2 Unit 2: The way we live 4 4

3 Unit 3: It all went wrong 4 4

4 Unit 4: Let’s go shopping! 4 4

5 Stop and check 1 + Progress test 1 1 1

6 Unit 5: What do you want to do? 4 4

7 Unit 6: Tell me! What’s it like? 4 4

8 Unit 7: Famous couples 4 4

9 Unit 8: Do’s and don’ts 4 4

10 Stop and check 2 + Mid-term test 1 1

11 Unit 9: Going places 4 4

12 Unit 10: Scared to death 4 4

13 Unit 11: Things that changed the 4 4

world

14 Unit 12: Dreams and reality 4 4

15 Stop and check 3 + Progress test 2 1 1

16 Unit 13: Earning a living 4 4

17 Unit 14: Love you and leave you 4 4

18 Stop and check 4 + Revision 2

Table 2: The syllabus for the second semester

Course books being employed by teachers and students of Hanoi University of Industry arethe set of New Headway by John and Liz Soars (2000) (elementary, pre-intermediate andintermediate), Talktime by Susan (2004), and TOEIC Analysts by Taylor (2006) As fornon English majors in the second semester, the main course book is New Headway Pre-intermediate (2000) by Liz and John Soars Besides, students are recommended to use

Trang 24

another reference book named English Grammar in Use by Murphy, R In the teachingprocess, teachers also use other materials to present and recycle the basic structures ofEnglish to develop students’ proficiency in using these structures in certain contexts Thefocus is also placed on reinforcing and improving students’ knowledge of vocabulary andstudents’ ability of communication.

The major teaching points of the course book for the second semester are presented inappendix 5

3.1.2 English Testing for non English majors at HaUI

For each semester, students are required to take at least three progress tests and one finalachievement test During my teaching at HaUI, I reckon that testing is not the mainconcern of teachers Testing has not been paid proper attentions and carefully studied interms of its validity, reliability, format and practicality

Within the scope of this thesis, the study focuses on investigating the validity of the finalachievement English test (for the second semester) for non English major students whohave been learning English for 120 class hours covering all 14 units of New Headway Pre-intermediate Hereunder is the testing format registered to the second semester non EnglishMajors named Test 2 or the final achievement test

Test 2 with the time allowance is 60 minutes has total score of 100 points and consists ofthe following parts:

Section A (20 points): Grammar and Vocabulary This section includes 20 multiple choicequestions and is marked 20 points

Section B (20 points): Reading comprehension This section contains 2 short readingpassages with 10 multiple choice questions

Section C (20 points): Listening In the listening section, students are required to listen toseveral short conversation or short talk and then answer the questions There are 10multiple choice questions and 10 true/ false questions with 1 point for each correct answer.Section D (20 points): Writing Students have to do 5 sentence building questions byselecting the correct answers, then write a short paragraph about one of the topic givenbefore

Section E (20 points): Speaking In speaking section, students often introduce aboutthemselves and then talk about one of the topic they have been assigned (See appendix 4)

Trang 25

3.1.3 Subject of the study

Subjects of this study are students and teachers from HaUI This is the Universitywhich gives out the most time for EFL with 6 periods each week In the university, English

is learned not as a major but as an instrument Students are oriented to take the TOEIC test

in their graduation examination, which helps them a lot in their work in the future

3.1.3.1 Students

150 first year college students are selected from different classes of Department ofComputer science, Department of Garment and Fashion design, Department of Chemistry,Department of Economics, and Department of Mechanical Engineering, Department ofElectronic Engineering, and Department of Electrical Engineering Of which, students ofeconomics and computer science are considered having better ability in using English andthey are required to take the TOEIC Test in their graduation exam At the same time,students of other departments are required to take a B level test when they graduate TheirEnglish, as classified by the placement test, is at elementary level and pre-intermediatelevel Most students studied English between 3 and 7 years at lower and upper secondaryschool Among them, some learnt other foreign languages rather than English As a result,their English proficiency is considerably varied

Based on the situation that students of Economics and computer science are judged asbetter users of English and they will takes the TOEIC course and TOEIC Test beforegraduating, and that students of other majors will take the other course named Talktimeand will take B level test as the requirement for their graduation, the author divided thestudent population into two separated groups The first group includes students ofEconomics and Computer Science (75 students, hereinafter referred to as EC) and thesecond one includes other students (75 students hereinafter referred to as OM) participated

in this survey

3.1.3.2 Teachers

The English Department is one of the biggest departments of HaUI in terms of itsstaff number There are more than 140 teachers of English who are in charge of teachingEnglish for almost all students of HaUI including vocational students, college students anduniversity students In this study, 15 teachers of English at HaUI are selected They all

Trang 26

have been teaching English for the first year students and have experienced in testpreparation They are well trained and have at least three years of teaching experience Ten

of them have master degrees issued by College of Foreign Language, Vietnam NationalUniversity and Hanoi University, and the rest have been pursuing a master course inTESOL

Both students and teachers are willing and enthusiastic to take part in this study

3.2 RESEARCH METHODS

This empirical study is carried out with three data collection instruments, namelysurvey questionnaire, interview and document analysis The overall purpose of these datacollection instruments is to investigate the validity of the final achievement test registered

to the second semester non- English major students The author hopes that these collectioninstruments will help him to find out the judgment of teachers and students about the testvalidity Moreover, the document analysis is expected to give more data about the testvalidity

3.2.1 Survey questionnaire

For the validation methods, Huong (2000: 67) introduces three methods of test validation:Internal, External and Judgmental She supposes that the internal methods refer to ways ofinvestigating the internal structures of the test scores to ascertain the adequacy of the testscores as indicators of the latent traits, or sample of behaviors or both External methodinvestigate the relationship between test scores and other measures external to the test.Judgmental methods refers to methods which are based on the judgments of either expert

or ordinary people about the test content and tasks

Thus the questionnaires are the key methods employed to collect the data It aims atcollecting information indirectly from students and teachers Regarding the benefits ofsurvey questionnaire, Richards and Lockhart (1994:10) emphasizes that “surveyquestionnaires are useful way of gathering information about affective dimensions ofteaching and learning, such as beliefs, attitudes, motivation and preferences and enable ateacher to collect a large a mount of information relatively quickly.”

The student survey questionnaire is chosen to carry out this survey for it has many profits

In the first place, it can reach a huge amount of people in a short time According to Nunan

Trang 27

(1993: 143), “the time required to collect data is less compared to interviews.” It will take

us only a week to hand out and collect data from survey questionnaire, and one which toanalyze the data Secondly, it is the quite easy and simple to summarize and reportcollected data since all the informants will be polled the same questions The thirdadvantage of student survey questionnaire is that students are given a chance to expresstheir thought without embarrassment as their names are kept confidential Last but notleast, it is inexpensive using survey questionnaires

Description of survey questionnaire.

Before constructing questionnaire itemss, the author studied carefully the objective of thestudy so that each item is directly referenced against one or more of the researchobjectives He also piloted the survey questionnaire to see its strength and weakness incollecting data, so that he can make change for appropriateness based on the comment ofhis colleagues and students

* Survey questionnaire for students

The author designs 14 questions in the survey questionnaire for students, in which

questions 1 to 4 collect students’ comments on the whole test The next question collect students’ comment on the grammar and vocabulary section of the test Questions 6, 7 aims

at gathering students’ opinions of reading section while the next three questions ask about the appropriateness of writing section and the last question is about the speaking section Inthis study, the data collected from students’ questionnaires will be presented in separated table equivalent to each section In short, these questions were designed to find out how students evaluate the validity of the current final English test for the second semester non English major students in terms of its construct, content and time allowance * Survey questionnaire for teachers

There are 14 questions in the survey questionnaire for teachers It is basically similar to theone for students; however, some questions are different Teachers are also polled to revealtheir assessment of the validity of the test regarding the construct, time allowance, thecontent of the test and the methods to improve the quality of the test, especially in terms ofits validity

To help students and teachers understand and be able to decide what and how to respond in

a relevant way to a certain question, clear instructions are given at the beginning of survey

Trang 28

session Additionally, the survey questionnaires are translated into Vietnamese, thelearner’s and teacher’s L1; and the Vietnamese version is handed out to avoid theirmisunderstandings.

3.2.2 Interview

An inevitable downside of survey questionnaire, which slightly affects the quality ofcollected data, is the lack of space for unpredicted responses In order to lessen thisimperfection, informal interview used in this study contain more blank for informants toexpress their own ideas Referring to benefits of interview, Nunan (1993: 150) express that

“it gives the interviewer a great deal of flexibility The interview can follow up arespondent’s answers to obtain more information and clarify vague statements.”

Description of interview

The informal interview is aimed at getting more information from students and teachers as they could not expose when answering the questions in survey questionnaire The study uses the semi-structure type of interview (see appendix 3 for interview questions) * Interview with students

15 students from 5 classes will be selected randomly to attend the interview which is focus

on the judgments about the time allowance, the content and construct of the test *

Interview with teachers

10 teachers will be selected at random to take part in the interview which is also focus onthe content, construct and the time allowance Of course, they are asked about how toenhance the quality of the test in terms of its validity

Trang 29

290 students’ marks of final achievement test for the second semester non English majorstudents are collected for the document analysis

Collection of student’s results is aimed at discovering explicitly the range of student’smarks Through the mark statistics, the author can assess the difficulty level of the test andthe distribution of students’ test scores

3.3 DATA COLLECTION PROCEDURE

To collect the information about the test validity of the final achievement test for nonEnglish major student at HaUI, first of all, 150 survey questionnaires were handed out tothe first year students of different majors as mentioned before At the same time, 15teachers of English were asked to complete 14 survey questionnaires which were designedfor teachers only

After survey questionnaires were collected, interviews were carried out with 15 studentsand 10 teachers chosen at random to find out more detailed data, which were recorded on apaper

Next, 290 students’ mark data were collected randomly for the document’s analysis

When data collection is accomplished, the data analysis will be initiated The results ofsurvey questionnaire and interview will be analyzed to investigate the validity of the finalachievement test for the second semester non English major student of HaUI

Trang 30

CHAPTER 4: FINDINGS AND DISCUSSIONS

4.1 DATA ANALYSIS

In this section, the data collected from the survey questionnaires, and the interview with

teachers and students will be analyzed and discussed Moreover, the scores analysis will

also be done to reveal the validity of the final achievement test for the second semester non

English major students

4.1.1 Data analysis of students and teachers survey questionnaires and interviews

One hundred and fifty questionnaires were handed out to the students of Hanoi University

of Industry who already took the final achievement tests for the second semester Students

were chosen from different faculties including Faculty of Computer science, Faculties of

Garment and Fashion design, Faculties of Chemistry, Faculties of Economics, and

Faculties of Mechanical Engineering, Faculties of Electronic Engineering, and Department

of Electrical Engineering

The questionnaires for teacher were completed by 15 teachers who were in charge of

teaching English for students in their second semester Along with the questionnaire for

teachers, I also have some informal to those who participated in teaching students at the

second semester and those who are in charge of composing test or selecting test items

As mentioned before, the questionnaires were used to collect students’ and teachers’

evaluation about the test in terms of time allowance, content, and construct, so that the

author can partly discover the validity of the test

Time allowance of the final test

58 50

40 30

18

10

2 1 0

Chart 1: Opinions of students on the time allowance of the final test

Trang 31

As shown in the chart 1, 77.3% of students of other majors (OM) suppose that the timeallowance for the final test is not enough When being asked about the time allowance,most of them would like to have more time to finish their answers On the other hand, only

24 percent of students of economics and computer science (EC) say they do not haveenough time to do the test

In contrast, the number of students find enough time for the test among EC students is 55out of 75 while that of OM students is just 18/75 Only few students (2 EC and 1 OM)suppose that the time for the final test is too much It seems that the test is too easy to ECstudents but too difficult to OM students

13%

47%

40%

Chart 2: Teachers’ comments on time allowance of the test

As shown from the chart above, 47 % of teachers determined that the time allowance of thetest is enough while there are 40 % of them suppose that students do not have enough time

to complete the test or the test needs more time so that the test construct can be varied andteacher can evaluate students’ achievements more thoroughly Only 13% agree thatstudents have too much time to do the test

Trang 32

Appropriate

Not appropriate

Chart 3: Appropriateness of the final test in student’s opinion

When asking about whether or not the test can measure what students have been taught,85.3 percent of students in EC group agree that the test is appropriate or can measure whatthey have been taught while only 14.7% of them disagree In OM group, more than 57percent say “yes” and nearly 43 percent say “no”

Chart 4: Teachers’ comment on the test appropriateness

10

9 9

8 7 6 5 4 3 2 1 0

6

Concerning the appropriateness of the test, 60% of teachers (9 teachers) participating inthis survey approve that the test content can measure what they have taught during thesemester However 40% of teachers (6 teachers) believe that the content of the test does not

Trang 34

Content and construct

Chart 5: Test items best measure students’ true ability in students’ perception

45

36 40

35 30

and vocabulary comprehension

The chart above reveals the opinions of students about the items which can best measure

their true ability 79 out of 150 students think that “Grammar and vocabulary” section can

reflect their true ability Coming to the second place is “Reading comprehension” section

with 22% whereas the percentage of students choosing “Speaking” section is 13% Only

10% select “Writing” section and 8% assume that “Listening” section can best measure

students’ true ability We can infer from the result that “grammar and vocabulary” section

and “speaking” section are considered the good items to measures students’ ability and

these items should be employed in the final test

Chart 6: Teachers’ opinions on test items best measuring your students’ true ability

Listening

Định dạng
Số trang	70
Dung lượng	308,66 KB