ValidityValidity the extent to which it measures what it is supposed to measure & nothing else content Face validity Content validity Construct validity Empirical validity... Co
Trang 1Chapter 10:
Criteria and Test Types
Trang 31 Validity
Validity the extent to which it
measures what it is supposed to measure & nothing else (content)
Face validity
Content validity
Construct validity
Empirical validity
Trang 4 Face validity
teachers, moderators & testees described as
having face validity
public relations exercise
validity- the most important of all types of validity
Trang 5
Content validity
Depending on a careful analysis of the language being tested & of the
particular course objective
When constructing tests, writers should
first draw up a table of test
specifications (language skills, areas
included…)
Trang 6 Construct validity
A test having construct validity is capable
of measuring specific characteristics in
accordance with a theory of language
behavior and learning
For example, a test consisting of multiple choice items will lack construct validity if the communicative approach is adopted during the language course
Trang 7 Empirical /statistical validity
This kind of validity obtained as a result of comparing the
results of the test with the results of some criterion
measure such as:
An existing test, known to be valid and given at the same time
The teacher’s ratings or any other such form of
independent assessment given at the same time
Trang 8
Empirical /statistical validity
The subsequent (later) performance of the testees on a certain task measured
by some valid test
The teacher’s ratings or any other such form of independent assessment given later
Trang 9Summary (Validity)
The test situation
The technique used
important factor in determining the overall validity of any test
Trang 102 Reliability (definitions)
A test administrated to the same candidates
on different occasions produces the same results reliable
Reliability denotes the extent to which the
same marks /grades awarded if the same test papers marked by
(i) 2 or more ≠ examiners
(ii) the same examiner on ≠ occasions
Trang 112 Reliability (affecting factors)
& the administration of the test
(1) test instructions (rubrics)
(2) personal factors like motivation & illness
(3) scoring of the test (the most important factor- objective tests overcome this problem of
marker reliability)
Trang 122 Reliability (measuring methods)
(1) Re-administering the same test (the
same group of candidates) after a lapse time
(2) Administering parallel forms of the test
to the same group (tests must be identical
in the nature of sampling, difficulty, length
& rubrics) If the correlation between 2
tests is high, the test can be termed
reliable
Trang 133 Reliability versus Validity
2 chief criteria for evaluating any test ( an ideal test should be valid &
reliable)
The greater the reliability of a test, the less validity it usually has.
Trang 144 Discrimination
(1) To discriminate among ≠ candidates
(2) To reflect the differences in the
performances of individuals in a group
vary depending on the purpose of the test
Trang 155 Administration/Practicality
A test must be practicable, i.e fairly straight forward to administrate or able to
administrate (the length of time for
administrating, collecting answer sheets,
reading instructions).
Another practical consideration concerns
the answer sheets and the stationery used
Trang 166 Test instructions to the candidates
All instructions are clearly written.
Samples are given.
Grammatical terminology should be
avoided.
Trang 177 Backwash effects
Def.: the influences of testing on teaching & learning
Positive backwash effect (reading tests
development of reading skills)
Negative backwash effect (objective tests reducing learners’ motivation
Implications: influences of tests on the compilation of syllabus & language teaching programmes
Trang 191 Achievement /attainment tests
Class progress tests, the most widely used types of tests
Achievement tests, formal tests
Trang 20 Class progress tests
have mastered the material taught in the
classroom, allowing Ss to show what they
have mastered
on teaching & motivation
& gain confidence
Trang 21 Achievement tests
scale, to show mastery of a particular syllabus
analysed & revised where necessary
particular approach to learning & teaching
adopted
Trang 222 Proficiency tests
Defining a student’s language
proficiency with reference to a
particular task which he/she will be required to perform (TOEFL, TOEIC)
In no way related to any syllabus or teaching programme
Trang 233 Aptitude tests
Designed to measure the Ss’ probable
performance in a foreign language which he/she has not started to learn
Generally, seeking to predict Ss’ probable strengths & weaknesses in learning a
foreign language by measuring
performance in an artificial language
Trang 244 Diagnostic tests
Achievement & proficiency tests:
frequently used for diagnostic purposes
such as diagnosing areas of difficulty Ss may have so that appropriate remedial
action can be taken later.
for groups of Ss rather than for individuals