COMPUTERIZED ADAPTIVE TESTING IN LANGUAGE EDUCATION OPPORTUNITIES AND CHALLENGES IN ERA 4 0 Bui Thi Kim Phuong (Hanoi University of Science and Technology) Nguyen Quy Thanh, Le Thai Hung (VNU University of Education) Abstract Computer technology is being widely applied in education in general and in assessment in particular, especially in the 4 0 era of digital transformation In the field of language education, computerized adaptive testing (CAT) is not a new concept and has attracted increasing[.]
Trang 1OPPORTUNITIES AND CHALLENGES IN ERA 4.0
Bui Thi Kim Phuong
(Hanoi University of Science and Technology)
Nguyen Quy Thanh, Le Thai Hung
(VNU University of Education)
Abstract: Computer technology is being widely applied in education in general and in assessment in particular,
especially in the 4.0 era of digital transformation In the field of language education, computerized adaptive testing (CAT) is not a new concept and has attracted increasing attention from scholars since the 1980s CAT offers considerable benefits in language assessment as a reliable testing tool with high precision, time efficiency, motivating individualized testing experiences, and wide coverage of assessment options for various purposes However, computerized adaptive language testing (CALT) poses certain concerns about test fairness and requires
a number of considerations in the development and validation process Based on a review of previous discussion and empirical studies conducted on CALT, the article offers an insight into the pros and cons of CALT and provides constructive suggestions for language education practices in Vietnam It also hopes to further contribute to the literature of technological application in educational assessment and measurement.
Keywords: computer technology in education, computerized adaptive testing, computerized adaptive
language testing, language assessment.
1 INTRODUCTION
The 4.0 era of digital transformation has impacted on all aspects of education and promoted innovative assessment practices In the field of language assessment, the application of computer technology has become more popular in all homes and schools, thereby offering favourable conditions for a more efficient testing initiative - a computerized adaptive language testing system
The adoption of CALT dates back to the 1980s Larson and Madsen (1985) developed the first computer-assisted tests in which students were allowed to access sources of references with the use
of computers to complete their writing tests Then, there was a shift from paper-and-pencil tests to computerized tests, thanks to computer experts (Meunier, 1994) In 1986, one project at Brigham Young University introduced the first computerized adaptive language test, a language testing system that used computers to tailor tests to the test-takers’ level of language ability (Larson, 1989)
Since then, many computerized adaptive language tests have been developed in many empirical studies for different languages, including French (Kaya-Carton et al., 1991; Burston, 1995; Laurier, 1999); Spanish (Larson, 1987); Japanese (Brown & Iwashita 1996); German (Starr-Egger, 2001); Chinese (Wang et al., 2012) It is noteworthy that DIALANG, a European project, offers web-based adaptive tests for official EU languages along with Irish, Icelandic, and Norwegian (Chalhoub–Deville
& Deville, 1999)
Trang 2CALT has been widely applied for English language assessment to assess various aspects of English language proficiency (Tseng, 2016), more frequently with vocabulary (Vispoel, 1993, 1998; Vispoel, Rocklin & Wang, 1994; Laufer & Goldstein, 2004; Tseng, 2016; Aviad-Levitzky, Laufer & Goldstein, 2019; Mizumoto, Sasao & Webb, 2019), and receptive skills (Madsen, 1991; Kaya-Carton, Carton & Dandonoli, 1991; Chalhoub-Deville, 1999; Dunkel, 1999; Nogami & Hayashi, 2010; He & Min, 2017; Gawliczek et al., 2021), and less frequently with productive skills (Stevenson & Gross, 1991; Malabonga & Kenyon, 1999; Malabonga, 2000)
Now, an increasing number of computerized adaptive language tests have been developed; large testing companies, despite low attention drawn to CALT in the past, have now paved the way for more adaptive language tests (Pathan, 2012) One of them, Educational Testing Service (ETS), has transferred high-stakes exams like TOEFL, GRE, and GMAT to CALT (Rudner, 2010) Moreover, the application
of CAT in language assessment has been the focus of discussion in various publications in the past decades (Larson & Madsen, 1985; Canale, 1986; Tung, 1986; Henning, 1987; Lange, 1990; Meunier, 1994; Brown, 1997; Chalhoub-Deville & Deville, 1999; Laurier, 2000; Chapelle & Douglas, 2006; Alderson, 2007; Ockey, 2009; Pathan, 2012; Khoshsima & Toroujeni, 2017; Okhotnikova et al., 2019)
In the context of Vietnamese education, CAT is not a well-established research field despite ever-growing interests in technological application in education in the past decades, especially in the digital era Few existing studies have been conducted on the development and validation of CAT (Giang
& Hung, 2018; Hung, Hoa, et al., 2019; Hung, Thuy, et al., 2019) However, positive results have been obtained by UEd-CAT 1.0, an adaptive testing system developed by the University of Education (Hung & Ha, 2021) The system has now provided teachers and students with free access to measure
10 graders’ mathematical ability and Vietnamese reading comprehension competence Since adaptive testing is recognized as a growing trend in assessment and testing, the university will continue to invest
in research and develop the system to increase the system security and efficiency to assess the capacity
of learners at different levels and for different purposes as well as integrate CAT in adaptive learning and blended learning environments in Vietnam
By reviewing both previous discussion and empirical studies on CALT, this paper provides some fundamentals of CALT, discusses the opportunities and challenges of CALT and puts forward suggestions for future CAT and CALT practices in Vietnam
2 COMPUTERIZED ADAPTIVE LANGUAGE TESTING
2.1 Overview
CALT refers to a testing system in which computers are used to generate a test adjusting to a test taker’s language level Figure 1 illustrates the CAT process (Thompson & Weiss, 2011)
The first component of CAT is a calibrated item bank, which serves as test content In the case of language assessment, the item bank consists of language items for language tests All items in the bank are first calibrated by item response theory (IRT) and latent trait theory (Meunier, 1994) Three IRT models that can be applied to build an item bank include the one-parameter model examining test items in terms
of only one parameter, item difficulty, the two-parameter model analyzing both item difficulty and item discrimination, and the three-parameter model covering item difficulty, item discrimination and pseudo guessing Once the item bank has been calibrated, it stores the items accompanied by their statistical features, which are ready for the later algorithms in the system (Choi & McClenen, 2020)
Trang 3The other components of CAT are CAT algorithms to decide the first item (starting point), choose succeeding items (item selection algorithm), score items to estimate the test-taker’s ability (scoring algorithm), and check predefined criteria to end the test (termination criterion) (Thompson & Weiss, 2011) In a complete testing process, the test administration starts with an item selected from the calibrated item bank This starting item can be chosen randomly or from a group of medium-difficulty items in the item bank (Oppl et al., 2017; Choi & McClenen, 2020) If the test-taker provides a correct answer, then a higher-difficulty question is given, and otherwise, a lower-difficulty question is given
In this repeated process, the test-taker’s ability is estimated and recalculated based on the test-taker’s performance until the system collects enough evidence to determine the candidate’s language level, which means the stopping criterion has been achieved
2.2 Opportunities
CALT has enormous advantages over conventional fixed formats testing, including paper-based language tests and computer-based tests In this paper, the opportunities of CALT development will be discussed in terms of high precision, time-saving potential, individualized testing experience, and a wide range of assessment options offered by CALT
2.2.1 CALT – a precise testing tool
Given that the item bank, one of the architectural components of CALT, is calibrated on the basis
of IRT, CALT promises a higher level of test standardization than conventional tests like paper-based and computer-based tests Many empirical studies have provided evidence for the precision of CALT in comparison with other test modes Mizumoto et al (2019) conducted a study to develop and evaluate
a computerized adaptive test - CAT-WPLT which is the CAT version of the Word Part Levels Test (WPLT), designed two years earlier by Sasao and Webb (2017) The study results conclude that the CALT version produced a “similar or or greater precision than the fixed-item counterpart” (p 120) Gawliczek et al (2021) recently conducted a study to develop an adaptive test for reading and listening,
in which they undertook a comparative analysis between CALT and paper-based One of the validity and reliability indexes presented in the study findings is the high rate of test-takers confirming their reading and listening ability based on the CALT results
Trang 4CALT is also regarded as an educational measurement tool with high precision by Larson & Madsen, 1985; Olsen et al., 1989; Meunier, 1994; Choi, Kim & Boo, 2003); Giouroglou & Economides, 2003; Pathan, 2012; He & Min, 2017; Khoshsima & Toroujeni, 2017)
2.2.2 CALT – a time-efficient testing mode
Since CALT assesses language proficiency based on test items fitted to each individual test-taker, questions that are not too difficult and too easy will not be selected and administered As a result, along with the testing time, the number of test items required for each test is considerably reduced while still achieving precise test results Olsen et al (1989), in their comparative study on paper-based, computer-based and adaptive tests, analyzed the testing time and reported that CAT took only one-fourth of the testing time in comparison with the paper-based tests, and about one-third to one-half in comparison to the computer-based tests Madsen (1991) also revealed that the majority of CALT test-takers (over 80%) needed less than one-half of test items in comparison with paper-based tests Mizumoto et al (2019) reported that the CAT version determined the test-takers’ language proficiency in approximately 10 minutes instead of 20 to 30 minutes required by the paper-based version of the test
The reduced demands for test items and testing time in CALT are also confirmed by other researchers (Vispoel, 1993; Meunier, 1994; Giouroglou & Economides, 2003; Pathan, 2012; Tseng, 2016; Khoshsima & Toroujeni, 2017; Okhotnikova et al., 2019) Therefore, this distinct advantage of CALT, and computer-based language testing helps reduce the burdens of time and money involved in the test administration process (Pathan, 2012) and brings more accessible opportunities for test-takers in different places, even for those with disabilities (Stone & Davey, 2011)
2.2.3 CALT – enhanced testing experiences
In an adaptive language test, the first test item is selected from a large item bank when the subsequent test items are selected based on the taker’s performance of the earlier item; as a result, each test-taker receives a specific test tailored to his or her level of language proficiency When the test-test-takers are provided with appropriate items with tailored features like difficulty and discrimination, they can focus better on the test, have better testing perceptions and feelings (Meunier, 1994; Wise, 2014) and relieve themselves of stress, boredom and fatigue (Giouroglou & Economides, 2003; Rasskazova et al., 2017) Gawliczek et al (2021) reported a higher level of motivation among the test-takers of both high and low language proficiency Moreover, as there are no limits in the testing time for each test item and the whole test, the test-taker can control their answering speed to finish the test In other words, there is
no longer time anxiety and pressure for the test-takers, which may negatively influence the test results (Meunier, 1994)
As soon as the test is finished, CALT takers can access real-time feedback, with the defined proficiency level and suggestions for improvement (Dandonoli, 1989; Meunier, 1994; Burston & Neophytou, 2014; Gawliczek et al., 2021) The results are individually evaluated in reference to a certain level of proficiency and announced to the taker, not in relation to the performance of other test-takers For test-takers who are eager to know the results, CALT brings them a very positive experience compared to other conventional testing modes It is also noteworthy that no information can be useful to provide other test-takers with some cheating possibilities, even among test-takers who are sitting next to each other or taking the test later This approach, therefore, can ensure better security in CALT (Meunier, 1994; Pathan, 2012; Rasskazova et al., 2017; Okhotnikova et al., 2019) This advantage could account for a high rate of test-takers’ readiness for CALT as Gawliczek et al (2021) reported
Trang 52.2.4 CALT – a wide coverage of assessment offers
Okhotnikova et al (2019) mentioned the wide range of language ability levels that can be evaluated in CALT as one of the biggest advantages of CALT in addition to the positive features of a computer assisted tests Rasskazova et al (2017) emphasized this strength of adaptive language tests by comparing with linear tests where the highest and lowest levels of language proficiency are not correctly assessed if the tests target average-level test-takers Olsen et al (1989) asserted that CALT can be developed for both norm-referenced and criterion-referenced assessment of test-takers’ ability level to meet a specific educational goals Adaptive tests can also be exploited for various assessment purposes, such as high stakes testing (Alderson, 2007; Chapelle & Voss, 2008; He & Min, 2017), large scale assessment (Chalhoub-Deville, 1999; Wen & Qinghua, 2002; Pathan, 2012; Khoshsima & Toroujeni, 2017), diagnostic assessment (Larson and Madsen, 1985; Alderson, 2007; Chapelle & Voss, 2008), and formative assessment (Giouroglou & Economides, 2005; Choi & McClenen, 2020)
Moreover, the use of mobile devices in computer technology can facilitate the integration of multiple-media features in the tests, the flexible administration of tests and supply of tests on demand (Triantafillou, Georgiadou, & Economides, 2008) The superiority of CAT to other testing modes lead
to the unlimited application in different disciplines from education and training to professional teacher development in schools, workplaces, and even the military In the education 4.0, when computer use
is witnessed in all homes and schools, there is no doubt that CALT can satisfy increasingly greater assessment demands with various offers and modern upgrades
2.3 Challenges
Despite all the positive points that CALT brings to test-takers, scholars have presented a number of challenges related to test threats and design requirements In the 4.0 era of digital transformation, some
of the challenges are no longer the focus of attention while others call for serious considerations and actions for more positive testing experiences with CALT
2.3.1 CALT – test threats
In the digital era, when computers become familiar devices in education, computer expertise (Madsen, 1991), digital presentation modes of test items and test bias due to computer anxiety (Henning, 1991) have been no longer the main causes of challenges about CALT It is noted that other concerns about test fairness should be brought to discussion for some constructive suggestions
Henning (1991) was concerned that the validity of CALT may be under the influence of different testing time and speed, which is related to test fairness Another concern about the CALT experiences
is that there is no chance for test-takers to come back to the earlier items to check and revise the answer (Giouroglou & Economides, 2003; Khoshsima & Toroujeni, 2017) Once the test-taker answers the question, the ability is calculated to administer the succeeding item, then this property of CALT may challenge the testing habits for a number of test-takers However, these concerns can be overcome through easier access to and more familiarity with CALT
Wainer & Eignor (2000) also raised their concerns about the security of CALT when a number
of critical items could be recalled and discussed among test-takers, which then affect the reliability of CALT To overcome this challenge, one suggestion by Mizumoto et al (2019) for a strict process of test development and the development of more test items for the item bank needs to be emphasized
Trang 62.3.2 CALT – design considerations
Papers on frameworks, guidelines, and experiences-sharing all indicate that the development process
of an adaptive assessment system requires an enormous investment of resources like time, money, human labor, and cross-disciplinary expertise related to language association and computer technology (Brown, 1997; Larson, 1998; Chapelle, 1999; Chapelle & Voss, 2008; Nydick & Weiss, 2009; Chen & Wang, 2010; Pathan, 2012; Rasskazova et al., 2017; Spoden, Frey & Bernhardt, 2018)
With regard to the unidimensionality assumption of the IRT models employed in the design of adaptive tests, many concerns have been raised by scholars (Jamieson, 2005; Chalhoub-Deville, 2010; Liu, 2019) Canale (1986) was concerned about a threat of CALT to language ability assessment when language ability covers multiple constructs like cognition, knowledge, contextual use of language Young
et al (1996), Norris (2001), Giouroglou & Economides (2003), He & Min (2017), and Okhotnikova
et al (2019) expressed their concern about the limitation of unidimensional CALT in handling open-ended questions and questioning the feasibility of CALT in assessing productive skills Even when some automated scoring systems have been developed, there remains significant controversy over whether writing and speaking can only be accurately evaluated by real persons It is suggested that more efforts should be put into the development of CALT, especially automated scoring systems, to handle challenges from both speaking and writing assessment to make CALT more communicative and authentic as expected in modern language and education philosophy as well as technology and assessment practice (Canale, 1986; Young et al., 1996; Pathan, 2012; Okhotnikova et al., 2019)
As mentioned in the overview of CALT, the architectural design consists of two integral parts: a calibrated language item bank and CAT algorithms A large calibrated item bank is expected to ensure the content quality and the successful operation of a CALT system It is suggested by Okhotnikova
et al (2019) that the CALT item bank should include more than a thousand items to guarantee the coverage of all the abilities Meunier (1994) reported the number of 100 or 200 items validated by the one-parameter IRT model to be used in most adaptive tests and recommended the use of 2000 or more items in case the two and three-parameter IRT models are employed for validation procedures Nydick
& Weiss (2009) and Suvorov & Hegelheimer (2013) also emphasize the need for a substantial size of item bank to minimize the problem of cheating resulting from test-takers’ memorization of items Even though many empirical studies have been implemented, no specific minimum size of CALT item bank has been concluded; therefore, more research is needed to come up with standard indexes of the item bank for future CALT design projects
In terms of design challenges related to CAT algorithms, all the decisions on the starting point, item selection, scoring mechanism, and ending criteria have yet to be made Researchers are still conducting further work to determine the optimal offer to maximize the potentials of CALT in providing reliable and effective language assessment practices (Pathan, 2012; Khoshsima & Toroujeni, 2017)
In addition, in the 4.0 era of digital transformation, when computer technology benefits from many technological advances, attracts more interests in all fields, and experiences a widespread and inclusive application in the education system and in language assessment practices, it offers more opportunities
to strengthen CALT, but also brings additional challenges related to the handling of network security risks and the requirements of continually maintaining, managing and upgrading CALT system as well
as modifying algorithms to satisfy the increasing needs of the society for education in general and assessment in particular It is expected that these challenges can be overcome through the considerable resource investments and cooperation from various disciplines, including language education, testing, assessment, programming and computer system management
Trang 73 CONCLUSION
With a thorough review of previous studies and publications, this paper has attempted to synthesize both potentials and considerations in the implementation of CALT in language assessment and provide some suggestions for subsequent work and investment to solve the persisting problems in the context of Vietnamese education
It is concluded that CALT is a computer-based testing approach that tailors language tests to the test-takers’ level of proficiency Besides many benefits of a testing system integrating the computer use, the initiative of CALT has obvious advantages over conventional testing modes, such as improved reliability, time-saving potential, motivating testing experience, and diverse testing options However, CALT faces a number of challenges relating to test design and administration process
In the 4.0 era of digital transformation, as the computer has become an easily accessible educational tool and its technology has enjoyed a very rapid development nearly everywhere in schools and universities, all the above-mentioned challenges will be overcome to create more reliable, flexible, and efficient tests for different language assessment purposes, including summative adaptive assessment and high school graduation exams, as well as to promote adaptive language learning opportunities in all educational levels It can strongly believed that CALT will achieve growing popularity as an innovative assessment option in both research and practices in Vietnam
REFERENCES
1 Alderson, C (2007) Computer-adaptive language testing Optimizing the role of language in
Technology-Enhanced Learning, 1.
2 Aviad-Levitzky, T., Laufer, B., & Goldstein, Z (2019) The new computer adaptive test of size
and strength (CATS): Development and validation Language Assessment Quarterly, 16(3),
345-368
3 Brown, A., & Iwashita, N (1996) Language background and item difficulty: The development of a
computer-adaptive test of Japanese System, 24, 199–206
4 Brown, J D (1997) Computers in language testing: Present research and some future
directions Language Learning & Technology, 1(1), 44-59.
5 Burston, J (1995) Practical design and implementation considerations of a computer adaptive
foreign language test: The Monash/Melbourne French CAT Calico Journal, 26-46.
6 Burston, J., & Neophytou, M (2014) Lessons Learned in Designing and Implementing a
Computer-Adaptive Test for English The EuroCALL Review, 22(2), 19-25.
7 Canale, M (1986) The promise and threat of computerized adaptive assessment of reading
comprehension In C Stansfield (ed.), Technology and language testing (pp 30-45) Washington,
DC: TESOL Publications
8 Chalhoub-Deville, M & Deville, C (1999) Computer adaptive testing in second language
contexts Annual Review of Applied Linguistics, 19, 273-99.
9 Chalhoub-Deville, M (1999) Issues in computer-adaptive testing of reading proficiency Cambridge:
University of Cambridge Local Examinations Syndicate
Trang 810 Chalhoub-Deville, M (2010) Technology in standardized language assessments In R Kaplan
(Ed.), Handbook of applied linguistics (2nd ed., pp 511–26) Oxford, England: Oxford University Press
11 Chalhoub-Deville, M (Ed.) (1999) Issues in computer-adaptive testing of reading
proficiency (Vol 10) Cambridge university press.
12 Chalhoub–Deville, M., & Deville, C (1999) Computer adaptive testing in second language
contexts Annual Review of Applied Linguistics, 19, 273-299.
13 Chapelle, C A (1999) Validity in language assessment Annual review of applied linguistics, 19,
254-272
14 Chapelle, C A., & Douglas, D (2006) Assessing language through computer technology
Cambridge, England: Cambridge University Press
15 Chapelle, C A., & Voss, E (2008) Utilizing technology in language assessment Encyclopedia
of language and education, 7, 123-134.
16 Chen, J., & Wang, L (2010, October) Computerized adaptive testing: A new trend in language
testing In 2010 International Conference on Artificial Intelligence and Education (ICAIE) (pp
725-728) IEEE
17 Choi, I.-C., Kim, K S., & Boo, J (2003) Comparability of a paper-based language test and a
computer-based language test Language Testing, 20, 295–320
18 Choi, Y., & McClenen, C (2020) Development of adaptive formative assessment system using
computerized adaptive testing and dynamic bayesian networks Applied Sciences, 10(22), 8196.
19 Dandonoli, P (1989) “The ACTFL Computerized Adaptive Test of Foreign Language
Reading Proficiency.” Modern Technology in Foreign Language Education: Application
and Projects, edited by F Smith Lincolnwood, IL: National Textbook.
20 Dunkel, P (1999) Research and development of a computer-adaptive test of listening comprehension in
the less-commonly taught language Hausa In M Chalhoub-Deville (Ed.), Issues in computer-adaptive testing of reading proficiency Cambridge: University of Cambridge Local Examinations Syndicate
21 Gawliczek, P., Krykun, V., Tarasenko, N., Tyshchenko, M., & Shapran, O (2021) Computer
Adaptive Language Testing according to NATO STANAG 6001 requirements Advanced
Education, 19-26.
22 Giang, N Q & Hung, L T (2018) Mô phỏng một bài kiểm tra thích nghi trên máy tính thông
qua phần mềm R Tạp chí Khoa học Giáo dục Việt Nam, 11, 6-11.
23 Giouroglou, H., & Economides, A (2003, June) Cognitive CAT in foreign language assessment
In Eleventh International PEG Conference” Powerful ICT Tools for Learning and Teaching”,
PEG’2003 (Vol 28).
24 Giouroglou, H., & Economides, A (2005) An implemented theoretical framework for a common
European foreign language adaptive assessment In 3rd International Conference on Open and
Distance Learning, Greek Open University, Patra, Greece.
25 He, L., & Min, S (2017) Development and validation of a computer adaptive EFL test Language
Assessment Quarterly, 14(2), 160-176.
Trang 926 Henning, G (1987) A guide to language testing: Development, evaluation, research Cambridge,
MA: Newbury House
27 Hung, L T & Ha, N T (2021) “Xu thế kiểm tra, đánh giá năng lực người học trên nền tảng
công nghệ” Tạp chí Khoa học Giáo dục Việt Nam, 42, 1-6.
28 Hung, L T., Hoa, T T., May, D T & Huong, H L (2019) “Phát triển ngân hàng trắc nghiệm
thích ứng để đánh giá năng lực đọc hiểu môn Ngữ văn của học sinh lớp 10 trung học phổ thông”
Tạp chí Khoa học Giáo dục Việt Nam, 24, 54-59.
29 Hung, L T., Thuy, T T., Anh, T L., Dung, N T., Anh, N P., & Giang, N T Q (2019) Developing
Computerized Adaptive Testing: An Experimental Research on Assessing the Mathematical
Ability of 10th Graders VNU Journal of Science: Education Research, 35(4), 49-63.
30 Jamieson, J (2005) Trends in computer-based second language assessment Annual Review of
Applied Linguistics, 25, 228-242
31 Kaya-Carton, E., Carton, A S., & Dandonoli, P (1991) Developing a computer-adaptive test of
French reading proficiency In P Dunkel (Ed.), Computerassisted language learning and testing: Research issues and practice New York: Newbury House
32 Khoshsima, H., & Toroujeni, S M H (2017) Computer Adaptive Testing (Cat) Design; Testing
Algorithm and Administration Mode Investigation European Journal of Education Studies.
33 Lange, D L (1990) ‘Priority Issues in the Assessment of Communicative Language Abilities.”
Foreign Language Annals 23 403-407.
34 Larson, J W & Madsen H S (1985) Computer-adaptive language testing: Moving beyond
computer-assisted testing CALICO Journal, 2(3), 32-6.
35 Larson, J W (1987) Computerized adaptive language testing: A Spanish placement exam
36 Larson, J W (1989) S-CAPE: A Spanish computerized adaptive placement exam Modern
technology in foreign language education: Applications and projects, 277-289.
37 Larson, J W (1998) An argument for computer adaptive language testing Multimedia-Assisted
Language Learning, 1(1), 9-24.
38 Laufer, B., & Goldstein, Z (2004) Testing vocabulary knowledge: Size, strength, and computer
adaptiveness Language learning, 54(3), 399-436.
39 Laurier, M (1999) The development of an adaptive test for placement in French In M
Chalhoub-Deville (Ed.), Issues in computer-adaptive testing of reading comprehension (pp 122–135) Cambridge, UK: Cambridge University Press
40 `Laurier, M (2000) Can computerized testing be authentic? ReCALL, 12(1), 93-104.
41 `Liu, X (2019) Optimizing design of incorporating off-grade items for constrained computerized
adaptive testing in K-12 assessment (Doctoral dissertation, The University of Iowa).
42 Madsen, H (1991) Computer-adaptive testing of listening and reading comprehension In P
Dunkel (Ed.), Computerassisted language learning and testing: Research issues and practice (pp 237-257) New York, NY: Newbury House
Trang 1043 Malabonga, V (2000) Trends in foreign language assessment: The computerized oral proficiency
instrument NCLRC Newsletter
44 Malabonga, V., & Kenyon, D (1999) Multimedia computer technology and performance-based language
testing: a demonstration of the computerized oral proficiency instrument In M B Olsen (Ed.), Computer mediated language assessment and evaluation in natural language processing New Brunswick, NJ: Association for Computational Linguistics
45 Meunier, L E (1994) Computer Adaptive Language Tests (CALT) Offer a Great Potential for
Functional Testing Yet, Why Don’t They? CALICO journal, 23-39
46 Mizumoto, A., Sasao, Y., & Webb, S A (2019) Developing and evaluating a computerized
adaptive testing version of the Word Part Levels Test Language Testing, 36(1), 101-123.
47 Nogami, Y., & Hayashi, N (2010) A Japanese adaptive test of English as a foreign language:
Development and operational aspects In W J van der Linden, & C A W Glas (Eds.), Elements
of adaptive testing (pp 191–211) New York, NY: Springer.
48 Norris, J (2001) Concerns with computerized adaptive oral proficiency assessment Language
Learning & Technology, 5(2), 99-105.
49 Nydick, S W., & Weiss, D J (2009) A hybrid simulation procedure for the development
of CATs In Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing
Retrieved from www psych umn edu/psylabs/CATCentral.
50 Ockey, G J (2009) Developments and challenges in the use of computer-based testing for
assessing second language ability The Modern Language Journal, 93, 836–847
51 Okhotnikova, A., Daminova, J., Muzafarova, A., & Rasskazova, T (2019) Challenges of
designing and administering computer-adaptive tests In 13th International Technology,
Education and Development Conference (INTED) (pp 5633-5636) International Academy of
Technology, Education and Development
52 Olsen, J B., Maynes, D D., Slawson, D., & Ho, K (1989) Comparisons of paper-administered,
computer-administered and computerized adaptive achievement tests Journal of Educational
Computing Research, 5(3), 311-326.
53 Oppl, S., Reisinger, F., Eckmaier, A., & Helm, C (2017) A flexible online platform for
computerized adaptive testing International Journal of Educational Technology in Higher
Education, 14(1), 1-21
54 Pathan, M M (2012) Computer Assisted Language Testing [CALT]: advantages, implications
and limitations Research Vistas, 1(4), 30-45.
55 Rasskazova, T., Muzafarova, A., Daminova, J., & Okhotnikova, A (2017) Computerized
language assessment: Limitations and opportunities eLearning & Software for Education, 2
56 Rudner, L (2010) Implementing the Graduate Management Admission Test Computerized
Adaptive Test In van der LindenW GlasC.(Eds.), Elements of adaptive testing (pp 151–165)
57 Sasao, Y., & Webb, S (2017) The Word Part Levels Test Language Teaching Research, 21,
12–30
58 Spoden, C., Frey, A., & Bernhardt, R (2018) Implementing three cats within eighteen months Journal
of Computerized Adaptive Testing, 6(3).