By analysing the corpus of 54,566 tokens extracted from the listening transcripts of the grade-10 English textbook series, the study reported two studies on lexical demands and lexical r
The Importance of English Textbooks in Classrooms
Textbooks are essential components of language education programs, serving as primary resources that deliver structured language lessons and guide teachers in lesson planning (Richards, 2001; Lau et al., 2018; To, 2018) They are widely recognized for their significant role in English language classrooms, offering a clear syllabus and instructional support that benefit both students and teachers in their language development journey (Nu, 2018).
English textbooks serve as essential resources for both students and teachers in the process of learning and teaching the language They are designed with specific learning objectives, enabling teachers to efficiently plan their lessons and save valuable time For students, the structured content of textbooks facilitates knowledge enhancement, ensuring a smooth learning progression Additionally, these textbooks are often accompanied by supplementary materials such as workbooks, teacher guides, student books, and CDs, providing ample opportunities for practice and reinforcement.
English textbooks serve as essential tools for delivering content and knowledge, enabling students to enhance their communicative competence daily (Lau et al., 2018) These textbooks encompass crucial elements such as pronunciation, vocabulary, grammar, and the four main language skills, all of which are vital for achieving language comprehension (Floang, 2018) In Vietnam, where the language environment is limited, textbooks play a critical role in facilitating English language learning.
The Features of the English Textbook Series
Textbooks play a crucial role in aligning with the English education curriculum, ensuring that students nationwide receive consistent content and evaluations (Hoang, 2018; Richards, 2001) As part of the MoET education project, which implements a ten-year curriculum, textbooks serve as the primary medium, designed according to the GEEC of MoET (2018) These essential materials include a series of student books, teacher books, and workbooks, providing comprehensive knowledge to learners (Nu, 2018) While student books and workbooks cater to students, teacher books support educators Despite the variety of available textbooks, they consistently cover major themes related to human life, society, the environment, and the future (Hoang, 2018) Each textbook unit is structured into smaller parts that focus on communicative competence and linguistic knowledge, featuring sections for listening, reading, speaking, and writing practice, alongside new vocabulary, phonological items, and grammatical boxes to enhance language learning (Hoang, 2018).
In 2020, Circular No 25/2020 TT-BGDĐT established the textbook selection process in accordance with the Education Law General education institutions are required to convene meetings to propose a list of textbook series, which is then submitted to the Provincial Department of Education for upper secondary levels or the Department of Education and Training for primary and secondary levels The Provincial Department of Education compiles these proposals and forwards them to the Council, which conducts discussions and votes to finalize a list of approved textbooks The final results are then sent to the Provincial People’s Committee, which considers local socio-economic factors and budget constraints before approving a suitable textbook series as outlined in Decision No 442/QĐ-BGDĐT.
Lexical Demands and Listening Comprehension 9
Lexical Demands and Lexical Coverage
To enhance vocabulary learning, learners should establish long-term goals to identify the necessary vocabulary range for understanding texts (Nation, 2022) According to Nation (2013), vocabulary demand refers to the specific words a learner must comprehend a text effectively Research over the past decade has identified three comprehension thresholds for readers: 85%, 95%, and 98% (Laufer & Ravenhorst-Kalovski, 2010; Schmitt et al., 2011; Laufer, 2013; van Zeeland & Schmitt, 2013) For effective language instruction, it is recommended that students achieve at least an 85% comprehension level (Nation, 2007; McLean, 2021) Laufer and Ravenhorst-Kalovski (2010) suggest that a 95% threshold is acceptable, while 98% is optimal for meaning-focused learning Higher thresholds correlate with a lower density of unknown words, meaning that at 95% and 98% comprehension, learners may only encounter one unknown word in every 20 or 50 words.
Research across various contexts highlights the vocabulary demands necessary for understanding textbooks In Saudi Arabia, an analysis of 22 EFL textbooks revealed that over 2,500 frequently used words were essential (Alsaif & Milton, 2012) In China, Sun and Dang (2020) found that high-school students need to master between 3,000 to 9,000 words from the BNC/COCA wordlists to achieve 95% to 98% comprehension Similarly, Nguyen (2020) studied English textbooks in Vietnam and determined that learners require knowledge of the first 5,000 word families from the same lists, although the study did not account for supplementary lists, leading to increased textbook difficulty These findings indicate that a substantial vocabulary is crucial for students to effectively comprehend educational materials This study aims to evaluate the vocabulary needed to reach 95% to 98% comprehension thresholds in eight English textbook series, focusing specifically on their listening components.
Vocabulary Knowledge and Listening Comprehension
Listening is a vital skill for English learners, as highlighted by Dunkel (1991) and Rost (2002), who describe listening comprehension as a process of receiving and decoding spoken discourse Vocabulary knowledge significantly impacts listening skills, with studies by Matthews & Cheng (2015), Ha (2022), and Trang et al (2023) emphasizing that learners' proficiency in listening comprehension is closely tied to their vocabulary Key concepts such as lexical demand and lexical coverage, as noted by Nation (2016) and Ha (2021), are essential for assessing listening comprehension levels Research by van Zeeland & Schmitt (2013) indicates that achieving 95% lexical coverage is the minimum requirement for adequate comprehension, while 98% coverage is ideal for optimal understanding of listening texts Furthermore, high-frequency words are crucial, as they allow learners to recognize them easily during listening tasks (Matthews & Cheng, 2015).
Word Families and Word-Frequency Lists
In 1993, Bauer and Nation introduced the concept of a word family, encompassing a base word along with its various derivational and inflectional forms, which facilitates systematic vocabulary learning and enhances learners' understanding of affixation For instance, words like activated, activating, activator, inactivation, and reactivates all derive from the base word "activate," showcasing their connection within a word family In educational contexts, word families serve as a benchmark for assessing vocabulary load and are integral to measuring lexical coverage (Nation, 2013; Bauer & Nation, 1993) According to Ha (2022), lexical coverage utilizes word families as a counting unit based on frequency lists.
The concept of word frequency is based on the occurrence density within specific corpora, particularly the British National Corpus (BNC) and the Corpus of Contemporary American English (COCA) Introduced by Nation in 2012, the BNC/COCA wordlist is considered a more accurate measure of word frequency (Schmitt et al., 2017) and consists of 25 1,000-word-family levels along with four supplementary lists: Proper Nouns, Marginal Words, Transparent Compounds, and Acronyms (Trang et al., 2023) Vocabulary is categorized into high-frequency, mid-frequency, and low-frequency words, with high-frequency words identified as the first three 1,000-word levels, comprising essential function words and familiar content words (Schmitt & Schmitt, 2014) These high-frequency words represent the majority of running words in most texts (Nation, 2022; Yang & Coxhead, 2022) and help learners achieve up to 95% text comprehension Following high-frequency words, mid-frequency words span the fourth to ninth 1,000-word levels, while low-frequency words, found beyond the tenth level, represent the largest number of word families but contribute minimally to overall text comprehension (Laufer, 2013).
Text Length and Lexical Richness 12
The Impact of Text Length on Listening Comprehension
Miller (1956) suggests that the sequential processing of phoneme sequences can lead to listener fatigue, particularly with longer texts This indicates that text length significantly influences comprehension, as it affects both memory retention and attentional capabilities (Forrin et al., 2018; Forrin et al., 2021; Andreassen).
Long texts can distract both readers and listeners, making it challenging for them to remember and understand the content Unlike readers, listeners cannot revisit passages multiple times, which forces them to maximize their memory and decode information quickly (Wolf et al., 2019) Additionally, listeners face the challenge of processing excessive information under time constraints during listening tasks (Csikszentmihalyi, 1990) Consequently, the length of a text significantly impacts students' listening comprehension.
Lexical Richness and Lexical Diversity
Lexical richness refers to the variety of words within a specific text or sample, serving as a key metric for assessing learners' vocabulary (Daller et al, 2003; Kim et al, 2018) While it was traditionally equated with lexical diversity, recent studies have highlighted that lexical richness encompasses a broader spectrum, including both lexical diversity and sophistication (Jarvis, 2013; Read, 2000).
Lexical diversity, determined by the ratio of unique words and their repetition in a text, serves as a key indicator of linguistic complexity and language proficiency (Jarvis, 2013) Additionally, lexical sophistication predicts word difficulty and reflects vocabulary knowledge based on frequency levels, particularly focusing on advanced, low-frequency words (Laufer & Nation, 1995; Jarvis, 2013; Kim et al., 2018) In this study, lexical sophistication is defined traditionally and measured using a frequency-based approach, specifically targeting low-frequency words ranging from 4,000 to 25,000 word families in the BNC/COCA wordlist.
The Most Common Metrics of Lexical Diversity
The type-token ratio (TTR) is a well-known and straightforward index of lexical diversity, but it is significantly influenced by text length To address this limitation, this study employs three reliable metrics: Moving Average Type Token Ratio (MATTR), Vocab-D (HD-D), and Measure of Textual Lexical Diversity (MTLD), which effectively reduce the impact of text length on lexical diversity assessment.
The Moving Average TTR (MATTR or MATTR50) is a refined version of the simple TTR designed to minimize the influence of text length on readability (Covington & McFall, 2010) By segmenting the text into 50-word increments with a shift of up to one word, MATTR50 ensures a more accurate assessment Each segment's TTR is averaged, and these averages are then re-averaged to produce the final MATTR result (Zenker & Kyle, 2021).
The hypergeometric distribution diversity (HD-D) index is a statistical metric used to assess lexical diversity by calculating the probabilities of encountering all token types within a sample of 42 words from a text (McCarthy & Jari, 2010) The final value of lexical diversity is obtained by summing the probabilities of all token types.
The Measure of Textual Lexical Diversity (MTLD) is a metric that assesses the average of token strings achieving a specified Type-Token Ratio (TTR) value of 0.72, analyzed in both forward and backward directions (Koizumi, 2012; Zenker & Kyle, 2021) MTLD segments the text into various parts, considering only those segments where the last word meets the TTR threshold, while excluding strings shorter than 10 words If the text cannot reach the 0.72 TTR by the conclusion, the remaining portion is proportionately calculated based on a trajectory from 1.00 to 0.72 (Koizumi).
2012) Later on, the total tokens of the text are divided by the times that TTRs value 0.72 or below, plus the decimal of the remainder (i.e.
——The two values of the forward and backward times ofTTR < 0.72 4- decimalof the remainder ' directions are finally averaged to produce the last result of lexical diversity (McCarthy
& Jarvis, 2010; Koizumi, 2012; and Zenker & Kyle, 2021).
Lexical Sophistication
Lexical sophistication is closely linked to lexical richness and serves as a key indicator of learners' vocabulary (Kim et al., 2018) Research by Hashimoto & Egbert (2019) highlights that lexical sophistication can predict word difficulty and reflects vocabulary knowledge based on frequency levels (Jarvis, 2013) This concept also pertains to advanced words that occur at low frequency levels (Laufer & Nation, 1995; Jarvis, 2013; Kim et al.).
Kyle and Crossley (2015) proposed that lexical sophistication is linked not only to word frequency but also to the depth and breadth of vocabulary, highlighting its multidimensional nature (Kim et al., 2018).
Few studies have thoroughly examined lexical sophistication as a multi-dimensional concept Therefore, this study defines lexical sophistication using traditional definitions and assesses it through a frequency-based approach, focusing on low-frequency words from 4,000 to 25,000 word families in the BNC/COCA wordlists.
This chapter provides an overview of fundamental theories pertinent to the research, focusing on vocabulary demands, lexical richness, and the role of English textbooks The subsequent section will delve deeper into these concepts and their implications for the study.
Related Studies, which includes summaries of previous studies using the same method or recruiting participants with the same features.
2.6 Studies on the Vocabulary of English Textbooks in EFL Contexts
2.6 Í Research on vocabulary in English textbooks in EFL context
Yang & Coxhead (2022) conducted a study on the vocabulary corpus from Books 3 and 4 of the New Concept English Textbook Series (NCE) in the context of Chinese secondary schools, cram schools, and university-preparation schools Utilizing the RANGE program 1.0.0, they analyzed a total of 40,895 tokens alongside BNC/COCA lists and four supplementary lists The findings indicated that learners require between 3,000 to 5,000 word families in Book 3 and 4,000 to 6,000 in Book 4 to achieve 95% and 98% vocabulary coverage, respectively, with over 85% of this coverage attributed to high-frequency words However, the study did not provide a comprehensive overview of the entire NCE series or evaluate its appropriateness for learners.
In a study by Sun and Dang (2019) involving 265 voluntary high school students using Yilin textbooks, researchers analyzed a corpus of 273,094 tokens, significantly larger than the corpus from Yang and Coxhead (2022) The study aimed to determine the vocabulary needed for students to achieve 95% and 98% coverage of the Yilin textbooks Utilizing RANGE with BNC/COCA wordlists, the findings revealed that learners required knowledge of approximately 3,000 to 9,000 word families, along with four supplementary lists, to achieve the desired coverage Additionally, it was found that vocabulary size requirements varied by grade level for achieving 98% coverage.
In Senior 1 to Senior 3, students encountered vocabulary sizes of 9,000, 11,000, and 8,000 word families, respectively This indicates that the Yilin textbooks required an increase of 2,000 to 3,000 word families annually, placing significant pressure on learners.
In the context of English teaching and learning in Indonesia, Aziez and Aziez
A study conducted in 2018 compared English textbooks with national examination (NE) texts for junior and senior high school students, focusing on four key aspects: lexical coverage, lexical variety (TTR index), academic word inclusion, and the presence of words beyond the 2,000 high-frequency threshold Utilizing the Web VocabProfiler program to analyze the British National Corpus (BNC) and the Academic Word List (AWL), the findings revealed that students required 4,000 word families at the junior level and 5,000 at the senior level for 95% lexical coverage, while NE texts demanded 3,000 to 4,000 word families The TTR scores for textbooks averaged 0.23, compared to 0.27 and 0.38 for NE texts at the junior and senior levels, respectively Additionally, the textbooks contained a lower percentage of academic words and words beyond the 2,000-word level, with junior textbooks at 1.75% for both categories, contrasted with NE texts at 3.26% and 11.2% For senior textbooks, academic words and higher-level words constituted 3.56% and 11.87%, while NE texts accounted for 5.65% and 15.1% These results indicate that the vocabulary level in both textbooks and NE texts is too advanced for students, suggesting a need for Indonesia to revise its textbooks to better align with students' linguistic capabilities.
2.6 2 Research on vocabulary in English textbooks in the Vietnamese context
Nguyen (2020) was a pioneer in analyzing the vocabulary of Vietnamese high school textbooks, specifically Tieng Anh 10, Tieng Anh 11, and Tieng Anh 12 The study aimed to determine the lexical coverage necessary for students to understand at least 95% of the texts A total of 422 high school students from various grades across the country were tested using the Vocabulary Levels Test (VLT) to assess their vocabulary knowledge The reading passages were analyzed using Vocabprofilers on the Lextutor.ca website with BNC/COCA wordlists, and results were compared with VLT scores Findings revealed that students with knowledge of the first two high-frequency levels in BNC/COCA could only comprehend 87.1% of the textbooks To achieve 95% and 98% coverage, students would need vocabulary knowledge from the third and fifth 1,000-word lists Notably, Nguyen's study, while utilizing BNC/COCA, did not include four supplementary lists (PNs, MWs, TCs, and acronyms), which could lead to misleading results regarding lexical demand and potentially increase the perceived difficulty of the textbooks.
Embarking on investigating the same series as Nguyen (2020), Le and Dinh
In their 2022 study, Lc and Dinh analyzed the updated grade-10 textbook, focusing on the vocabulary necessary for students to achieve 95% and 98% comprehension thresholds They examined a corpus of 41,137 words extracted from written texts and audio transcripts using Vocabprofilers and BNC/COCA lists The findings revealed that Vietnamese students require between 3,000 to 5,000 word families to reach these comprehension levels, aligning with Nguyen’s 2020 research but including additional vocabulary lists The study highlighted that the textbook presents challenges for learners without adequate support Furthermore, while the textbook addressed the first high-frequency word level, it fell short for the second level, covering just over half of the second 1,000-word list Consequently, the authors recommended that teachers adapt the textbook to enhance its effectiveness in the classroom.
2.7 Research Gaps and the Present study
The reviewed literature highlights critical concerns regarding English textbooks, particularly the lexical requirements needed to comprehend 95% to 98% of their content These studies offer significant insights into textbook vocabulary research; however, they also reveal existing gaps that require further exploration.
The study by Aziez and Aziez (2018) utilized the British National Corpus (BNC) and the Academic Word List (AWL), but the AWL only represents 3.7% of high-frequency word families within the first 1,000 words of BNC/COCA, which is heavily biased towards British English and lacks sufficient American English corpus For more comprehensive results, BNC/COCA should be utilized Additionally, previous studies, including Sun and Dang (2020), varied in their evaluation methods for textbooks, with some considering only the first two 1,000-word levels as high-frequency words, while others included the first three Moreover, these studies primarily focused on frequency to assess vocabulary breadth and depth, neglecting other important vocabulary features Notably, Aziez and Aziez (2018) were the only ones to evaluate lexical diversity, but their use of the Type-Token Ratio (TTR) index, while basic, is highly sensitive to text length, potentially compromising the accuracy of their findings on lexical diversity.
Recent studies have highlighted significant disparities between English textbooks, particularly in the Chinese context, where research by Sun & Dang (2020) on the Yilin series and Yang & Coxhead (2022) on the NCE series revealed differing lexical demands In Vietnam, while Nguyen (2020) and Le & Dinh (2022) have examined new English textbook series, the Ministry of Education and Training (MoET) has introduced multiple series over the past decade, with nine series currently in use for grade 10 Despite the proliferation of these textbooks, there is a notable lack of research validating their effectiveness, indicating a gap in the investigation of their educational impact.
This chapter provides a concise overview of the significance of English textbooks, focusing on lexical demands, listening comprehension, and lexical richness It highlights previous studies relevant to the current research and identifies notable research gaps that warrant further exploration Specifically, methodological limitations are noted, such as Aziez and Aziez's (2018) use of the BNC and AWL with the TTR index, and Nguyen's (2020) reliance on 2,000 word families as high-frequency lists Additionally, the chapter points out the need for investigations into new textbook series and the differences among them.
This research seeks to address existing gaps by comparing the lexical demand and lexical richness of various English textbooks, aiming to determine if they present similar levels of difficulty The study specifically focuses on two key research questions to provide clarity on this topic.
1 How much vocabulary is required for grade-10 students to achieve 85%, 95%,and 98% coverage of listening sections in the eight English textbook scries?
2 To what extent do the listening sections in these eight grade-10 English textbook series differ in terms of text length, lexical diversity, and lexical sophistication?