1. Trang chủ
  2. » Luận Văn - Báo Cáo

lexical and morphological characteristics of english documents on information technology with implications in teaching esp at utehy

68 378 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 68
Dung lượng 1,53 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

The program produced classification of words based on three word lists available in it, the levels of word types, word tokens and word families of the corpus of texts.. The 2,000 most fr

Trang 1

o0o

ĐẶNG THỊ LOAN

LEXICAL AND MORPHOLOGICAL CHARACTERISTICS OF ENGLISH DOCUMENTS ON INFORMATION TECHNOLOGY WITH IMPLICATIONS IN TEACHING ESP AT UTEHY

(Những đặc điểm về mặt từ vựng và hình thái học của tài liệu Tiếng Anh chuyên ngành Công Nghệ Thông Tin với sự ứng dụng trong giảng dạy Tiếng

Anh chuyên ngành tại Trường ĐHSP Kỹ thuật Hưng Yên.)

M.A MINOR PROGRAMME THESIS

Hanoi, 2010

ha

Field: English Linguistics Code: 60 22 15

Trang 2

o0o

ĐẶNG THỊ LOAN

LEXICAL AND MORPHOLOGICAL CHARACTERISTICS OF ENGLISH DOCUMENTS ON INFORMATION TECHNOLOGY WITH IMPLICATIONS IN TEACHING ESP AT UTEHY

(Những đặc điểm về mặt từ vựng và hình thái học của tài liệu Tiếng Anh chuyên ngành Công Nghệ Thông Tin với sự ứng dụng trong giảng dạy Tiếng

Anh chuyên ngành tại Trường ĐHSP Kỹ thuật Hưng Yên.)

M.A MINOR PROGRAMME THESIS

Trang 3

TABLE OF CONTENTS

ACKNOWLEDGEMENTS ii

ABSTRACT iii

LIST OF ABBREVIATIONS vii

LIST OF TABLES AND FIGURES viii

CHAPTER 1: INTRODUCTION 1

1.1 Rationale 1

1.2 Aims of the study 2

1.3 Research questions 2

1.4 Methods of the study 2

1.5 Scope of the study 3

1.6 Organization of the study 3

CHAPTER 2: LITERATURE REVIEW 4

2.1 An overview of lexicon 4

2.1.1 Words and lexemes 4

2.1.2 Word classification 5

2.1.3 Word meaning 6

2.1.3.1 Grammatical meaning 7

2.1.3.2 Lexical meaning 7

2.1.4 Word types, word tokens and word families 9

2.2 An overview of morphology 10

2.2.1 Basic terminology with definitions of morphology 10

2.2.2 Inflection and derivation 11

2.2.2.1 Inflection 11

2.2.2.2 Derivation 12

Trang 4

2.2.3 Compounding and blending 13

2.2.3.1 Compounding 13

2.2.3.2 Blending 14

CHAPTER 3: LEXICAL AND MORPHOLOGICAL CHARACTERISTICS OF EIT TAUGHT AT UTEHY 15

3.1 Lexical characteristics of EIT taught at UTEHY 15

3.1.1 Research methodology 15

3.1.1.1 Methods for lexical analysis 15

3.1.1.2 Tools for lexical analysis 15

3.1.1.3 Inter-rater reliability check 18

3.1.2 Classification of vocabulary of the corpus of ESP texts 21

3.1.2.1 First 2,000 most frequent words in GSL 22

3.1.2.2 Academic word list 25

3.1.2.3 Technical vocabulary and low frequency vocabulary 25

3.1.3 Size of technical vocabulary in the ESP texts 26

3.1.4 Importance of technical vocabulary in the ESP texts 28

3.1.5 Conclusion 29

3.2 Morphological characteristics of EIT taught at UTEHY 29

3.2.1 Methodology 30

3.2.2 Typical inflectional suffixes in the corpora 30

3.2.2.1 Suffix –ing 30

3.2.2.2 Suffix –ed 31

3.2.3 Typical derivational affixes in the corpora 32

3.2.3.1 Derivational prefixes 32

3.2.3.2 Derivational suffixes 32

3.2.4 Compounding, blending and abbreviation 33

Trang 5

3.2.5 Conclusion 34

CHAPTER 4: CONCLUSIONS 35

4.1 Summary of the findings 35

4.1.1 Lexical characteristics 35

4.1.2 Morphological characteristics 35

4.2 Pedagogical implications 36

4.2.1 Implications for teaching 36

4.2.2 Implications for learning 37

4.3 Limitations and suggestions for further study 38

REFERENCES 39 APPENDIX I

Trang 6

LIST OF ABBREVIATIONS Abbreviations

AWL : Academic Word List

EIT : English for Information Technology

ESP : English for Specific Purposes

GE : General English

GSL : General Service List

UTEHY : University of Technical Education, Hung Yen

Trang 7

LIST OF TABLES AND FIGURES

Table 1 A rating scale for finding technical words

Table 2 Inter-rater reliability accuracy score calculated by the number of words

assigned to four levels by the rater1 and by the researcher Table 3 Inter-rater reliability accuracy score calculated by the number of words

assigned to four levels by the rater 2 and by the researcher Table 3 Inter-rater reliability accuracy score calculated by the number of words

assigned to four levels by the rater 2 and by the researcher

Table 4 Coverage of texts by the various levels of vocabulary tokens and types by

RANGE program Table 5 The most frequent words in word list 1

Table 6 The most frequent words in word list 2

Table 7 The most frequent words in word list 3

Table 8 The most frequent words in word list 4

Table 9 Size and different levels of the vocabulary throughout the corpus of texts Table 10 The frequency of four levels of vocabulary in the corpus

Table 11 The most frequent derivational suffixes in the corpus

Trang 8

CHAPTER 1: INTRODUCTION 1.1 Rationale

No one denies the importance of English language in the present time as global language

It is clear that English language has become more dominant around the world English is a means of communication between people of different cultures This makes English widespread On the other hand, English is the language of science and technology and most universities and institutes in the world use it in the fields of education

In learning English, a good mastery of vocabulary is essential for learners Without vocabulary, it is so difficult to convey anything Pyles and Algeo (1970) noted that ―when

we first think about the language, we think about words It is words that we arrange together to make sentences, conversations and discourse of all kinds‖ In fact, vocabulary size is important to link the four skills of speaking, listening, reading and writing all together Clearly, for learners with specific goals, knowledge of the technical terms associated with a particular field of the study will be necessary, and this type of vocabulary

is an obvious focal point in any examination related to lexis of the scientific texts Indeed, there may be a temptation to believe that a mastery of technical terms is all that is required for success in ESP reading, in Fraser (2005)

As such, vocabulary learning and teaching is a central activity in the second language classroom One of the potential vocabulary learning strategies is the use of morphological knowledge to learn vocabulary With morphological awareness, learners are able to learn complex words better by morphemes and morphemic boundaries

In the context of University of Technical Education, Hung Yen (UTEHY), general English (GE) and English for Specific Purpose (ESP) are compulsory subjects Students of IT start learning EIT at the beginning of the second year Both learners and teachers are coping with various difficulties in learning and teaching ESP, especially in technical vocabulary, including the lack of field knowledge with numerous terms, complicated structures and countless expressions, the insufficiency of teaching aids and reference material Meanwhile the teachers of ESP course need to have a thorough understanding of the nature and the role of different categories of words such as technical words, semi-technical words,

Trang 9

academic words and how vocabularies should be taught So far, there have not been any researches on EIT at UTEHY

For these reasons, the author decides to carry out the study on lexical and morphological

characteristics of English documents on Information Technology with implications in teaching ESP at UTEHY Hopefully, the thesis would bring concrete benefits to

researchers, teachers, and students of IT

1.2 Aims of the study

The inter-related aims of this thesis are:

1 to find out the lexical and morphological features of IT English texts, and

2 to draw implications in teaching ESP at UTEHY

1.4 Methods of the study

The study presents a theoretical background based on a number of materials on lexicology and morphology Next, to achieve the aims mentioned above, quantitative and qualitative methods appropriate to the corpus of linguistics are used with the support of some tools, which are RANGE program (Nation, 2005), Simple Concordance Program (Reed, 1997-2008), and especially, Chung and Nation‘s (2003) four-point rating scale All of them are presented in detail in chapter 3

Trang 10

1.5 Scope of the study

Limitations in the case of minor study mean that it is not feasible to carry out all of the levels of linguistic analysis The study only analyzes ten texts of EIT to find out their lexical and morphological characteristics, because lexical and morphological features of ESP can be analyzed under the same method with the same group of analysis tools However, the researcher focuses mainly on lexical features, and only general

morphological characteristics of the corpus such as inflection and affixation

1.6 Organization of the study

The thesis consists of four chapters, references and appendices

Chapter 1: Introduction

This chapter presents rationale, scope, and objectives of the study Research methods, research questions and organization of the thesis are also given clearly in this chapter

Chapter 2: Literature Review

This chapter provides fundamental and theoretical concepts related to the purpose of the study It deals with theories of lexicon and morphology

Chapter 3: Lexical and morphological characteristics of English documents on

Information technology at UTEHY

This chapter not only investigates lexical items but also presents morphological features of EIT documents used at UTEHY Main features

of lexical and general morphological characteristics are also indicated in this chapter

Chapter 4: Conclusions

This final chapter gives the overall answers for the research questions of the study, implications for teaching and learning of ESP, especially EIT, and some suggestions for further studies

Trang 11

CHAPTER 2: LITERATURE REVIEW 2.1 An overview of lexicon

The terms vocabulary, lexis and lexicon are synonymous They refer to the total stock of words in a language (Jackson & Amvela, 2002:11) In Richards et al (1992:212), lexicon is defined as ―a set of all the words and idiom of any language‖ The lexicon of a language is its vocabulary, including its words and expressions More formally, it is a language's inventory of lexemes

2.1.1 Words and lexemes

When linguists study the lexicon, one of the things they study is what words are In fact, this term appears to be a simple concept, but it is extremely difficult to have the best definition of the word, which can satisfy all linguists There are different definitions of words from various authors

In Cruse, D.A (1986: 35), word is defined as ―the smallest element of a sentence which has the positional mobility‖ and ―they are typically the largest units which resist interruption by the insertion of new material between their constituent parts.‖ According to Jackson and Amvela (2002:49), words are listed in dictionaries, they are separated in writing by spaces and in speech by pauses They consider the word as uninterruptible unit of structure consisting of one or more morphemes and which typically occurs in the structure of phrases

Four main characteristics of words are also presented in Biber et al (1999:51) Firstly, words, phonologically, may be preceded and followed by a pause; orthographically there are spaces of punctuation marks; syntactically, they may be used alone as a single utterance; and finally, words, semantically, can obtain one or more meanings in a dictionary

The lexicon includes the lexemes used to actualize words Lexemes are formed according

to morpho-syntactic rules and express sememes In this sense, a lexicon organizes the mental vocabulary in a speaker's mind: First, it organizes the vocabulary of a language according to certain principles (for instance, all verbs of motion may be linked in a lexical

Trang 12

network) and second, it contains a generative device producing (new) simple and complex words according to certain lexical rules For example, the suffix '-able' can be added to transitive verbs only, so that we get 'read-able' but not 'cry-able'

Cruse, D.A (1986: 76), characterized a lexeme as ―a family of lexical units‖ The term

―lexeme‖ was proposed by Lyons (1977:18-25) to avoid complexities associated with the vague word ―word‖ Let us consider these forms: go/ going/ went/ gone Four forms have four different meanings but they have a share lexical meaning and different grammatically meanings In other words, they all share a core meaning although they are spelled and pronounced differently We say that these four forms constitute one lexeme ―go‖ Biber et

al (1999:54) defined lexeme as ―a group of word forms that share the same basic meaning and belong to the same the word class‖ A lexeme may be abstract, but it can be simplified

by saying a lexeme allows different inflections to affix to it to make words For example,

go is a lexeme, meanwhile goes and going are inflected forms of go The dictionary information on a lexeme as a dictionary entry generally includes its pronunciation, part of speech, inflected forms and various meanings, generally grouped according to its senses and sub-senses

2.1.2 Word classification

Word-classification has been dealt with in different ways by different linguists Part of speech, the name employed to classify words, include nouns, verbs, adjectives, adverbs, prepositions, pronouns, conjunctions, and interjections (oh, shh, Ouch) This classification

of the words belongs to traditional grammar and the classification of the words of a language in this way depends on their grammatical functions For example, nouns can occur in certain places and have some special functions Nouns can be the head of noun phrases functioning as a clause constituent; subject, object or complement or it can combine with a preposition preceded to form a prepositional phrase which can function as subject complement and adverbial in a sentence as well as post – modifier in a noun phrase and complementation of an adjective and a verb So, word - class membership is an important lexical feature However, if we just look at a word, it is sometimes difficult to know how to classify it For example, the word ―book‖ may be a verb in ―book seats for

Trang 13

the theatre‖, a noun in ―I like this book‖ In some cases, a syllable stress helps us to determine whether a word is a noun ―‘record‖ or a verb ―re‘cord‖ But, this only works for

a limit number of word pairs

Lyons (1968: 66) used the term ―word-form‖ in word classification and classified the words into: full and empty word-forms Full word-forms are forms of the major parts of speech such as nouns, adjectives, verbs and adverbs Empty word-forms belong to a wide variety of classes such as prepositions, articles, conjunctions, and certain pronouns and adverbs Other terms found in literature, more or less equivalent to ―empty word – form‖ are ―form word‖; ―functional word‖; ―grammatical word‖ and ―structure word‖ Lyons also mentioned that in many modern schools of grammatical theory, the terms ―open and closed classes‖ used to classify the words correlates with his terms ―full and empty word – forms‖, respectively

Biber et al (1999) and Celee – Murcia and Larsen – Freemen (1983) also divided words into open classes and closed system

In conclusion, two classes of words: the open classes or full word forms and the close system or empty word forms have been discussed The closed classes contain the so-called

―grammatical‖ or ―function‖ words, which generally serve the grammatical construction of sentences They are articles, demonstratives pronouns, prepositions, conjunctions and interjections The open classes are ―content‖ words, which consist of nouns, adjectives, verbs and adverbs, carrying the main meaning of a sentence

2.1.3 Word meaning

Meaning of a language has been studied at different levels (from the morpheme to the discourse) Hence, many linguists have held a discussion of the meaning of ―meaning‖, the theories of meaning and its kinds Word meaning is divided into two types: grammatical meaning and lexical meaning which will be dealt with as follows:

Trang 14

2.1.3.1 Grammatical meaning

Lyons (1996:52) pointed out that "different forms of the same lexeme will generally, though not necessarily, differ in meaning: they will share the same lexical meaning (or meanings) but differ in respect of their grammatical meaning‖ Obviously, a meaningful sentence is composed of smaller meaningful parts, and the smaller parts are namely phrases or words which are in different forms causing different meanings of the sentence For example, the sentence ―A dog barked‖ has the different meaning from ―The dog barked‖ or ―Some dogs barked‖ or ―A dog barked‖

Lobner (2002: 12-13) also mentioned that the grammatical form of a word, e.g., singular, plural, positive, comparative, simple past tense, progressive past tense, etc has a meaning Such meanings are called grammatical meaning

In conclusion, grammatical meaning of the words is meaning in terms of grammar and their grammatical meaning only functions in contexts whereas their lexical meaning can stand on its own Thus, the word ―dog‖ has meaning to an English speaker, even out of contexts, whereas ―the‖ does not The lexical meaning and the grammatical meaning of the word together form the meaning of the sentence

2.1.3.2 Lexical meaning

Ferdinand de Saussure in Jackson & Amvela (2002:55) considered word meaning as a linguistic sign – a mental unit consisting of two faces: a concept and an acoustic image He thinks that the discussion of word meaning focuses on the relationship between the two faces of the sign, via the acoustic image or ―significant‖, i.e the signifier, on the one hand, and the concept of ―signifié‖, i.e the thing meant, on the other And he narrows down his discussion to an examination of some of the most common terms associated with the word meaning such as: denotation, connotation, reference and sense

According to Baker (1988:12), the lexical meaning of a word or lexical unit may be thought of as the specific value it has in a particular linguistic system In other words, the lexical meaning is ―the most outstanding individual property of the word‖ It can stand on its own

Trang 15

In short, basing on Lyons (1996) and Jackson & Amvella (2002), the lexical meaning of the word can be classified into denotation and connotation

a Denotation

We are likely to think that a language consists of a large number of words and each of these words has a direct correlation with something outside language, which is its meaning Thus, denotation is the linguistics term used to refer to the relation holding between a word (a lexeme) and a whole class of extra-linguistic objects Lyons (1977:207) defined the denotation of a lexeme as the relationship that holds between that lexeme and persons, things, places, properties, processes and activities external to the language system In other words, denotational (referential) meaning is the meaning of expression (a word), which it refers to or denotes or stands for For example, the word ―doctor‖ refers to the person who works in the hospital and helps other people to recover from diseases

Jackson & Amvela (2002:57) stated that denotation refers to the relationship between a linguistic sign and its denotatum or referent

a lexeme is its expressive meaning (emotive, attitudinal or affective meaning), which conveys the speakers‘ evaluation, attitudes and feeling Another part of connotation is the

Trang 16

evoked meaning of a lexeme (stylistic meaning), which is ―a consequence of the existence

of different dialects and registers within a language‖ (Cruse, 1986:282)

In conclusion, connotation helps us to have a subtle choice of a certain words Denotation and connotation are both important in order to determine word meaning in a given context

2.1.4 Word types, word tokens and word families

Let us take a look at this example:

Mary goes to Paris next week, and she intends going to Edinburgh next month

The sentence has fourteen words placed a space between each word, but two of them (the words to and next) are repeated So there are only twelve different words in the sentence

According to Carstairs-McCarthy (2002:5), there are 14 tokens and 12 types in the

sentence One may say two performances of the same tune, two copies of the same book, are distinct token of one type

The type-token distinction is relevant to the notion ―word‖ in this way Sentences (spoken

or written) may be said to be composed of word-tokens, but it is clearly not word-tokens that are listed in dictionaries Words are listed in dictionaries entries are, at one level, types, not tokens

Another term relative to the study is ―word families‖ Words are grouped into families on

the basic of their morphology, both their inflections and their derivations (Bauer and Nation, 1993) A family consists of a base form, its possible inflectional forms and the words derived from it by prefixation and suffixation According to Bauer and Nation (1993), the idea of a word family is important for a systematic approach to vocabulary teaching and for deciding the vocabulary load of texts A word family includes a collection

of formally related and semantically related word types So, the agree family could include

agree, agrees, agreed, agreeing, agreement, disagree, disagreement

Trang 17

2.2 An overview of morphology

2.2.1 Basic terminology with definitions of morphology

Morphology refers to the study of forms Linguistics morphology refers to the study of

words, their internal structure and the mental process that are involved in word formation (Arnoff & Fudeman, 2005; O‘Grady & Cuzman, 1997) It is ‗… the study of the hierarchical and relational aspects of words and the operation on lexical items according to word formation rules to produce other lexical items‘ (Leong and Parkinson, 1995, p 237) Bauer (1983:13) defined that ―morphology as a sub-branch of linguistics deals with the internal structure of word-forms‖

The basic units in morphology are morphemes Morpheme is the smallest meaningful unit

of language (any part of a word that cannot be broken down further into smaller meaningful parts, including the whole word itself) For example, the word 'items' can be broken down into two meaningful parts: 'item' and the plural suffix '-s'; neither of these can

be broken down into smaller parts that have a meaning Therefore 'item' and '-s' are both morphemes Katamba and Stonham (1993:24) defined that ―The morpheme is the smallest difference in the shape of a word that correlates with the smallest difference in word or sentence meaning or in grammatical structure‖ A morpheme may be termed free and bound morpheme Free-morpheme is a morpheme that can stand alone as an independent word (e.g 'item') Bound morpheme is a morpheme that cannot stand alone as an independent word, but must be attached to another morpheme/word (affixes, such as plural '-s', are always bound)

A root is a form which is not further analyzable, either in terms of derivational or inflectional morphology A base form is any form to which affixes of any kind can be

added This means that a derivationally analyzable form to which derivational affixes are

added can only be only referred to as a base, and the word part touchable can become an

analyzable base A stem is involved only when dealing with inflectional morphology In

this way, untouchable becomes a stem, (Bauer, 1983:20)

The analysis of words into morphemes begins with the isolation of morphs A morph is a

physical form representing some morpheme in a language (Katamba and Stonham

Trang 18

(1993:24) A morph can be defined as a segment of a word form which represents a particular morpheme, Bauer (1983:15)

If different morphs represent the same morpheme, they are grouped together and they are called allomorphs of that morpheme, Katamba and Stonham (1993:26) According to Bauer (1983:15), an allomorph is a phonetically, lexically or grammatically conditioned member of a set of morphs representing a particular morpheme For example, the plural morpheme ―-s‖, in its regular forms, has three different phonological realizations: /iz/, /s/ and /z/ depending on the phonetic environment in which the morpheme occurs, i.e it is phonetically conditioned

2.2.2 Inflection and derivation

As we know, morphemes can be divided into two major functional categories: derivational morphemes and inflectional morphemes This reflects arecognition of two principal word-

building (morphological) processes: inflection and derivation

2.2.2.1 Inflection

According to Katamba and Stonham (1993:223), inflectional morphology is concerned with syntactically driven word-formation Inflectional morphology deals with syntactically determined affixation processes They also state that inflectional morphemes do not change referential or cognitive meaning, and do not alter the word-class of the base to which it is attached as well Inflectional morphemes are only able to modify the form of a word so

that it can fit into a particular syntactic slot For example, ―book” and ―books‖ are both nouns referring to the same kind of entity The ―–s‖ ending merely carries information

about the number of those entities

The following inflectional suffixes are frequently used ones in English

-s V 3rd person, singular, present tense sleep-s

Trang 19

-ed V past tense walk-ed

-ing V progressive (incomplete action) walk-ing

(adapted from Katamba and Stonham, 1993:53) Other forms of inflection such as the following are not frequently used:

Internal change in plurals and tenses (internal vowel change) man/men; grow/grew

Suppletion in adjectives and tenses good/better; go/went

Zero inflection in plurals or tenses deer/deer; put/put

(adapted from Barnard, 2005:530) English has no inflectional prefixes, but some other languages have both inflectional prefixes and suffixes

2.2.2.2 Derivation

Derivational morphology is used to create new lexical items by either:

- modifying significantly the meaning of the base

- to which they are attached, without necessarily changing its grammatical category,

for example, ―kind‖ and ―un-kind‖

- they bring about a ship in the grammatical class of a base as well as a possible

change in meaning, for example, ―hard” (Adj) and ―hardship” (N (abs))

- or they may cause a ship in the grammatical subclass of a word without moving it into a new word-class (as in the case of friend (N (conc)) and friendship (N (abs))

(adapted from Katamba and Stonham (1993:49-51))

Trang 20

Meanwhile there are only inflectional suffixes, derivational morphemes are either prefixes

or suffixes

E.g.: Prefixes: ex-president, reread, unknown

Suffixes: childhood, centralize, greenish, derivation

Words can be built up by using a number of prefixes and suffixes for the same stem, and may become very complex

E.g.: pre-industr-ial, industry-ial-ise, industry-ial-is-ation

2.2.3 Compounding and blending

2.2.3.1 Compounding

Compounding is another word-building process According to Biber et al (1999:58), in

compounding, independently existing bases are combined to form new lexemes Carstairs

– McCarthy (2002:59) also had the similar idea that compounds are words formed by

combining roots These following are compound classes:

Compound verbs

Verb-verb (VV): stir-fry, freeze-dry

Noun-verb (NV): hand-wash, air-condition

Adjective-verb (AV): dry-clean, whitewash

Preposition-verb (PV): underestimate, outrun

Compound adjectives

Noun-adjective (NA): sky-high, oil-rich

Adjective-adjective (AA): grey-green, red-hot

Preposition-adjective (PA): underfull, overactive

Trang 21

Compound nouns

Verb-noun (VN): swearword, playtime

Noun-noun (NN): mosquito net, butterfly net

Adjective-noun (AN): blackboard, greenstone

Preposition-noun (PN): in-group, outpost

(adapted from Carstairs – McCarthy (2002:60-62))

2.2.3.2 Blending

As we see in compounding, the whole of each component root is reproduced However, we encounter a kind of compound where at least one component is reproduced only partially

These are known as blends (Carstairs – McCarthy (2002:65) Blending is a process which

collapses two words into one

For example: compuserve (fusing computer and serve)

computron (fusing computer and electron)

Trang 22

CHAPTER 3: LEXICAL AND MORPHOLOGICAL CHARACTERISTICS OF EIT

TAUGHT AT UTEHY 3.1 Lexical characteristics of EIT taught at UTEHY

3.1.1 Research methodology

Although there have been no researches on EIT coursebook at our school since it was first introduced, there have been some discussions and changes of materials 10 texts from unit

1 to unit 10 of the book Oxford English for Computing by Boeckner and Brown is

currently being taught for IT students for 36 hours of instruction in the second semester of the second year

To carry out the lexical and morphological analysis of the material, 10 texts of the book must be changed into the corpus of 10 text files because of the requirement of the methods and the tools that the researcher will use It is necessary to give a detailed description of these methods and tools before the analysis is performed

3.1.1.1 Methods for lexical analysis

The lexical analysis of English of Information Technology followed the basic steps of ESP analysis Firstly, a quantitative method appropriate to the corpus of linguistics was used with the support of RANGE and FREQUENCY program The program produced classification of words based on three word lists available in it, the levels of word types, word tokens and word families of the corpus of texts Depending on the classification of words made by the program, the researcher reclassified the vocabulary into four new levels based on the Chung and Nation‘s (2003) four-point rating scale which will be described in more detailed in the next section However, this technique was complemented by an equally important qualitative method

3.1.1.2 Tools for lexical analysis

To carry out the lexical analysis, first, RANGE program which is available at http://www.vuw.ac.nz/lals/staff/Paul_Nation was used to find the coverage of the texts by certain word lists, create word lists based on frequency and range, and to discover shared

Trang 23

and unique vocabulary in several pieces of writing (Nation, P (2005)) It is the program for Windows based PCs made by Nation, P to provide the statistics of word types, word tokens, word families, frequency of occurrences and range of occurrences to get an overall statistic description of the corpus of texts The 2,000 most frequent words from GSL, Academic Word list and the list of words which are not in the above lists or technical words are made by RANGE However, to have in-depth description of lexical characteristics of ESP, it is necessary to apply the Chung and Nation‘s (2003) four-point rating scale to get four new levels of vocabulary

Chung and Nation (2004) proved four-point rating scale technique has a higher degree of reliability than the others in identifying technical vocabulary A rating scale approach to identifying terms involves deciding whether the individual meanings of words fall into the sphere of specialised meaning or not Deciding on or interpreting the individual meanings

of words depends on the ability of the researchers to draw on their own domain knowledge and to make inferences from domain information within the context in question (Asher and Lascarides, (1996); Becka (1972); Stambuk (1998) in Chung and Nation (2004)) At the decision stage, researchers ultimately have to rely on their intuition and knowledge of the field This approach was used with the support of RANGE program, some specialised dictionaries, and discussions with specialised experts

With the application of the four-point rating scale in classification of four levels of vocabulary, words were classified as being technical or non-technical words by rating them

on a four point scale designed to measure the strength of the relationship of a word to a particular specialised field, Chung and Nation (2003) Items classified at steps 3 and 4 in Table 1 were considered to be technical words Items at steps 1 and 2 were not

Level 1 (the first 2,000 most frequent words)

Words such as function words that have a meaning that has no particular

relationship with the field of information technology, that is, words independent

of the subject matter Examples are: the, is, between, it, your, which, by, common,

commonly, directly, constantly, early, and especially

Trang 24

Level 2 (Academic Word List)

Words that have a meaning that is minimally related to the field of information

technology Examples are: incidence, detect, available, distributed

Level 3 (Technical words)

Words that have a meaning that is closely related to the field of information

technology They refer to parts, structures, operations or functions of the

computers Such words are also used in general language The words may have

some restrictions of usage depending on the subject field Examples are: access,

scan, infected, technology, crystal, voltage Words in this category may be

technical terms in a specific field like information technology and yet may occur with the same meaning in other fields and not be technical terms in those fields

Level 4 (Low frequency words)

Words that have a meaning specific to the field of information technology and are not likely to be known in general language These words have clear restrictions of

usage depending on the subject field Examples are: microchips, mainframe,

desktop, palmtop, drivers, Gridpad, pixel, antivirus

Table 1: A rating scale for finding technical words (adapted from Chung

and Nation (2003))

Words at Level 3 may have polysemes that occur in general use with little change in

meaning, for example mouse and button Level 4 includes words that even though they are

used outside information technology, they could be thought of as being information

technology terms Examples are software and database

To ensure the rating scale is used reliably, an inter-rater reliability check is carried out Inter-rater reliability check is used to estimate whether there is a reasonable degree of agreement by different raters as to where a lexical item falls on the scale To make sure that the inter-rater reliability check works efficiently, the training of raters should be done

Trang 25

using the same kinds of materials that are used for the research The way it was applied in this study will be presented later in the following section

3.1.1.3 Inter-rater reliability check

According to Chung and Nation (2003), to make sure that the scale could be applied consistently in the research, an inter-rater reliability check was carried out The researcher invited two raters who are native speakers of English and are working at our school to participate in the check The raters' task was to assess the degree of specificity of the meaning of the words in the text to the field of information technology

The first step of the check was the training of the invited raters The researcher explained the objectives of the study, the aim of the reliability check and how to consider the semantic relationship in order to place the words in the four-point scale The raters were then provided with the text in which the words to be rated were already marked Forty words (ten from each of the four steps as classified by the researcher) were randomly chosen to be used The researcher and the rater went through the words one by one together Each time the raters‘ results were compared with those of the researcher When discrepancies were found, they were discussed by the researcher and the raters and all were resolved

Then, sixty words were randomly selected from two parts of two texts, fifteen for each of the four steps, were provided for the raters to analyze independently All these terms are numbered (1), (2), (3) and (4) according to the four levels of vocabulary respectively by the researcher However, some words appear in either Level 1 or 2, according to RANGE program may go into Level 3, so that these words are marked with (1→3) for those moving from Level 1 to Level 3 and (2→3) for those moving from Level 2 to Level 3

Levels of vocabulary chosen by the researcher (from text 2)

Enter the clipboard(4) computer(2→3), a technology(2) that has been in development(1) for

the last 20 years but took hold in the mass market only this year Clipboard PCs - which, as

their name(1) suggests(1), are not much bigger than an actual clipboard - replace(1) the

Trang 26

keyboard with a liquid(1) crystal display (LCD) screen(1→3) and an electronic stylus(4) Users input data by printing(1→3) individual(2) letters(1) directly on the screen

There are two technologies at work in a clipboard PC: one allows raw data(2→3) to get

into the computer and the other allows the computer to figure out what that data means

The first technology relies(2) principally(2) on hardware(4) and varies(2) depending on the particular computer In one system, marketed under the name GRIDPad(4), the computer's LCD(4) screen is covered by a sheet(1) of glass with a transparent (3) conductive coating Voltage is sent across the glass in horizontal(3) and vertical(3) lines forming a fine grid(4); at any point on the grid, the voltage(3) is slightly different When the stylus - which is essentially(1) a voltmeter(3) - touches the screen, it informs the

computer of the voltage at that point The computer uses this information to determine where

the stylus is and causes a liquid so crystal(3) pixel(4) to appear at those coordinates(2) The position of the stylus is monitored(2→3) several hundred times a second, so as the

stylus moves across the glass, whole strings of pixels are activated

(From text 7)

Don't worry(1) too much about viruses(4) You may never see one There are just a few ways to become(1) infected(3) that you should be aware(2) of The sources(2) seem to be service people, pirated games, putting floppies(4) in publicly available(2) PCs without write-protect tabs(4), commercial software(4) (rarely), and software distributed(2) over computer bulletin(4) board systems (also quite rarely(1), despite(2) media(2) misinformation) Many viruses have spread through pirated – illegally(2) copied or broken

- games This is easy(1) to avoid(1) Pay for your games, fair and square

If you use a shared PC or a PC that has public access(2→3), such as one in a college PC lab or a library(1), be very careful about putting floppies into that PC's drives(1→3) without a write-protect tab Carry a virus-checking program and scan(3) the PC before

letting it write data onto floppies

Despite the low incidence(2) of actual viruses, it can't hurt to run a virus checking program now and then There are actually(1) two kinds of antivirus(4) programs: virus shields, which(1) detect(2) viruses as they are infecting your(1) PC, and virus scanners(4),

which detect viruses once they've infected you

Trang 27

Finally, the reliability accuracy score was used to estimate the degree of agreement between the researcher's results and the raters‘ Rosenthal (1987: 67) states that a raw accuracy score of 0.7 is desirable for rating items in four groups The following tables present the raw accuracy scores between each rater‘s results and the researcher‘s

Levels chosen by the

Table 2: Inter-rater reliability accuracy score calculated by the number of

words assigned to four levels by the rater1 and by the researcher (adapted

from Chung and Nation (2003)

In table above, we can see that the rater agreed on the assignment of all 15 of these items at Levels 1, 2 and 3 It means that the rater agreed with the researcher on all the worlds that the researcher had moved from Level 1 or 2 to Level 3 However, 4 of 15 words from Level 4 were moved to Level 3 by the rater Thus, the total agreement score is (15+15+15+11=) 56 out of 60 Since a raw accuracy score of 0.7 is acceptable, the result of 0.933 of the check is very reliable

Trang 28

Levels chosen by the

Table 3: Inter-rater reliability accuracy score calculated by the number of

words assigned to four levels by the rater 2 and by the researcher (adapted

from Chung and Nation (2003)

The table shows that the rater 2 agreed on the assignment of all 15 of these items at Levels

1 and 3 However, the rater‘s agreement was just 13 items out of 15 at Level 2, and 12 items out of 15 at Level 4 Although results from the rater is quite different from the researcher‘s, this result of 0.916 accuracy is much higher than acceptable result (0,7)

Results of the check show that it is reliable for the researcher to use the four-point rating scale consistently through the research

3.1.2 Classification of vocabulary of the corpus of ESP texts

The corpus of texts was run through the RANGE program with three base word lists available in the program The first 1,000 most frequent words, the second 1,000 most frequent words, and the Academic Word List are the word list 1, 2 and 3 respectively The first and second 1,000 most frequent words make up the Level of the first 2,000 most frequent words or the General Service List (GSL) Especially, words not in three above

Trang 29

lists are technical vocabulary and low frequency vocabulary (Chung and Nation, 2003:104)

Word types, word tokens, and word families are the units of counting of the four word lists However, word family is only reference data, not a significant unit because one or two members of a family are technical words, not all of them are (e.g., frequency and frequent) For the reason, no figure regarding word family for Word List 4 has no effect on this study

1 (first 1,000 most frequent

3.1.2.1 First 2,000 most frequent words in GSL

The first 2,000 most frequent words in GSL are those from word list 1 and word list 2 of the table 4 As we see, the word tokens in the word list 1, the first 1,000 most frequent words in GSL are 4516, comprise 73.78% of the total number of tokens, but with only 50.43% word types of the total word types in the whole corpus It means that there is an extremely high frequency of occurrences for word types in the word list 1 Moreover, the

Trang 30

largest amount of both word tokens and word types in the word list 1 shows that this word class has the highest density in all the corpus, with over a haft of all

The following table describes the most frequent types in word list 1, in which the first column is the types, the second tells each item occurs in how many texts, the third shows how many times that item occurs throughout the corpus and the rest of columns indicate in which text it occurs and its frequency in that text

Trang 31

There are 15 words types from word list 1 appearing in all 10 texts They are the, to, and,

of, a, in, is, that, are, for, it, s, they, by, can, and the frequency of these words throughout

the corpus is 351, 186, 182, 171, 158, 129, 87, 76, 66, 59, 56, 50, 32, 31 and 29

respectively The word type ―the‖ has the highest frequency of occurrence (351) and the highest range, the next is ―to‖ with 186 times of appearance However, ―to‖ plays the role

of not only preposition but also as a particle, as ―to‖ in ―to carry‖ Similarly, ―it‖ word

type with the frequency of 56 functions as a pronoun or an acronym of ―information and technology‖

In the contrast, the tokens from the second 1,000 most frequent words in GSL comprise only 4,98% with 305 items out of the whole corpus and the word types from this word list

is 162 (9.84%), the smallest category of all in the corpus The highest range in this word list is 5, which means that no words occur more than in 5 texts The frequency of occurrence of these words is much smaller than those in word list 1 Table 6 shows that

―program” has the highest frequency score 30 and the next is ―programs‖ with the

frequency of 10 The rest of words appear no more than 10 times in the whole corpus It is clear that there is a larger lexical variation in word list 2 than in word list 1

Table 6: The most frequent words in word list 2

In conclusion, the first 2,000 most frequent words in GSL make up 78.76% of the total number of tokens and 60.27% of the word types of the corpus This figure is equivalent with the statistics ―around 80% of the running words of academic text‖ presented by Nation (2001), but much higher than 20.3% of the anatomy text, and 41.8% of the applied

Trang 32

linguistics text in Chung and Nation‘s research (2003:108).This means that students at UTEHY may not have many difficulties in dealing with these specialized texts

3.1.2.2 Academic word list

The word tokens of Academic Word List from word list 3 is 9.08%, meanwhile the word types is 14.7% out of the total number of word types in the corpus It is quite higher than the figure Chung and Nation (2003:104) presented ―the vocabulary from Academic Word List covers on average 8.5% of academic text‖

Table 7: The most frequent words in word list 3

Table 7 indicates the range and frequency of some typical words found in the corpus of

texts ―Computer‖ word has the highest frequency of 68 and appears in 9 out of 10 texts,

and the next item is ―computers‖ word with the frequency of 24 and the range of 8

3.1.2.3 Technical vocabulary and low frequency vocabulary

Both technical vocabulary and low frequency vocabulary are from word list 4, the list of words that are not in three above lists (Chung and Nation, 2003:104) Table 4 shows that the number of tokens from this word list comprise 12,15% of the total of the corpus, which

is a bit higher than the number of 10% that Nation (2001) suggested However, the word types are quite large, with 25.03% out of the whole of the corpus This means that there is the largest lexical variation in word list 4

Trang 33

TYPE RANGE FREQ F1 F2 F3 F4 F5 F6 F7 F8 F9 F10

Table 8: The most frequent words in word list 4

Table 8 indicates ―software‖ word has the largest frequency of occurrence in this word list (32) and appears in high range of texts (8) Nevertheless, most of the words in this word list have low frequency and low range and the detailed classification of technical vocabulary and low frequency vocabulary will be presented in the next step of analysis

3.1.3 Size of technical vocabulary in the ESP texts

Technical vocabulary size refers to the number of words of which some aspect of meaning

is related to a specialized area In theory, three classes of vocabulary: the first 2,000 most frequent words in GSL; Academic Word List; and Technical vocabulary and low frequency vocabulary list were defined by RANGE program However, practically, technical words at level 3 can come from the high frequency words or the Academic Word List (Chung and Nation, 2003) To gain the statistics of the size of technical vocabulary in the corpus, it is necessary to reclassify words into four new levels of vocabulary, which are Level 1 - the first 2,000 most frequent words, Level 2 - Academic Word List, Level 3 - Technical words, and Level 4 - low frequency words Therefore, the four-point rating scale

Trang 34

is consistently applied to the data analysis in this study with the assistance of RANGE program after the inter-rater reliability check has been performed

In the process of assigning words to different levels, word types, rather than word families, are used as the unit of counting because a word type is a single word form; meanwhile, not all of members of a family are technical words (Chung and Nation, 2003)

Level 1 (The first 2,000 most frequent

words)

953 (57.9%)

Table 9: Size and different levels of the vocabulary throughout the corpus of texts

Table 9 shows the results of reclassification of words in the whole corpus by using the four-point rating scale It can be seen that there is a change in the number of types of each level compared with the result from RANGE program (table 4) 39 types (2.37%) from level 1 and 49 (3.0%) items from level 2 are moved to level 3 (technical words) The first 2,000 most frequent words in GSL still comprise over a haft of the word types in the corpus, which is the largest proportion of the corpus (57.9%) The second largest one is level 3 (technical words) with 23.3%, and the smallest type is level 4 with 7.1% The total proportion of technical vocabulary and low frequency vocabulary is 30.4% of the total of the corpus, which is higher than the statistics of 10% as Nation (2001) suggested However, it is still smaller than the percentage of 36.7% for technical vocabulary of medico-pharmaceutical texts found by Van Hanh, N.T, and only nearly haft of 70.1% for technical vocabulary of an anatomy text found by Chung and Nation (2003) It means that 30.4% of technical vocabulary for the ESP texts at UTEHY of this study is considered to

be relevant and is not too difficult for the IT students at UTEHY in reading these texts

Ngày đăng: 25/12/2015, 17:24

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm

w