VIETNAM NATIONAL UNIVERSITY, HANOI University of languages and international studies Faculty of post-graduate studies ĐÀO THỊ NGỌC NGUYÊN A corpus-based study on collocations of keywords
Trang 1VIETNAM NATIONAL UNIVERSITY, HANOI University of languages and international studies
Faculty of post-graduate studies ĐÀO THỊ NGỌC NGUYÊN
A corpus-based study on collocations of keywords in English business articles
ABOUT THE EUROPEAN DEBT CRISIS
(Nghiên cứu tập hợp cụm từ của các từ khóa trong các bài báo tiếng
Anh kinh tế về cuộc khủng hoảng nợ châu Âu)
M.A COMBINED PROGRAMME THESIS
Field: English Linguistics Code: 60 22 15
Hanoi - 2012
Trang 2TABLE OF CONTENT
Declaration i
Abstract ii
Acknowledgement iii
List of tables vii
List of figures ix
CHAPTER I: INTRODUCTION 1
I.1 Statement of the problem and rationale of the study 1
I.2 Aims of the study 2
I.3 Scope of the study 3
I.4 Organization 3
CHAPTER II: LITERATURE REVIEW 4
II.1 Corpus linguistics 4
II.2 Sense and sense relations 6
II.3 Transference of meaning 7
II.3.1 Metaphor 7
Trang 3II.3.2 Metonymy 8
II.3.3 Other types of meaning transference 9
II.4 Collocations 10
II.4.1 Definitions of collocations 10
II.4.2 Properties of collocations 12
II.4.2.1 Collocations are arbitrary 13
II.4.2.2 Collocations are language-specific 13
II.4.2.3 Collocations are recurrent in context 14
II.4.3 Classifications of collocations 15
CHAPTER III: RESEARCH METHODOLOGY 17
III.1 Data collection instrument 17
III.1.1 Construction of corpus 17
III.1.1.1 Database 17
III.1.1.2 Extracted business articles 19
III.1.2 Concordance program 20
III.2 Data collection procedures 21
CHAPTER IV: RESULTS AND DISCUSSION 22
IV.1 Quantitative results 23
Trang 4IV.2 Collocation analysis of content keywords 26
IV.2.1 DEBT and CRISIS 26
IV.2.2 ECONOMIC 44
IV.2.3 MARKETS 50
Chapter V: CONCLUSION 57
V.1 Major findings 57
V.2 Pedagogical implications and suggestions 59
V.2.1 Improving collocation competence among language learners 59
V.2.2 Corpus-based activities for learner‘s collocation development in ESP class 62
V.3 Suggestions for further studies 67
REFERENCES 68
APPENDIX
Trang 5LIST OF TABLES
Table 1: List of the selected articles
Table 2: Top 100 high-frequency words from the constructed corpus Table 3: First 25 keywords from the corpus
Table 4: CRISIS Concordance (Adjective collocations)
Table 5: Adjectives collocating with CRISIS
Table 6: DEBT Concordance (Adjective collocations)
Table 7: Adjectives collocating with DEBT
Table 8: CRISIS Concordance (Noun collocations)
Table 9: DEBT Concordance (Noun collocations)
Table 10: Nouns collocating with CRISIS
Table 11: Nouns collocating with DEBT
Table 12: CRISIS Concordance (Verb collocations)
Table 13: DEBT Concordance (Verb collocations)
Table 14: Verbs collocating with CRISIS
Table 15: Verbs collocating with DEBT
Trang 6Table 16: Other patterns of CRISIS in the corpus
Table 17: Other patterns of DEBT in the corpus
Table 18: ECONOMIC Concordance (Noun collocations)
Table 19: Nouns collocating with ECONOMIC in the corpus
Table 20: Composite nominal containing ECONOMIC (with modification within the head) Table 21: Composite nominal containing ECONOMIC (with coordination in the modifier)
Table 22: MARKETS Concordance (markets as ‗the total amount of trade in a particular kind
of goods‘)
Table 23: MARKETS Concordance (markets as ‗people who buy and sell goods in
competition with each other‘)
Table 24: MARKETS Concordance (markets as ‗a particular country, area or section of
population that might buy goods‘)
Trang 7LIST OF FIGURES
Figure 1: Concordance Program‘s main screen
Figure 2: String matching of CRISIS from the corpus
Figure 3: String matching of DEBT from the corpus
Figure 4: String matching of ECONOMIC from the corpus Figure 5: String matching of MARKETS from the corpus
Trang 8CHAPTER I
INTRODUCTION I.1.Statement of the problem and rationale of the study
The importance of vocabulary in language learning has always and long been recognized, although there were times when vocabulary was treated as separated from grammar and skills However, under the light of recent studies, vocabulary has even gained much more attention Essential and crucial as it has become, vocabulary has been highlighted as the basis of language and communication Wilkins, an outstanding British linguist, once stated "without vocabulary nothing can be conveyed" Obviously, a rich knowledge of vocabulary not only makes one's ability of using the language recognized and appreciated but also makes him or her be more successful in communication
However, no matter how convinced learners of English in principles of the importance of vocabulary, the vocabulary acquisition actually poses enormous difficulties to them One of the most complicated problems arising when vocabulary is dealt with is how to combine and use words appropriately in accordance with culture or language conventions, which is often referred to as ―collocation competence‖ (Hill,1999)
Collocations are usually defined as words that typically occur in association with other words;
in reality, they run through the whole of the English language and they are as old as the language itself No piece of natural spoken and written English is totally free of collocations Because of their widespread use, the role that collocations play in the language is absolutely undeniable
For learners of English in general, with collocation competence, they should have the ability to combine lexical (and grammatical) chunks in order to produce fluent, accurate, as well as semantically and stylistically appropriate utterances For business English learners in
Trang 9particular, a good knowledge of collocation patterns in English is also of great importance The most important characteristics of the language of business English, as opposed to the language of general English, are a sense of purpose, intercultural dimension and a need for clear, straightforward and concise communication (Ellis & Johnson, 1994) In order to achieve these broad objectives of business English learners, teachers have to find out the best ways to teach business performance skills such as socializing, telephoning, meeting, presentation, and report writing In all these situations, collocation competence is significantly essential
With the rise of computing power as well as the acceptance of corpus linguistics since 1990s, collocations have received serious treatment The dramatic rise in processing power of computers now makes it possible to quickly compose lists of frequency for lexical items in a large corpus At the same time, there have been a large number of different software programs installed for keywords and collocations extract from corpus data Such software packages have made easier access to the investigation into typical lexical items and their collocations of any particular text genres
With the writer‘s personal interest in collocations as a researcher and observations of students‘ tough experience in dealing with collocations in business discourse as a tutor of business learners, this thesis provides a comprehensive research on collocations of keywords in a variety of business articles written about a currently hot topic for business learners, the European debt crisis The thesis, therefore, is carried out in the hope that it may be of some help to business learners of English as well as those who find themselves interested in English semantics and collocation-related issues
I.2 Aims of the study
The aim of this research is to conduct a close investigation into collocations of keywords from
a corpus of a certain number of business articles written about the European debt crisis To be specific, it identifies words with high frequency of occurrence within the chosen corpus and
Trang 10examines their collocations The research, therefore, is carried out to answer the following research questions:
What are the top high-frequency words in the corpus of written articles about the European debt crisis?
What are significant patterns and features of collocations of such keywords?
I.3.Scope of the study
This study is about to discuss keywords and their collocations in 15 written articles about the European debt crisis The designed corpus of over 20,000 words is taken from online business
articles from websites of high reputation such as The Washington Post, Money CNN,
….Keywords chosen for analysis of significant patterns of collocation within the study are
those which can distinguish the business genre of the selected articles
I.4 Structure of the thesis
The study is organized as follows:
Chapter I-Introduction- is firstly introduced, briefly stating the rationale, aims, scope and
organization of the study
Secondly, chapter II-Theoretical Background- deals with the theories setting the
background for the study
Thirdly, chapter III- Research Methodology- is a presentation on the methodology of the
research, referring to the research design, data collection procedures and data analysis procedures of the study
Next, on chapter IV-Results and Discussion-, a detailed discussion of collocations
keywords in the selected corpus is carried out, through which some interesting aspects can
be revealed
In chapter V-Conclusion- major findings of the study and pedagogical implications and
suggestions are presented
Trang 11CHAPTER II
THEORETICAL BACKGROUND
This chapter is going to deal with the theories setting the background for the research on collocations of keywords in business articles about the European debt crisis 2011 under the light of corpus linguistics In the first place, an overview of corpus linguistics is presented, followed by the theories of sense and sense relation Then, the literature about transference of meaning is overviewed The chapter will be closed with a presentation on collocation in an effort to provide a partial answer to three questions "What is collocation?", "What are properties of collocation that surface repeatedly across the literature?", and "How is collocation classified by different researchers?"
II.1 Corpus linguistics
Nowadays, a lot of investigation has been devoted to how computers can facilitate language learning With the help of computer technology, the contextual factors that influence variability in language use can be discovered through examples taken from corpora A corpus can be described as a large collection of authentic texts that have been gathered in electronic form according to a specific set of criteria (Bowker& Pearson, 2002)
Corpus linguistics (hereafter CL) deals with the principles and practice of using such corpora
in language study As a branch of linguistics, it differs from traditional linguistics as it is related to the study of authentic examples of language (Sinclair, 1997) The main focus on CL
is to discover pattern of authentic language in order to verify a hypothesis about language, for example, to determine how the usage of a particular sound, word, or syntactic construction varies This, in turn, allows learners and researchers to ascertain related linguistic patterns and structures for the goals of their research
Conducting a corpus analysis is the very fundamental technique used by CL Corpus analysis
is a means of accessing a corpus of text to show how any given word or phrase in the text is
Trang 12used in the immediate contexts in which it appears By grouping the uses of a particular word
or phrase on the computer screen or in printed form, the researcher shows the patterns in which the given word or phrase is typically used A large collection of a word‘s patterns then can be created very quickly and effectively Thus, CL has been widely employed in other areas of linguistics and lexicography, where corpora can be used to help dictionary markers to spot new words and identify contexts for new meanings (Meyer, 2002)
In addition to its crucial function in language study in general, the role of CL in language pedagogy has become increasingly prominent McEnery and Wilson (1996) argue that foreign language teachers usually produce simplified examples, which will raise difficulties for students when these are confronted with real, more complex language that sometimes they are incapable of processing CL can thus contribute to rendering learning a foreign language more effective since students will be faced with real language Authentic materials can motivate learners in the language classroom whereas non-authentic materials may not because they do not reflect real applications of language and thus students will lose motivation in learning in a target language
The essence of exposure to authentic materials even becomes more accurate in the case of ESP classes where all lessons are highly purpose-driven CL, therefore, takes an essential part in ESP, bringing a great deal of benefit to the teaching and learning on ESP courses However, among the many different types of corpora available such as written and spoken corpora, general reference corpora, special purpose corpora, monolingual and multilingual corpora, synchronic and diachronic corpora, open and closed corpora, and learner corpora (Biber, 1998), specialized corpora are preferable in ESP classes since they offer access to specialized vocabulary in specialized contexts In a specialized corpus, context has considerable influence
on the language choice; and the choice of language in turn plays an essential role in the shaping of the text genre
Trang 13II.2 Sense and sense relations
In Nguyen Hoa‘s words (Nguyen Hoa, 2004:56), "sense is a philosophical term for meaning" Meaning and sense are closely related; however, sense is sometimes distinguished from meaning The meaning of a word is seen as part of the language system whereas sense is the realization of this meaning in speech According to John Lyons (1995:80), the sense of an expression may be defined as the set, or network, of sense-relations that hold between it and other expressions of the same language
Sense relation is the kind of relationship between vocabulary items when they are arranged in texts, spoken or written: how they are related to one another in terms of their meaning; how they may or may not substitute for one another; how similar or how different they are to each other and so on
Nguyen Hoa(2004:121) points out that sense relations may be of two types: subsitutional and combinatorial which roughly correspond to the two Saussurean terms paradigmatic and syntagmatic Subsitutional or paradigmatic relations are those which hold between
intersubstitutable members of the same grammatical category; combinational or syntagmatic relations hold normally hold between expressions of different grammatical categories which can be put together in grammatically well-formed combinations For example, a subsitutional
relation hold between the noun bachelor and spinster, whereas the relation that holds between the adjective unmarried and the nouns man and woman is combinatorial
In discussion of combinatorial relations, there emerges a question whether any adjective can combine with any noun or any verb can go with any noun Actually, English, as well as every other language, the combinations of words of different grammatical categories are restricted
Each word tends to co-occur with a certain range of words, which is referred to as collocation relation In fact, lexemes are so highly restricted with respect to collocation acceptability that
it is almost impossible to predict their combinational relations on the basis of an independent characterization of their sense
Trang 14II.3 Transference of meaning
In English, there are basically two types of meaning transference, namely metaphor and metonymy
II.3.1 Metaphor
According to Nguyen Hoa (2004:105), "metaphor is the transference of meaning from one object to another based on the similarity between these two objects" Traditionally, metaphors
have been viewed as implicit comparisons Flood…poured in, oozes, and stern in the
following sentences are all examples of metaphors
A flood of protects poured in following the announcement
(a large quantity of…came in)
He oozes geniality (displays all over)
The government still hopes to stern the tide of inflation
(resist the force of)
However, if the fact of resemblance is explicitly signaled, by a word such as like, as in protest
came in like a flood, this is considered not to be metaphor but simile
According to Nguyen Hoa (2004:109), metaphors may be of three types
Living metaphors are those involving words used in unusual meaning and metaphors may be felt
as such (Beauty is a flower which wrinkles will devour.)
Faded metaphors lost their freshness because of long use and became habitual (dying capitalism,
to fall in love, golden youth)
Dead metaphors are words which have lost their direct meanings and are used only figuratively
(to ponder, capital, sarcasm)
Trang 15Additionally, metaphors may be divided into different subgroups Following are some commonly and widely used subgroups of metaphors in English
A subgroup of metaphors comprises names of human body transferred to other objects
Typical examples include the nose of a plane, the head of the school, or the leg of the table
A subgroup of metaphors comprises names of animals transferred to the human beings For example, a cunning person is a fox; or a hard working person is a bee
A subgroup of metaphors comprises proper names transferred to common ones For instance, a jealous person is called an Othello; and an eloquent speaker is a Cicero
II.3.2 Metonymy
According to Nguyen Hoa (2004:112), metonymy can be defined as "the substitution of one word for another with which it is associated" Thus, metonymy works by continuity rather than similarity, which means that instead of the name of one object or notion we use the name
of another because these objects are associated or closely related Examples of metonymy
include eye, shirt, and breathe in the following sentences
Keep your eye on the ball (gaze)
He is always chasing shirts (girls)
It will not happen while I still breathe (live)
According to Lyons (1995:314), body parts are favourite sources of metonymy, and many
such expressions have been incorporated into the language, with words like hand, heart, head
as in have a hand in, bear one's heart, or keep your head
Some common substitutions in metonymy include:
place-for-institution (The White House objected to the plan.)
thing-for-perception (There goes my knee.)
object-for-possessor (The crown was angry with the Prime Minister's proposal.)
Trang 16 part-for-whole (We do not like long hairs.)
place-for-event (Watergate strikes at the heart of the American political system.)
In tradition, according to Nguyen Hoa (2004:113), the following cases of metonymy are often presented
The name of container is used instead of the thing contained (to drink a glass)
Names of parts of human body may be used as symbols (to have a good eye, kind heart)
The concrete is used instead of abstract (from the cradle to the grave)
The materials are used for the things made of the materials (Canvas, glass)
The name of the author is used for his works (Watts, Picasso)
Part is used for the whole and vice versa (We all live under the same roof; She is wearing a
fox)
A subtype of metonymy is called synecdoche in which a whole is represented by naming one
of its parts, or vice versa Roof, strings, and bite in the following sentences are examples of
synecdoche
They all live under the same roof.(in one house)
At this point the strings take over (stringed instruments)
Let's go and have a bite.(have a meal)
II.3.3 Other types of meaning transference
Besides metaphor and metonymy, there are other types of meaning transference involving
hyperbole, litotes, irony, and euphemisms
Trang 17Hyperbole is an exaggerated statement not meant to be understood literally However, the
effect is powerful
For example:
It is a nightmare
A thousand thanks
Litotes is really an understatement It is traditionally defined as expressing something in the
affirmative by the negative of its contrary For instance, not bad is often used to mean good; or rather unwise to mean very silly
Irony is used to express meaning by words of the opposite sense In irony, intonation plays an
essential role For example, nice in "You have got us into a nice mess." means bad
Euphemisms involve the use of a milder expression for something unpleasant For instance,
restroom or bathroom are used instead of WC
II.4 Collocation
II.4.1 Definition of collocation
It is not easy to define what collocation is In the linguistic literature, it is often discussed in
contrast with free word combination at one extreme and idiomatic expression at the other;
collocation occurs somewhere in the middle of this spectrum A free word combination can be described using general rules; that is, in terms of semantic constraints on the words which appear in a certain syntactic relation with a given headword An idiom, on the other hand, is a rigid word combination to which no generalities apply; neither can its meaning be determined from the meaning of its parts nor can it participate in the usual word-order variations Collocation falls between these extremes and it can be difficult to draw the line between categories A word combination fails to be classified as free and is termed as collocation when
Trang 18the number of words which occur in a syntactic relation with a given headword decreases to the point where it is not possible to describe the set using semantic regularities
Thus, example of free word combinations include put + (object) or run + (object) (i.e manage) where the words that can occur as object are virtually open-ended In the case of put,
the semantic constraint on the object is relatively open-ended (any physical object can be mentioned) and thus the range of words that can occur is relatively unrestricted In the case of
run (in sense of manage or direct) the semantic restrictions on the object are tighter but still
follow a semantic generality: any institution or organization can be managed such as
businesses, ice cream parlor In contrast to these free word combinations, a phrase such as explore a myth is a collocation In its figurative sense, explore illustrates a much more restricted collocation range Possible objects are limited to words such as brief, idea, theory
At the other extreme, phrase such as fill the bill or fit the bill function as idioms, where no
words can be interchanged and variation in usage is not generally allowed
Different linguists have different definitions of collocation Moira Runcie in Oxford Collocation Dictionary gives a general definition in which collocation is defined as the way
words combine in a language to produce natural-sounding speech and writing To a native speaker, these combinations are highly predictable; to a learner they are anything but Specifically speaking, Chitra Fernando, Richards and others (1996:62) states that collocation refers to the restrictions on how words can be used together, for examples which prepositions are used with particular verbs or which verbs and nouns are used together It is defined in
Oxford Advanced Learner's Dictionary that collocation is "a combination of words in a
language that happens very often and more frequently than would happen by chance" In Kjellmer (1994:xiv& xxxiii), collocation is "such recurring sequences of items as are grammatically well formed" Kathleen R McKeown and Dragomir R Radev in their paper on
Collocations regard collocations as word pairs and phrases that are commonly used in
language with no general syntactic or semantic rules applied
Trang 19Additionally, many linguists have tried to define collocation by presenting its functions Halliday (1966) and Sinclair (1966) introduced the notion that patterns of collocation can form the basis for a lexical analysis of language alternative to, and independent of, the grammatical analysis They regarded the two levels of analysis as being complementary, with neither of the two being subsumed by the other Holding the same idea, McIntosh (1961:328) and Mitchell (1971) presented the lexical and grammatical analyses as interdependent: "Collocations are to
be studied within grammatical matrices which in turn depend for their recognition on the observation of collocation similarities" (Mitchell, 1971:65) Later, Halliday (1966:151&157) argued that the collocation patterns of lexical items can lead to generalization at the lexical level Sinclair (1966:412 & 1974:16) proposed that a lexical item can be defined from its collocation pattern
In conclusion, definitions of collocation vary across research projects by different linguists The fact that collocation is observable in large samples of language has led to the important role collocation plays Actually, collocation is used in various applications and the information about collocation is significant to many linguistic areas such as dictionary writing, natural language processing, and language teaching "In all kinds of texts collocations are essential, indispensable elements…with which our utterances are very largely made" (Kjellmer, 1987:140); "Even very advanced learners often make inappropriate or unacceptable collocations" (McCarthy,1990:13) The above quotes make two points relevant to the English learners in the learning of collocation Firstly, collocation relations are an important part of the language to be mastered Secondly, it is an area which "resists" tuition and, therefore, requires special and systematic attention
II.4.2 Properties of collocation
In discussion of the nature of collocation, linguists have been trying to generalize what characteristics collocation has in common Generally, collocation has three major features as follow
Trang 20II.4.2.1 Collocation is arbitrary
In the first place, collocation is typically characterized as arbitrary, which means that words are often combined with each other without any particular reasons
According to Gains and Redman (1986:37), a statement on collocation is never absolute Items, as they said, may co-occur simply because the combination reflects a common real
world state of affairs For instance, pass and salt collocate since people want other people to
pass them the salt The notion of arbitrariness captures the fact that substituting a synonym for one of the words in a collocation word pair may result in an infelicitous lexical combination
For example, a phrase such as make an effort is acceptable, but make an exertion is not Similarly, a running commentary, commit treason, warm greetings are all true collocations, but a running discussion, commit treachery, and hot greetings are not acceptable lexical
combinations
However, Gains and Redman (1986:37) added, there may exist an element of linguistic convention in collocation Thus, English speakers have chosen to say, for example, that lions
roar rather than bellow It is because of the linguistic conventions collocation bears that
joining together semantically compatible parts does not always produce an acceptable
collocation For instance, quiet and noise appear perfectly acceptable to co-occur; however, in reality native speakers do not say quiet noise
II.4.2.2 Collocation is language-specific
Secondly, collocation is language-specific as is nature persists across languages As Larson (1984:141) points out, every language interprets the physical worlds in its own way and has its own convention; therefore, it governs different collocability of words For instances, in
French, the phrase régler la circulation is used to refer to a policeman who directs traffic, the English collocation In Russian and German, the direct translation of regulate is used; only in English is direct used in place of regulate Similarly, American and British English exhibit differences in similar phrases Thus, in American English one says set the table and make a
Trang 21decision; whereas in British English, the corresponding phrases are lay the table and take a
decision
The characteristics above lead to the fact that what is perfectly acceptable collocation in one
language may be unacceptable in another Take the case of eat in English and ăn (eat) in
Vietnamese as a typical example Although these two words are equivalent to each other, they
cannot go with the same range of nouns While such collocations as ăn hối lộ, ăn bữa tối,
không ăn lương, ăn Tết are acceptable in Vietnamese, the verb eat in English actually cannot
co-occur with these corresponding nouns Instead, the equivalent phrases must be take bribes, have dinner, without pay, enjoy Tet in which different verbs are employed
As collocation differs from language to language, students are put to a lot of troubles in learning collocation of a foreign language Unconsciously, students fall into the habit of translating a word combination from their first language to the foreign language and
eventually get an unacceptable collocation For example, instead of saying ride bicycle,
Vietnamese learners sometimes says go bicycle because đi xe đạp (go bicycle) is totally
correct in Vietnamese
II.4.2.3 Collocation is recurrent in context
While the two properties mentioned above indicate difficulties in determining what is an acceptable collocation, on the positive side it is clear that collocation occurs frequently in similar contexts It is possible to observe collocations in samples of language Generally, collocations are those word pairs which occur frequently together in the same environment, but do not include lexical items which have a high overall frequency in language This property, in fact, has exploited by many researchers in natural language processing in identify collocation automatically
Trang 22II.4.3 Classifications of collocation
In an effort to characterize collocation, linguists present a wide variety of individual collocations, attempting to categorize them as part of a general scheme Eventually, linguists end up in different classifications of collocation corresponding to their view of collocation
By examining a huge number of collocates of the same syntactic category, Kathleen R
McKeown and Dragomir R Radev in their paper on Collocations identify similarities and
differences in their behavior Distinctions are made between grammatical collocations and
semantic collocations In their opinion, grammatical collocations often contain prepositions,
including paired syntactic categories such as verb + preposition, adjective + preposition, and
noun + preposition In these cases, the open-class word is called the base and determines the words it can collocate with, the collocation indicator Semantic collocations are lexically
restricted word pairs, where only a subset of the synonyms of the collocation indicator can be used in the same lexical context
In Oxford Advanced Learner's Dictionary(Moira Runcie:2002) collocation is classified both
in terms of the grammatical pattern and the strength of collocation Firstly, according to the grammatical pattern, there exist thirteen types of collocations as follows
1 adjective + noun: heavy traffic
2 quantifier + noun: a hand/bunch of bananas
3 verb + noun: make/ deliver/ give speech
4 noun + verb: proportion grows/ increases/ rises
5 noun + noun: project management
6 preposition + noun: along/across the road
7 noun + preposition: the light from the window
8 adverb + verb: strongly recommend
9 verb + verb: be willing to risk
10 verb + preposition: depend on
11 verb + adjective: make/ keep/ declare something safe
Trang 2312 adverb + adjective: downright/ completely/ absolutely ridiculous
13 adjective + preposition: pleased with
Secondly, according to the strength of collocation, collocations are categorized into four types:
1 Unique collocations are the most restricted ones in which the patterns have almost no
expected variations Usually, a unique collocation often forms a particular meaning rather than
a structure For example, the phrase kick the bucket is considered a unique collocation,
meaning "to die" used for bad men like thieves or murders While other nouns and verbs can
be substituted in the phrase to form other meaning phrases such as kick the door, and lift the bucket, the word combinations in these other phrases are no longer cohesive patterns in the way that kick the bucket is
2 Strong collocations are those in which any knowledge of a pattern can be incomplete
without some idea of its strong collocate Trenchant criticism and rancid butter are two
examples of collocations of this type
3 Medium-strength collocations form the great part of what we say and write This is
considered the most common and typical type of collocations Instances of medium-strength
collocations include hold a conversation, highly complicated, or direct equivalent
4 Weak collocations are often common patterns that help structure a sentence but do not
carry much specific meaning by themselves For instance, a weak collocation might be let's + verb, which is used for suggestion This is a commonly used structural pattern into which a
variety of verbs can be inserted without any changes in meaning of the phrase as a whole
Trang 24CHAPTER III
RESEARCH METHODOLOGY
The study is a corpus-based analysis of the data in business articles It attempted to investigate high-frequency words together with their collocations in a comparative number of different business articles and reports The following chapter, as denoted by its name, will outline the methodology of the research It starts with the fundamental data collecting instruments employed in the study Procedures for data collection are addressed next, followed by procedures for data analysis
III.1 Data collecting instruments
III.1.1 Construction of Corpus
Since the study is primarily a corpus-based analysis of collocations, its findings come from a linguistic analysis of a substantial number of written articles The corpus of the study is constructed from 15 extracted articles from four databases
III.1.1.1 Database
The database in this thesis refers to the set of publications from which articles for analysis have been extracted It consists of the following journals:
The New York Times
With continuous publication since its foundation in 1851, The New York Times is the third largest newspaper overall in America, behind The Wall Street Journal and USA Today Its
websites is the most popular American online newspapers website, receiving more than 30 million visitors every month
Trang 25Washington Post
Along with The New York Times, The Washington Post is generally considered one of the
leading daily American newspapers While it has distinguished itself through its particular emphasis on the operation of the US government, economic issues have increasingly become a central topic of discussion in the newspaper
business website worldwide with millions of unique visitors per month
Bloomberg.com
This is the official website of Bloomberg L.P, an American multinational mass media corporation situated in New York City, New York Bloomberg has established a privileged
position in the world of economics and finance, making up one third of the global financial
market data with estimated revenue of $6.25 billion in 2009
The mentioned-above newspapers were chosen to serve as the database for the study because
of their reliability and reputation for famous authors, prestige presses and worldwide use in the world of economy
Trang 26III.1.1.2 Extracted business articles
As mentioned above, 15 articles were extracted from the sample publications with a view to identifying, then examining keywords with high-frequency of occurrence and their collocation patterns The selected articles are all written about the European debt crisis in 2011, providing readers with up-to-date features, critical and systematical analysis of the crisis-related aspects The following table summarizes the corpus used in the study, including databases and the extracted texts A detailed referencing of each selected text can be found in Appendix
Table 1: List of the selected articles
Database Information of the Articles
(Author, Date of publication, Title)
Average Text Length
Louis Cooper (3 Aug 2011) Debt Crisis in Europe:
Worries Grow of Spread to Larger Economies of Italy, Spain
527 words
Alex Witt (05 Feb 2011) Debt Crisis Unsettles European Economy
1024 words
Money.cnn Ben Rooney (26 Nov 2011) Europe’s Debt Crisis: Five
Things You Need to Know
Trang 27Bloomberg Simon Johnson (23 Jan 2011) Europe’s Debt Crisis is
Still Likely to End Badly
Larry Elliot, Heather Stewards and John Hooper (9 Nov
2011) European Debt Crisis Spiraling Out of Control
Hannelore Foerster (26 Aug 2011) European Debt Crisis 5748 words
Total Corpus Length 21,083 words
III.1.2 Concordance Program
Concordance Program is a computer program that is helpful to the corpus linguist It is used
to create word lists, count word frequency, compare different usages of a word, analyze keywords, and find phrases and idioms The Concordance Program is a general-purpose working tool for studying of text, whether the text is literary, linguistic, historical, philosophical, legal, commercial, and political or of other kinds In this study, the Concordance Program 3.3 was used to search for high-frequency words and their collocations
in the corpus of business articles
The following illustrates a sample page for the main screen of the Concordance Program 3.3
Trang 28Figure 1: Concordance Program’s main screen
III.2 Data collecting procedures
The research was conducted in the following steps:
Firstly, articles written about the European debt crisis in 2011 were copied from the websites
of selected newspapers and journals, and saved as Plain Text
Next, dates, titles, and the names of author in the articles were deleted from the Text Only the articles bodies were left for analysis
The corpus was then fully developed from the completed Plain Text file
Finally, the Concordance Program 3.3 was used to investigate the constructed corpus From the made full concordance, results and findings of the research were taken out for analysis
Trang 29III.3 Data analyzing procedures
The data of the study are interpreted in the following steps To begin with, analyses of the corpus are conducted using the Concordance Program 3.3, available in website:
www.concordancesoftware.co.uk To get the quantitative results, 100 words with highest percentage of occurrence are listed in tables with reference to their rank and relative frequency Out of those 100 lexis, the top 25 content words are selected, from which keywords are brought out for full analysis A keyword is one which has unusually high, or low, frequency in comparison to a base reference corpus (Berber Sardinha, 1999) and thus may characterize a text or a genre (Scott, 2009) Within this study, keywords are recurrent and can differentiate the business genre of the chosen articles However, as this is a corpus-based study
on collocations, frequency alone may not be adequate; some measures of collocation strength
is also required Thanks to the relatively small dimensions of the corpus, a close reading of the texts could be undertaken both manually and by computer Therefore, in the next step, concordance of the keywords is scanned in order to bring an overview of collocation patterns
of keywords From that, final decision about target words for analysis is given to those with a wide and remarkable range of collocations
Once the target keywords are identified, an in-depth investigation into different collocates of the words will be carried out The investigation, in turn, is diversified as collocations are examined as regards their every possible semantic and syntactic feature For example, various senses of a word in different collocations can be interpreted through careful definition of phrases it involve, through comparisons with words convey the same meanings, or by the researcher‘s illustrating example sentences or contexts
Considering the illustration of data from the Concordance program in the analysis of the selected keywords, due to the pre-set function of the Concordance 3.3, only a limited number
of cases of a particular word can be shown up As a result, for each keyword, a screen shot of the concordance is provided first to ensure the reliability of the research; then, the results will
be provided in tables with the author certain aims
Trang 30CHAPTER IV
RESULTS AND DISCUSSION
This chapter presents the results of the research, followed by an in-depth discussion on the possible findings The quantitative results of the analysis are presented to address the first research question Keywords with high frequency of occurrence in the constructed corpus are demonstrated in tables After that, on closer examination, a number of key collocations of the keywords are identified regarding their striking patterns
IV.1 Quantitative Results
Research question 1: What are the top high-frequency words in the corpus of written
business articles about the European debt crisis 2011?
Table 2 below illustrates frequencies of the first 100 words in the corpus of well over 20,000 words from 15 selected written articles about the debt crisis in Europe in 2011
Table 2: Top 100 high-frequency words from the constructed corpus
Trang 32It can be obviously seen from the table that, as in most English written texts, the most frequent
items in the corpus of the research are functional (or grammatical) words such as the, to, of, and, a, that, for From the 8th item in the word list, the key (or content) words that distinguish
the business genre of the corpus start to appear Among these are debt, European, Greece, crisis, Euro, countries, financial, bailout and so on Table 3 below shows the first 25 key
words from the high-frequency word list of the corpus
Table 3: First 25keywords from the corpus
Trang 33As mentioned previously, within this research, only high-frequency keywords that differentiate the genre of the corpus, that is, written texts about an economic and financial issue, and possess striking patterns of collocates are brought into sharp focus for collocation analysis The following section, therefore, provides collocation information of four content
words, including debt, crisis, economic and markets
IV.2 Collocation analysis of content keywords
Research question 2: What are significant patterns of collocations of the content keywords
from the corpus?
IV.2.1 DEBT and CRISIS
DEBT and CRISIS are the top high-frequency content words of business genre among all the words in the selected articles with the relative frequency of 179 (0.852%) and 96 (0.457) respectively The following screen shots (Figure 2 and Figure 3) illustrate string matching of CRISIS and DEBT in the corpus respectively
Trang 34Figure 2: String matching of CRISIS from the corpus
Trang 35Figure 3: String matching of DEBT from the corpus
Both of the two words take a wide range of collocates within the corpus They are selected for analysis at the same time and in the same section as they themselves frequently occur together throughout the articles and their collocates share a good number of common features The section below looks at collocation pattern of DEBT and CRISIS, identifying adjectives, verbs, nouns as well as phrases these words can go with
At the first glance, it is noteworthy that almost all of the adjectives shown in Figure 2 above are used attributively within the corpus, coming before the noun CRISIS they modify (only
continuing, looming, imminent, unshakable are excluded) Semantically, these are general
adjectives susceptible to objective measure since they are used to describe the existence and
Trang 36development of the debt crisis This should be a signal feature of the corpus genre; the objectiveness and concision must be guaranteed in the provision of factual information in business articles in order to accurately report the issues
Table 4: CRISIS Concordance (Adjective collocations)
economic crisis However, panic due to the Greek
debt crisis hit the country in the late
2009 and early
47
if the Eurozone enters a full-on crisis For example, European debt makes
up almost half of all
98
with a slow-moving but unshakable crisis that has underscored the flaws behind
the common
340
greater penalties for bid deficits But
what appeared to be an imminent crisis
Another potential crisis bubbled up in September, as
European officials angrily warned Greece that
477
To address the growing debt crisis , Chancellor Angela Merkel of
Germany and President Nicolas
503
assistance as part of the continuing debt crisis The aid offered by countries that
use the euro was
700
already reluctant European leaders and
the European Central Bank to present a
full-blown crisis
881
―unite or face irrelevance‖ in the face of
the mounting economic crisis
in Italy ―We are witnessing 1040
underlying economic crisis He pointed out that they would
require referendums in at least four
Trang 37That helped ease fears of an immediate
debt
crisis
riskier investments as collateral for loans
to help them through the financial crisis
1506
All the adjectives going with CRISIS in the corpus are listed in Table 5 below
Table 5: Adjectives collocating with CRISIS
Adjectives in the combinations with DEBT as shown in Table 6, on the other hand, are
remarkable for the predominance of words indicating the ‗debt owner‘ such as European, Greek, Italian, Irish, or nation’s and country’s – nouns in the possessive case functioning as
adjectives This tells readers about countries that suffered the most in the stories told
Table 6: DEBT concordance (Adjective collocations)
and ever-increasing debt due to a lower cost of borrowing
Greece hired Wall Street firms, most
8
deficits were more than double previous
estimates Greek debt
was immediately downgraded The 11
more austerity, and Moody‘s has put
Spanish debt
on warning for another downgrade 67
growth, fears that Italy would develop an
Italian debt
in May In June, Moody‘s also threatened a downgrade, citing
71
Trang 38rising borrowing costs
if the euro zone enters a full-on crisis For
example, European debt
makes up almost half of all 98
Greece‘s rising debt troubled the markets from whom it
borrowed Raising more money became
138
create this frenzied fight to save the euro
But while Italy, Portugal and Ireland all
face similar debt
164
as the US and its Federal Reserve, can
buy back bad debt
from banks if such a crisis approaches
181
against potential losses on distressed
sovereign
2012, the troika engineered a default by
Greece on most of its private debt
of gross domestic product and total debt to 60 percent Violators would be
hit with sanctions unless
556
expected to begin releasing to Greece the
aid it needs to prevent a default when its
levels to 120% by 2020 Greece needs to
have this level reduced to 60% for a true
sustainable debt
1205
Trang 39Lisbon struggled on Friday to quell fears
of a looming debt
Senior officials at the major rating
agencies on Friday played down the risk
of an immediate debt
1459
Table 7 below summarizes the adjectival collocates of DEBT in the corpus
Table 7: Adjectives collocating with DEBT
Both of the two groups of adjectival collocations are also distinct for a significant number of – ing adjectives coming from the same families with verbs describing trends to indicate the
current status of the debt crisis at the time it was written about Some of other examples are
continuing, looming, rising, growing, ongoing and so on
While most of the adjectives in Table 5and Table 7are widely used, some of the words should
be focused for attention as when in collocation with CRISIS and DEBT, they may convey
such meaning that causes confusion among learners Immediate is very familiar for its most common meaning ‗happening or done without delay‘ However, in immediate crisis or immediate debt, the adjective makes the sense of ‗existing now and needing urgent attention‘ (Oxford Advanced Learner’s Dictionary 2011) The word then becomes roughly synonymous with existing, pressing, critical or urgent and can be in happy combinations with such words
as concern, problem and danger With unshakable crisis, there is an indication of a metaphor
Trang 40of ECONOMY AS A PERSON Originally used to describe a person‘s feeling or attitude that
cannot be changed or destroyed, in collocation with CRISIS, unshakable is employed to denote the firmness of a business matter In bad debt, the overall meaning of the combination does nothing with the expected senses the word debt as offered in many dictionaries Instead, bad debt becomes a fixed technical term in accounting referring to a debt that will not be paid
Table 8: CRISIS Concordance (Noun collocations)
Unlike Greece, Ireland had a balanced budget
before the
crisis hit However, it also had a huge real 33
In November 2010, Ireland, wracked by a
banking crisis
that followed the collapse of a housing
352
But the crisis response in the United States did
not depend solely on backed entities
government-662
The goal is to help present a future crisis by ensuring that governments do not
spend beyond their
creating a potential funding crisis It didn‘t help when the EBA
(European Banking Authority) did
1216
around E1.8 trillion As we learned from the
2008US crisis
, there are numerous structures that 1246
But after being in crisis mode for nearly two years, some
investors are sounding more optimistic
1320
The nearly three-year-old crisis appears to be entering a new phase
as the respite in global financial