A Corpus-based Study on Collocations of Keywords in English Business Articles on the European Debt Crisis

A Corpus-based Study on Collocations of Keywords in English Business Articles on the European Debt Crisis Đào Thị Ngọc Nguyên Trường Đại học Ngoại ngữ Luận văn Thạc sĩ ngành: Ngôn ngữ A

Trang 1

A Corpus-based Study on Collocations of Keywords in English Business Articles on the

European Debt Crisis Đào Thị Ngọc Nguyên

Trường Đại học Ngoại ngữ Luận văn Thạc sĩ ngành: Ngôn ngữ Anh; Mã số: 60 22 15

Người hướng dẫn: TS Phạm Thị Thanh Thủy

Năm bảo vệ: 2012

Abstract: One of the most problematic areas when vocabulary is dealt with is

collocation It is often seen as arbitrary and overwhelming, a seemingly insurmountable obstacle to the attainment of native like fluency This piece of work presents a study on collocations of keywords within a 20,000-word corpus of various English business articles about the European debt crisis 2011 The aim of the present study is to find out high-frequency words used within the corpus, and above all, to examine collocation patterns of keywords that distinguish the business genre of the selected texts Concordance Program 3.3 is the main methods employed throughout the study for the data collection and analysis The major findings of the research are a good number of striking collocation patterns some of the most recurrent keywords possess The major findings drawn from the research is the basis for the recommendation of pedagogical implications and suggestions for raising students'

consciousness of the English collocation acquisition

Keywords: Ngôn ngữ; Tiếng Anh; Từ vựng

Content

I INTRODUCTION

I.1.Statement of the problem and rationale of the study

However, no matter how convinced learners of English in principles of the importance

of vocabulary, the vocabulary acquisition actually poses enormous difficulties to them One of the most complicated problems arising when vocabulary is dealt with is how to combine and use words appropriately in accordance with culture or language conventions, which is often referred to as “collocation competence” (Hill,1999)

Collocations are usually defined as words that typically occur in association with other words; in reality, they run through the whole of the English language and they are as old as the language itself No piece of natural spoken and written English is totally free of

Trang 2

collocations Because of their widespread use, the role that collocations play in the language

is absolutely undeniable

For learners of English in general, with collocation competence, they should have the ability to combine lexical (and grammatical) chunks in order to produce fluent, accurate, as well as semantically and stylistically appropriate utterances For business English learners in particular, a good knowledge of collocation patterns in English is also of great importance The most important characteristics of the language of business English, as opposed to the language of general English, are a sense of purpose, intercultural dimension and a need for clear, straightforward and concise communication (Ellis & Johnson, 1994) In order to achieve these broad objectives of business English learners, teachers have to find out the best ways to teach business performance skills such as socializing, telephoning, meeting, presentation, and report writing In all these situations, collocation competence is significantly essential

With the rise of computing power as well as the acceptance of corpus linguistics since 1990s, collocations have received serious treatment The dramatic rise in processing power of computers now makes it possible to quickly compose lists of frequency for lexical items in a large corpus At the same time, there have been a large number of different software programs installed for keywords and collocations extract from corpus data Such software packages have made easier access to the investigation into typical lexical items and their collocations of any particular text genres

With the writer‟s personal interest in collocations as a researcher and observations of students‟ tough experience in dealing with collocations in business discourse as a tutor of business learners, this thesis provides a comprehensive research on collocations of keywords

in a variety of business articles written about a currently hot topic for business learners, the European debt crisis The thesis, therefore, is carried out in the hope that it may be of some help to business learners of English as well as those who find themselves interested in English semantics and collocation-related issues

I.2 Aims of the study

The aim of this research is to conduct a close investigation into collocations of keywords from a corpus of a certain number of business articles written about the European debt crisis

Trang 3

To be specific, it identifies words with high frequency of occurrence within the chosen corpus and examines their collocations The research, therefore, is carried out to answer the

following research questions:

 What are the top high-frequency words in the corpus of written articles about the European debt crisis?

 What are significant patterns and features of collocations of such keywords?

I.3.Scope of the study

This study is about to discuss keywords and their collocations in 15 written articles about the European debt crisis The designed corpus of over 20,000 words is taken from

online business articles from websites of high reputation such as The Washington Post, Money

CNN, ….Keywords chosen for analysis of significant patterns of collocation within the study

are those which can distinguish the business genre of the selected articles

I.4 Structure of the thesis

The study is organized as follows Chapter I-Introduction- is firstly introduced,

briefly stating the rationale, aims, scope and organization of the study Secondly, chapter

II-Literature review- deals with the literature setting the background for the study Thirdly,

chapter III- Research Methodology- is a presentation on the methodology of the research,

referring to the research design, data collection procedures and analytical framework of the

study Next, on chapter IV-Results and Discussion-, a detailed discussion of collocations

keywords in the selected corpus is carried out, through which some interesting aspects can be

revealed In chapter V-Conclusion- major findings of the study and pedagogical implications

and suggestions are presented

II LITERATURE REVIEW

II.1 Corpus linguistics

Trang 4

Corpus linguistics (hereafter CL) deals with the principles and practice of using such corpora in language study As a branch of linguistics, it differs from traditional linguistics as it

is related to the study of authentic examples of language (Sinclair, 1997) The main focus on

CL is to discover pattern of authentic language in order to verify a hypothesis about language, for example, to determine how the usage of a particular sound, word, or syntactic construction varies This, in turn, allows learners and researchers to ascertain related linguistic patterns and structures for the goals of their research

II.2 Sense and sense relations

In Nguyen Hoa‟s words (Nguyen Hoa, 2000:56), "sense is a philosophical term for meaning" Meaning and sense are closely related; however, sense is sometimes distinguished from meaning The meaning of a word is seen as part of the language system whereas sense is the realization of this meaning in speech According to John Lyons (1995:80), the sense of an expression may be defined as the set, or network, of sense-relations that hold between it and other expressions of the same language

Sense relation is the kind of relationship between vocabulary items when they are arranged in texts, spoken or written: how they are related to one another in terms of their meaning; how they may or may not substitute for one another; how similar or how different they are to each other and so on

II.3 Transference of meaning

In English, there are basically two types of meaning transference, namely metaphor and metonymy

II.3.1 Metaphor

According to Nguyen Hoa (2004:105), "metaphor is the transference of meaning from one object to another based on the similarity between these two objects" Traditionally, metaphors have been viewed as implicit comparisons

II.3.2 Metonymy

According to Nguyen Hoa (2004:112), metonymy can be defined as "the substitution

of one word for another with which it is associated" Thus, metonymy works by continuity rather than similarity, which means that instead of the name of one object or notion we use the name of another because these objects are associated or closely related

According to Lyons (1995:314), body parts are favourite sources of metonymy, and

many such expressions have been incorporated into the language, with words like hand, heart,

head as in have a hand in, bear one's heart, or keep your head

Trang 5

II.3.3 Other types of meaning transference

Besides metaphor and metonymy, there are other types of meaning transference

involving hyperbole, litotes, irony, and euphemisms

II.4 Collocation

II.4.1 Definition of collocation

Different linguists have different definitions of collocation Moira Runcie in Oxford

Collocation Dictionary gives a general definition in which collocation is defined as the way

words combine in a language to produce natural-sounding speech and writing To a native speaker, these combinations are highly predictable; to a learner they are anything but Specifically speaking, Chitra Fernando, Richards and others (1996:62) states that collocation refers to the restrictions on how words can be used together, for examples which prepositions are used with particular verbs or which verbs and nouns are used together In Kjellmer (1994:xiv & xxxiii), collocation is "such recurring sequences of items as are grammatically

well formed" Kathleen R McKeown and Dragomir R Radev in their paper on Collocations

regard collocations as word pairs and phrases that are commonly used in language with no general syntactic or semantic rules applied Additionally, many linguists have tried to define collocation by presenting its functions Halliday (1966) and Sinclair (1966) introduced the notion that patterns of collocation can form the basis for a lexical analysis of language alternative to, and independent of, the grammatical analysis They regarded the two levels of analysis as being complementary, with neither of the two being subsumed by the other Holding the same idea, McIntosh (1961:328) and Mitchell (1971) presented the lexical and grammatical analyses as interdependent: "Collocations are to be studied within grammatical matrices which in turn depend for their recognition on the observation of collocation similarities" (Mitchell, 1971:65) Later, Halliday (1966:151&157) argued that the collocation patterns of lexical items can lead to generalization at the lexical level Sinclair (1966:412 & 1974:16) proposed that a lexical item can be defined from its collocation pattern

II.4.2 Properties of collocation

II.4.2.1 Collocation is arbitrary

In the first place, collocation is typically characterized as arbitrary, which means that words are often combined with each other without any particular reasons

II.4.2.2 Collocation is language-specific

Secondly, collocation is language-specific as is nature persists across languages As Larson (1984:141) points out, every language interprets the physical worlds in its own way

Trang 6

instances, in French, the phrase régler la circulation is used to refer to a policeman who

directs traffic, the English collocation In Russian and German, the direct translation of regulate is used; only in English is direct used in place of regulate Similarly, American and

British English exhibit differences in similar phrases Thus, in American English one says set

the table and make a decision; whereas in British English, the corresponding phrases are lay

the table and take a decision

II.4.2.3 Collocation is recurrent in context

While the two properties mentioned above indicate difficulties in determining what is

an acceptable collocation, on the positive side it is clear that collocation occurs frequently in similar contexts It is possible to observe collocations in samples of language Generally, collocations are those word pairs which occur frequently together in the same environment, but do not include lexical items which have a high overall frequency in language This property, in fact, has exploited by many researchers in natural language processing in identify collocation automatically

II.4.3 Classifications of collocation

By examining a huge number of collocates of the same syntactic category, Kathleen R

McKeown and Dragomir R Radev in their paper on Collocations identify similarities and

differences in their behavior Distinctions are made between grammatical collocations and

semantic collocations In their opinion, grammatical collocations often contain prepositions,

including paired syntactic categories such as verb + preposition, adjective + preposition, and

noun + preposition In these cases, the open-class word is called the base and determines the words it can collocate with, the collocation indicator Semantic collocations are lexically

restricted word pairs, where only a subset of the synonyms of the collocation indicator can be used in the same lexical context

In Oxford Advanced Learner's Dictionary(Moira Runcie:2002) collocation is

classified both in terms of the grammatical pattern and the strength of collocation Firstly, according to the grammatical pattern, there exist thirteen types of collocations as follows, including: adjective + noun, quantifier + noun, verb + noun, noun + noun, preposition + noun, noun + preposition, adverb + verb, verb + verb, verb + preposition, verb + adjective, adverb + adjective, and adjective + preposition

Secondly, according to the strength of collocation, collocations are categorized into four types: unique collocations, strong collocations, medium-strength collocations, and weak

collocations

Trang 7

III RESEARCH METHODOLOGY

III.1 Data collecting instruments

III.1.1 Construction of Corpus

Since the study is primarily a corpus-based analysis of collocations, its findings come from a linguistic analysis of a substantial number of written articles The corpus of the study

is constructed from 15 extracted articles from four databases

III.1.1.1 Database

The database in this thesis refers to the set of publications from which articles for analysis have been extracted It consists of the following journals: the New York Times, Washington Post, The Guardian, CNNmoney.com, and Bloomberg.com The mentioned-above newspapers were chosen to serve as the database for the study because of their reliability and reputation for famous authors, prestige presses and worldwide use in the world

of economy

III.1.1.2 Extracted business articles

As mentioned above, 15 articles were extracted from the sample publications with a view to identifying, then examining keywords with high-frequency of occurrence and their collocation patterns The selected articles are all written about the European debt crisis in

2011, providing readers with up-to-date features, critical and systematical analysis of the crisis-related aspects The following table summarizes the corpus used in the study, including databases and the extracted texts A detailed referencing of each selected text can be found in Appendix

Table 1: List of the selected articles Database Information of the Articles

(Author, Date of publication, Title)

Average Text Length

Washington

Post

Ezra Klein (8 May 2011) Everything You Need to Know

about the European Debt Crisis in One Post

1392 words

Louis Cooper (3 Aug 2011) Debt Crisis in Europe:

Worries Grow of Spread to Larger Economies of Italy, Spain

527 words

Alex Witt (05 Feb 2011) Debt Crisis Unsettles European

Economy

1024 words

Money.cnn Ben Rooney (26 Nov 2011) Europe’s Debt Crisis: Five

Things You Need to Know

1420 words

Trang 8

Ben Rooney (1 Feb 2011) Europe’s Debt Crisis: Where

Bloomberg Simon Johnson (23 Jan 2011) Europe’s Debt Crisis is

Still Likely to End Badly

Larry Elliot, Heather Stewards and John Hooper (9 Nov

2011) European Debt Crisis Spiraling Out of Control

1383 words

(9 Aug 2011) Debt Crisis: A Default in Europe Could

Benefit Poor Countries

Donna Rogers (02 Dec 2011) An Overview of the

European Debt Crisis

806 words

Hannelore Foerster (26 Aug 2011) European Debt Crisis 5748 words

Total Corpus Length 21,083 words

III.1.2 Concordance Program

Concordance Program is a computer program that is helpful to the corpus linguist It

is used to create word lists, count word frequency, compare different usages of a word, analyze keywords, and find phrases and idioms The Concordance Program is a general-purpose working tool for studying of text, whether the text is literary, linguistic, historical, philosophical, legal, commercial, and political or of other kinds In this study, the Concordance Program 3.3 was used to search for high-frequency words and their collocations

in the corpus of business articles

III.2 Data collecting procedures

Trang 9

The research was conducted in the following steps Firstly, articles written about the European debt crisis in 2011 were copied from the websites of selected newspapers and journals, and saved as Plain Text Next, dates, titles, and the names of author in the articles were deleted from the Text Only the articles bodies were left for analysis The corpus was then fully developed from the completed Plain Text file Finally, the Concordance Program 3.3 was used to investigate the constructed corpus From the made full concordance, results and findings of the research were taken out for analysis

IV RESULTS AND DISCUSSION

The data of the study are interpreted in the following steps To begin with, analyses of the corpus are conducted using the Concordance Program 3.3, available in website: www.concordancesoftware.co.uk To get the quantitative results, 100 words with highest percentage of occurrence are listed in tables with reference to their rank and relative frequency Out of those 100 lexis, the top 25 content words are selected, from which keywords are brought out for full analysis A keyword is one which has unusually high, or low, frequency in comparison to a base reference corpus (Berber Sardinha, 1999) and thus may characterize a text or a genre (Scott, 2009) Within this study, keywords are recurrent and candifferentiate the business genre of the chosen articles However, as this is a corpus-based study on collocations, frequency alone may not be adequate; some measures of collocation strength is also required Thanks to the relatively small dimensions of the corpus, a close reading of the texts could be undertaken both manually and by computer Therefore, in the next step, concordance of the keywords is scanned in order to bring an overview of collocation patterns of keywords From that, final decision about target words for analysis is given to those with a wide and remarkable range of collocations

Once the target keywords are identified, an in-depth investigation into different collocates of the words will be carried out The investigation, in turn, is diversified as collocations are examined as regards their every possible semantic and syntactic feature For example, various senses of a word in different collocations can be interpreted through careful definition of phrases it involve, through comparisons with words convey the same meanings,

or by the researcher‟s illustrating example sentences or contexts

IV.1 Quantitative Results

Research question 1: What are the top high-frequency words in the corpus of written

business articles about the European debt crisis 2011?

Trang 10

Table 2 below illustrates frequencies of the first 100 words in the corpus of well over 20,000 words from 15 selected written articles about the debt crisis in Europe in 2011

Table 2: Top 100 high-frequency words from the constructed corpus

N Word Freq % N Word Freq %

Trang 12

It can be obviously seen from the table that, as in most English written texts, the most

frequent items in the corpus of the research are functional (or grammatical) words such as the,

to, of, and, a, that, for From the 8th item in the word list, the key (or content) words that

distinguish the business genre of the corpus start to appear Among these are debt, European,

Greece, crisis, Euro, countries, financial, bailout and so on Table 3 below shows the first 25

key words from the high-frequency word list of the corpus

Table 3: First 25keywords from the corpus

Trang 13

N Word Freq % N Word Freq %

The top 25 keywords from the corpus, as shown in Table 3, are perhaps noticeable for

a large number of geographical names for the zone and countries in which the debt crisis occurred, accurately reflecting the fact that the three countries Greece, Italy, and Spain are among the most unfortunate victims of the crisis

IV.2 Collocation analysis of content keywords

Research question 2: What are significant patterns of collocations of the content keywords

Trang 14

IV.2.1 DEBT and CRISIS

DEBT and CRISIS are the top high-frequency content words of business genre among all the words in the selected articles with the relative frequency of 179 (0.852%) and 96 (0.457) respectively Both of the two words take a wide range of collocates within the corpus They are selected for analysis at the same time and in the same section as they themselves frequently occur together throughout the articles and their collocates share a good number of common features

At the first glance, it is noteworthy that almost all of the adjectives shown in Table 4 below are used attributively within the corpus, coming before the noun CRISIS they modify

(only continuing, looming, imminent, unshakable are excluded) Semantically, these are

general adjectives susceptible to objective measure since they are used to describe the

existence and development of the debt crisis This should be a signal feature of the corpus genre; the objectiveness and concision must be guaranteed in the provision of factual

information in business articles in order to accurately report the issues

Table 4: Adjectives collocating with CRISIS

financial immediate looming possible

ongoing underlying economic sovereign

mounting full-blown continuing growing

potential imminent slow-moving unshakable

full-on European

Adjectives in the combinations with DEBT as shown in Table 5, on the other hand, are

remarkable for the predominance of words indicating the „debt owner‟ such as European,

Trang 15

Greek, Italian, Irish, or nation’s and country’s – nouns in the possessive case functioning as

adjectives This tells readers about countries that suffered the most in the stories told

Table 5: Adjectives collocating with DEBT

existing sustainable nation‟s country's

Spanish Jamaica‟s French

Both of the two groups of adjectival collocations are also distinct for a significant

number of –ing adjectives coming from the same families with verbs describing trends to

indicate the current status of the debt crisis at the time it was written about Some of other

examples are continuing, looming, rising, growing, ongoing and so on

While most of the adjectives in Table 4and Table 5are widely used, some of the words should be focused for attention as when in collocation with CRISIS and DEBT such as

immediate, unshakable, or bad, which may convey such meaning that causes confusion

among learners

A look at the nominal collocations of CRISIS in Table 8 and DEBT in Table 9 above

reveals various compound nouns of the words in the corpus, including debt crisis, future

crisis, crisis management, crisis victims, debt burden, debt payment, housing debt, public debt

and so on

Định dạng
Số trang	30
Dung lượng	368,67 KB