Luận văn thạc sĩ VNU ULIS achieve, attain and accomplish from a corpus based perspective

Statistical analysis in combination with intuitive-based interpretation of the data reveals significant findings: 1 the three verbs have both overlapping as well as exclusive senses, who

Trang 1

VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF LANGUAGES AND INTERNATIONAL STUDIES

FACULTY OF POSTGRADUATE STUDIES

LÊ THỊ THU HỒNG

“ACHIEVE”, “ATTAIN” AND “ACCOMPLISH”

FROM A CORPUS-BASED PERSPECTIVE

“Achieve”, “attain” và “accomplish”

dưới góc nhìn của phương pháp khối liệu

M.A MINOR PROGRAM THESIS

Field: English Linguistics Code: 60220201

HANOI, 2017

Trang 2

VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF LANGUAGES AND INTERNATIONAL STUDIES

FACULTY OF POSTGRADUATE STUDIES

LÊ THỊ THU HỒNG

“ACHIEVE”, “ATTAIN” AND “ACCOMPLISH”

FROM A CORPUS-BASED PERSPECTIVE

“Achieve”, “attain” và “accomplish”

dưới góc nhìn của phương pháp khối liệu

M.A MINOR PROGRAM THESIS

Field : English Linguistics Code : 60220201

Supervisor : Dr Trần Thị Thu Hiền

HANOI, 2017

Trang 3

DECLARATION OF AUTHORSHIP

I hereby declare that the thesis entitled ―ACHIEVE, ATTAIN AND ACCOMPLISH FROM A CORPUS-BASED PERSPECTIVE‖ is the result of my own study It was conducted with scientific guidance of Dr Trần Thị Thu Hiền The data and conclusions of the study presented in the thesis have never been published

in any form

Trang 4

ACKNOWLEDGEMENTS

I would like to express my deepest gratitude towards my supervisor, Dr Trần Thị Thu Hiền, for her immense support and invaluable guidance without which my study would be far from finished Also, I am grateful to all the lecturers and staffers

at the Faculty of Post-graduate Studies, University of Languages and International Studies, Vietnam National University of Hanoi Their support and consideration have enabled me to pursue the course Last but not least, my sincere thanks go to

my beloved family for their love, encouragement and support while I was conducting this research

Trang 5

ABSTRACT

This descriptive research exploits corpus linguistic methods in order to

differentiate the three synonymous verbs achieve, attain and accomplish by

realizing their similarities and differences in meanings and usages The data collection instruments include two large corpora, namely the Corpus of Contemporary American English and the Collins Wordbank Online, and six dictionaries Statistical analysis in combination with intuitive-based interpretation of the data reveals significant findings: (1) the three verbs have both overlapping as well as exclusive senses, whose frequencies are different across words; (2) regarding register, all the verbs are most preferred in academic journals even though

accomplish has lower formality level than the other two; and (3) in terms of

collocational properties, despite a few mutual collocates, each verb tends to favorably co-occur with a distinctive group of nouns as object

Keyword: near-synonym, corpus linguistics, word sense, collocation

Trang 6

TABLE OF CONTENS

DECLARATION OF AUTHORSHIP i

ACKNOWLEDGEMENTS ii

ABSTRACT iii

LIST OF TABLES vi

LIST OF FIGURES vii

CHAPTER 1 INTRODUCTION 1

1.1 Rationale 1

1.2 Aim and objectives of the study 2

1.3 Research questions 2

1.4 Research methods 2

1.5 Scope of the study 3

1.6 Significance of the research 3

1.7 Organization of the study 3

CHAPTER 2 LITERATURE REVIEW 5

2.1 Synonymy 5

2.1.1 Synonymy as absolute synonymy 5

2.1.2 Synonymy as near-synonymy 5

2.1.3 Near-synonymic differences 7

2.2 Corpus linguistics 14

2.2.1 Corpus 14

2.2.2 Corpus linguistics 15

2.2.3 Corpus linguistics in synonymy study 15

2.3 Previous studies 16

CHAPTER 3 METHODOLOGY 18

3.1 Research approaches 18

3.2 Data sources 18

3.3 Data collection procedure 20

Trang 7

3.3.1 Phase 1 - Word senses and frequencies of senses 20

3.3.2 Phase 2 - Register 21

3.3.3 Phase 3 - Collocational properties 21

CHAPTER 4 FINDINGS AND DISCUSSION 23

4.1 Word senses and frequencies of senses 23

4.1.1 Word senses 23

4.1.2 Frequencies of senses 27

4.2 Register 29

4.3 Collocational properties 32

4.3.1 Preferred collocation 32

4.3.2 Less preferred and anti-collocation 39

CHAPTER 5 CONCLUSION 44

5.1 Concluding remarks 44

5.2 Implications 45

5.3 Limitations of the study and recommendations for further research 47

REFERENCES 48 APPENDIX I

Trang 8

LIST OF TABLES

Table 2.1 Dimensions of denotation variations 11

Table 4.1 Dictionary senses of achieve, attain and accomplish 27

Table 4.2 Sense distribution of achieve, attain and accomplish 27

Table 4.3 Frequencies of achieve, attain and accomplish in different genres 30

Table 4.4 Top mutual collocates of achieve, attain and accomplish 33

Table 4.5 Top object collocates of achieve only, 36

attain only and accomplish only 36

Trang 9

LIST OF FIGURES

Figure 2.1 Classification of synonymic difference by Edmonds (1999) and

Edmonds and Hirst (2002) 10

Figure 2.2 Gove‘s (1973) entry (abridged) for the near-synonyms of lie 11

Figure 3.1 Corpus command for frequencies on the COCA (screenshot) 21

Figure 3.2 Command for collocation in the CWO (screenshot) 22

Figure 4.1 The proportion of tokens in different genres for achieve, attain and accomplish 31

Figure 4.2 Sketch difference of objects between achieve and attain 40

Figure 4.4 Sketch difference of objects between attain and accomplish 42

Figure 4.5 Summary of preferred, less preferred and anti-collocates of achieve, attain and accomplish 43

Trang 10

CHAPTER 1 INTRODUCTION

1.1 Rationale

More than being a linguistic instrument, English, the world language, leads its learners to broadened horizons and brings them to various perspectives Hence, the teaching of English as a second or foreign language has never ceased to be vital

Indeed, it is hardly of any surprises that English as a school subject accounts for the most teaching hours in classrooms all over the world compared to any other subject

Vietnam, in the process of renovating its education and particularly its English language teaching, has placed emphasis on the development of teachers‘

and learners‘ proficiency of the language

As a Vietnamese learner and teacher of English, the author has recognized difficulties met by non-native speakers in terms of understanding and using correct vocabulary in different contexts This challenge becomes even more significant when it comes to word choice among confusing synonyms Among various

confusing groups of synonyms, achieve, attain and accomplish appear one of the

most challenging to English learners as well as the author Are they completely the same in meaning? If not, how are they different? In which aspects do they resemble and/or differ? Motivated by the desire to better understand this issue, the author intends to investigate the meaning and usages of these often-misused synonyms,

achieve, attain and accomplish

Since the arrival of information technologies and the development of computer, corpora have been revolutionized into enormous electronic collections of authentic texts which provide invaluable insights into the distribution of words in a language This would assist language researchers as well as language users, especially those who are non-native, to differentiate between near-synonyms based

on their patterns of distribution retrieved automatically from corpora This encourages the author to exploit this immensely promising tool to examine the near-synonyms mentioned

Trang 11

1.2 Aim and objectives of the study

The study aims to distinguish the three synonyms achieve, attain and

accomplish from a corpus-based perspective as an illustrative example of one way

to realize the nuances of meanings and usages between synonymous words

It is the author‘s assumption that the corpus linguistic approach applied in

the study can prove that achieve, attain and accomplish are near-synonyms and they

have overlapping senses as well as distinct shades of meaning Also, they may be similar and/or different in terms of usage, to be more specific, in their genre preference and collocational behaviors

With the aforementioned aim and assumption, the objectives of the study are

to (1) identify the overlapping and exclusive senses of each synonym and the frequencies of these senses, (2) find out the genre preferences of each term, and (3) draw out and compare the collocational properties of the target verbs

1.3 Research questions

In order to fulfil the above objectives, the research questions are conducted

as follows (1) What are the similarities and differences in word sense and frequencies of

sense of achieve, attain and accomplish?

(2) What are the similarities and differences in register of achieve, attain and

accomplish?

(3) What are the similarities and differences in collocational properties of

achieve, attain and accomplish?

1.4 Research methods

The study relied on corpus-based approach with the vast data collected from two large corpora (Corpus of Contemporary American English and Cobuild Wordbank Online) and six dictionaries The process of dictionary consultation

revealed similarities and differences in the senses of achieve, attain and accomplish,

while an analysis of concordances from the corpora showed the frequencies of these senses Later the two corpora were used to extract data on the synonymous verbs‘

register and finally compare their collocational patterns The results were then thoroughly analyzed and interpreted by the author

Trang 12

1.5 Scope of the study

This study does not aim to be an extensive account of all aspects of synonymy Rather, it just covers some of the most practical patterns of usage, including word senses and sense frequencies, style or register and collocational behaviors Also, within the study space permission, the focus of the investigation is

limited to three specific verbs, achieve, attain and accomplish, not any other

member of their synonym cluster Similarly, the variations among different varieties

of English are not considered for that would complicate the comparison In terms of collocational properties, for the same space restriction, only collocating objects are analyzed

1.6 Significance of the research

This piece of research is significant in the context of both English linguistics research and English teaching and learning in Vietnam for a number of reasons

Firstly, while the number of studies applying corpora in linguistics in general and lexical semantics in particular is soaring worldwide, that in Vietnam is still very limited In fact, to the best of the author‘s knowledge, very few research on the similar topic can be found to have been done in Vietnam This considerably emphasizes the necessity of this research Secondly, this research and its results would, in the author‘s opinion, greatly facilitate teaching and learning English for EFL teachers and learners It is often that teachers of English find themselves asked questions about synonyms such as ―How are these words different?‖, ―Are they the same in every situation?‖, ―Can they substitute each other?‖, etc More often than not, they may have no better answers than ―It is just the way it is‖, which could hardly help students‘ language ability This research, especially its methodology and results, would illustrate a very promising tool and method for synonym differentiation for EFL teachers and students

1.7 Organization of the study

The study consists of five chapters Chapter One gives a brief overview of the study, including the rationale, aims and objectives, research questions, research

Trang 13

methods, the scope of the study and the significance of the research Chapter two presents the literature review on corpus linguistics and synonymy study with the emphasis on the framework for this research Chapter three outlines the research methodology adopted with detailed description of the research approach, data collection instruments and data collection procedure Chapter four illustrate the findings and analysis of those findings from dictionary and corpus data Finally, chapter five sums up and give an interpretation to the findings described in the previous chapter, pointing out the relevance of this research to teaching and learning English in general as well as lexicon in particular; also, limitations of the study and recommendations for further research are mentioned in this concluding chapter

Trang 14

CHAPTER 2 LITERATURE REVIEW

This chapter provides an overview on synonymy and synonymic difference, along with the literature of corpus-based approach on synonymy study

to the many different approaches to defining synonymy, which will be reviewed in this section

2.1.1 Synonymy as absolute synonymy

Some linguists, such as Lyons (1977), have looked at synonym as absolute

synonyms, that is, words which are interchangeable in all possible contexts without

meaning alteration This, however, has been challenged by Quine (1951), on the grounds that it is impossible to determine whether the expressions before and after substitution have the same meaning From a different angle, Goodman (1952), claims that no two words can have the same meaning, for there would always be some contexts in which two putative words are not completely interchangeable

Even if absolute synonyms are arguably possible, pragmatic and empirical evidences show that it is very rare Clark (1992), in her Principle of Contrast, pronounces that language constantly changes to eliminate absolute synonyms If an absolute synonym would not take on new nuance(s) of meaning, it would fall into disuse

2.1.2 Synonymy as near-synonymy

It is quite largely agreed that absolute synonyms are virtually non-existent

However, there are nearly absolute synonyms which can substitute each other in

contexts with minor differences in the overall expression Lexicographers obviously acknowledge that synonym is a matter of degree, on account that every dictionary

of synonyms, in fact, differentiate between near-synonyms Synonymy is defined in

Trang 15

terms of similarities in meaning, although how similar in meaning is still a question

of debate Traditionally, synonyms are defined as closely related words that differ in minor ways, but a broader definition includes words with merely one or more related characters of meaning (Egan, 1973) To be specific, Roget applied the principle of ―the grouping of words according to ideas‖ (Chapman, 1992), while

lexicographers of Webster‘s New Dictionary of Synonyms used the following more

precise definition (as in Edmonds, 1999):

A synonym, in this dictionary, will always mean one of two or more words in the English language which have the same or very nearly the same essential meaning […] the two or more words which are synonyms can be defined in the same terms

up to a certain point

(Gove, 1973)

Ultimately, the level of openness of a synonym definition in each dictionary

depends on its purposes, in the sense that Roget‘s Thesaurus is likely to be better for word searching, whereas Webster‘s New Dictionary of Synonyms appears to be

superior for word discrimination Due to the study‘s aim being finding the differences among near-synonyms, the latter would serve as one effective tool for data collection

Similar to lexicographers, semanticists have appeared to agree on synonym as a matter of degree Ullmann (1962) defined near-synonymy as having similar

―objective‖ meaning, but possibly having different emotive, stylistic or dialectal meaning Lyons (1995) argued that near-synonyms are ―more or less similar, but not identical in meaning‖ He also added a distinction between near-synonym and

partial synonym, though it is not clear why Partial synonyms fail to qualify as

absolute synonyms for either they are not ―complete‖, i.e not identical ―on all dimensions of meaning‖, or they are not ―total‖, i.e not ―synonymous in all

contexts‖ (1981) For example, big and large are partial synonyms because despite being complete synonyms, they are not total synonyms A big mistake is fine whereas a large mistake is unacceptable

Giving a more precise definition on synonymy, Cruse (1986) differentiated two kinds of near-synonymy which roughly correspond to Lyons‘ classification One

Trang 16

is cognitive synonyms, which refer to words which the same truth conditions but different expressive meaning, style or register, such as fiddle and violin The other is

plesionyms, which are lexical items without totally the same truth conditions, but still

yield semantically similar expressions, for instance, foggy and misty

Unfortunately, the aforementioned distinction seems unrealistic for determining synonym differences, for it only covers the aspect of propositional meaning, one among the many types of synonym variations Moreover, two definitions of synonyms would just complicate the categorization

In order to solve the problem, Edmonds (1999) introduced the concept of

granularity into defining synonymy, aiming to include the level of detail used to

describe or represent the meanings of words Due to its possibility of marking the difference between the essential and peripheral meanings of a word, this concept helps construct more rigorous definition of synonym However, it is still difficult to

set a benchmark for an appropriate level of granularity in the representation of

word meaning to precisely define near-synonym In an attempt for a rigorous definition of near-synonym, Edmonds proposed that:

Near-synonym preserve truth conditions, or propositional meaning, to a level of

granularity of representation consistent with language independence in most contexts when interchanged

Having the same school of thoughts, DiMarco, Hirst and Stede (1993) (in Edmonds and Hirst, 2002) claimed that near-synonyms are words that are close, but not identical in meaning They ―vary in their shades of denotation, connotation, implicature, emphasis or register‖ Similarly, Inkpen and Hirst (2006) emphasized that near-synonyms are not completely interchangeable but differ in denotational or connotational meaning; they may vary in grammatical or collocational behaviors

Overall, these notions can hardly settle the debate on synonymy, but they provide theoretical implications for lexical semantics

2.1.3 Near-synonymic differences

As presented in the previous section, it is generally agreed that absolute synonyms virtually do not exist Synonymy is widely considered near-synonymy,

Trang 17

for which examples can be found easily Thin, slim and skinny all denote a state of body figure; however, while thin carries a neutral tone, slim and skinny respectively convey a positive and negative sense from the speaker Similarly, pissed, drunk and

inebriated are correspondingly informal, neutral and formal expression of the same

denotation, which is being affected by alcohol to the extent of losing control of one‘s faculties or behavior, according to Cambridge English Dictionary

In any discussion of near-synonym, the most discussed concept would be

synonymic difference (Edmonds and Hirst, 2002), for there must be some

distinctions between two putative synonyms that make them unidentical As illustrated by the examples above, near-synonyms not only differ in denotational meaning, but also in every aspect of their meaning Comprehension of synonym differences is crucial in language use, especially for EFL learners, who usually lack native linguistic intuition in word selection

There are multiple ways in which synonyms can differ Cruse (1986) lists four broad type of differences in synonymic meanings:

 denotational or propositional meaning

 stylistic meaning (dialect and register)

 expressive meaning (affect, emotion and attitude), and

 presupposed meaning (selectional and collocational variations)

DiMarco, Hirst and Stede (1993) investigated synonyms in terms of semantic and stylistic distinctions, i.e denotational and connotational differences However, this categorization seems not precise enough Denotation refers to the literal, explicit meaning of a word, while connotation covers any other aspect that is not denotation This makes the term too broad and ambiguous to become a criterium for synonymous distinction

Having a to some extent similar classification to Cruse‘s, Gove (1973) argues that synonyms may have distinctions in:

 implications

 connotations, and

Trang 18

 applications Gove‘s criteria include both propositional and peripheral meaning; however, it is unclear why he did not include stylistic difference in the categorization despite his extensive discussion on the matter All of the above classifications are combined by Edmonds (1999) and Edmonds and Hirst (2002) to develop a categorization of synonymous differences with more sub-classes The categorization also includes four main variations, which are illustrated in figure 2.1

Denotational variation of near-synonym has proved to be the most complicated to sort out It involves differences not only of simple features but of

―full-fledged concepts or ideas‖, with relation to roles and aspects of a context

According to Edmonds (1999), many concepts or ideas in which near-synonyms differ can be considered to be dimensions of variations, such as continuous, binary, different phases of a process, referent to world-knowledge, etc See table 2.1 for examples of synonyms with different dimensions of variations Within the limited scope of the study, the author would not go into such detailed classification but would just consider different dimensions of denotational variations as denotational

variations, or nuances in word senses

In terms of variations in manner of expression, the most likely related aspect

of denotational variation to this study is synonymic difference in frequencies of

senses This represents the frequency that a synonym expresses a specific sense in

real language usage, which is usually referred to in frequency terms such as always,

often, usually, etc in dictionaries However, this use of frequency terms by

lexicographers could not adequately specify how similar or different the frequencies

of expression are between two synonyms Take Gove‘s (1973) entry for the

near-synonyms of lie (as shown in figure 2.2) as an example The underlined frequency

terms, ―usually‖, ―often‖, and ―sometimes‖, only give a very vague idea of the words‘

frequencies of senses, i.e dictionary users can hardly determine which sense of which word is more prominent/popular than the other(s)

Trang 19

Classification of synonymic difference

Figure 2.1 Classification of synonymic difference by Edmonds (1999) and

Edmonds and Hirst (2002)

DENOTATIONAL VARIATION

fine-grained technical variations abstract denotational variations

manner of expression

indirectness

emphasis

frequency of sense

dimensions of denotational variation

basic dimension

complex dimension

specificity

fuzzy and overlapping words

STYLISTIC VARIATION

dialect

register

EXPRESSIVE VARIATION

emotive aspects

attitudinal aspects

STRUCTURAL VARIATION

collocational aspects

syntactic aspects

Trang 20

Continuous dimension mistake: error: blunder (severity)

absorb: digest: assimilate (slowness)

Binary dimension escort: accompany (protection)

abandon: forsake (renunciation) blunder: mistake (stupidity)

Multi-value dimension order: command (authority)

accompany: attend (status) lie: misrepresentation (contradiction) adore: worship: idolize (admiration)

Complex (process) begin: start: initiate

Specificity eat: consume: devour: dine: gobble

act: work: operate: function forbid: ban: outlaw

accuse: charge: incriminate: indict: impeach

Extensional overlap error: mistake: blunder

review: article brainy: cunning high: tall

Fuzzy overlap mistake: error: blunder

forest: woods marsh: swamps: fen: morass amicable: neighborly: friendly

*The first term is the most general term

Table 2.1 Dimensions of denotation variations

Lie usually felt to be a term of extreme opprobrium because it implies a flat and unquestioned

contradiction of the truth and deliberate intent to deceive or mislead

Falsehood may be both less censorious than lie and wider in its range of application… Like lie,

the term implies known conformity to the truth, but unlike lie, it does not invariably suggest a

desire to pass off as a true something known to be untrue

Untruth is often euphemistic for lie or falsehood and may carry similar derogatory implications

… Sometimes, however, untruth may apply to an untrue statement made as a result of ignorance or

a misconception of the truth

Fib is an informal or childish term for a trivial falsehood; it is often applied to one told to save

one‘s own or another‘s face

Misrepresentation applies to a misleading and usually an intentionally or deliberately misleading

statement which gives an impression that is contrary to the truth

Figure 2.2 Gove’s (1973) entry (abridged) for the near-synonyms of lie

This is where corpus linguistic methodology proves to be promising, for it provides statistical data which will likely ease the process of calculating frequencies of expression This will be later discussed in more details in Chapter 3

Trang 21

Another noticeable point in synonym differentiation is stylistic variation,

which includes dialect and stylistic tone, or register While dialectal differences

closely relate to language users, register variation is more associated with the environment where the text happens, making it feasible to be compared basing on corpus data These dimensions of register are absolute and can be compared on the same finite scale of dimensions with a range of possible values from low to high

For example, pow wow appears in informal contexts, while meeting in neutral and

assembly in more formal ones

Distinction of synonyms also involves expressive variation, which consists

of two main categories of differences One is about the speaker‘s emotions, and the other is the speaker‘s attitude or judgement toward the referent However, this cannot be judged from the corpus data and therefore is not studied within the scope

of this research

Finally, near-synonyms may be different in their structural patterns, i.e

collocational and syntactic behavior In terms of syntactic behavior, near-synonyms

can differ in their grammatical patterns For instance, John teaches tricks to the dog

is acceptable while John *instructs tricks to the dog is impossible On the other

hand, collocational variation associates with the words which can combine with

the putative word For example, make a cake but not *do a cake

This notion on collocational variation overlap the co-occurrence approach which is based on the assumption that the semantic and functional traits of a lexical item can be shown through its distributional characteristics This assumption can be traced back to Firth‘s famous saying in 1957 ―you shall know a word by the company it keeps‖ Similarly, Bolinger (1968) claimed that different syntactic form always indicates meaning difference Harris (1970) agreed to this assumption when asserting overtly:

If we consider words or morphemes A and B to be more different in meaning than

A and C, then we will often find that the distribution of A and B are more different

that the distribution of A and C In other words, difference of meaning correlates

with difference of distribution Cruse (1986) also stated that ―the semantic properties of a lexical item are

Trang 22

fully reflected in appropriate aspects of the relations it contracts with actual and potential contexts‖

This theory has been the underlying logic for a great deal of studies on synonymy, in which collocational distribution and/or syntactic distribution is

exploited Some of these studies are Church et al (1998) on strong and powerful, Partington (1998) on absolutely, completely, and entirely, and Biber et al (1998) on

big, large and great In these studies, the differences in collocational properties of

the putative words indicates the differences in their meanings To be specific, these meaning differences were interpreted from the distributions of formal elements of the words within their context provided by the corpora

Adopting from Pearce (2001), this study will look at collocations as a set of three joint subclasses, which are

 preferred collocation (words which are collocates of the target words),

 less-preferred collocation (words which tend not to be used with the target

word although, if used, do not lead to unnatural readings), and

 anti-collocation (words which must not be used with the target word since

they will lead to unnatural readings)

This classification by Pearce, to the author of this study, would enable a clearer path into investigation of near-synonym‘s collocation in the light of the co-occurrence approach

Overall, the four main types of synonym variations are denotational, stylistic, expressive and structural This categorization will be applied in the research

analysis to collect meaning distinctions of the three verbs achieve, attain and

accomplish

In summary, in order to find the similarities and differences of the three

verbs achieve, attain and accomplish, the study is based on Edmonds‘ classification

of synonymic differences, with the focus on three aspects, namely denotational (word senses and frequencies of senses), stylistic and collocational variations

Trang 23

2.2 Corpus linguistics

2.2.1 Corpus

First and foremost, it is necessary to define corpora, which have been described in numerous ways during their decades of development One is by Kennedy (1998), who identifies a corpus as ―a collection of texts in an electronic database‖ This definition seems to overlook on one of the most important

characters of corpora, that they are designed to be representative and balanced of

a language (Gries, 2009) This means that a corpus should manifest all different

linguistic varieties in their true proportions as in the language Even though such theoretically ideal corpus design is still a challenge not yet overcome by corpus compilers, corpora are anything but random collections of texts Leech (1992) defines corpora more strictly as ―generally assembled with particular purposes in

mind, and are often resembled to be representative of some language or text type‖

However, this definition seems not strict enough, for it misses one criterion for a text to be qualified in a corpus All texts that form a corpus must have been occurred in natural communicative settings, not formulated for the sole purpose of being gathered into a corpus Covering all the criteria, McEnery, Xiao and Tono

(2006) propose a more satisfying definition – a corpus is ―a collection of

machine-readable authentic texts which is sampled to be representative of a particular

language or language variety‖ It is important to note that, unlike their paper-based predecessor, modern electronic corpora in combination with computer corpus software have immense advantages in language study such as easy manipulation of data at minimal cost, accurate data processing and limitation of human bias Crystal (1985) adds that this collection of data can be used ―as a starting point of linguistic description or as a means of verifying hypotheses about a language‖

Another point to cover is the various types of corpora In fact, corpora differ

in various ways First, there are general corpora which depict language as a whole and specific corpora¸ which represent only a particular variety of language Second,

diachronic corpora and synchronic corpora differ in terms of their span – one cover

Trang 24

changes over time while the other only provide language data at one specific point

of time Another distinction may be between monolingual and parallel corpora,

which provide texts in either one or multiple languages Finally, corpora can be different in terms of whether they are fixed in size A corpus which stays the same

once created is static, while one which is constantly extended with updated data is

dynamic

2.2.2 Corpus linguistics

It is still a matter of debate whether corpus linguistics is a branch of linguistics or a methodology On the one hand, it is said that corpus linguistics has become an independent ‗philosophical approach‘ (Leech, 1992); on the other hand,

it is considered indeed a methodology that is not restricted to a particular aspect of language (McEnery et al, 2006) It considers ‗natural-occurring‘ language as a credible source for investigation and classification of linguistic structures Similarly, Hanks (2008) states that corpus linguistics is primarily concerned with interpreting observed language in order to arrive at statements on patterns in word meaning or syntactic composition Gries (2009) lists a number of areas in which corpus linguistic helps in investigation:

- Phonology: how possible is the prediction of the degree of phonological assimilation or reduction based on its‘ components‘ frequency of co-occurrence as

in Bybee and Scheibman (1999)

- Morphology: what do regular and irregular verb forms suggest about the probabilistic nature of the linguistic system as in Baayen and Martin (2005)

- Syntax: how to predict which syntactic choice speakers will make as in Leech et

al (1994)

- Semantics and pragmatics: how do near-synonyms differ from each other, as in Okada (1999), Oh (2000), Gast (2006) and Gries and David (2006)

2.2.3 Corpus linguistics in synonymy study

In order to determine synonym similarities and differences, one could consult

a number of sources (Edmonds, 1999) The first one is one‘s own intuition;

Trang 25

however, this could be too biased to produce reliable synonym distinction

Secondly, it could be helpful to consult dictionaries - the much less biased work by generations of lexicographers Although dictionary definitions and usage notes serve as a decent source of data for synonym comparison, this source alone is not in-depth and detailed enough A more fruitful source for analysis is raw text corpora As presented earlier, corpora with their powerful computer databases and language analysis tools facilitate researchers to judge word behaviors in a myriad of authentic contexts This opens the door to concluding the meanings of words from their repeated syntactical and collocational patterns Therefore, it is reasonable to use corpus as a source of data for investigating synonyms This is advocated by Church et al (1994) as they claim collocational and constructional similarity collected from corpora can be used to investigate semantic relations like synonym and antonymy A review of past studies on synonymy in light of corpus linguistic approach can be found in 2.4

2.3 Previous studies

In Vietnam, linguistics research using corpora is still of limited number, many of which are research on languages other than English For example, Dao (2011) emphasized the importance of corpus linguistics and corpus technology in teaching and learning Vietnamese as a foreign language, or Nguyen (2016) studied variation modes of speech sound in Vietnamese by using Sino – Vietnamese corpus

of yuanyun Studies concerning English language phenomena seem scarce; one that can be found is by Luu (2016) in which she did a critical discourse analysis of power relation in New York Times‘ reconstruction of global climate change conferences Despite a lot of effort to find previous studies in Vietnam on synonymy in light of the corpus approach, the author have not been able to find one

On the universal scale, research on synonym and corpus linguistics are a lot easier to find In fact, the number of studies on near-synonymy has surged during the last few decades along with the arrival of electronic corpora and computerized

language tools One of the most well-known is Kennedy‘s (1991) study of between

Trang 26

and through in terms of their collocational and semantic behaviors Other studies include the examination of collocation and prosody of sheer, pure and absolute by Partington (1998), Togibi-Bonelli‘s (2001) studies of largely and broadly or tall and

large, Yang‘s (2016) discrimination of learn and acquire, Phoocharoensil‘s (2010)

study of ask, beg, plead, request and appeal, or Mildred‘s (2016) on get, fetch and

receive Johns (1991), emphasizes the pedagogical benefits of concordance on

differentiating near-synonyms such as persuade and convince However, it can be safely claimed that no study has had the focus on the three verbs achieve, attain and

accomplish, which proves this piece of research relevant and hopefully able to, to

some extent, fill the gap in corpus study as well as.synonymy study

Trang 27

corpus-However, intuition can be affected by one‘s dialect or sociolect, possibly making one natural utterance to one speaker sound unacceptable to another (McEnery et al., 2006) Moreover, even when one‘s intuition is correct, it may not represent real language use The corpus-based approach, in contrast, facilitates finding distinctions that intuition alone can hardly detect More importantly, these results are in reliable quantitative data thanks to sophisticated statistical measurements Nonetheless, this

is not to negate the roles of intuition in language study Indeed, intuition is extremely important when interpreting corpus evidence McEnery et al (2006) claim that ―the key to corpus data is to find the balance between the use of corpus data and the use of one‘s intuition‖ Therefore, exploitation of corpus data in combination with intuitive analysis is rational for this study

These are two of the largest corpora available (520 million and 460 million words,

Trang 28

respectively) with an annual addition of approximately 20 million words Unlike static corpora, i.e., corpora that stay unchanged once they are created such as the Brown corpus or Frown Corpus, these constantly expanded corpora allowed the author to retrieve the most recent data Another plus point of these corpora is their quite easy access The COCA is available free of charge for all kinds of researchers, allowing 50-200 queries a day depending on researcher status Gaining the highest status of researcher in the COCA for a university lecturer, the author finds this corpus really accommodating as one data collection instrument for the research The CWO,

on the other hand, has more limited access (only freely available for a one-month trial subscription); however, it was still adequate for the author to collect data for the research

Apart from the aforementioned advantages, the two corpora have plus and minus points that make them complementary for each other Firstly, according to Davies (2010), COCA have a genre balance remained from year to year In other words, the number of texts from each category in the COCA are quite equal throughout its 25-year from 1990 to 2015 In fact, each category contains more than

100 million words, with the largest being popular magazine (110 million words) and the smallest being academic journals (103 million words) Spoken, Newspaper and Fiction account for 109 million, 106 million and 105 million words,

respectively This could ensure that the corpus depicts ―linguistic changes in the real world‖ (Davies, 2010), which cannot be achieved with the CWO due to its genre imbalance However, the CWO is superior to the COCA in terms of its statistical measures There are two common measures of collocational strength, Mutual Information (MI) and t-score The former has been found to be useful in measuring similarity, while the latter is more effective in measuring difference (Church et al., 1994) Whereas the COCA relies on Mutual Information (MI) as its solely measurement of collocational strength, the CWO uses a combination of MI and t-score, which would produce more reliable results on collocational properties of the investigated words

Trang 29

The other source for data collection is dictionaries, which were comprised of general and synonym dictionaries The former were consulted mostly for the possible senses of all the verbs, while the latter were used in search for discriminations among the verbs in any aspects of their meanings The general dictionaries consulted in the study, chosen on the basis of their prestige and accessibility, were all mono-lingual of both British and American English There

were five general dictionaries in total, namely The American Heritage Dictionary of

the English Language (AHD, https://ahdictionary.com/), Cobuild Advanced English Dictionary (CAED, https://www.collinsdictionary.com/dictionary/english),

Longman Dictionary of Contemporary English Online (LDOCE,

http://www.ldoceonline.com/), the Merriam-Webster Dictionary online

(Merriam-Webster, https://www.merriam-webster.com/), and the Oxford Advanced Learner‘s Dictionary (OALD, http://www.oxfordlearnersdictionaries.com/) Apart from the

five general ones, a dictionary of synonym, Webster‘s New Dictionary of Synonyms (WNDS), was also consulted From these dictionaries, the author retrieves

information in terms of the target verbs‘ meanings, notes on their styles and synonyms (if provided) It was the author‘s anticipation that the dictionaries would,

to some extent, show some nuances of meaning among the verbs

3.3 Data collection procedure

The data collection procedure consisted of three phases, corresponding with essential data to answer the three research questions mentioned in Chapter 1

3.3.1 Phase 1 - word senses and frequencies of senses

In order to extract word senses of achieve, attain, and accomplish, the author

first studied and synthesized the word‘s definitions, usage notes and synonyms (if available) from general and synonym dictionaries This process revealed the senses

of each word and more importantly, whether there was an overlap between the words‘ senses This was particularly crucial because two words need at least one overlapping sense in order to be considered synonymous, according to Chung and Ahrens (2008)

Trang 30

The next step was to calculate the frequencies of the word senses This process involved a collection of concordances extracted from the two corpora To

be specific, 200 concordances of each word were randomly collected, 100 from the COCA (via the KWIC command) and 100 from the CWO (via the Concordance command) These 600 concordances were analyzed one by one to calculate the frequencies of meaning of each verb The results were then presented in both raw frequency and percentage terms for better comparison between the verbs

3.3.2 Phase 2 - Register

This phase of data collection procedure revealed the distribution of achieve,

attain and accomplish across different registers (styles or genres) in their various

word forms The data was retrieved from the COCA website (http://corpus.byu.edu/coca/), which provided the frequencies of the target words in terms of different lemmas, different categories (Spoken, Magazine, Newspaper, Fiction and Academic Journals) and different time periods, all presented in the form

of tables and charts Figure 3.1 shows the command on the COCA for frequencies

with the verb attain

Figure 3.1 Corpus command for frequencies on the COCA (screenshot)

3.3.3 Phase 3 - Collocational properties

The final phase in the data collection procedure provided data to determine

the similarities and differences in the collocational properties of achieve, attain and

Trang 31

accomplish To be more specific, what are the preferred collocations, less-preferred

collocations and anti-collocations (Pearce, 2001) of the three verbs? Are there any overlap in the types of collocations of the three verbs? Within the scope of this study, it was the author‘s decision to only consider collocates being noun objects of the target verbs For this task, the COCA was not chosen for it only allowed to select collocates‘ part of speech, not syntactic roles; in contrast, the CWO was consulted for its ability to provide collocates according to specific syntactic roles, i.e subject, object, modifier, etc The command for collocation in the CWO was quite simple, which is demonstrated by figure 3.2

Figure 3.2 Command for collocation in the CWO (screenshot)

In order to contrast the collocates of the three verbs, another function in the CWO - Sketch-Diff - was used This application allowed users to check whether a preferred collocate of A is a less-preferred or anti-collocate of B, and vice versa The results were presented in lists and tables with real language instances ready to be accessed

This chapter have described the methodology of the research, which was corpus-based in combination with intuition-based The data collection instruments comprised two large open corpora, five general dictionaries and one synonym dictionary The data collection procedure consisted of three phases in order to gather data for the three corresponding research questions

Trang 32

CHAPTER 4 FINDINGS AND DISCUSSION

4.1 Word senses and frequencies of senses

This section reports findings on the different senses of the three verbs

achieve, attain and accomplish and the frequencies of each sense First, dictionaries

were analyzed for preliminary information on word definitions, register and synonyms The goal was to determine whether there is any overlapping sense among the three verbs and record any notes of usage on formality and synonymy

Later the set of 600 concordances from COCA and CWO was studied to calculate the frequencies of each word sense in percentage term

4.1.1 Word senses

All the three verbs achieve, attain and accomplish are found in the five

general dictionaries It is the author‘s observation that the senses and number of senses in each target verb entries are quite similar across dictionaries

Achieve

Achieve appears to be the most unanimous entry in the five different

dictionaries, which all list three senses of the word, the first two being transitive while the third intransitive

(i) The first sense of achieve involves the basic notions of ―gaining or succeeding in

something‖ and ―after a lot of effort‖ This remains consistent across the five dictionaries, although wording may vary to some extent To illustrate, AHD defines

achieve as ―to gain with effort or despite difficulty‖, MWD ―to get or attain as the

result of exertion‖, and OALD ―to succeed in reaching a particular goal, status or standard, especially by making an effort for a long time‖ The examples given by the dictionaries include

- Achieve fame as a singer (AHD)

- Achieve a record speed (AHD)

- Achieve a high degree of skill (MWD)

- They could not achieve their target of less than 3% inflation (OALD)

Trang 33

- Wilson has achieved considerable success as an artist (LDOCE)

- There are many who will work hard to achieve these goals (CAED)

Regarding synonyms of this word sense, AHD and MWD list reach, in the meantime, OALD and WNDS attain On this ground, it could be ascertained that

achieve and attain are true near-synonyms, at least from lexicographers‘ point of

view

(ii) The second sense of achieve revolves around the notion of ―to succeed in

accomplishing, bring about‖ (AHD), ―to carry out successfully‖ (MWD) or ―to succeed in doing something or causing something to happen‖ (OALD) Examples for this sense include

- All you‘ve achieved is to upset my parents (OALD)

- Achieve an improvement in foreign relations (AHD)

In terms of synonym, OALD names accomplish as a synonym for this sense of

achieve, so does MWD

It could be seen from the first two senses of achieve that they have, to some

extent, similar shades of meaning; however, their focus is different, sense (i) emphasizes the ―effort‖ one has to pay in order to succeed, while sense (ii) focuses more on the ―completion‖ of a task

(iii) The third and final sense of achieve is listed by dictionaries as ―be successful‖

(OALD), ―become successful or attain a desired end or aim‖ (MWD) This sense of

achieve does not accept object, i.e intransitive Some of the examples are

- Their background gives them little chance of achieving at school (OALD)

- We want our students to achieve within their chosen profession (LDOCE)

Attain

The definitions of attain, like those of achieve, are quite similar across

different dictionaries To be specific, all the dictionaries agree on two main senses

of the word

(i) The first sense of attain appears to overlap with that of achieve – ―succeed in

getting something after a lot of effort‖ (OALD), ―succeeding in achieving

Trang 34

something after trying for a long time (LDOCE), or ―gain as an objective‖ (AHD)

This is illustrated by examples such as

- attain a diploma by hard word (AHD)

- Most of our students attained five ‗A‘ grades in their exams (MWD)

- More women are attaining positions of power (LDOCE)

- Jim is halfway to attaining his pilot‘s licence (CAED)

(ii) The second sense of attain denotes the action of reaching or coming to a

particular age, size, level or condition (OALD & LODCE) CAED uses different wording but might confuse users when it includes the defined word in its own

definition – ―to attain a state or condition as a result of natural development‖ Some

of the examples for this sense are

- Redwoods can attain a height of 300 feet (AHD)

- They attained the top of the hill (MWD)

- She attained a ripe old age (MWD)

- The cheetah can attain speed of up to 97kph (OALD)

- After a year, she had attained her ideal weight (LDOCE)

- He attained preferment over his fellows

In terms of synonyms, the word is synonymous with achieve (AHD, MWD and CAED) in its first sense and obtain (MWD), get, win, reach (CAED) in its second sense

Once again, this is evidence for the synonymy between achieve and attain

Accomplish

Quite similar to those of attain, definitions of accomplish across dictionaries

mostly resemble each other with only one basic sense However, MWD seems to separate the word senses according to different nuances of meaning AHD, OALD,

LDOCE and CAED all agree that to accomplish means to succeed in doing or completing something, which coincides with sense (ii) of attain To illustrate this

sense, a number of examples are presented

- If we‘d all work together, I think we could accomplish our goal (CAED)

- The first part of plan has been safely accomplished (OALD)

- That‘s it Mission accomplished (=we have done what we aimed to do) (OALD)

While the other four dictionaries list accomplish as having only one

Trang 35

aforementioned sense, MWD mentions four The first one is ―to bring about (a

result) by effort, e.g., have much to accomplish today Secondly, like in the other

dictionaries, it is ―to bring something to completion or fulfill something‖, e.g

accomplish a job The third sense, which is ―to succeed in reaching (a stage in a

progression)‖, does not appear in other dictionaries However, to the author, this sense almost overlaps with the previous one; therefore, they should be grouped together without major loss in meaning In fact, the example proposed by MWD for

this sense (would starve before accomplishing half the distance – W.H Hudson

1922) proves the point In this case, half the distance could be considered an

achievement of some kind, making it unnecessary to list this shade of meaning separately Finally, the fourth sense in MWD is an archaic one, ―to equip thoroughly‖, which is given without any example Hence, it is expected that the possibility of encountering this sense in the set of concordances (collection of language instances from 1990 onward) would be quite low In terms of the word‘s

synonym, achieve is listed by OALD and LDOCE as the only candidate

Overall, the dictionary consultation reveals different senses of the target

verbs achieve, attain and accomplish Also, due to the overlap in their different

senses, it is ascertained to claim they are near synonyms To be specific, all of the three verbs denote the action of ‗gaining something after a lot of effort‘ While

achieve and accomplish can both mean ‗to carry something out successfully or to

complete something‘, attain does not denote this sense; on the other hand, it refers

to the action of ‗reaching a particular age, size, level or condition‘ Similarly,

achieve is found to denote the status of ‗becoming successful‘, which resembles

neither of the other verbs Table 4.1 shows the different senses of the three verbs, all

of which are retrieved from dictionaries In the later part of this section, their frequencies are calculated after these senses have been recognized from the analysis

of 600 concordances

While the relationship of near synonymy appears to be minute between

attain and accomplish, it seems really strong in word pairs with achieve, i.e achieve

vs attain and achieve vs accomplish This makes these two synonymous pairs the

main focus to be investigated in the later phases of the study

Tiêu đề	Achieve, Attain And Accomplish From A Corpus-Based Perspective
Tác giả	Lê Thị Thu Hồng
Người hướng dẫn	Dr. Trần Thị Thu Hiền
Trường học	Vietnam National University, Hanoi University of Languages and International Studies
Chuyên ngành	English Linguistics
Thể loại	thesis
Năm xuất bản	2017
Thành phố	Hanoi

Định dạng
Số trang	70
Dung lượng	1,24 MB