14 Based on the author’s thesis doctoral--University of Basel, 2003 under title: Collocations in the English of advanced learners : a study based on a learner corpus.. Table of contentsC
Trang 2Collocations in a Learner Corpus
Trang 3Studies in Corpus Linguistics
SCL focuses on the application of corpus method throughout language study,the development of a computational approach to linguistics, and the develop-ment of new tools for processing language
University of LancasterAnna MauranenUniversity of TampereJohn SinclairUniversity of BirminghamPiet van SterkenburgInstitute for Dutch Lexicology, LeidenMichael Stubbs
University of TrierJan SvartvikUniversity of LundH-Z YangJiao Tong University, Shanghai
Volume 14
Collocations in a Learner Corpus
by Nadja Nesselhauf
Trang 4Collocations in a Learner Corpus
Nadja Nesselhauf
University of Heidelberg
John Benjamins Publishing Company
Amsterdam/Philadelphia
Trang 5The paper used in this publication meets the minimum requirements
8TM
of American National Standard for Information Sciences – Permanence
of Paper for Printed Library Materials, ansi z39.48-1984.
Cover design: Françoise Berserik
Cover illustration from original painting Random Order
by Lorenzo Pezzatini, Florence, 1996
Library of Congress Cataloging-in-Publication Data
Nadja Nesselhauf
Collocations in a Learner Corpus / Nadja Nesselhauf.
p cm (Studies in Corpus Linguistics, issn 1388–0373 ; v 14) Based on the author’s thesis (doctoral) University of Basel, 2003 under
title: Collocations in the English of advanced learners : a study based on a
learner corpus.
Includes bibliographical references and indexes.
1 Language and languages Study and teaching 2 Study and teaching I Title II Series.
Trang 6Table of contents
Chapter 1
1.1 The role of collocations in language and language teaching 1
1.2 Previous research on collocations in learner English 3
1.3 Aims and scope of the study 9
Chapter 2
2.1 The notion of ‘collocations’ 11
2.1.1 Definitions of collocations 11
2.1.2 Related concepts 18
2.1.3 Classifications of collocations 21
2.1.4 The definition of collocations in this study 25
2.1.5 The classification of collocations in this study 34
2.2 The question of norm in ELT and the notion of error 37
2.3 Learner corpora and the analysis of learner language 40
2.4 Data and procedure 44
2.4.1 The learner corpus used 44
2.4.2 The syntactic patterns considered 47
2.4.3 Determining the degree of acceptability of the
2.4.4 Delimiting collocations from other types of word
combinations 54
Trang 7 Table of contents
Chapter 3
3.1 Overall results 65
3.2 Deviations in the verb 73
3.2.1 Types and frequencies 73
3.2.2 Deviations only involving simple verbs 74
3.2.3 Deviations involving phrasal verbs 79
3.2.4 Deviations involving prepositional verbs 81
3.2.5 Other deviations concerning the verb 86
3.2.6 Regularities in verb deviations across categories 86
3.3 Deviations in the noun phrase or prepositional phrase 96
3.3.1 Deviations concerning the noun 96
3.3.2 Deviations concerning the determiner 104
3.3.3 Deviations concerning noun complementation 106
3.3.4 Deviations in the preposition of the prepositional phrase 111
3.4 More global deviations 112
3.4.1 Stretched verb construction instead of the corresponding
verb 112
3.4.2 Whole collocation inappropriate 116
3.4.3 Deviations in the structure of the collocation 121
3.5 Deviations in collocations versus collocational deviations 123
3.6 Deviations involving collocations in non-collocations 129
3.7 Groups of deviations across categories 135
3.8 Further aspects of learner collocation use 145
3.8.1 Variation, repetition, and title recycling 145
3.8.2 The use of quotation marks 150
4.1.1 The use of L2 elements 165
4.1.2 The use of L2 chunks 167
4.1.3 The use of semantically or formally related elements 170
4.1.4 Blends of related L2 material 176
Trang 8Table of contents
4.2 L1 building material 179
4.2.1 The influence of L1 elements and chunks 179
4.2.2 How and when L1 influence operates 185
4.3 Further building material 195
4.4 Relation and interaction of the different types of building
material 197
Chapter 5
Factors correlating with learners’ difficulties with collocations 199
5.1 Intralinguistic factors 199
5.1.1 The degree of restriction of a collocation 199
5.1.2 The fact that a combination is a collocation 204
5.1.3 The fact that a collocation is a stretched verb construction 211
5.1.4 The syntactic pattern of a collocation 214
5.1.5 Congruence of a collocation in L1 and L2 221
5.2 Extralinguistic factors 229
5.2.1 The circumstances of production 229
5.2.2 The learners’ exposure to English 234
Chapter 6
6.1 Summary of the findings 237
6.2 Implications for second language storage and processing 247
6.3 Implications for teaching 252
6.3.1 Exposure, consciousness-raising and explicit teaching 252
6.3.2 Selecting collocations for teaching 254
6.3.3 Principles of collocation teaching 264
Trang 10Corpora:
Dictionaries:
BBI The BBI Dictionary of English Word Combinations
(= Benson et al 1997)
CCED Collins COBUILD English Dictionary
OALD Oxford Advanced Learner’s Dictionary
ODEI Oxford Dictionary of English Idioms (= Cowie et al 1993) OED The Oxford English Dictionary
Trang 12This book is based upon my PhD dissertation, Collocations in the English of
Ad-vanced Learners A Study Based on a Learner Corpus, which was accepted by the
Philosophisch-Historische Fakultät of the University of Basel in August 2003 Iwould like to thank my supervisors, David J Allerton and Christian Mair, fortheir warm support throughout the writing process and beyond David Aller-ton gave most generously of his time to discuss all aspects of my work and alsoalerted me to many subtleties in the areas of phraseology and syntax Chris-tian Mair introduced me to corpus linguistics and stimulated my interest inlinguistics in the first place
Numerous other people also provided helpful comments on earlier sions and stimulating ideas for my research I am particularly grateful to StefanHanke, Nuria Hernandez, Andreas Langlotz, Iman Laversuch, Michelle Miles,Ute Römer, Tamsin Sanderson, John Sinclair, Pius ten Hacken, Cornelia Tschi-chold, Ursula Weinberger, and the anonymous reviewer Any remaining flawsare of course my own
ver-I would like to thank Gunter Lorenz, Sylviane Granger and the staff of theCentre for English Corpus Linguistics at the University of Louvain, especiallySylvie De Cock, who kindly integrated me into the ICLE (the InternationalCorpus of Learner English) project at a late stage Thanks also go to the stu-dents at the University of Basel who contributed essays to ICLE and to JoyceBachmann for help with the essay collection I am grateful to the native speak-ers who provided native speaker judgements, in particular to Peter Burleigh,the students from the Institute of European Studies in Freiburg and the U.S.Naval Academy in Annapolis, Maryland, and to those who helped me recruitnative speaker informants, in particular Siri Caltvedt, Chris Everard, WilliamFletcher, Viviane Klein, Ulrich Lohrmann and Andrew Shields
For providing generous grants, which enabled me to attend conferencesrelated to the topic, I am grateful to the University of Basel, in particular theDepartment of English, and to the “Improving Human Potential” scheme ofthe European Union
Trang 13 Acknowledgements
I also wish to thank Elena Tognini-Bonelli for including the book in the
series Studies in Corpus Linguistics, and both her and the team at Benjamins, in
particular Kees Vaes, for a friendly and efficient publishing process
On a more personal level, I would like to express my deepest thanks to myparents, friends, and especially my husband, Stefan, for the various kinds ofsupport I have received from them while working on this book
Trang 14Chapter 1
Collocations in native and non-native
speaker language
. The role of collocations in language and language teaching
Collocations, i.e arbitrarily restricted lexeme combinations such as make a
de-cision or fully aware, are one type of a group of expressions whose importance
in language has been increasingly recognized in recent years This group of pressions has been variously called prefabricated units, prefabs, phraseologicalunits, (lexical) chunks, multi-word units, or formulaic sequences.1 They aremade up of more than one word and are lexically and/or syntactically fixed to
ex-a certex-ain degree Following ex-a period in which, lex-argely due to the wide ence of generative grammar, prefabricated units were considered peripheral inlanguage, it is today widely assumed that their number is vast and that theyplay a major role in language processing and use Bolinger was among the firstlinguists to point out that a generativist view, which relegates prefabricatedunits to the periphery of language, fails to account for a considerable part ofobservable language data.2On the basis of numerous examples he claims that
influ-our language does not expect us to build everything starting with lumber,nails, and blueprint Instead it provides us with an incredibly large number
He also points out that most of these prefabs are not completely but only tially fixed Many linguists have since made similar claims, most notably Pawleyand Syder They – also mainly on the basis of a sizeable collection of pre-fabricated units – come to the conclusion that “by far the largest part of theEnglish speaker’s lexicon consists of complex lexical items” (1983: 215), andthat most of these are semi-productive (216f.).3Further empirical support forthis view has come from corpus studies, which have regularly found that most
par-of naturally occurring language, both spoken and written, consists par-of rent patterns, many of which are phraseological (e.g Altenberg 1998; Altenberg
recur-& Eeg-Olofsson 1990; Kjellmer 1994; Renouf recur-& Sinclair 1991; Sinclair 1991;Stubbs 2001).4 Corpus studies have also shown that collocations are a fre-
Trang 15 Chapter 1
quently occurring type of semi-prefabricated unit In an analysis of over 5,000verb-noun combinations in a written 240,000 word corpus, for example, over athird of the combinations were found to be collocations (Howarth 1996: 120).Several important functions have been identified for prefabricated units(cf Wray 2002: 93ff.) First, there is growing evidence that they play an essentialrole in language learning, as they seem to be the basis for the development ofcreative language in first language and childhood second language acquisition(e.g Peters 1983; Wray 1999) Secondly, prefabricated units are essential for flu-ency in both spoken and written language Psycholinguistic evidence indicatesthat the human brain is much better equipped for memorizing than for pro-cessing, and that the availability of large numbers of prefabricated units reducesthe processing effort and thus makes fluent language possible (cf Aitchison1987; Fillmore 1979; Pawley & Syder 1983, 2000; Partington 1996: 20) Thirdly,the use of prefabricated units supports comprehension, as the recipient canunderstand the meaning of a passage of text without having to attend to everyword (cf Hunston & Francis 2000: 270) And fourthly, prefabricated units serve
to indicate membership of a certain linguistic group; they fulfil “the desire tosound [and write] like others” (Wray 2002: 75; cf also Pawley & Syder 1983).5
For the adult non-native speaker, the first of these functions probably does notplay a major role, as it seems that prefabricated language is not regularly used as
a basis for creative language in adult L2 acquisition (cf Wray 1999) However,two of the other functions are at least as essential for non-native speakers as fornative speakers Enhancing fluency through reducing processing effort must
be of particular interest for non-native speakers, as they naturally need moreprocessing effort to convey their intended message Indeed, it has been shownthat whether or not L2 production is fluent crucially depends on the learner’scontrol over a large repertoire of prefabricated units (Dechert 1983; Towell &Hawkins 1996) The third function, making comprehension easier, is doubt-less of importance for every user of a language While the use of native-likeprefabs aids comprehension, non-native-like prefabs can irritate the recipientand draw the attention away from the message (cf Hüllen 1971: 172; Hecht &Green 1988; Korosadowicz-Stru˙zy ´nska 1980: 115; Cowie & Howarth 1996: 90).Being perceived as a member of a certain linguistic group that speaks the L2natively, finally, though clearly not an aim of all non-native speakers, is alsoimportant to certain learners of a language (cf Section 2.2)
The knowledge of and the ability to use prefabricated units are thus tial for the language learner; unfortunately, however, they also pose consider-able difficulties, even for the advanced learner Statements such as the followingabound in the literature:
Trang 16essen-Collocations in native and non-native speaker language
Language learners often stumble across co-occurrence relations
(Smadja 1989: 164)Any analysis of students’ speech or writing shows a lack of [ ] collocational
Knowing which subset of grammatically possible utterances is actually monly used by native speakers is an immense problem for even the most
There is also wide agreement that prefabs have to be taught (Bahns 1997: 62ff.;Cowie 1988: 136; McCarthy 1990: 12ff.; Nation 2001: 317 and many others) Inspite of this, many types of prefabs, including collocations, are still not treatedadequately in English language teaching today Although collocations havereceived increasing attention in language teaching in recent years (Granger1998c: 159; Howarth 1998a: 30), we are still far from the development of a co-herent methodology and even further from a wide-spread and systematic treat-ment of collocations in language teaching materials and syllabi (cf e.g Bahns1997: 61; Howarth 1996: 168; Kaszubski 1998: 175; Nesselhauf & Tschichold2002; Sinclair & Renouf 1988: 153; Wiebalck-Zahn 1990: 54) In recent years,however, a few approaches to language teaching have been developed, which,far from neglecting phraseological units, put them at the centre of teach-ing: Willis’s lexical syllabus, Lewis’s lexical approach, and the lexical phrasesapproach by Nattinger and DeCarrico (e.g Willis 1990; Lewis 1994, 1997;Nattinger & DeCarrico 1992) As with most other suggestions for teaching pre-fabs, even these approaches are at best based on the analysis of native speakerprefabs; none of them is based on any systematic observation of prefabs inlearner language.6 However, if efficient pedagogical measures are to be de-vised, they need to take into consideration the difficulties learners have withprefabricated units
. Previous research on collocations in learner English
Collocations, in the present sense of the term (cf Section 1.1), have not been
a frequent focus of attention in analyses of learner English so far.7One of theproblems with a number of studies of ‘collocations’ in learner language is thatthe use of the term is often hazy In these studies, although the definition ofcollocations seems to be the same or at least close to the one adopted here, inpractice collocations are not carefully delimited from other types of word com-binations In particular compounds, but also idioms and combinations that are
Trang 17 Chapter 1
not arbitrarily restricted, are often without further discussion included in the
combinations investigated (e.g safety belt, blind date, break even in Hussein
1990, or striped shirt in Farghal & Obiedat 1995) In the following survey, only
those studies are considered in which most of the combinations investigatedare collocations in the present sense or if collocations in the present sense con-stitute a fairly clearly delimited group in a more comprehensive study Studiesare then included disregarding what the combination in question is actuallycalled and how the notion of collocation is theoretically defined The survey isalso restricted to studies of learners of English.8
About half of the published studies on collocations in learner language arebased on elicitation tests and about half on production data Most elicitationstudies have focused more on the production than on the comprehension ofcollocations, the reason being that the former is much more problematic forlearners than the latter This is due to the nature of collocations (the restriction
on one of its elements in a nevertheless largely transparent combination) andhas been confirmed by two studies (Marton 1977; Biskup 1990) In both ofthese studies, collocation comprehension and production of advanced Polishlearners of English is investigated by means of a translation test.9The result
of both studies is that the translation of collocations from L2 to L1 is most always accurate, while the translation of the same collocations from L1
al-to L2 poses considerable difficulty for the learners Two studies have examinedwhether learners are able to determine whether certain combinations are ac-tually used in English (Channell 1981; Granger 1998c) Advanced learners had
to mark which of a number of given words combine with certain words fromanother word class (adjectives and nouns in one case and adverbs and adjec-tives in the other) The principal result of both studies is that a large number
of combinations that are acceptable were not marked by the learners
The elicitation studies of collocations concentrating on the question ofwhat learners can produce have used either cloze tests or translation tasks orboth Typically, they are based on small amounts of data and the results are(partly for that reason) not analysed in more detail So whereas they have con-sistently produced the result that collocations are difficult for the learner, theanalysis often only goes slightly beyond this Shei (1999) finds on the basis of
a cloze test that advanced learners with L1 Chinese have more difficulties thanspeakers of European languages The result of a cloze test by Herbst (1996) isthat (advanced) non-native speakers vary considerably more in their answersthan native speakers Bahns and Eldaw (1993), examining advanced Germanlearners’ knowledge of verb-noun collocations, observe that the translation ofverbs that are part of collocations poses many more problems than the transla-
Trang 18Collocations in native and non-native speaker language
tion of other lexical elements They also find that the proportion of collocationerrors10does not differ significantly between the best and the worst transla-tions Marton (1977) observes that mere exposure to collocations does notusually lead to their acquisition, and Bahns and Sibilis (1992) similarly ob-serve that reading only slightly improves learners’ knowledge of collocations.Farghal and Obiedat (1995) and Hasselgren (1994), finally, go a bit further intheir analysis by investigating the collocations actually provided by the learn-ers Farghal and Obiedat find that in their sample of advanced Arabic-speakinglearners, non-native-like collocations are based on transfer in about a tenth
of the cases (1995: 320) Hasselgren, in a study of adadjective and noun collocations, observes that advanced Norwegian learners more frequently
verb-choose unrestricted intensifiers (such as very) and core verbs (such as get or
give) than native speakers.
Two somewhat more detailed elicitation studies on collocations in learnerlanguage are Biskup (1992) and Al-Zahrani (1998) Al-Zahrani looks at thecollocational knowledge of 81 advanced Arabic-speaking learners, testing 50verb-noun collocations with a cloze test, in which the first phoneme of eachcollocate is provided He finds a strong relationship between knowledge of col-locations and overall proficiency as well as a strong L1 influence, which is notquantified, however Biskup (1992) describes a study in which 34 Polish and 28German advanced learners were asked to translate 23 collocations into English.Clear differences between the two groups emerged The Polish students pro-duced more collocations than the German students, but they also much morefrequently gave no answer at all The German students more frequently tried toparaphrase the intended meaning without using a collocation but made moremistakes According to Biskup, this emphasis on accuracy on the part of thePolish students and creative strategies on the part of the German students canprobably be put down to the different emphases in foreign language teaching inthe two countries She also observes that the L1 influence on non-native forms
is greater with the Polish than with the German students, and that differenttypes of transfer are preferred by the two groups
Studies of collocations in learner language based on production data havealmost exclusively investigated written learner language.11Two types of studycan be distinguished: those in which all collocations (often of a certain gram-matical type) are extracted manually from a given corpus and those in which apredefined set of collocations is extracted (semi-)automatically So far, studiesusing automatic analysis have only dealt with adverb-adjective combinationsand with collocations of high-frequency verbs They tend to concentrate onoveruse and underuse (by comparing the quantity of certain collocations in
Trang 19 Chapter 1
native and non-native speaker writing) and not primarily on the analysis ofnon-native-like combinations Chi Man-lai et al.’s study (1994) is an excep-
tion They analyse the combinations with the verbs have, make, take, do and get
in a one-million word corpus containing essays by (intermediate to advanced)learners of English with L1 Chinese Their analysis is restricted to deviations
in the use of these verbs, the main result being that they are often used as ifthey were interchangeable Kaszubski (2000) investigates the same verbs (with
the addition of be), and compares collocational uses of these verbs to their
use in other environments Collocations in native speaker corpora are pared to corpora of different groups of learners: Polish and Spanish interme-diate learners and Polish and French advanced learners of English Kaszubskifinds that in general learners produce fewer collocations (i.e tokens), but thatthey greatly overuse a small number of them (i.e types), in particular thosethat are frequent in English and/or similar to an L1 combination The pro-duction of adverb-adjective combinations is analysed in Granger (1998c) and
com-Lorenz (1999) Granger restricts her analysis to combinations with -ly adverbs
and compares advanced learner data (250,000 words, L1 French) with nativespeaker data Corresponding to the results for verb-noun combinations ob-tained by Kaszubski, her analysis reveals a general underuse of collocations bylearners Also similar to Kaszubski, she finds that adverb-adjective collocationscontaining a more restricted adverb are mainly used when an equivalent form
exists in L1 (e.g following French sévèrement puni, severely punished is duced) Lorenz (1999), who also includes intensifiers that do not end in -ly,
pro-investigates the writing of intermediate and advanced learners with L1 German(200,000 words) and compares it to native pupil and college student writing.Like Granger, he finds that learners underuse more restricted collocations whileoverusing certain less restricted ones He also concludes from his study that themajor reason for deviations in intensifier-adjective collocations is the desire
of many learners (often supported by teaching practices) to be original andexpressive
Two of the four existing production studies in which collocations weremanually extracted are limited to non-native-like collocations Lombard(1997) analyses written business English produced by native speakers of Man-darin on the basis of 571 non-native-like collocations extracted from a corpus
of 78,000 words produced by 8 students Her principal findings are that themajor type of mistake is the use of a near-synonym, that blends are rare, andthat lexical transfer occurs in about one tenth of the non-native-like combina-tions What is unfortunate is that types of mistakes (i.e how the collocationsproduced by the learner differ from the target form) and possible reasons for
Trang 20Collocations in native and non-native speaker language
these are not clearly kept apart The second study investigating non-native-likecollocations on the basis of a manual analysis of a corpus is by Burgschmidt andPerkins (1985).12This is probably the earliest large-scale study of learner col-location use At the same time it is the study including the greatest number ofcollocations, as it investigates more than 550 essays by advanced German learn-ers (450,000 words) In spite of its considerable scope, it has hardly received anyattention, however One of the reasons for this might be that, although differ-ent types of mistakes are minutely distinguished and possible reasons are givenfor every mistake, this information is not quantified and only discussed on avery general level, so that the study remains primarily a list of errors.13One
of the few more general findings is that both blending of L2 structures and L1transfer are frequent sources of mistakes Another finding is that learners areoften insecure in the use of collocations, which can be seen in frequent ‘cor-rections’ by the learners, in which incorrect collocations are often replaced byother incorrect ones
Howarth (1996) is one of the most thorough investigations of collocations
in learner language to date, although his database is comparatively small Hemanually investigates verb-noun combinations in a corpus of 10 essays (about22,000 words) written by non-native speakers with different L1s and comparesthem to combinations in native speaker writing.14 His analysis produces twomain general results He finds that learners use slightly fewer collocations thannative speakers and that there is no correlation between the general proficiency
of a learner and the number and the acceptability of the collocations used.More specific results include that non-native-like collocations are often eitherblends of two acceptable collocations with a similar meaning or the result ofwhat Howarth calls ‘overlaps’, i.e sets of nouns that share certain but not allverbs.15One final study to be mentioned (Zhang 1993) combines manual anal-ysis of production data with elicitation tests, the main result being based onproduction data The study focuses on the relation of language proficiency andcollocation use (in an intermediate to advanced, mixed L1 group) and, con-trary to the one by Howarth, indicates that the use of collocations, as regardsboth their number and their acceptability, is related to proficiency
As this survey has shown, apart from the fact that most studies of tions in learner language have focused on the advanced or intermediate learner,the studies differ widely, in particular with respect to their method of investi-gation Clearly, cloze tests in which the first phoneme of the collocating word
colloca-is given (Al-Zahrani 1998; Herbst 1996) or tests in which all the elements ofthe collocations are given (Channell 1981; Granger 1998c) investigate a type ofknowledge substantially different from production studies or elicitation tests
Trang 21 Chapter 1
where such aids are absent In addition to this, studies have also investigateddifferent types of collocations and learner groups with different language back-grounds But not only do the existing studies differ widely, their number is alsosmall, and many of them are quite limited in size and/or scope In elicitationstudies, 15 to 20 items (the selection of which often seems somewhat arbitrary)are tested on average, and in some production studies the data is limited to afew verbs or a small number of essays
In spite of this, some results have emerged Most of these results, however,mainly confirm and elaborate on the observation that collocation productionpresents a problem for second language learners A conclusion reached by anumber of studies is that learners use overall fewer collocations than nativespeakers (e.g Hasselgren 1994; Howarth 1996; Kaszubski 2000; Granger 1998c;Lorenz 1999) except for a small number of frequent ones which are overused(Kaszubski 2000) Other recurrent findings have been that learners are oftennot aware of restrictions (e.g Herbst 1996; Howarth 1996), but that they are
at the same time not aware of the full combinatory potential of words theyknow (Channell 1981; Granger 1998c) Individual studies have indicated thatlearners are insecure in the production of collocations (Burgschmidt & Perkins1985) and that the collocation problems are more serious than general vocab-ulary problems (Bahns & Eldaw 1993) A number of apparently contradictoryresults have also emerged, which is unsurprising given the differences in studydesign Some studies indicate that the use of collocations is related to profi-ciency (Zhang 1993; Al-Zahrani 1998), others indicate that it is not (Bahns
& Eldaw 1993; Howarth 1996) L1 influence appears to be strong in somecases (e.g Granger 1998c; Al-Zahrani 1998; Kaszubski 2000; Burgschmidt &Perkins 1985) and comparatively weak in others (e.g Farghal & Obiedat 1995;Lombard 1997) In one study difficulty appears to be related to L1–L2 distance(Shei 1999), in another it is inversely related to that and seems to be related
to teaching practice instead (Biskup 1992) Blends were partly shown to befairly frequent (Howarth 1996), partly to be rare (Lombard 1997) However,the questions that seem most important for the design of pedagogical material,i.e which collocations or types of collocations are most difficult for certaingroups of learners, what kinds of mistakes occur and why, have received littleattention so far
Trang 22Collocations in native and non-native speaker language
. Aims and scope of the study
The present study intends to investigate the use of collocations by advancedlearners More precisely, the study has four principal, interconnected aims Thefirst aim is to identify the typical difficulties and non-difficulties of a particu-lar group of advanced learners in the production of collocations The secondaim is to identify the factors that contribute to the difficulty of (certain) col-locations, so that predictions concerning difficulty that go beyond the specificdata analysed in the study can be made The third aim is to find out what ma-terial and strategies learners use to create collocations in L2 The fourth aim,finally, is to formulate suggestions for language teaching based on these re-sults A secondary aim of the study is to develop a definition of collocationsand a classification of collocation mistakes that may help material designersand teachers deal with the phenomenon more adequately and more systemat-ically than has been the case hitherto (cf Howarth 1998b: 161f.; Burgschmidt
& Perkins 100) The investigation is thus conceived as a study in the field ofApplied Linguistics: not only are the analyses carried out with a practical ap-plication (i.e language teaching) in mind, but it is also not assumed that theresults from the linguistic investigation can be directly and without furtherdiscussion translated into suggestions for language teaching (unlike in whatWiddowson has referred to as ‘Linguistics Applied’; 2000)
The analysis is restricted to German-speaking learners of English; it isrestricted to verb-noun combinations and to a certain text type, namely ar-gumentative essays The investigation is based on a learner corpus (ICLE; cf.Section 2.4.1), of which 150,000 words have been analysed The extraction ofthe collocations was manual and yielded more than 2,000 instances of verb-noun collocations, so that the study reported here is one of the largest to datewhich goes beyond a predefined set of elements Restricting the analysis to oneL1 group rather than analysing more data from many different L1 groups wasdeemed necessary since, as a number of studies have indicated, the first lan-guage clearly plays a role in L2 collocation production, but has neverthelessnot been investigated in much detail Verb-noun combinations, i.e combi-
nations such as make an attempt or take sth into account, have been chosen
because they are not only frequent (cf e.g Bahns 1993b: 8; Howarth 1996: 120;Aisenstadt 1981: 55) and among the most difficult for the learner (cf Lombard1997; Biskup 1992; Howarth 1996), but also particularly important, since “theytend to form the communicative core of utterances where the most importantinformation is placed” (Altenberg 1993: 227) In addition, unlike in adverb-adjective or adjective-noun collocations, it is not possible for the learner to
Trang 23 Chapter 1
leave out the difficult element or replace it with a safe choice (such as very); and
paraphrasing is also often impossible (cf Bahns & Eldaw 1993) Argumentativeessays (on fairly general topics; cf Section 2.4.1) have been chosen because theyare fairly neutral in register and style (i.e they contain non-specialized vocab-ulary and tend to display a medium level of formality) and will therefore revealdifficulties with those collocations that most learners often need
The book is divided into six chapters In the second chapter, the ical and methodological foundations of the study are outlined Collocationsare defined and classified into different types, and the notions of norm anderror are discussed The potential of learner corpus analysis is compared toother types of analysis of learner language, and the data on which the analy-sis is based is described Moreover, the methodology that has been developedfor the analysis of collocations in a learner corpus is described in detail and itslimitations pointed out The third, fourth and fifth chapter contain a detailedanalysis of the data Chapter 3 reports on the difficulties and non-difficultiesthat the data reveal Different types of deviations are distinguished, described,and collocations and elements affected particularly often identified Groups ofdeviant collocations are also identified, hidden problems with collocations un-covered, and aspects of learner collocation use that do not necessarily lead todeviation investigated Chapter 4 is devoted to the question of what buildingmaterial learners use when they produce non-native-like collocations, and howdifferent building materials interact In Chapter 5, an attempt is made to iso-late factors that are responsible for difficulties in collocation use The focus
theoret-is on intralingutheoret-istic factors such as the degree of restriction of a collocationand congruence to an L1 collocation; a few extralinguistic factors are, however,also examined Chapter 6, finally, investigates the implications of the results forlanguage teaching – both for the selection of collocations for teaching and thequestion of how collocations could be taught most efficiently In addition, pos-sible implications of the results for second language storage and production arediscussed, and possible ways forward are briefly pointed out
Trang 24Chapter 2
Investigating collocations in a learner corpus
In this chapter, the theoretical and methodological foundations of the study arepresented In Section 2.1, the notion of ‘collocations’ is investigated, by provid-ing a systematic overview of the widely varying definitions of the term as well
as of a number of related terms such as ‘selectional restrictions’ and by ing an overview of the ways in which collocations have been classified On thisbasis, a definition and classification of collocations that attempts to be boththeoretically consistent and easily applicable to real language data is developed.Section 2.2 addresses the question of norm in language teaching and the notion
provid-of ‘error’, and in Section 2.3, the advantages and limitations provid-of learner corpusanalysis as compared to other types of learner language analysis are discussed.The final section of this chapter, 2.4, outlines what data and procedures havebeen used for the study Information is provided on the learner corpus the anal-ysis is based on and the precise syntactic patterns of the collocations that havebeen considered The section also gives a detailed description of the proceduresused to determine the degree of acceptability of the collocations produced bythe learners and of those used to delimit collocations from other types of wordcombinations in the corpus
. The notion of ‘collocations’
.. Definitions of collocations
The term ‘collocation’ is used in widely different and often rather vague senses
in linguistics and language teaching The only common denominator is thatthe term is (at least mostly) used to refer to some kind of syntagmatic re-lation of words Among the many diverse uses of the term, two main viewscan be identified (cf also Klotz 2000: 63ff.; Nesselhauf 2004a).1In one of thesetwo views, a collocation is considered the co-occurrence of words at a certaindistance, and a distinction is usually made between co-occurrences that arefrequent (or more precisely, more frequent than could be expected if words
Trang 25J R Firth and has been developed further in particular by M A K Hallidayand J Sinclair It is often adopted by researchers who are involved in the com-putational analysis of syntagmatic relations The phraseological approach hasbeen strongly influenced by Russian phraseology Typically, researchers adopt-ing this approach work in the fields of lexicography and/or pedagogy; amongthe main representatives are A P Cowie, I Mel’ˇcuk and F J Hausmann Inwhat follows, I will describe the view of collocations propounded by one ofthe major representatives of each of the two approaches and briefly outlinehow other representatives of the two approaches differ from them.3 For thefrequency-based approach, Sinclair’s view of collocations will be discussed, forthe phraseological approach, that of Cowie.
Sinclair defines collocations as “the occurrence of two or more wordswithin a short space of each other in a text” (1991: 170) A short space, or ‘span’,
is usually defined as a distance of around four words to the right and left of theword under investigation, which is called the ‘node’ (e.g 1991: 170; Jones &
Sinclair 1974: 21f.) If, for example, in a given amount of text, the word house
is analysed, and the word occurs in an environment such as He went back to
the house When he opened the door, the dog barked, the words went, back, to, the, when, he, opened, the are all considered to form collocations with the node house; these words are then called ‘collocates’ Sinclair distinguishes two types
of collocations, namely ‘significant’ and ‘casual’ collocations, and sometimesreserves the term ‘collocation’ for the former type (e.g 1991: 115) Significantcollocations are co-occurrences of words “such that they co-occur more oftenthan their respective frequencies and the length of text in which they appear
would predict” (1974: 21) In the example above, the and house would probably
not be significant collocations, as, although these two words can be assumed to
co-occur frequently, the is itself a frequent word in virtually every kind of text The words dog and barked would, however, very likely constitute a significant collocation, as barked is not usually very frequent and, if it occurs, is likely
to be found near the word dog Exact formulae of how to determine exactly
whether co-occurring words constitute a significant collocation have also been
Trang 26Investigating collocations in a learner corpus
developed by Sinclair and others (for a discussion of these see, for instance,Stubbs 1995)
Given that even Sinclair sometimes varies in how he defines collocations,
it is not surprising that some researchers adopting a frequency-based approach
to collocations consider co-occurrences of all frequencies to be collocations(e.g Halliday 1966; Moon 1998), while others reserve the term for frequentco-occurrences (e.g Stubbs 1995) Some use recurrence, i.e co-occurrencemore than once in a given corpus, as the defining criterion (e.g Kjellmer1987; Kennedy 1990) Other points of variation in the definition of colloca-tions in the frequency-based approach are also mirrored by variation in Sin-clair’s writings Whereas he uses ‘word’ in the sense of ‘lexeme’ in the abovedefinition (cf 1991: 54 and 173, where this is made explicit), and thus seescollocation as a relationship between lexemes, he previously regarded it as arelationship between ‘lexical items’ This latter view is also shared by Halliday,who exemplifies ‘lexical item’ with the group of derivationally related lexemesSTRENGTH, STRENGTHEN, STRONG (1966: 156).4According to this view,
a strong argument, he argued strongly, the strength of the argument, his argument was strengthened would all be considered instances of the same collocation A
third view on this question is that collocation is a relationship between word
forms, i.e that combinations such as hold tight and holds tight are two
dif-ferent collocations A more fundamental aspect in which definitions vary isthe question of the nature of the collocation as such Sometimes ‘collocation’seems to be used purely to describe a phenomenon in a given amount of text(as in the above definition by Sinclair); more commonly, it also seems to beconsidered a more abstract tendency in a language (cf e.g Sinclair 1966: 418).Further points that are viewed differently by authors adopting a frequency-based approach are the number of words involved in a collocation and whether
or not these have to be consecutive Occasionally, as in the above definition,
“two or more words” are considered to constitute a collocation (also e.g Firth1951: 197f.); often only two words are allowed (e.g Jones & Sinclair 1974).The fact that the words are consecutive is, for example, required by Kjellmer;
Firth at times considers whole sequences such as [i]s all the world drowned in
blood and sunk in cruelty (1957: 196) as collocations.5A final aspect in whichdefinitions vary is the syntactic relationship of the elements involved in a col-location In the frequency-based approach, the syntactic relationship betweenthe elements does not usually play a role in deciding whether they form acollocation or not Among the few exceptions are Kjellmer and Greenbaum.Kjellmer excludes from his definition sequences that have no or only a very
distant grammatical relationship: night he, for example, in a sentence such as
Trang 27 Chapter 2
At night, he suddenly remembered what had happened, would not be
consid-ered a collocation according to his definition, even if the criterion of (relative)frequency is met (1994: xxiiff.) Greenbaum’s definition of collocations onlyincludes words that stand in a close grammatical relationship (such as adverb+ adjective; 1970) However, as he at the same time completely dismisses thecriterion of co-occurrence in a certain span (although he retains the criterion
of frequency of co-occurrence), Greenbaum is among the less typical sentatives of the frequency-based approach, and his definition approaches thephraseological view of collocations
repre-A P Cowie is a typical representative of the phraseological approach: heconsiders collocations a type of word combination, i.e an abstract combina-tion with instantiations in actual texts, and defines them by delimiting themfrom other types of word combinations, most importantly from idioms onthe one side and from what he sometimes calls ‘free combinations’ on theother (e.g Cowie 1981, 1994; Cowie et al 1993) At the same time he is one
of the most important representatives of the phraseological approach, as hisattempts to define collocations and to delimit different kinds of word combi-nations are among the most precise Cowie divides word combinations intotwo main types, ‘composites’ and ‘formulae’ Formulae are combinations with
a primarily pragmatic function such as How are you? or Good morning (e.g.
1994: 3169) Collocations belong to the group of ‘composites’, which are scribed as having a primarily syntactic function The distinctions in the group
de-of composites are made on the basis de-of two criteria, which Cowie assumes tointeract closely: the criterion of transparency and the criterion of commutabil-ity (or substitutability).6 Transparency refers to whether the elements of thecombination and the combination itself have a literal or a non-literal meaning,and commutability refers to whether and to what degree the substitution ofthe elements of the combination is restricted On this basis, he distinguishesthe following four types of combinations, stressing, however, that these typesare not clearly delimitable, but should rather be seen as forming a continuum:
Free combinations (e.g drink tea):
– the restriction on substitution can be specified on semantic grounds– all elements of the word combination are used in a literal sense
Restricted collocations (e.g perform a task):
– some substitution is possible, but there are arbitrary limitations onsubstitution
– at least one element has a non-literal meaning, and at least one ment is used in its literal sense; the whole combination is transparent
Trang 28ele-Investigating collocations in a learner corpus
Figurative idioms (e.g do a U-turn, in the sense of ‘completely change
one’s policy or behaviour’):
– substitution of the elements is seldom possible
– the combination has a figurative meaning, but preserves a currentliteral interpretation
Pure idioms (e.g blow the gaff ):
– substitution of the elements is impossible
– the combination has a figurative meaning and does not preserve acurrent literal interpretation
The most important variation in Cowie’s use of the term ‘collocation’ is thatwhile he sometimes applies it (as above) only to combinations with an arbitrar-ily limited substitutability in which one element is used in a non-literal sense,
he sometimes applies it to free combinations as well In this case, however, hemakes a distinction between ‘open collocations’ (i.e free combinations) and
‘restricted collocations’ He also varies in categorising combinations of the type
foot the bill, in which one word in a given specialized meaning (foot in this case)
can co-occur only with one other word While he usually subsumes such binations under the category ‘restricted collocations’ (1998b: 221), at least inone paper he regards them as constituting an additional category between id-ioms and collocations (1981: 228) A third aspect in which his definition varies
com-is that while he usually assumes that the elements of a collocation are emes, he assumes in at least one publication that these elements are ‘roots’(1994: 3169), abstract units comprising all inflectional and derivational forms
lex-of a word, similar to Halliday’s and Sinclair’s ‘lexical items’.7
As in the case of Sinclair, Cowie’s variation in the use of the term reflectssome of its different uses by different authors adopting a phraseological ap-proach A number of researchers apply the term ‘collocations’ to both free com-binations and restricted collocations Some of these do not differentiate further(e.g Lyons 1977), while others, like Cowie, distinguish between ‘open colloca-tions’ (or ‘free collocations’) and ‘restricted collocations’ (e.g Aisenstadt 1981).More frequently, authors adopting a phraseological approach reserve the term
‘collocation’ for Cowie’s restricted collocations and use different terms, such
as ‘free combinations’ or ‘co-creations’, for unrestricted combinations (e.g.Benson et al 1997; Hausmann 1984; Bahns 1993a) The number of categoriestowards the more restricted and opaque end of the scale also varies betweenauthors Cowie’s distinction between two types of idioms (figurative idiomsand pure idioms) is often not made, and Benson et al., for example, consis-tently postulate an additional category between collocations and idioms, which
Trang 29defi-is freely chosen on the basdefi-is of its meaning, while the selection of the other pends on this freely chosen element To illustrate this, he cites examples such as
de-do a favour, where favour is chosen on the basis of its meaning and de-do (and not
for example make or give) is selected by favour (1998: 31) The commutability (of do) therefore mainly seems to be responsible for the classification of do a
favour as a collocation and not a free combination, and both commutability
and transparency (of favour) seem to be responsible for the combination not
being classified as an idiom
Even when the same criteria are used by different authors, however, the limitations between different types of word combinations are not necessarilyidentical Among those authors basing the distinction between free combina-tions and collocations on commutability, for example, (restricted) collocationsare sometimes a broader category than in Cowie’s classification, and some-times a narrower one A more narrow category can be found, for example, inFernando, who only considers those collocations restricted whose elements are
de-very limited in their commutability (such as addled, which in the sense of ‘bad
to eat’ is restricted to eggs) but not combinations such as strong coffee or hard
work (1996: 36).9A broader category can be found in Cruse, for example (cf.Section 2.1.2) Interestingly, a number of authors use the criterion of frequency
Trang 30Investigating collocations in a learner corpus
of co-occurrence, i.e the main criterion of the frequency-based approach, inaddition to phraseological criteria such as commutability and transparency(e.g Nation 2001: 317; Herbst 1996: 389; Benson et al 1986: 253) For these au-thors, for a combination to be considered a collocation, it has to be restricted,transparent and frequent; Benson et al even seem to assume that the criteria
of restriction and frequency coincide (1997: xxx).10
Unlike the frequency-based approach, the phraseological approach sistently requires that the elements of collocations should be syntactically re-lated Hausmann even goes so far as to call only those combinations colloca-tions that appear in a pre-defined set of syntactic relations: adjective + noun,(subject-)noun + verb, noun + noun, adverb + adjective, verb + adverb, verb+ (object-)noun He thus also only allows the combination of two lexical ele-ments in the category of collocations, while Benson et al., for example, also per-mit a lexical word plus a preposition (1997: ix) As in the frequency-based ap-proach, the number of participating items varies between two (e.g Hausmann1989: 1010) and two or more (e.g Aisenstadt 1981: 54; Cowie 1994: 3169), but
con-often this question is not addressed at all Shrug one’s shoulders, for example,
can therefore be viewed as a collocation consisting of either three elements
(shrug + ones + shoulders) or of two (shrug + shoulders) As to the question of
whether lexemes, word forms, or lexical items/roots are the elements of cations, most authors assume that the participating elements are lexemes (with
collo-the exception of Cowie 1981, cf above) Combinations such as strong
argu-ment and strong arguargu-ments are therefore generally assumed to be instantiations
of the same collocation, but the strength of the argument is assumed to be an
instantiation of a different one A final important point on which tives of the phraseological approach differ is on how they view the relationshipbetween the elements of a collocation Often, the assumption seems to be thatthere is no difference in the nature of the elements, as for example in Cowie’sdefinition, which merely requires that one of the elements is restricted but doesnot specify which one (cf above and also e.g Cowie 1992: 5f.) A few theorists,
representa-in particular Hausmann and Mel’ˇcuk, however, have stressed that there is adifference in the nature of the elements in a collocation Mel’ˇcuk’s distinctionbetween the two lexical elements involved in a collocation is already present inhis definition (cf above) He calls the element of a collocation that has beenselected on the basis of its meaning the ‘keyword’, and the element(s) it selects
to express a certain meaning the ‘value’ (in do a favour, for example, favour is the keyword and do is the value, or more precisely, part of the value) Haus-
mann makes a very similar distinction, calling the element that is semanticallyautonomous and selected first in production the ‘base’ (or rather German ‘Ba-
Trang 31 Chapter 2
sis’ and French ‘base’), and the element whose selection depends on the basethe ‘collocator’ (‘Kollokator’ and ‘collocatif ’).11The difference between the twodistinctions is that ‘value’ refers to all elements that collocate with the keyword
to express a certain sense, whereas ‘collocator’ only refers to one element In
the collocations carry out / do / make / conduct a study, for example, carry out,
do, make, and conduct are four collocators to the base study, but together (and
perhaps together with one or two other verbs) they are the value of the
lex-ical function ‘do’/‘perform’ of the keyword study (for more details on lexlex-ical
functions, see Section 2.1.3).12
In addition to the two main approaches and their variations, collocationshave been defined in numerous other, more idiosyncratic ways Benson et al.,for example, also use the term to refer to what are more commonly called
valency patterns, such as suggest + -ing (1997: ix, cf below) A few authors
in-clude compounds (e.g Smadja 1993), and van der Wouden (1997: 6f.) evenextends the term to cover combinations of morphemes that are to some de-
gree fixed (e.g cran-berry or ox-en) Examples of idiosyncratic usage can also
frequently be found in the area of language teaching (cf Bahns 1997: 9ff.).Taylor, for example, includes paradigmatic relations in her definition of col-locations (1990: 2) The frequency-based and the phraseological approach arealso sometimes mixed, with some authors who primarily adopt a phraseolog-ical approach additionally considering frequency as a defining criterion Someauthors primarily working in the framework of the frequency-based approach,
in turn, have also introduced phraseological distinctions (e.g Sinclair, whouses the term ‘idiom’; 1991: 172) What can also be found is a double use ofthe term ‘collocation’, i.e its use in both the sense of the frequency-based ap-proach and the phraseological approach in one and the same piece of work
F R Palmer, for example, on the one hand reserves the term for free andrestricted combinations as opposed to idioms (1981: 77f.), and on the other
refers to “the collocation of kick and the bucket” (1981: 79), where ‘collocation’
apparently means co-occurrence A similar variation can be found in Quirk
et al (1985: 1197f., 1567, 772, 1172) Finally, a few other terms can be foundfor the syntagmatic phenomena described above, in particular for collocations
in the phraseological sense, such as ‘non-idiom phraseological units’ (Nagy1978: 296) or ‘idioms of encoding’ (Makkai 1972: 25)
.. Related concepts
Of the vast variety of existing concepts in the area of syntagmatic relations andword combinations (such as valency, metaphor, proverbs, ‘wesenhafte Bedeu-
Trang 32Investigating collocations in a learner corpus
tungsbeziehungen’, semantic roles, semantic prosody, colligation), only threethat are particularly relevant for the present study will be discussed here: the(related) notions of ‘selectional restrictions’ and ‘lexical solidarities’, and a type
of combination that has been referred to as ‘stretched verb construction’
The notion of ‘selectional restrictions’ (or ‘selection restrictions’, e.g Leech1974; Carter 1998) originated in generative grammar; it was introduced byKatz and Fodor (1963) in their attempt to add a semantic component to thegenerative model Selectional restrictions are conditions for the combinability
of elements which are a consequence of the meaning of a word and expressed
by means of semantic features For example, one selectional restriction of the
verb kill is the requirement that the object has to contain the semantic feature
[+ANIMATE] The “presence” of selectional features is conceived to prevent
the generation of combinations such as *kill a chair, which can be
consid-ered unacceptable or at least highly uncommon.13Today, the term ‘selectionalrestrictions’ is found outside theories of generative grammar as well A dis-tinction is sometimes made between ‘selectional restrictions’, which originate
in the meaning of elements, and restrictions which are arbitrary to a certaindegree, which can be called ‘collocational restrictions’ (e.g Herbst 1996: 385;Cruse 1986: 107).14 If this distinction is made, collocational restrictions can
be said to result in restricted collocations and selectional restrictions in freecombinations Different views of when a restriction may be considered to orig-inate in the meaning of a word, however, lead to different interpretations ofthe term ‘selectional restrictions’, which in turn lead to different distinctionsbetween free combinations and collocations Cruse’s broader view of colloca-tions mentioned above, for example, is in part a consequence of his definition
of selectional restrictions, as he only considers them to be based on the core
meaning of the word The fact that the verb pass away, for example, requires a
human subject, is not a selectional but a collocational restriction according toCruse, as it is not part of the core meaning of the verb (which is ‘to die’) A com-
bination such as husband + pass away (as in Her husband passed away last night)
would thus be considered a collocation in Cruse’s framework Other theorists
do not assume that selectional restrictions have to be a part of the core ing of the word, and would therefore consider such restrictions ‘selectional’(e.g Bierwisch 1970; Herbst 1996).15
mean-As with selectional restrictions, lexical solidarities are an attempt to explainthe combinability of lexemes on the basis of their semantic characteristics Theconcept of lexical solidarities, which was introduced by Coseriu (1967), dif-fers from selectional restrictions in two respects, however First, selectionalrestrictions are or at least originally were conceived as having primarily neg-
Trang 33 Chapter 2
ative implications in that they are considered to be responsible for the blocking
of certain combinations Lexical solidarities, on the other hand, are conceived
as having primarily positive implications, in that they explain why certain ments tend to co-occur (cf e.g Kastovsky 1982: 249) Secondly, lexical solidar-ities are divided into three different types (examples from Kastovsky 1980: 87):
ele-Affinity: A given lexeme can be combined with all lexemes
contain-ing a very general semantic feature such as [ANIMATE], [HUMAN],
[MALE] Example: The boy apologized, where apologize requires its subject
to be [+HUMAN]
Selection: A given lexeme can be combined with others containing a
certain archisememe,16 i.e a feature that is less general but comprisesthe content of a whole lexical field, such as [PLANT] or [TIME] Ex-
ample: A week elapsed, where elapse requires a subject containing the
feature [+TIME]
Implication: A given lexeme can only be combined with one other
lex-eme Example: He shrugged his shoulders, with shrug usually allowing only
shoulder as object.
If these concepts are related to the concepts of collocations and free tions, the first two types of solidarity can be said to lead to free combinationsand not to collocations in most of the definitions outlined above (Cruse be-ing one of the few exceptions) What the third type (‘implication’) leads to isless clear and depends on the criteria employed; this type will be discussed indetail later on
combina-The third concept related to collocations to be outlined is not a type ofsyntagmatic relation but rather a particular type of word combination Thistype of word combination has been referred to as ‘stretched verb construc-tion’ (Allerton 2002), ‘support verb construction’ (Krenn 2000; Danlos 1992),
‘expanded predicate’ (Algeo 1995), ‘verbo-nominal phrase’ (Rensky 1964),
‘phrasal verb’ (Stein 1991; Live 1973), ‘complex verbal structure’ (Nickel 1968),
or ‘delexical verb combination’ (Altenberg 2001) to name but a few Examples
of ‘stretched verb constructions’, which will be the term used in this study,
are make an arrangement, give an answer or have a look at What is special
about these combinations is that the noun is derivationally related to a verb
that is roughly synonymous with the whole combination: the meaning of make
an arrangement, for example, largely corresponds to the meaning of arrange.
The noun is eventive and carries the bulk of the meaning, while the verb tributes comparatively little to the lexical meaning of the combination and can
Trang 34con-Investigating collocations in a learner corpus
therefore be called a ‘light verb’.17Similar to collocations, stretched verb structions are not easily distinguishable from other types of word combinationsand the definitions vary widely and also independently of the term that is be-ing used The most restrictive definitions only include combinations of one of
con-the verbs make, take, give, and have with an indefinite article and an eventive
noun that is identical in form to the verb with which the whole construction
is roughly synonymous (e.g Labuhn 2001) Less restrictive definitions also
in-clude combinations with other verbs (such as run a risk), a different or no article (such as take action), combinations in which the noun is a preposi- tional object (such as take sth into consideration), and combinations in which the noun is phonetically and/or derivationally related to the verb (e.g make a
decision – decide, take a breath – breathe, offer an apology – apologize) Broad
definitions sometimes additionally allow combinations of verb and adjective
and copular constructions in which the noun denotes an agent (be critical, be
a helper; e.g Allerton 2002) Many broader definitions also include
combina-tions that have an equivalent verb in the passive or in a reflexive or causative
construction (take offence – be offended, give sb a good feeling – make sb feel
good; e.g Allerton 2002) and combinations of a light verb and an eventive
noun which do not have a roughly synonymous verb related to the noun (make
an effort, e.g Altenberg 2001) Interestingly, although the concept of stretched
verb constructions is quite frequently discussed and used, the relation of thesecombinations to collocations is not If the relationship is mentioned or evenmade explicit, it is usually assumed that stretched verb constructions – at leastthe verb-(object-)noun type – are a type of restricted collocation (e.g Allerton2002: 221) Sometimes a distinction is made between stretched verb construc-tions and collocations (e.g Caroli 1995) Combinations with light verbs arethen considered stretched verb constructions, while those with other types ofverbs (in particular figurative verbs) are considered collocations
.. Classifications of collocations
As the definition of collocations in this study is phraseological rather thanfrequency-based and narrow rather than broad (i.e includes only restrictedcollocations and not open collocations), only classifications of restricted collo-cations will be considered here There have not been many attempts to classifyrestricted collocations, but the classifications that have been made can be di-vided into three types The first type is based on the syntactic characteristics
of the collocation, the second on its semantic characteristics and the third
on the commutability of its elements In the first type, restricted collocations
Trang 35appointed), verb + adverb (severely criticize), and verb + (object-)noun (stand
a chance) Aisenstadt (1981) proposes a similar classification, but divides the
verb + noun group further into verb + noun (e.g make a decision) and verb + preposition + noun (e.g come to a decision) Benson et al make the same
distinctions as Hausmann, but owing to their broader definition of
colloca-tions add the combinacolloca-tions noun + preposition (e.g interest in), preposition + noun (by accident), and adjective + preposition (angry at).18They also make
a more fundamental distinction, which is based on the word classes to whichthe elements of a collocation belong Collocations in which two lexical ele-ments co-occur are called ‘lexical collocations’, collocations in which a lexicaland a more grammatical element (such as a preposition) co-occur, are called
‘grammatical collocations’.19
The second type of classification is based on the semantic characteristics ofthe combination, or more precisely, on the semantic characteristics of the col-locator (in Hausmann’s terminology) Two different kinds of attempt to classifycollocations in this way can be discerned One is limited to verb-noun collo-cations and is based on the nature of the meaning of the verb Cowie (e.g
1991, 1992) is one of the few researchers who attempt such a classification
He distinguishes between verbs with a ‘figurative’, a ‘delexical’ and a ‘technical’(or ‘semi-technical’) meaning Corresponding collocations (which he does not
label) would be deliver a speech, make recommendations, try a case (1992: 6).
Cowie’s classification, which was also adopted by Howarth, was probably spired by Aisenstadt (1979), who makes a similar distinction between verbswith a “secondary, abstract meaning”, verbs with a “grammaticalized, wideand vague meaning”, and verbs with a “very narrow and specific meaning”(1981: 57) From the examples she gives, it may be assumed that the first twocategories roughly correspond to Cowie’s figurative and delexical meanings;
in-the last category is exemplified by shrug one’s shoulders.20The second type ofattempt at classifying collocations on a semantic basis is much more detailedand also more comprehensive in that it applies to all grammatical types of col-locations The classification has been devised primarily by Igor Mel’ˇcuk and isbased on the notion of what he calls ‘lexical functions’ A lexical function is a(typically general) meaning that may be expressed by a variety of different lex-emes, but in a given collocation, the lexeme(s) which express(es) this meaning
is chosen by the keyword (Hausmann’s ‘base’) An example of a lexical function
Trang 36Investigating collocations in a learner corpus
is the meaning ‘do’/‘perform’ If this meaning is to be expressed with respect to
the noun cry, one of the possible lexemes is let out; with respect to support,
lend is possible Possible collocations thus are let out a cry and lend support
(but not *lend a cry or *let out support); the lexical function is called Oper.21Intheory, it should be possible to classify most collocations according to lexicalfunctions, i.e according to the meaning the collocator expresses So far, how-ever, at least for English, this has not been done,22although a large number oflexical functions that occur in many languages and whose value is expressed
by a large number of lexemes has been identified Additional examples of
lex-ical functions are Magn (‘intense(ly)’/‘very’; e.g stark naked), Incep (‘begin’; e.g catch fire), Func (‘function’; e.g snow is falling), and Liquid (‘causation of
non-existence’; e.g lift a blockade).23
The third type of classification that has been established for restricted locations is based on the commutability of the elements of a collocation Thedistinction made by Benson et al between ‘collocations’ and ‘transitional collo-cations’ (see above) may be considered an example of this type of classification,
col-as it is bcol-ased on the variability of elements (foot a bill, for example, is col-assigned to this category, as foot in this meaning can only combine with bill; 1986: 254).24
An attempt to explicitly subclassify collocations on the basis of commutability
is made by Aisenstadt (1979, 1981) She divides collocations into two groupsdepending on whether both or only one of the participating lexical elementsare restricted in their commutability.25 In the combination shrug one’s shoul-
ders, for example, she assumes that both lexical elements are restricted, and
illustrates this with the following paradigms (1981: 55, 56; 1979: 73):
shrug one’s shoulders shrug one’s shoulders
In combinations such as to make/take a decision or auburn hair, on the other
hand, only one element is considered restricted in its commutability In the mer combinations, the verbs are said to “have a rather wide and vague meaningand collocate with different nouns” (1981: 57), whereas the noun “is restricted
for-in its commutability, though not [ .] to one verb only” (1981: 56); for-in the
lat-ter combination, the commutability of auburn is considered to be restricted to
hair whereas hair “commute[s] freely with a great number of other adjectives”
(1981: 57)
The most comprehensive classification on the basis of commutability todate has been established by Howarth; it is, however, restricted to verb-nouncollocations Howarth distinguishes five ‘levels of restrictedness’ according to
Trang 37 Chapter 2
two criteria, namely the number of elements that are restricted in their mutability and the degree of the restriction (1996: 105) These levels are de-scribed and exemplified as follows (1996: 102):
com-1 freedom of substitution in the noun; some restriction on the choice ofverb
an open set of nouns
a small number of synonymous verbs
adopt/accept/agree to a proposal/suggestion/recommendation/convention/plan
2 some substitution in both elements
a small range of nouns can be used with the verb in that sense
there are a small number of synonymous verbs
introduce/table/bring forward a bill/an amendment
3 some substitution in the verb; complete restriction on the choice of thenoun
no other noun can be used with the verb in that sense
there are a small number of synonymous verbs
pay/take heed
4 complete restriction on the choice of the verb; some substitution of thenoun
a small range of nouns can be used with the verb in that sense
there are no synonymous verbs
give the appearance/impression
5 complete restriction on the choice of both elements
no other noun can be used with the verb in the given sense
there are no synonymous verbs
curry favour
Howarth also relates these five levels to the semantic categorization of binations into those with figurative, delexical, and technical verbs, which headopts from Cowie He finds that there are no combinations with figurativeand technical verbs on level five and no combinations with technical verbs onlevel one, but that all other combinations of type of verb and level of restricted-ness occur (1996: 118), which means that there is no direct correlation betweenthe type of verb in the combination and its level of restrictedness This attempt
com-to relate aspects of meaning and commutability in some detail in a classification
of collocations is probably unique to date
Trang 38Investigating collocations in a learner corpus
.. The definition of collocations in this study
The approach to collocations in the present study is, as has already beenpointed out, phraseological rather than frequency-based, i.e collocations areconsidered a type of word combination in a certain grammatical pattern, andthe term ‘collocation’ will be used both to refer to an abstract unit of languageand its instantiations in texts It will, however, exclusively be applied to re-stricted collocations; for the ones that have been called ‘open collocations’, theterm ‘free combinations’ will be used A collocation will not be considered to berestricted to two lexical elements but taken to include the other elements closely
associated with them as well: put pressure on sb., for example, will be considered
a collocation (and not merely put + pressure) Similarly, take an interest in will
be referred to as a collocation, although strictly speaking two collocations, one
lexical (take an interest) and one grammatical (interest in), are present here The
elements involved in collocations are assumed to be lexemes, i.e it is assumed
that combinations such as pay attention, pays attention, paid attention and
at-tention was paid are instantiations of the same collocation This does not mean,
however, that it is assumed that all theoretically conceivable instantiations of acertain collocation exist, let alone that they are equally common
Three major types of non-formulaic word combinations will be guished: free combinations, collocations (or restricted collocations), and id-ioms (abbreviated as F, RC, and I, respectively).26An attempt will be made todistinguish these three types as clearly as possible, so that the delimitation isnot only theoretically consistent but also applicable both to corpus data and inthe area of language teaching Of the existing delimitations, none was found
distin-to sufficiently meet these criteria, so that a new delimitation of word tions and a new definition of collocations had to be developed The definitionhas only been developed for verb-noun combinations, though it may be as-sumed that it is also applicable to other types of combinations without majormodification
combina-Most of the phraseological definitions of collocations are based on severalcriteria, and often these criteria are assumed to coincide The two criteria thatare most commonly used, as for example in the definition by Cowie, are opac-ity and commutability Collocations are then defined as combinations in which
at least one element has a non-literal meaning (and at least one a literal one)and in which commutability is arbitrarily restricted, but some commutability
is possible What is problematic about such a definition is that the two criteria,although correlating to some degree, do not regularly coincide.27This lack ofcorrelation between different criteria that have traditionally been employed to
Trang 39 Chapter 2
classify phraseological combinations has been pointed out by a few authors cently (Barkema 1996; Hudson 1998) For the non-coincidence of opacity andcommutability, three groups of examples can be adduced (the examples will
re-be of verb-noun combinations only) First, there are combinations in whichone element is used in a non-literal sense, but in which the elements are ar-bitrarily restricted in their commutability One example is combinations with
face in a figurative sense meaning ‘to have to deal with a particular situation’,
such as face a financial crisis, face a task, face a period of unemployment, face
her anger These would be classified as collocations on the basis of the fact that
they are non-literal, and as free combinations on the basis of the criterion ofcommutability, as the choice of object seems unlimited as long as it refers to
some kind of difficult or unpleasant situation Further examples are take in the sense of ‘need or require a particular amount of time’ as in It took her three
hours to repair her bike or The journey to the airport takes about half an hour, or push in the sense of ‘make sb work hard’ as in The music teacher really pushes her pupils (OALD) The (direct) objects of these senses of the two verbs are not
arbitrarily restricted either; the only requirements are that for take the object denotes a period of time and for push, that the object is human Secondly, there
are combinations in which both elements probably have to be considered eral, but which are nevertheless restricted in their commutability A problemrelated to this is that it is often difficult to decide whether an element is used
lit-in a literal or a figurative sense (cf Howarth 1998: 98f.) A comblit-ination such
as commit a crime (or commit a sin, an error etc.), for example, is restricted in its commutability (?commit a lie, a deceit, a delinquency, cf Klotz 2000: 94) but
can probably not be considered to contain an element in a figurative sense, as
both commit and crime are used in their primary senses (as evidenced by
dic-tionaries) Thirdly, there are combinations in which both elements are used in afigurative sense, but where a great degree of commutability is nevertheless pos-
sible Take steps in a context such as steps were taken to prevent this, for example,
which would be classified as a ‘figurative idiom’ by Cowie, allows
commutabil-ity of both elements although both are used in their figurative senses Take
measures or take action, for example, are possible as well as envisage steps, or consider steps.
A solution to this problem is to define collocations on the basis of onecriterion only Such a solution has already been hinted at by those authorswhose definitions are based on both opacity and commutability Often one ofthese is declared as the main criterion, and in particular the distinction be-tween collocations and free combinations is often drawn exclusively on thebasis of the criterion of commutability Even Howarth, who explicitly adopts
Trang 40Investigating collocations in a learner corpus
Cowie’s definition of collocations (1996: 47), only applies the criterion of mutability when it comes to actually dividing a group of combinations intocollocations and free combinations (101f.) Sometimes the criterion of com-mutability has even been called ‘collocability’ (Barkema 1997; Cowie 1994)
com-It seems, therefore, that commutability is generally seen as the more relevant
of the two criteria; in addition, it is also easier to measure than opacity (cf.Hudson 1998: 35) In my definition, I will therefore consider only commutabil-ity a defining criterion for collocations I will also show that it is possible touse it both for the distinction of collocations and free combinations and forthe distinction of collocations and idioms No other criteria will be adopted(including the criterion of frequency which has been introduced into many pri-
marily phraseological definitions) A combination such as commit blasphemy
will, therefore, although it is rather infrequent, be considered a ‘collocation’.28
A second serious problem inherent in many definitions of collocationsconcerns the criterion of commutability itself: arbitrary restriction on com-mutability is interpreted in widely different ways, and it is also often not madeclear what exactly is meant Aisenstadt, for example, assumes the commutabil-
ity of shrug to be restricted to one’s shoulders, sth off, and sth away (cf above), and in turn the commutability of shoulders to be restricted to shrug, square and hunch In the first case, the commutability examined is between a pronoun
plus a specific noun and an unspecified noun plus a particle of a phrasal verb
(one’s shoulders commutes with sth off/away) In the second, it is constrained to
one specified word of one class (verbs), while both the noun and the pronoun
(one’s shoulders) remain constant In addition, it seems that while the plementation of shrug is really fairly exhausted with the examples given, this
com-is by no means the case with shoulders: straighten one’s shoulders, wash one’s
shoulders, look at one’s shoulders, rub one’s shoulders, scratch one’s shoulders and
many more are conceivable, as the examples given both for shoulders and for
shrug do not indicate that the verbs are required to be synonymous The same
applies to make/take a decision, where decision is said to be restricted to a few
verbs (cf above) Again, the requirement does not seem to be that the verbs
or the combinations are synonymous, as the combination auburn hair is cited
as a parallel case, and the wide combinability of hair with other adjectives of
any kind is taken to mean that its collocability is not restricted If one looks at
the combinability of decision with other (synonymous and non-synonymous) verbs, however, many additional verbs are possible: reach a decision, come to a
decision, postpone a decision, criticise a decision, explain a decision etc Cowie, on
the other hand, limits the notion of restricted commutability to synonyms ornear-synonyms Among the examples he cites for restricted commutability are