On the other, they might have difficulty identi-fying function words confidently within a piece of connected speechbecause functors in English are usually brief and of low perceptualprom
Trang 1Bricks or Mortar: Which Parts of the Input Does a Second Language Listener Rely on?
lan-A separate channel of processing for functors would enable them to bedetected faster The question is of importance to our understanding ofsecond language (L2) listening Because what is extracted from theinput by L2 listeners is generally less than complete, it is useful for theinstructor to know which parts of the signal they are likely to recognize,and which parts are likely to be lost to them On the one hand, L2listeners might rely heavily on function words because high frequencyrenders them familiar On the other, they might have difficulty identi-fying function words confidently within a piece of connected speechbecause functors in English are usually brief and of low perceptualprominence The current study investigated intake by intermediate-level L2 listeners to establish whether function or content words areprocessed more accurately and reported more frequently It found thatthe recognition of functors fell significantly behind that of lexicalwords The finding was remarkably robust across first languages andacross levels of proficiency, suggesting that it may reflect the way inwhich L2 listeners choose to distribute their attention
Grammarians have long found it useful to identify two categories of
lexical unit The distinction is often expressed in terms of a closed
class (of prepositions, determiners, auxiliary verbs, etc.) to which new
items are very rarely added, and an open class (consisting mainly of nouns,
verbs, adjectives, and adverbs of manner) which is constantly being panded as new items are coined (Quirk, Greenbaum, Leech, & Svartvik,
ex-1985, pp 67–68) A more traditional way of defining the categories is by
distinguishing items which fulfil a largely syntactic function (function
words) from those which bear lexical meaning (content words) This
Trang 2dis-tinction gives rise to grey areas: For example, the preposition in clearly does not have the same level of lexical meaning as the word book, but it
still appears in dictionaries and can be demonstrated by a languageteacher Nevertheless, it is this semantic distinction which is adopted inthe current study, for reasons that will become evident
A linguistic concept does not necessarily correspond to a guistic one Just because a category or structure is recognized in gram-
psycholin-mar theory, one cannot take it for granted that it has psychological reality,
that is, that it plays any part in the way the mind constructs or stands utterances However, within first language (L1) psycholinguistics,
under-a greunder-at deunder-al of evidence hunder-as under-accumulunder-ated which suggests thunder-at functionwords are processed differently from those that bear lexical meaning.Some of the earliest indications came from examples of slips of thetongue, where it was noted that content words are quite often misplaced
(rules of word formation → words of rule formation) but that function words
tend not to be This evidence suggested to some commentators (e.g.,Garrett, 1980) that assembling an utterance demands two distinct pro-cesses, with the speaker first constructing a frame in which certain po-sitions are reserved for the mortar of function words and then insertingmeaning-bearing bricks in the form of nouns, verbs, and adjectives.Further early evidence (for listening, see Swinney, Zurif, & Cutler,1980) came from patients who had suffered damage to Broca’s area inthe brain as a result of a stroke, an accident, or surgery Their vocabularystore seemed to remain relatively intact, but access to grammar (includ-ing inflections and function words) was often impaired This led Bradley(1978) to conclude that the two categories are stored separately in themind and/or accessed in different ways The symptoms associated withBroca’s aphasia might be the result of damage to a part of the brainwhere function words are grouped or of damage to the route by whichthe individual retrieves these items when needed
It is also suggestive that infants appear to recognize function wordsquite early in their language development (Shi, Werker, & Cutler, 2006)but do not produce many of them until quite late (Radford, 1990),despite their high frequency The delay might be interpreted as an in-dication that, as speakers, they need to establish a separate retrievalprocess for this category
The long-standing assumption that content and function words arestored and processed differently has not gone unchallenged From testsusing spliced sections of speech, Herron and Bates (1997) concludedthat, though listeners access function words rapidly, they depend uponcontext in order to decode them unambiguously Segalowitz and Lane(2000) argued that it is not necessary to assume two stores because thehigh frequency of function words will always lead to their being recog-nized more rapidly than most content words However, the separate
Trang 3store view has been supported by recent neurological evidence obtainedfrom brain imaging (e.g., Brown, Hagoort, & ter Keurs, 1999; Münte etal., 2001) The areas of the brain that are associated with function wordprocessing appear to be rather different from those associated with ac-
cessing the wider lexicon In addition, event-related potentials measuring
electrical activity in the brain suggest neurological differences in the waythe two categories are processed (Kutas & Van Petten, 1994, pp 125–127)
Using two different routes has practical benefits for a listener (orindeed a reader) In order to identify a function word, a simple patternmatch is all that is required; there is no need to gain access to a meaning.The process so far as a content word is concerned is considerably morecomplex The word might possess not one but a range of potentialsenses, all of which have to be accessed (Swinney, 1979) before one ischosen that accords with the context in which the word appears So aseparate route for function words might enable them to be identifiedmore rapidly and less ambivalently than content words
If we assume that content and function words are processed separately
in this way, then how does a listener, exposed to a group of sounds,manage to determine at the outset which parts of the input are likely tocorrespond to functors and which to content words? Basing their analysis
on English,1Grosjean and Gee (1987) suggest that listeners exploit theperceptual difference between stressed syllables, which occur almost ex-clusively in content words (and indeed may even serve as the principalmeans of identifying those words), and unstressed syllables, which oftencorrespond to monosyllabic, weak quality functors Stressed syllables ini-tiate a lexical search, while unstressed ones lead in the first instance to asimple pattern-matching procedure
Grosjean and Gee (1987) assume that there are two separate stores,with one list consisting purely of functors and another of both functorsand content words The first is thus available for a rapid identificationprocess, while the second enables factors such as frequency, multiplemeaning, and contextual constraints to be brought to bear on any de-cision The consequence is a much faster identification of functionwords.2
It should be stressed that any identification is provisional Just because
an apparent functor match is achieved, it does not mean that it will besustained by the subsequent search To give an example, the group
1 The proposal related to English, but many other languages appear to downgrade the prominence of function words in terms of their duration, their loudness, or their vowel quality.
2 By including the entire vocabulary, the second list allows for gradations in the extent to which words convey “meaning.”
Trang 4[əvз:td] might initially be interpreted as HAVE + past participle of verbbefore being revised when the lexical search reveals that there is no such
verb as erted Similarly, a decision on a possible functor match would be
sustained in a sequence like [getə|
laf] but would be overruled in asequence such as [|kætəlɒg], once the larger word became available.Current models of listening represent lexical retrieval as involving a form
of competition with rival candidates receiving activation according tohow probable they are as a match, that is, according to how well they fitwhat has been heard and to their relative frequency Competition can
take place across word boundaries, so, halfway through the word
cata-logue, potential whole word matches such as catalogue and catapult
com-pete with possibilities such as cat + a or cat + of What gives function words
a head start, in principle at least, is their high frequency (Segalowitz &Lane, 2000) However, a problem lies in the disproportionately highfrequency of many of them,4which could lead to their dominating con-
tent words in which they are embedded (one thinks of the in weather).
Bard (1990, p 206) suggests that there is a trade-off between the weakperceptibility of most function words, which depresses their activation,and their high frequency
Like Grosjean and Gee, Cutler (1993) suggests that perceptual cuescan be used by a listener to initiate separate searches, though she prefers
to rely on a distinction between strong stressed syllables with full qualityvowels and weak unstressed ones marked by the presence of schwa Shepoints out that the structure of the English lexicon means that anyfirst-pass association of content words with the former and functionwords with the latter has a high chance of success Of the strong syllables
in a corpus examined by Cutler & Carter (1987), 86% occurred in openclass words and only 14% in closed-class words The pattern was reversedfor weak syllables, with 72% in closed-class words and 28% in open-classwords
Stankler (reported in Cutler, 1993) supported these statistics withpsychological evidence of a link between weak quality and membership
of the functor class He trained English speakers in an artificial languagewhich observed the same prosodic distinctions as in English, with weakquality items serving grammatical functions and strong quality itemscarrying lexical content The results were compared with those fromparticipants who learned the same artificial language but with the con-tent–functor distinction random, reversed (strong syllables marking
3 The phonemic transcription in this article follows the conventions of the International Phonetic Alphabet The examples given are based upon standard southern British English.
4 Of the 100 most frequent items in the spoken British National Corpus (Leech, Rayson, & Wilson, 2001, p 144), only around 16 are clearly identifiable as open-class words The word
the leads the field, with a frequency of 39,605 per million words, with its nearest rival I at
29,448 By comparison, the first content word know has a frequency of 5,550 per million.
Trang 5function words), or null (all syllables strong) Stankler reported a icant advantage in favour of the version that followed the same distinc-tion as English The conclusion he drew is that English listeners knowand exploit the connection between vowel quality and membership ofthe two classes.
signif-A rather different slant on how listeners handle input comes fromstudies of verbatim recall We know that, after a lapse of time, contentwords are remembered more accurately than functors The most obviousexplanation is that functors decay from memory more quickly once theutterance has been turned into an abstract idea because they are notcentral to the final meaning But it might also be that any sections of theinput that potentially correspond to function words are awarded lessattention by the listener at the time they are being heard They areprocessed more shallowly because they are perceptually weaker and lesseasy to decode with confidence This explanation would correspond toevidence from reading L1 readers process function words more curso-rily (Haberlandt & Graesser, 1989) and skip them more often (Carpen-ter & Just, 1983) The findings may be partly connected to the shortness
of most functors, but readers have also shown themselves less accurate
in crossing out a target letter when it occurs in a function word thanwhen it occurs in a content one (Rosenberg, Zurif, Brownell, Garrett, &Bradley, 1985)
RELEVANCE TO SECOND LANGUAGE (L2) LISTENING
We must now relate these issues to L2 listeners Many other languagesbesides English distinguish content and function words by means of theirrelative prominence, so learners will often be familiar with the principlefrom their L1 But they still need to recognise the specific cues that markthe difference in English, and to develop an association between cue andword category
Lexically stressed syllables in English are marked in several ways: bygreater duration, loudness, and pitch, and by possessing full quality vow-els (Laver, 1994, pp 512–514) Most learners succeed in distinguishingstressed from unstressed syllables quite reliably at an early stage How-ever, the ability to make a content/functor attribution may depend uponthe overall rhythm of English—perhaps upon the brevity of unstressedsyllables compared to stressed ones Eastman (1993) produces evidencethat learners face an important obstacle in distinguishing content wordsand functors when their L1 does not resemble English rhythmically He
suggests that speakers of what are traditionally called syllable-timed guages are at a disadvantage compared with those who speak stressed-timed
lan-languages
Trang 6Clearly the ability to identify content words confidently is often promised by the learner’s limited vocabulary A great deal depends onhow accurately the word in question is represented in the listener’s mindand how many possible variants of it the listener is able to recognize(Field, 2008b, see chapter 9) L2 listeners constantly need to make al-lowance for the fact that a given string of sounds may not correspond toany of the lexical items that they currently know.
com-But it is functors that pose the most intriguing questions On the onehand, they are highly frequent They are also limited in English toaround 300 single or multiword items We can assume that the interme-diate-level listener has encountered most of them many times over andhas been able to build up a repertoire of the possible variations to whichthey may be subject On the other hand, the information in the inputthat signals the possible presence of a function word is usually (because
of its unstressed status) very brief and of low perceptibility It is thuspotentially unreliable Discussing L1 listeners, Shillcock and Bard (1993)point out that the high frequency of function words is counterbalanced
by the fact that the uncertain evidence provided by the input may lead us
to form matches with a large number of possible words:
As closed-class words are often pronounced as weak syllables, they are likely to have short, centralized vowels or no vowel at all as well as re-duced, imperfectly articulated consonants The set of lexical competitorsactivated by such poor input could be quite large and yet, because theacoustic evidence is so poor, have no clear front runners (p 182)
If this uncertainty obtains with an L1 listener, how much more nounced it must be with an L2 listener—especially one whose phonemevalues in English are shaky
pro-There is thus a conflict between high frequency/familiarity and lowperceptual evidence It is of interest to see which of the factors prevailsfor most L2 listeners The relevance is that a great deal of L2 listening ishighly strategic in nature In general, listeners succeed in decoding farless of the input than is generally assumed (Field, 2008b, see chapter 15).They are thus quite heavily dependent on compensatory techniques tosupply the words or the concepts that they have not succeeded in iden-tifying We know very little about the perceptual information that theinexperienced L2 listener draws upon to support these strategic guesses
We know even less about which items within that information are likely
to be reliable and which items listeners should be cautious about ing
trust-Hence the question addressed in this study: Do function words orcontent words feature more reliably in the bottom-up data that becomesavailable to the listener? Do listeners structure their interpretation of apartially understood piece of spoken input around familiar functors?
Trang 7They might do so, not as part of a “mortar-first” approach to syntacticprocessing but simply by virtue of the fact that they know and recognizefunction words and rate them as dependable This would assist them, asCutler (1993) suggests, to work out where content words begin and end.Just such a procedure has been hypothesized in the early stages of L1acquisition (Christophe, Guasti, Nespor, Dupoux, & van Ooyen, 1997).Alternatively, do listeners adopt a decoding strategy that is primarilysemantic, and rely more heavily on perceptually reliable information inthe form of lexically stressed syllables that provide cues to meaning-bearing words? Intuitively, this seems to be the more likely—becausemore informative—line of attack But it is a line of attack fraught withdangers Listeners may need to (a) allow for the fact that a targetedcontent word is not in their vocabulary at all, (b) access not one but arange of possible senses, and (c) relate those senses to the context inwhich the word appears In relation to (c), it is important to bear in mindthat the context in question might be incomplete and fragmented be-cause not everything in the utterance has been successfully decoded.
DECODING BY L2 LISTENERS
Relatively few studies have investigated how much of a piece of naturalspeech an L2 listener succeeds in matching to words The reasons havebeen partly practical: Multiple variables need to be considered, amongthem the sample of speech that is chosen, the familiarity of the speaker’svoice and accent, the listening experience of the participants, and so on.The reasons have also been historical In the 1980s and early 1990s, therewas a received idea among TESOL practitioners and researchers that thelistener’s ability to map from sounds to words was not of primary impor-tance because any lapses at this lower level could be compensated for bythe use of contextual information It is only relatively recently (Field,2008a; Lynch, 2006; Vandergrift, 2004) that thinking has moved on andthere has been renewed interest in perceptual processing Even so, much
of the focus of attention has been on how the phonology of L1 constrainsthe perception of L2 at phoneme level (Strange, 1995)
One way of investigating the decoding of natural speech is to play arecording to a population of L2 listeners and to ask them to transcribewhat they understand The resulting data can be subjected to erroranalysis and thus provides input to remedial classroom practice It alsoindicates what linguistic information is available to listeners A number
of research studies have adopted a transcription method, often with aprimary interest in vocabulary (i.e., content word) recognition Fishman(1980) concluded from responses obtained in a dictation exercise that
Trang 8the errors of L2 listeners were not dissimilar to those of L1 listeners Voss(1984) examined the effects of hesitation and accent on accuracy and onthe types of error made There were lexis versus syntax studies by Conrad(1983) and Kelly (1991), though the criteria they used for the subse-quent classification of errors are open to challenge Mack (1988) ex-posed learners to pieces of computer-generated speech and to anoma-
lous utterances such as A painted shoulder thawed the misty sill More
re-cently, Bonk (quoted in Pemberton, 2004) traced the relationshipbetween comprehension and the accuracy with which content words aretranscribed, and Pemberton (2004) investigated the decoding of 27Hong Kong listeners, using a carefully designed taxonomy of error types
A brief look at the results of Voss’s first task indicates that, of 193errors by his 22 high-level German learners, 109 involved function words.However, the figures are not as clearcut as they might appear: Somewords were wrongly reported by nearly all respondents, others by onlyone or two The data does not indicate the relationship between erro-neous items and those correctly matched or the relative prevalence offunctors and content words in the text Pemberton’s detailed analysisincluded consideration of the effects of word frequency and word cat-egory on recognition He reported (pp 41–42) an unexpectedly lowrecognition rate for high-frequency words, all of them known to theparticipants He also reported very little difference between recognitionrates for function and content words (respectively, 74% and 79%)
An important aspect of the transcription method adopted by bothVoss and Pemberton is that it permitted participants to rewind as often
as they wished in order to check their answers The opportunity forrecursion means that the levels of accuracy and the types of error mustdiffer from those that would obtain in real life One might postulate thaterrors in situations where only one hearing is possible would be higherthan those recorded by Pemberton and that the ratio between accuratelyreported content and functor words might well be different
RESEARCH DESIGN
Method: Paused Transcription
The method used in this study therefore required listeners to reportback immediately after hearing the target sections of speech and withoutthe possibility of rewinding Short sections of only four or five words weretargeted, with a view to limiting possible memory effects Small-scalepieces of transcription run the risk of directing attention to decoding atword level, thus eliciting a set of processes which do not resemble those
Trang 9of a normal listening encounter A paused transcription method was
there-fore adopted In this paradigm, participants are asked to listen to anauthentic piece of connected speech Pauses are inserted into the re-cording at irregular intervals; and, whenever a pause occurs, participantsare asked to transcribe the last few words The rationale is that, for most
of the recording, participants are listening as they would in real life, forlarger-scale meaning When the pause occurs, the most recently heardwords remain available for report: There is psycholinguistic evidence(Jarvella, 1971) that we briefly retain a verbatim record of the words wehear until the onset of the following clause
Material
The recording used was from a set of L2 listening comprehensionmaterials (Underwood, 1975) It consisted of an informal interview withthe manager of a cinema, in which he discusses changes in the waycinema is perceived The text was judged to be culturally neutral andwithin the world experience of all the participants Most or all of thewords in it were judged to be within their vocabulary range The sections
of the text chosen for transcription combined content and functionwords and are shown in Table 1 A minimum of 10 seconds of recordingseparated each section to ensure that higher-level processing took place
Participants
Nonnative listeners (NNL) were drawn from mixed-nationality classes
at an English language school in Cambridge, England The participantswere mainly in their late teens or early twenties, and had spent only threeweeks in Britain They were in classes graded as intermediate Threemembers were later dropped from the sample: One had recorded zeroresponses to all items; one had an entry test score substantially below that
of the rest of the group; and for one, there was no record of an entry test
TABLE 1 Sections of Recording Targeted for Transcription
1 which changed each week 8 we’re providing a service
2 must be an occasion 9 most of their early years
3 comfortable seats, good sound 10 make any money on it
4 we’ve lost in the past 11 children having a good time
5 been brought up on television 12 the age of forty-eight
6 a higher standard of entertainment 13 they’re staying at home
7 have been shown to adults 14 middle-aged type of person
Trang 10score This left 46 participants, who were divided into two groups of 23
on the basis of their scores in the entry test administered by the school
At the time of testing, participants were in the third week of their course;the entry test results were thus sufficiently recent to reflect their knowl-edge of English The first group (NNL1) comprised those who hadscores ranging from 30–60; the second (NNL2) comprised those whohad scores ranging from 61–80
The participants spoke a range of L1s They included Spanish (n =
12), German (8), Portuguese (5), Korean (4), Italian (4), Japanese (3),Arabic (2), Czech (2), and Mandarin Chinese (2) The group also in-cluded one speaker each of French, Russian, Albanian, and Hebrew Theissue of possible variation due to native language is explored in duecourse
Control groups of native listener (NL) participants were drawn fromYear 10 in two state secondary schools in Cambridge, England They were
• Group NL1: a set of language learners graded as poorly performing
(n = 21)
• Group NL2: a top set of successful language learners (n = 23)
In both cases, the language being learnt was French Accurate responsesfrom three members of Group NL1 were found to be well below themean; these participants were dropped from the sample, on the assump-tion that they had had writing difficulties with the transcription This left
a group of 18
Procedure
Participants were tested in groups in their normal classrooms Theywere told that they would hear a cassette recording of a man’s voice.Whenever there was a pause in the recording, they were to write the lastfour or five words they had heard There was a check to ensure that theinstruction had been understood The general specification “four orfive” was used so as not to introduce additional cognitive demands byencouraging participants to count the words to be transcribed However,previous experience with the method indicated that, in these circum-stances, participants choose to write entire phrases or clauses, which iswhat happened here
The recording (with inserted pauses) was played on high quality audioequipment designed for language learning At each pause, the experi-menter called out a number and participants wrote their transcription inthe appropriate place on an answer sheet The length of the pauses wasdesigned to ensure that subjects, writing at average speed, could onlyrecord a maximum of about six words The aim was to prevent them
Trang 11from attempting to recall large sections of the text and, in the process,reducing the accuracy of their responses This expedient also ensuredthat subjects did not have time to review their answers once they hadwritten them down.
RESULTS
Participants’ handwritten responses were transferred to computer,and were classified word by word according to whether an accuratetranscription had been achieved The attitude to spelling was not pre-scriptive: Orthographic variants were recorded as accurate if they seg-mented a particular item correctly and approximated phonetically to thetarget item Lexical words which were accurately identified but wronglyinflected were treated as correct answers
Untranscribed words were classified in three ways Where no earlierpart of the extract had been transcribed or where the response as a
whole consisted of only one word, a blank (–) was recorded Where an earlier part of the text had been transcribed, an omission (0) was re- corded A problem was posed by zero responses where the respondent had
not transcribed the target string at all or had written words which felloutside the section in question If it could be shown that these responsesresulted exclusively from inability to identify words, it would have beenlegitimate to include them in the figures for errors However, they mightequally well have resulted from larger scale failures of processing relating
to the whole target section or to the wider context Zero answers weretherefore discounted, and accurate transcriptions were calculated as apercentage of those responses where a minimum of one word was writ-ten The figures quoted in succeeding sections thus reflect the probabil-ity of a response being correct where a response was given
Function words in the target items were identified by reference to the
specification in Quirk et al (1985, pp 67–72), but the adverb particle up
was included and pronoun + verb contractions were treated as single
items Though strictly a determiner in its context here, the word most was treated as a content word The compound adjective middle-aged was counted as two items, and the numeral forty-eight was omitted This pro-
cedure resulted in a set of 30 content and 29 function words across the
14 items Of the content words, however, three possessed more than twosyllables, giving them a distinct advantage in terms of length.5 These
longer items (comfortable, television, entertainment) were excluded, leaving
27 content words
5Since lexical recognition was the goal, the -ing inflection was discounted: The stem in
providing is treated as disyllabic and the stems in having and staying as monosyllabic.