Ngôn ngữ học khối liệu

8 The convergence of corpuslinguistics, psycholinguistics and functionalist linguistics As we have seen in Chapter7, functionalist linguistics in the broad sense including cognitive ling

Trang 1

8 The convergence of corpus

linguistics, psycholinguistics and functionalist linguistics

As we have seen in Chapter7, functionalist linguistics in the broad sense (including cognitive linguistics) is increasingly making use of corpus-based methods, and in turn informing the analyses of corpus linguists In this chapter, we will show that this phenomenon extends as well to experimental psycholinguistics We will also discuss the implications of the rapprochement

of functionalist linguistics and psycholinguistics with corpus linguistics with regard to the neo-Firthian school of thought which we surveyed in Chapter6;

we will argue that in the neo-Firthian school, this rapprochement with functional linguistics has taken a very different form As we saw in Chapter6, one of the bases of the neo-Firthian or so-called ‘corpus-driven’ approach is a rejection

of non-corpus-derived theoretical frameworks To explicitly adopt a functionalist theory as the basis for a corpus-driven study would be distinctly peculiar from the neo-Firthian perspective Indeed, some of the stronger forms of the neo-Firthian position – such as that espoused by Teubert, for instance – explicitly reject the notion of a convergence of neo-Firthian corpus linguistics and functional

or cognitive linguistics, with Teubert (2005: 2) claiming that corpus linguistics

‘offers a perspective on language that sets it apart from received views or the views of cognitive linguistics, both relying heavily on categories gained from introspection rather than from the data itself’ Nevertheless, we wish to argue that such a convergence is in fact taking place, stemming on the neo-Firthian side from work by Sinclair and others from the 1990s onwards Our basis for making this case is that, when we closely examine the findings of the most extensively developed neo-Firthian theories – in particular, Pattern Grammar and

Lexical Priming – we will find that many of these conclusions have also been

arrived at by one or more branches of functional linguistics or psycholinguistics These congruent conclusions stem from wildly different sets of evidence and are, of course, expressed using very different descriptive apparatus But certain fundamental insights – namely, the inseparability of lexis and grammar, and the nature of grammar as secondary to, and emergent from, lexis – have been arrived at by both functional linguists and neo-Firthian corpus linguists, largely independently of one another

192

Trang 2

8.2 Corpus methods and psycholinguistics 193

In this chapter, then, we have two main topics Firstly, in section 8.2 we will

consider the role of corpora in experimental psycholinguistics, as we

consid-ered their role in functionalism in Chapter7 Psycholinguistics as a discipline

is methodologically rather different to functionalist theoretical linguistics, but it

shows signs of a similar trend with regard to corpus methods – that is, that over

recent years there has been more and more use of corpus data within

psycholin-guistic research, and a convergence or rapprochement between the findings of

psycholinguistic experiments and of corpus investigations

Secondly, section 8.3 discuss the convergence of findings, regarding in

par-ticular the ontological status of grammar, lexis and language itself, between

neo-Firthian corpus linguistics, functional linguistics and psycholinguistics

Overlapping cognitive linguistics (which we discussed in the previous

chapter), but in many ways distinct from it, is the field of psycholinguistics –

and in particular that branch of psycholinguistics whose methodology is mainly

experimental In the latter approach, the primary source of data is various types

of laboratory tests on human subjects (or, as we will see later, computer

mod-els) While experimental psycholinguistics is not usually considered a branch

of functional-cognitive linguistics, its fundamental methodological assumption –

that the nature of language in the brain or mind can be investigated in much the

same way that experimental psychology in general looks at other aspects of the

nature of thought – is in accordance with the general tenet of functionalism that

there is no absolute divide between form and function, between language and

non-linguistic cognition

Psycholinguistics is a very broad field, and there is absolutely no room here

for a full review of it – nor even to treat comprehensively all research which has

linked psycholinguistics with corpus data and methods We must therefore

con-fine ourselves to an extremely brief and purely indicative survey To characterise

psycholinguistics in very broad terms, we might say that it is focused on two

primary issues (which are closely interrelated, as Ellis2002illustrates): language

learning and language processing There are other topics of interest of course,

such as the evolution of the language faculty However, we will limit ourselves

here to looking at how corpora have been used in some psycholinguistic

investi-gations into first language acquisition, second language acquisition and language

processing

8.2.1 Corpus data in experiments on language processing 䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲䡲

Language processing has been investigated experimentally in a

num-ber of ways Two that are reasonably common are self-paced reading experiments

Trang 3

194 c o r p u s , p s y c h o l i n g u i s t i c s a n d f u n c t i o n a l i s m

and eye-tracking experiments Both are means of investigating the speed with

which particular segments of language are processed In a self-paced reading experiment, participants work at a computer running a specially designed pro-gram The computer shows one word of a sentence at a time to the participant, who presses a button to get the next word once they have read the word currently

on screen The program records the time for each button-press, so that the relative speed of reading for each word is known Typically, after each sentence partici-pants have to answer a (very easy) question about the content of the sentence – this prevents participants from just clicking through sentences without actually reading for meaning The results of such an experiment can be used to infer what elements (morphological, syntactic or semantic) are processed easily, and which are more difficult and thus require more processing time This in turn can give indications about what is actually happening in the brain Although useful, self-paced reading experiments may potentially be misleading in that fluent readers

do not typically read one word at a time, in sequence, without ever going back

in the text In fact, it is known that a reader’s journey through a sentence of printed text can be quite complex, with multiple movements back and forth This type of evidence is gathered in eye-tracking experiments (see Rayner1998for a review) Again, participants are given the task of reading sentences presented on

a computer screen, but this time an entire sentence is presented at one time, and specialised video equipment records the movements of one of the participant’s eyes as it looks at different positions in the sentence immediately after the sen-tence appears on screen The resulting data is much richer, but correspondingly rather more difficult to interpret, than self-paced reading data

These kinds of experiments may seem remote from the concerns of corpus linguistics However, there are at least two ways in which corpus data can play

an important role in the design and interpretation of such experiments Firstly, corpus data can be used as a check on the naturalness of the language task that the experiment sets its participants For instance, Frisson and Pickering (2001) summarise the results of a series of eye-tracking experiments aimed at investigating the processing of words which are ambiguous between a literal and

a metaphorical meaning, when the part of the sentence prior to the ambiguous word does not provide sufficient cues to indicate which meaning is intended But Deignan (2005: 114–17), in a review of this study, points out that in fact, such

cases almost never occur in corpora of real usage: in all the examples she looks at,

some aspect of the preceding context – possibly in an earlier sentence – indicates

which meaning is intended So, for instance, the word campaign literally relates

to warfare and metaphorically relates to politics In any given real example of

campaign from a corpus, the prior context is overwhelmingly likely to give some

indication whether a military campaign or a political campaign is intended; so by

the time the reader gets up to campaign, it is already effectively disambiguated.

On this basis, Deignan argues that if an experiment presents participants with a

word such as campaign without any indication in the foregoing text as to whether

it is literal or metaphorical, as Frisson and Pickering’s experiment did, then that

Trang 4

8.2 Corpus methods and psycholinguistics 195 experiment is actually ‘forcing participants to tackle problems that are not faced

in normal discourse’ (Deignan 2005: 117) If this is the case, then it may be

argued that while such an experiment may indeed tell us something interesting

about the processing of ambiguously metaphorical words, it cannot tell us about

the normal processing of language in use We can see, then, that a corpus-derived

awareness of how words (and other linguistic items) are actually used can serve

as a useful anchoring-point for psycholinguistic experimentation This is not to

say that unnatural language should never be used in an experiment – there are

cases where non-idiomatic language may itself be the object of study, for instance

Millar’s (2011) study of how errors in collocation, of the type made by non-native

speakers of English, can affect processing speed in self-paced reading What is

undesirable is a situation where experimental tasks include highly unnatural

language without the experimenter being aware that this is the case.

Secondly, corpus data can be used as a source of frequency data in the

construc-tion of test sentences in self-paced reading or eye-tracking experiments Often,

the test sentences used will not be drawn directly from corpus data, because the

analysis of the resulting data may require certain aspects of the sentences to be

controlled across different examples For instance, if we are primarily interested

in the time taken to process (say) the verb in a sentence, then we might well

wish to control the length and syntactic structure of the preverbal elements (as

well as, potentially, that of the rest of the sentence) We are unlikely to find such

controlled sentences in a corpus! But even when invented example sentences are

used, it is entirely possible for the creation of the sentences to be informed by

fre-quency data of various sorts extracted from a corpus The study by Millar (2011)

which we cited above uses this approach: Millar’s test sentences are all fabricated,

but each is built around an observed non-idiomatic collocation extracted from a

learner corpus

A perhaps more straightforward use of frequency data drawn from corpora is

exemplified by the eye-tracking experiments of McDonald and Shillcock (2003a,

2003b) They investigate whether the co-occurrence frequency of a pair of words

(as established in a large corpus, in this case the BNC) can predict the ease of

processing of the second word in that pair The co-occurrence frequencies are

expressed, in this case, as transition probabilities; that is, given that the first

word in the pair is X, what is the probability that the second word is Y? In

this case, the probability is equal to the number of times the sequence X-then-Y

occurs in the BNC, divided by the total number of instances of word X – this is

fundamentally very similar to a collocation calculation McDonald and Shillcock

(2003a) look at the processing of verb–object pairs, contrasting pairs where the

object is probable, given the verb – e.g avoid confusion – and pairs where it is less

probable – e.g avoid discovery The frequencies of these bigrams in the BNC are

50 and 2 respectively, relative to 7,823 instances of the wordform avoid in total.

McDonald and Shillcock’s eye-tracking data showed that participants’ eyes fixed

on the object noun for a shorter time when they were reading a high-probability

transition than when reading a low-probability transition This suggests that the

Định dạng
Số trang	4
Dung lượng	82,58 KB