1. Trang chủ
  2. » Giáo Dục - Đào Tạo

Connectionist explorations of multiple cue integration in syntax acquisition

41 3 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Connectionist Explorations of Multiple-Cue Integration in Syntax Acquisition
Tác giả Morten H. Christiansen, Rick Dale, Florencia Reali
Trường học Not Available
Chuyên ngành Cognitive Science
Thể loại Chapter
Năm xuất bản Not Available
Thành phố Not Available
Định dạng
Số trang 41
Dung lượng 278,23 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

In this chapter, we hypothesize that integrating multiple probabilistic cues phonological, prosodic, and distributional by perceptually attuned general-purpose learning mechanisms may ho

Trang 1

PART II Words, Language, and Music

Trang 2

Among the many feats of learning that children showcase in their development, syntactic

abilities appear long before many other skills, such as riding bikes, tying shoes, or playing a musical instrument This is achieved with little or no direct instruction, making it both

impressive and even puzzling, because mastering natural language syntax is one of the most difficult learning tasks that humans face One reason for this difficulty is a “chicken-and-egg” problem involved in acquiring syntax Syntactic knowledge can be characterized by constraints governing the relationship between grammatical categories of words (such as noun and verb) in a sentence At the same time, the syntactic constraints presuppose the grammatical categories in terms of which they are defined; and the validity of grammatical categories depends on how they support those same syntactic constraints A similar “bootstrapping” problem faces a student learning an academic subject such as physics: understanding momentum or force presupposes some understanding of the physical laws in which they figure; yet these laws presuppose these very concepts The bootstrapping problem solved by very young children seems much more daunting, both because the constraints governing natural language are so intricate, and because these children do not have the intellectual capacity or explicit instruction present in conventional academic settings Determining how children accomplish the astonishing feat of language

acquisition remains a key question in cognitive science

Trang 3

By 12 months, infants are attuned to the phonological and prosodic regularities of their native language (Jusczyk, 1997; Kuhl, 1999) This perceptual attunement may provide an

essential scaffolding for later learning by biasing children toward aspects of language input that are particularly informative for acquiring grammatical knowledge In this chapter, we

hypothesize that integrating multiple probabilistic cues (phonological, prosodic, and

distributional) by perceptually attuned general-purpose learning mechanisms may hold promise for explaining how children solve the bootstrapping problem Multiple cues can provide reliable evidence about linguistic structure that is unavailable from any single source of information

In the remainder of this chapter, we first review empirical evidence suggesting that

infants may use a combination of phonological, prosodic, and distributional cues to bootstrap into syntax We then report a series of simulations demonstrating the computational efficacy of multiple-cue integration within a connectionist framework (for modeling of other aspects of cognitive development, see the chapter by Mareschal & Westermann, this volume) Simulation 1 shows how multiple-cue integration results in better, faster, and more uniform learning

Simulation 2 uses this initial model to mimic the effect of grammatical and prosodic

manipulations in a sentence comprehension study with 2-year-olds (Shady & Gerken, 1999) Simulation 3 uses an idealized representation of prenatal exposure to gross-level phonological and prosodic cues, leading to facilitation of postnatal learning of syntax by the model Simulation

4 demonstrates that adding additional distracting cues, irrelevant to the syntactic acquisition task, does not hinder learning Finally, Simulation 5 scales up these initial simulations, showing that connectionist models can acquire aspects of syntactic structure from cues present in actual child-directed speech

Trang 4

THE NEED FOR MULTIPLE LANGUAGE-INTERNAL CUES

In this section, we identify three kinds of constraints that may serve to help the language learner solve the syntactic bootstrapping problem First, innate constraints in the form of linguistic universals may be available to discover to which grammatical category a word belongs, and how they function in syntactic rules Second, language-external information, concerning observed semantic relationships between language and the world, could help map individual words onto their grammatical function Finally, language-internal information, such as aspects of

phonological, prosodic, and distributional patterns, may indicate the relation of various parts of language to each other, thus bootstrapping the child into the realm of syntactic relations We discuss each of these potential constraints below, and conclude that some form of language-internal information is needed to break the circularity

Although innate constraints likely play a role in language acquisition, they cannot solve the bootstrapping problem Even with genetically prescribed abstract knowledge of grammatical categories and syntactic rules (e.g., Pinker, 1984), the problem remains: Innate knowledge

requires building in universal mappings across languages, but the relationships between words and grammatical categories clearly differ cross-linguistically (e.g., the sound /su/ is a noun in

French (sou) but a verb in English (sue)) Even with rich innate knowledge, children still must

assign sound sequences to appropriate grammatical categories while determining the syntactic relations between these categories in their native language Recently, a wealth of compelling experimental evidence has accumulated, suggesting that children do not initially use abstract linguistic categories Instead, they seem to employ words at first as concrete individuals (rather than instances of abstract kinds), thereby challenging the usefulness of hypothesized innate grammatical categories (Tomasello, 2000) Whether we grant the presence of extensive innate

Trang 5

knowledge or not, it seems clear that other sources of information are necessary to solve the bootstrapping problem

Language-external information, such as correlations between the environment and

semantic categories, may contribute to language acquisition by supplying a “semantic

bootstrapping” solution (Pinker, 1984) However, because children learn linguistic distinctions that have no semantic basis (e.g., gender in French: Karmiloff-Smith, 1979), semantics cannot be the only source of information involved in solving the bootstrapping problem Other sources of language-external constraints include cultural learning, indicated by a child’s imitation of

linguistic forms in socially conventional contexts (Tomasello, Kruger & Ratner, 1993) For

example, a child may perceive that the idiom “John let the cat out of the bag,” used in the

appropriate context, means that John has revealed some sort of secret, and not that he released a feline from captivity Despite both of these important language-external sources, to break down the linguistic forms into relevant units, it appears that correlation and cultural learning must be coupled with language-internal information

We do not challenge the important role that the two foregoing sources of information play in language acquisition We would argue, however, that language-internal information is fundamental to bootstrapping the child into syntax Because language-internal input is rich in potential cues to linguistic structure, we offer a requisite feature of this information for syntax acquisition: Cues may only be partially reliable individually, and a learner must integrate an array of these cues to solve the bootstrapping problem For example, a learner could use the

tendency for English nouns to be longer than verbs to conjecture that bonobo is a noun, but the same strategy would fail for ingratiate Likewise, although speakers tend to pause at syntactic

phrase boundaries in a sentence, pauses also occur elsewhere during normal language

Trang 6

production And although it is a good distributional bet that the definite article the will precede a noun, so might adjectives, such as silly The child therefore needs to integrate a great diversity of

probabilistic cues to language structure Fortunately, as we review in the next section, there is now extensive evidence that multiple probabilistic cues are available in language-internal input, that children are sensitive to them, and that they facilitate learning through integration

Bootstrapping through Multiple Language-Internal Cues

We explore three sources of language-internal cues: phonological, prosodic, and distributional Phonological information includes stress, vowel quality, and duration, and may help distinguish grammatical function words (e.g., determiners, prepositions, and conjunctions) from content words (nouns, verbs, adjectives, and adverbs) in English (e.g., Cutler, 1993; Gleitman &

Wanner, 1982; Monaghan, Chater & Christiansen, 2005; Monaghan, Christiansen & Chater, 2007; Morgan, Shi, & Allopenna, 1996; Shi, Morgan, & Allopenna, 1998) Phonological

information may also help separate nouns and verbs (Monaghan, Chater, & Christiansen, 2005; Monaghan, Christiansen, & Chater, 2007; Onnis & Christiansen, 2008) For example, English disyllabic nouns tend to receive initial-syllable (trochaic) stress whereas disyllabic verbs tend to receive final-syllable (iambic) stress, and adults are sensitive to this distinction (Kelly, 1988) Acoustic analyses have also shown that disyllabic words that are noun–verb ambiguous and have the same stress placement can still be differentiated by syllable duration and amplitude cue differences (Sereno & Jongman, 1995) Even 3-year-old children are sensitive to this stress cue, despite the fact that few multisyllabic verbs occur in child-directed speech (Cassidy & Kelly,

1991, 2001) Additional noun/verb cues in English likely include differences in word duration, consonant voicing, and vowel types, and many of these cues may be cross-linguistically relevant (see Kelly, 1992; Monaghan & Christiansen, 2008, for reviews)

Trang 7

Prosodic cues help word and phrasal/clausal segmentation and may reveal syntactic structure (e.g., Gerken, Jusczyk & Mandel, 1994; Gleitman & Wanner, 1982; Kemler-Nelson, Hirsh-Pasek, Jusczyk, & Wright Cassidy, 1989; Morgan, 1996) Acoustic analyses find that pause length, vowel duration, and pitch all mark phrasal boundaries in English and Japanese child-directed speech (Fisher & Tokura, 1996) Perhaps from utero (Mehler et al., 1988) and beyond, infants seem highly sensitive to such language-specific prosodic patterns (Gerken et al., 1994; Kemler-Nelson et al., 1989; for reviews, see Gerken, 1996; Jusczyk & Kemler-Nelson, 1996; Morgan, 1996) Prosodic information also improves sentence comprehension in 2-year-olds (Shady & Gerken, 1999) In experiments using adult participants, artificial language

learning is facilitated in the presence of prosodic marking of syntactic phrase boundaries

(Morgan, Meier & Newport, 1987; Valian & Levitt, 1996) Neurophysiological evidence in the form of event-related brainwave potentials (ERP) in adults shows that prosodic information has

an immediate effect on syntactic processing (Steinhauer, Alter, & Friederici, 1999), suggesting a rapid, on-line role for this important cue While prosody is influenced to some extent by a

number of nonsyntactic factors, such as breathing patterns, resulting in an imperfect mapping between prosody and syntax (Fernald & McRoberts, 1996), infants’ sensitivity to prosody argues for its likely contribution to syntax acquisition (Fisher & Tokura, 1996; Gerken 1996; Morgan, 1996)

Distributional characteristics of linguistic fragments at or below the word level may also provide cues to grammatical category Morphological patterns across words may be

informative—e.g., English words that are observed to have both –ed and –s endings are likely to

be verbs (Maratsos & Chalkley, 1980) In artificial language learning experiments, adults acquire grammatical categories more effectively when they are cued by such word-internal patterns

Trang 8

(Brooks, Braine, Catalano & Brody, 1993; Frigo & McDonald, 1998) Corpus analyses reveal that word co-occurrence also gives useful cues to grammatical categories in child-directed

speech (e.g., Mintz, 2003; Monaghan et al., 2005, 2007; Redington, Chater, & Finch, 1998) Given that function words primarily occur at phrase boundaries (e.g., initially in English and French and finally in Japanese), they can also help the learner by signaling syntactic structure This idea has received support from corpus analyses (Mintz, Newport & Bever, 2002) and

artificial language learning studies (Green, 1979; Morgan et al., 1987; Valian & Coulson, 1988) Finally, artificial language learning experiments indicate that duplication of morphological

patterns across related items in a phrase (e.g., Spanish: Los Estados Unidos) <COMP: Keep

underline for clarity.> facilitates learning (Meier & Bower, 1986; Morgan et al., 1987)

It is important to note that there is ample evidence that children are sensitive to these multiple sources of information After just 1 year of language exposure, the perceptual

attunement of children likely allows them to make use of language-internal probabilistic cues (for reviews, see Jusczyk, 1997, 1999; Kuhl, 1999; Pallier, Christophe & Mehler, 1997; Werker

& Tees, 1999) Through early learning experiences, infants already appear sensitive to the

acoustic differences between function and content words (Shi, Werker & Morgan, 1999) and the relationship between function words and prosody in speech (Shafer, D W Shucard, J L

Shucard & Gerken, 1998) Young infants are able to detect differences in syllable number among isolated words (Bijeljac, Bertoncini & Mehler, 1993) In addition, infants exhibit rapid

distributional learning (e.g., Gómez & Gerken, 1999; Saffran, Aslin, & Newport, 1996; see Gómez & Gerken, 2000; Saffran, 2003 for reviews), and importantly, they are capable of

multiple-cue integration (Mattys, Jusczyk, Luce, & Morgan, 1999; Morgan & Saffran, 1995) When facing the bootstrapping problem, children probably also benefit from characteristics of

Trang 9

child-directed speech, such as the predominance of short sentences (Newport, Gleitman &

Gleitman, 1977) and exaggerated prosody (Kuhl et al., 1997)

In summary, phonological information helps to distinguish function words from content words and nouns from verbs Prosodic information helps word and phrasal/clausal segmentation, thus serving to uncover syntactic structure Distributional characteristics aid in labeling and segmentation, and may provide further cueing of syntactic relations Despite the value of each source, none of these cues in isolation suffices to solve the bootstrapping problem The learner must integrate these multiple cues to overcome the limited reliability of each individually This review has indicated that a range of language-internal cues is available for language acquisition, that these cues affect learning and processing, and that mechanisms exist for multiple-cue

integration What is yet unknown is how far these cues can be combined to solve the

bootstrapping problem (Fernald & McRoberts, 1996) Here we present connectionist simulations

to demonstrate that efficient and robust computational mechanisms exist for multiple-cue

integration (see also the chapters in this volume by Hannon, Kirkham, and Saffran, for evidence from human infant learning)

SIMULATION 1: MULTIPLE-CUE INTEGRATION

Although the multiple-cue approach is gaining support in developmental psycholinguistics, its computational efficacy still remains to be established The simulations reported in this chapter are therefore intended as a first step toward a computational approach to multiple-cue

integration, seeking to test its potential value in syntax acquisition Based on our previous

experience with modeling multiple-cue integration in speech segmentation (Christiansen, Allen,

& Seidenberg, 1998), we used a simple recurrent network (SRN; Elman, 1990) to model the integration of multiple cues The SRN is feed-forward neural network equipped with an

Trang 10

additional copy-back loop that permits the learning and processing of temporal regularities in the stimuli presented to it (see Figure 5.1) This makes it particularly suitable for exploring the acquisition of syntax, an inherently temporal phenomenon

INSERT FIGURE 5.1 ABOUT HERE The networks were trained on corpora of artificial child-directed speech generated by a grammar that includes three probabilistic cues to grammatical structure: word length, lexical stress, and pitch The grammar (described further below) was motivated by considering frequent constructions in child-directed speech in the CHILDES database (MacWhinney, 2000)

Simulation 1 demonstrates how the integration of these three cues benefits the acquisition of syntactic structure by comparing performance across the eight possible cue combinations ranging from the absence of cues to the presence of all three

Method

Networks

Ten networks were trained per condition, with an initial randomization of network connections in the interval [–0.1, 0.1] Learning rate was set to 0.1, and momentum to 0 Each input to the networks contained a localist representation of a word (one unit = one word) and a set of cue units depending on cue condition Words were presented one by one, and networks were required

to predict the next word in a sentence along with the corresponding cues for that word With a total of 44 words (see below) and a pause marking boundaries between utterances, the networks had 45 input units Networks in the condition with all available cues had an additional five input units The number of input and output units thus varied between 45 and 50 across conditions Each network had 80 hidden units and 80 context units

Trang 11

Materials

We constructed an idealized but relatively complex grammar based on independent analyses of child-directed speech corpora (Bernstein-Ratner, 1984; Korman, 1984) and a study of child-directed speech by mother–daughter pairs (Fisher & Tokura, 1996) As illustrated in Table 5.1, the grammar included three primary sentence types: declarative, imperative, and interrogative sentences Each type consisted of a variety of common utterances reflecting the child’s exposure For example, declarative sentences most frequently appeared as transitive or intransitive verb

constructions (the boy chases the cat, the boy swims), but also included predication using be (the horse is pretty) and second person pronominal constructions commonly found in child-directed

corpora (you are a boy) Interrogative sentences were composed of wh-questions (where are the boys? , where do the boys swim?), and questions formed by using auxiliary verbs (do the boys walk? , are the cats pretty?) Imperatives were the simplest class of sentences, appearing as intransitive or transitive verb phrases (kiss the bunny, sleep) Subject–verb agreement was upheld

in the grammar, along with appropriate determiners accompanying nouns (the cars vs *a cars)

Each word was assigned a unit for input into the model, and we added a number of units

to represent cues Two basic cues were available to all networks The fundamental distributional information inherent in the grammar could be exploited by all networks in this simulation As a second basic cue, utterance-boundary pauses signaled grammatically distinct utterances with 92% reliability (Broen, 1972) This was encoded as a single unit that was activated at the end of all but 8% of the sentences Other semireliable prosodic and phonological cues accompanied the phrase-structure grammar: word length, stress, and pitch Network groups were constructed using different combinations of these three cues Cassidy and Kelly (1991) demonstrated that syllable count is a cue available to English speakers to distinguish nouns and verbs They found that the

Trang 12

probability of a single syllable word to be a noun rather than a verb is 38% This probability rises

to 76% at two syllables, and 92% at three We selected verb and noun tokens that exhibited this distinction, whereas the length of the remaining words was typical for their class (i.e., function words tended to be monosyllabic) Word length was represented in terms of three units using thermometer encoding—that is, one unit would be on for monosyllabic words, two for bisyllabic words, and three for trisyllabic words Pitch change is a cue associated with syllables that

precede pauses Fisher and Tokura (1996) found that these pauses signaled grammatically

distinct utterances with 96% accuracy in child-directed speech, allowing pitch to serve as a cue

to grammatical structure In the networks, this cue was a single unit that would be activated at the final word in an utterance Finally, we used a single unit to encode lexical stress as a possible cue to distinguish stressed content words from the reduced, unstressed form of function words This unit would be on for all content words

INSERT TABLE 5.1 ABOUT HERE

Procedure

Eight groups of networks, one for each combination of cues (all cues, 2 cues, 1 cue, or none), were trained on corpora consisting of 10,000 sentences generated from the grammar Each

network within a group was trained on a different randomized training corpus Training consisted

of 200,000 input/output presentations (words), or approximately 5 passes through the training corpus Each group of networks had cues added to its training corpus depending on cue

condition Networks were expected to predict the next word in a sentence, along with the

appropriate cue values A corpus consisting of 1,000 novel sentences was generated for testing Performance was measured by assessing the networks’ ability to predict the next set of

grammatical items given prior context Importantly, this measure did not include predictions of

Trang 13

cue information, and all network conditions were thus evaluated by exactly the same

Results

After training, SRNs trained with localist output representations will produce a distributional pattern of activation closely corresponding to a probability distribution of possible next items In order to assess the overall performance of the SRNs, we made comparisons between network output probabilities and the full conditional probabilities given the prior context For example,

the full conditional probabilities given the context of “The boy chases ” can be represented as a

vector containing the probabilities of being the next item in this sentence for each of the 44 words in the vocabulary and the pause To ensure that our performance measure can deal with novel test sentences not seen during training, we estimate the prior conditional probabilities based on lexical categories rather than individual words (Christiansen & Chater, 1999) Suppose,

in the example above, that every continuation of this sentence fragment in the training corpus

always involved the indefinite determiner “a” (as in “The boy chases a cat”) If we did not base

our full conditional probability estimates on lexical categories, we would not be able to assess

SRN performance on novel sentences in which the definite determiner “the” followed the

Trang 14

example fragment (as in “The boy chases the cat”’) Formally, we thus have the following

Equation 1 with c i denoting the category of the ith word in the sentence:

(5.1)

where the probability of getting some member of a given lexical category as the pth item, c p, in a

sentence is conditional on the previous p–1 lexical categories Note that for the purpose of

performance assessment, singular and plural nouns are assigned to separate lexical categories throughout Simulations 1–4, as are singular and plural verbs Given that the choice of lexical items for each category is independent, and that each word in a category is equally frequent, the

probability of encountering a particular word w n , which is a member of a category c p, is simply

inversely proportional to the number of items, C p, in that category So, overall, we have the following equation:

(5.2)

If the networks are performing optimally, then the vector of output unit activations should

exactly match these probabilities We evaluate the degree to which each network performs

successfully by measuring the mean squared error between the vectors representing the

network’s output and the conditional probabilities (with 0 indicating optimal performance)

All networks achieved better performance than the standard bigram/trigram models values < 0001), suggesting that the networks had acquired knowledge of syntactic structure

(p-beyond the information associated with simple pairs or triples of words Figure 5.2A illustrates the best performance achieved by the trigram model as well as SRNs provided with no cues (the baseline network), a single cue (length, stress, or prosody), and three cues The nets provided

Trang 15

with one or more phonological/prosodic cues achieved significantly better performance than

baseline networks (p-values < 02) Using trigram performance as criterion, all multiple-cue

networks surpassed this level of performance faster than the baseline networks as shown in

Figure 5.2B (p-values < 002) Moreover, the three-cue networks were significantly faster than the single-cue networks (p-values < 001) Finally, using Brown-Forsyth tests for variability in

the final level of performance, we found that the three-cue networks also exhibited significantly

more uniform learning than the baseline networks (F(1,18) = 5.14, p < 04), as depicted in Figure

5.2C

INSERT FIGURE 5.2 ABOUT HERE

SIMULATION 2: SENTENCE COMPREHENSION IN 2-YEAR-OLDS

Simulation 1 provides evidence for the general feasibility of multiple-cue integration for

supporting syntax learning To further demonstrate the relevance of the model to language

development, closer contact with human data is needed (Christiansen & Chater, 2001) In the current simulation, we demonstrate that the three-cue networks from Simulation 1 are able to accommodate experimental data showing that 2-year-olds can integrate grammatical markers (function words) and prosodic cues in sentence comprehension (Shady & Gerken, 1999:

Experiment 1) In this study, children heard sentences, such as (1) [see below], in one of three prosodic conditions depending on pause location: early natural [e], late natural [l], and unnatural

[u] Each sentence moreover involved one of three grammatical markers: grammatical (the), ungrammatical (was), and nonsense (gub)

1 Find [e] the/was/gub [u] dog [l] for me

The child’s task was to identify the correct picture corresponding to the target noun (dog)

Children performed the task best when the pause location delimited a phrasal boundary

Trang 16

(early/late), and with the grammatical marker the Simulation 2 models these data by using

comparable stimuli and assessing noun unit activations

Method

Networks

Twelve three-cue networks of the same architecture and training used in Simulation 1 were used

in each prosodic condition in the infant experiment This number was chosen to match the

number of infants in the Shady and Gerken (1999) experiment An additional unit was added to

the networks to encode the nonsense word (gub) in Shady and Gerken’s experiment

Materials

We constructed a sample set of sentences from our grammar that could be modified to match the stimuli in Shady and Gerken Twelve sentences for each prosody condition (pause location) were constructed Pauses were simulated by activating the utterance-boundary unit Because these pauses probabilistically signal grammatically of distinct utterances, the utterance-boundary unit provides an approximation of what the children in the experiment would experience Finally, the nonsense word was added to the stimuli for the within group condition (grammatical vs

ungrammatical vs nonsense) Adjusting for vocabulary differences, the networks were tested on comparable sentences, such as (2):

2 Where does [e] the/is/gub [u] dog [l] eat?

Procedure

Each group of networks was exposed to the set of sentences corresponding to its assigned pause location (early vs late vs unnatural) No learning took place, since the fully trained networks were used To approximate the picture selection task in the experiment, we measured the degree

Trang 17

to which the networks would activate the groups of nouns following the/is/gub The two

conditions were expected to affect the activation of the nouns

Results

The human results for the prosody condition in Shady and Gerken (1999) is depicted in Figure 5.3A They reported a significant effect of prosody on the picture selection task The same was

true for our networks (F(2,33) = 1,253.07, p < 0001), and the pattern of noun activations closely

resembles that of the toddlers’ correct picture choice as evidenced by Figure 5.3B The late natural condition elicited the highest noun activation, followed by the early natural condition, and with the unnatural condition yielding the least activation The experiment also revealed an effect of grammaticality as can be seen from the human data shown in Figure 5.3C We similarly

obtained a significant grammaticality effect for our networks (F(2,70) = 69.85, p < 0001),

which, as illustrated by Figure 5.3D, produced the highest noun activation following the

determiner, followed by the nonsense word, and lastly for the ungrammatical word Again, the network results match the pattern observed for the toddlers One slight discrepancy is that the networks are producing higher noun activation following the nonsense word compared to the ungrammatical marker This result is however consistent with the results from a more sensitive picture selection task, showing that children were more likely to end up with a semantic

representation of the target following nonsense syllables compared to incorrectly used

morphemes (Carter & Gerken, 1996) Thus, the results suggest that the syntactic knowledge acquired by the networks mirrors the kind of sensitivity to syntactic relations and prosodic

content observed in human children Together with Simulation 1, the results also demonstrate that multiple-cue integration may both facilitate syntax acquisition, and underlie some patterns of linguistic skill observed early on in human performance In the next simulation, we show that the

Trang 18

multiple-cue perspective can simulate possible prosodic scaffolding that occurs much earlier in development: prenatal attunement to prosody

INSERT FIGURE 5.3 ABOUT HERE

SIMULATION 3: THE ROLE OF PRENATAL EXPOSURE

Studies of 4-day-old infants suggest that the attunement to prosodic information may begin prior

to birth (Mehler et al., 1988) We suggest that this prenatal exposure to language may provide a scaffolding for later syntactic acquisition by initially focusing learning on certain aspects of prosody and gross-level properties of phonology (such as word length) that later will play an important role in postnatal multiple-cue integration In the current simulation, we test this

hypothesis using the connectionist model from Simulations 1 and 2 If this scaffolding

hypothesis is correct, we would expect that prenatal exposure corresponding to what infants receive in the womb would result in improved acquisition of syntactic structure

Trang 19

Procedure

The networks in the prenatal group were first trained on 100,000 input/output filtered

presentations drawn from a corpus of 10,000 new sentences Following this prenatal exposure, the nets were then trained on the full input patterns exactly as in Simulation 1 The nonprenatal group only received training on the postnatal corpora As previously, networks were required to predict the following word and corresponding cues Performance was again measured by the prediction of following words, ignoring the cue units

Results

Both network groups exhibited significantly higher performance than the bigram/trigram models

(F(1,18) = 25.32, p < 0001 for prenatal, F(1,18) = 12.03, p < 01 for nonprenatal), again

indicating that the networks are acquiring complex grammatical regularities that go beyond simple adjacency relations We compared the performance of the two network groups across different degrees of training using a two-way analysis of variance with training condition

(prenatal vs nonprenatal) as the between-network factor and amount of training as

within-network factor (five levels of training measured in 20,000 input/output presentation intervals)

There was a main effect of training condition (F(1,18) = 12.36, p < 01), suggesting that prenatal exposure significantly improved learning A main effect of degrees of training (F(9,162) = 15.96,

p < 001) reveals that both network groups benefited significantly from training An interaction between training conditions and degrees of training indicates that the prenatal networks learned

significantly better than postnatal networks (F(1,18) = 9.90, p < 01) Finally, as illustrated by

Figure 5.4, prenatal input also resulted in faster learning (measured in terms of the amount of

training needed to surpass the trigram model; F(1,18) = 9.90, p < 01) The exposure to prenatal

input—void of any information about individual words—promotes better performance on the

Trang 20

prediction task as well as faster learning overall This provides computational support for the prenatal scaffolding hypothesis, derived as a prediction from the multiple-cue perspective on syntax acquisition

INSERT FIGURE 5.4 ABOUT HERE

SIMULATION 4: MULTIPLE-CUE INTEGRATION WITH USEFUL AND DISTRACTING CUES

So far, simulations have demonstrated the importance of cue integration in syntax acquisition, that integration can match data obtained in infant experiments, and that this perspective can provide novel predictions in language development A possible objection to these simulations is that our networks succeed at multiple-cue integration because they are only provided with cues that are at least partially relevant for syntax acquisition Consequently, performance may

potentially drop significantly if the networks themselves had to discover which cues were

partially relevant and which are not Simulation 4 therefore tests the robustness of our cue approach when faced with additional, uncorrelated distractor cues Accordingly, we added three distractor cues to the previous three reliable cues These new cues encoded the presence of word-initial vowels, word-final voicing, and relative (male/female) speaker pitch—all

multiple-acoustically salient in speech, but which do not appear to cue syntactic structure

Method

Networks

Networks, groups, and training details were the same as in Simulation 3, except for three

additional input units encoding the distractor cues

Ngày đăng: 12/10/2022, 21:16

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w