Processing of Recursive Sentence Structure:Testing Predictions from a Connectionist Model Program in Neural, Informational and Behavioral Sciences University of Southern California Los A
Trang 1Processing of Recursive Sentence Structure:
Testing Predictions from a Connectionist Model
Program in Neural, Informational and Behavioral Sciences
University of Southern California Los Angeles, CA 90089-2520
morten@gizmo.usc.edu mcm@gizmo.usc.edu
Abstract
We present results from three psycholinguistic experiments which tested predictions from a connectionist model of recursive sentence processing The model was originally de-veloped to capture generalization using non-local information (Christiansen, 1994; Chris-tiansen & Chater, 1994) From this model it was possible to derive novel empirical pre-dictions concerning the processing of dierent kinds of recursive structure We present behavioral results conrming network predictions about the acceptability of sentences in-volving multiple right-branching PPs (Experiment 1), multiple left-branching prenominal genitives (Experiment 2), and doubly center-embedded object relative clauses (Experi-ment 3) Importantly, these predictions derive from the intrinsic architectural constraints
of the model (Christiansen & Chater, in submission), rather than arbitrary, externally specied memory limitations We conclude that the SRN is well-suited for the modeling
of human performance on recursive sentence structure.
1 Introduction
One way to evaluate computational models of psycholinguistic phenomena is to assess how well they match behavioral data and whether they make predictions beyond existing data Many models only match data at a fairly gross level of performance, and few make predictions that inspire new experiments This is true of both connectionist and symbolic computational models
of language{especially within the area of sentence processing We introduce a performance mea-sure, Grammatical Prediction Error (GPE), which allows for the modeling of grammaticality ratings We use this measure to derive novel empirical predictions from an existing connection-ist model of the processing of recursive sentence structure (Chrconnection-istiansen, 1994; Chrconnection-istiansen & Chater, 1994) The predictions suggest that increasing depths of recursion not only decrease the acceptability for center-embedded constructions, but also for the simpler left- and right-branching constructions These predictions are at odds with many symbolic models of sentence processing Results from three behavioral experiments are presented, conrming the model's predictions
2 Connectionist Simulations
The predictions were derived from the Simple Recurrent Network (SRN, Elman, 1990) model of recursive sentence processing developed by Christiansen (1994; Christiansen & Chater, 1994)
Trang 2Input (42 units) Context (150 units)
Hidden (150 units)
copy-back
Figure 1: The basic architecture of the SRN used in Christiansen (1994) Arrows with solid lines denote trainable weights, whereas the arrow with the dashed line denotes the copy-back connections.
The SRN, as illustrated in Figure 1, is essentially a standard feedforward network equipped with an extra layer of so-called context units The hidden unit activations from the previous time step are copied back to these context units and paired with the current input This means providing the SRN with an ability to deal with integrated sequences of input presented succes-sively The SRNs were trained via a word-by-word prediction task on 50,000 sentences (mean length: 6 words; range: 3-15 words) generated by the context-free grammar in Figure 2 (using a
38 word vocabulary) This grammar involved left recursion in the form of prenominal genitives, right recursion in the form of subject relative clauses, sentential complements, prepositional modications of NPs, and NP conjunctions, as well as complex recursion in the form of ob-ject relative clauses The grammar also incorporated subob-ject noun/verb agreement and three additional verb argument structures (transitive, optionally transitive, and intransitive) The generation of sentences was further restricted by probabilistic constraints on the complexity and depth of recursion
-S NP VP rel PP gen
NP VP \."
who NP V(tjo) j who VP prep locN (PP)
PropN j N j N rel j N PP j gen N j N and NP V(i) j V(t) NP j V(o) (NP) j V(c) that S
(gen) N \s"
+
Figure 2: The small context-free grammar from used to generate the training corpus.
2.1 Deriving Predictions from the SRN Model
When evaluating how well the SRN has learned regularities of the grammar, it is important from
a linguistic perspective not only to determine whether the words that were activated given prior context are grammatical, but also which items were not activated despite being sanctioned by the grammar The GPE provides anactivation-basedmeasure of how well a network is obeying the training grammar in making its predictions, taking hits, false positives, correct rejections as well as false negatives into account The GPE for predicting a particular word was calculated using:
Trang 3GPE = 1?
hits hits + false positives + false negatives Hits and false positives consisted of the accumulated activations of all units that were gram-matical (G) and of all activated units that were ungrammatical (U), respectively
hits =X
i2G u
i false positives =X
i2U u
i false negatives =X
i2G
f n i
False negatives were calculated as a sum over the (positive) discrepancyfn
ibetween the desired activation for a grammatical unitt
i and the actual activation of that unit u
i
fn
i =
(
i
? u i
0
t i
? u
i = (hits + false positives)f
i P
j2G f j
The desired activation,t
i, was computed as a proportion of the total activation determined by the lexical frequency f
i of the word that u
i designate and weighted by the sum of the lexical frequencies f
j of all the grammatical units
word given the previous sentential context, and can be mapped qualitatively onto word reading indicating long predicted reading times (MacDonald & Christiansen, in submission) The aver-age GPE across a whole sentence expresses the diculty that the SRN experienced across the sentence as a whole, and have been found to map onto sentence grammaticality ratings (Chris-tiansen & Chater, in submission), with low average GPE scores indicating a low acceptability predictions concerning the acceptability of three types of sentences involving complex recursive constructions from the existing model by Christiansen (1994; Christiansen & Chater, 1994)
A number of predictions regarding the processing of recursive sentence structure were derived from the model Here we focus on the processing of sentences involving multiple instances
of three dierent kinds of recursion: right-branching, left-branching, and center-embedding These predictions from the SRN model were tested in on-line grammaticality judgment exper-iments using a self-paced reading task with word-by-word center presentation Following the presentation of each sentence, subjects rated the sentence on a 7-point scale (7 = bad)
3.1 Experiment 1: Multiple PP Modications of Nouns
SRN Prediction: Increasing the number of recursions in right-branching constructions in-volving an NP modied by several PPs should make the sentences less acceptable
Trang 41 PP 2 PPs 3 PPs 1
2 3 4 5
Sentence Type Figure 3: The mean ratings for sentences incorporating 1, 2 or 3 PPs into an NP (Experiment 1).
The results in Figure 3 show that there was a signicant eect of depth of recursion in the direction predicted by the SRN model (F 1(2 ; 70) = 10 : 87 ; p < : 0001; F 2(2 ; 16) = 12 : 43 ; p < : 001; N=36)
3.2 Experiment 2: Multiple Prenominal Genitives
SRN Prediction: Having two levels of recursion in an NP involving left-branching prenominal genitives should be less acceptable in an object position than in a subject position
In this experiment, subjects were presented with sentences containing multiple prenominal genitives either in the subject position (4) or in the object position (5):
(4) Jane's dad's colleague's parrot followed the baby all afternoon (subject)
(5) The baby followed Jane's dad's colleague's parrot all afternoon (object)
As predicted by the SRN model, the results in Figure 4 show that multiple prenominal genitives were less acceptable in object position than in subject position (F 1(1 ; 33) = 5 : 76 ; p < : 03; F 2(1 ; 9) = 3 : 48 ; p < : 1; N=34)
Subject Object 1
2 3 4 5 6 7
Sentence Type Figure 4: The mean ratings for sentences incorporating multiple prenominal genitives in subject or object positions (Experiment 2).
Trang 52 VPs 3 VPs 1
2 3 4 5 6 7
Sentence Type Figure 5: The mean ratings for the ungrammatical 2 VP constructions and the grammatical 3 VP sentences (Experiment 3).
3.3 Experiment 3: Doubly Center-Embedded Constructions
Using an o-line task, Gibson & Thomas (1997) found that ungrammatical NP1NP2NP3VP3VP1
constructions, such as (7), were rated no better than their grammatical counterpart
NP1NP2NP3VP3VP2VP1, such as (6):
(6) The apartment that the maid who the service had sent over was cleaning every week was well decorated (3 VPs)
(7) *The apartment that the maid who the service had sent over was well decorated (2 VPs)
SRN Prediction: People will actually nd the grammatical 3 VP sentence (6) worse than the ungrammatical 2 VP sentence (7) when tested on-line
The results presented in Figure 5 conrmed the SRN prediction: The grammatical 3
VP sentences were rated signicantly worse than their ungrammatical 2 VP counterparts (F 1(1 ; 35) = 15 : 55 ; p < : 0001; F 2(1 ; 5) = 6 : 85 ; p < : 05; N=36)
3.4 Comparing Human and SRN Data
Figure 6 shows that the model's average GPE scores correctly predict the behavioral data both within and across experiments
We have presented results from three grammaticality judgments experiments conrming novel predictions derived from an existing connectionist model Importantly, this model was not developed for the purpose of tting these data, but was nevertheless able to predict the patterns
of human grammaticality judgments across three dierent kinds of recursive structures We have argued elsewhere that the SRN's ability to model human limitations on complex recursive constructions stems largely from intrinsic architectural constraints (Christiansen & Chater, in submission; MacDonald & Christiansen, in submission) In contrast, the present pattern of results provides a challenge for symbolic models of human performance relying on arbitrary, externally specied memory limitations The close t between SRN predictions and the human
Trang 61 PP 2 PPs 3 PPs
1
2
3
4
5
6
0.0 0.1 0.2 0.3
0.4 Avg GPE
Sentence Type
1 2 3 4 5 6
0.0 0.1 0.2 0.3
0.4 Avg GPE
Sentence Type
1 2 3 4 5 6
0.0 0.1 0.2 0.3
0.4 Avg GPE
Sentence Type
Figure 6: Grammaticality ratings (left y-axes) and GPE averages (right y-axes) from Experiments 1 (left panel), 2 (middle panel), and 3 (right panel).
data within and across the three experiments suggests that the SRN is well-suited for the modeling of the processing of recursive sentence structure, and that GPE provides a useful way
of mapping SRN performance onto behavioral data
References
Christiansen, M.H (1994). Innite languages, nite minds: Connectionism, learning and linguis-tic structure Unpublished PhD thesis, University of Edinburgh.
Christiansen, M.H & Chater, N (1994). Generalization and connectionist language learning.
Mind and Language ,9, 273{287.
Christiansen, M.H & Chater, N (in submission). Toward a connectionist model of recursion
in human linguistic performance.
Elman, J.L (1990). Finding structure in time Cognitive Science ,14, 179{211.
Gibson, E & Thomas, J (1997). Memory limitations and structural forgetting: The perception
of complex ungrammatical sentences as grammatical Manuscript, MIT, Cambridge, MA.
MacDonald, M.C & Christiansen, M.H (in submission). Individualdierences without work-ing memory: A reply to Just & Carpenter and Waters & Caplan.
Rumelhart, D.E., Hinton, G.E & Williams, R.J (1986). Learning internal representations by error propagation In McClelland, J.L & Rumelhart, D.E (Eds.) Parallel distributed pro essing, Vol 1 (pp 318{362) Cambridge, MA: MIT Press.
... data Many models only match data at a fairly gross level of performance, and few make predictions that inspire new experiments This is true of both connectionist and symbolic computational models...of language{especially within the area of sentence processing We introduce a performance mea-sure, Grammatical Prediction Error (GPE), which allows for the modeling of grammaticality ratings... as well as false negatives into account The GPE for predicting a particular word was calculated using:
Trang 3GPE