1. Trang chủ
  2. » Luận Văn - Báo Cáo

Tài liệu Báo cáo khoa học: "Pseudo-Projectivity: A Polynomially Parsable Non-Projective Dependency Grammar" docx

7 351 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 7
Dung lượng 639,05 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

One problem that has posed an impediment to more wide-spread acceptance of dependency grammars is the fact that there is no computationally tractable ver- sion of dependency g r a m m a

Trang 1

Pseudo-Projectivity: A Polynomially Parsable Non-Projective

Dependency Grammar

S y l v a i n K a h a n e * a n d A l e x i s N a s r t a n d O w e n R a m b o w t

• T A L A N A Universit@ Paris 7 ( s k 0 c c r j u s s i e u f r )

t LIA Universit@ d ' A v i g n o n ( a l e x i s n a s r © l i a , u n i v - a v i g n o n , f r )

:~CoGenTex, Inc ( o w e n O c o g e n t e x c o m )

1 I n t r o d u c t i o n

Dependency g r a m m a r has a long tradition

in syntactic theory, dating back to at least

Tesni~re's work from the thirties3 Recently, it

has gained renewed attention as empirical meth-

ods in parsing are discovering the importance

of relations between words (see, e.g., (Collins,

1997)), which is what dependency grammars

model explicitly do, but context-free phrase-

structure grammars do not One problem that

has posed an impediment to more wide-spread

acceptance of dependency grammars is the fact

that there is no computationally tractable ver-

sion of dependency g r a m m a r which is not re-

stricted to projective analyses However, it is

well known that there are some syntactic phe-

nomena (such as wh-movement in English or

clitic climbing in Romance) that require non-

projective analyses In this paper, we present

a form of projectivity which we call pseudo-

projectivity, and we present a generative string-

rewriting formalism that can generate pseudo-

projective analyses and which is polynomially

parsable

The paper is structured as follows In Sec-

tion 2, we introduce our notion of pseudo-

projectivity We briefly review a previously pro-

posed formalization of projective dependency

grammars in Section 3 In Section 4, we extend

this formalism to handle pseudo-projectivity

We informally present a parser in Section 5

2 L i n e a r a n d S y n t a c t i c O r d e r o f a

S e n t e n c e

2.1 S o m e N o t a t i o n a n d T e r m i n o l o g y

We will use the following terminology and no-

tation in this paper The h i e r a r c h i c a l o r d e r

tThe work presented in this paper is collective and

the order of authors is alphabetical

(dominance) between the nodes of a t r e e T will

be represented with the symbol _~T and T

Whenever they are unambiguous, the notations -< and _ will be used When x -~ y, we will say that x is a d e s c e n d e n t of y and y an a n c e s t o r

of x The p r o j e c t i o n of a node x, belonging to

a tree T, is the set of the nodes y of T such that

y _ T X An a r c between two nodes y and x of a tree T, directed from y to x will be noted either (y, x) or ~- The node x will be referred to as the d e p e n d e n t and y as the g o v e r n o r The latter will be noted, when convenient, x +T (x +

when unambiguous) The notations ~2- and x + are unambiguous because a node x has at most one governor in a tree As usual, an o r d e r e d

t r e e is a tree enriched with a linear order over the set of its nodes Finally, if l is an arc of

an ordered tree T, then Supp(1) represents the

s u p p o r t of l, i.e the set of the nodes of T situated between the extremities of l, extremi- ties included We will say that the elements of

Supp(1) are c o v e r e d by I

2.2 P r o j e c t i v i t y The notion of projectivity was introduced by (Lecerf, 1960) and has received several different definitions since then The definition given here

is borrowed from (Marcus, 1965) and (Robin- son, 1970):

D e f i n i t i o n : An arc ~- is p r o j e c t i v e if and only if for every y covered by ~2-, y ~ x + A tree T is p r o j e c t i v e if and only if every arc of

T is projective

A projective tree has been represented in Fig- ure 1

A projective dependency tree can be associ- ated with a phrase structure tree whose con- stituents are the projections of the nodes of the dependency tree Projectivity is therefore equivalent, in phrase structure markers, to con-

Trang 2

The big cat sometimes eats white mice

Figure 1: A projective sub-categorization tree

tinuity of constituent

T h e strong constraints introduced by the pro-

jectivity property on the relationship between

hierarchical order and linear order allow us to

describe word order of a projective dependency

tree at a local level: in order to describe the

linear position of a node, it is sufficient to de-

scribe its position towards its governor and sis-

ter nodes The d o m a i n of locality of the linear

order rules is therefore limited to a subtree of

d e p t h equal to one It can be noted that this do-

main of locality is equal to the domain of local-

ity of sub-categorization rules Both rules can

therefore be represented together as in (Gaif-

man, 1965) or separately as will be proposed

in 3

2.3 P s e u d o - P r o j e c t i v i t y

Although most linguistic structures can be

represented as projective trees, it is well known

that projectivity is too strong a constraint for

dependency trees, as shown by the example of

Figure 2, which includes a non-projective arc

(marked with a star)

Who do you think she invited ?

Figure 2: A non projective sub-categorization

tree

T h e non projective structures found in

linguistics represent a small subset of the

potential non projective structures We will

define a property (more exactly a family of

properties), weaker t h a n projectivity, called

p s e u d o - p r o j e c t i v i t y , which describes a

subset of the set of ordered dependency trees,

containing the non-projective linguistic struc-

tures

In order to define pseudo-projectivity, we in-

troduce an operation on d e p e n d e n c y trees called lifting W h e n applied to a tree, this operation leads to the creation of a second tree, a lift of the first one An ordered tree T ' is a lift of the ordered tree T if and only if T and T' have the same nodes in the same order and for ev-

ery node x, x +T <T x+T' We will say that the

node x has been lifted from x +T (its s y n t a c t i c

g o v e r n o r ) to x +T' (its l i n e a r g o v e r n o r )

Recall that the linear position of a node in

a projective tree can be defined relative to its governor and its sisters In order to define the linear order in a non projective tree, we will use a projective lift of the tree In this case, the position of a node can be defined only with regards to its governor a n d sisters in the lift,

i.e., its linear governor and sisters

D e f i n i t i o n : An ordered tree T is said

p s e u d o - p r o j e c t i v e if there exists a lift T' of tree T which is projective

If there is no restriction on the lifting, the previous definition is not very interesting since

we can in fact take any non-projective tree and lift all nodes to the root node and obtain a pro- jective tree

We will therefore constrain the lifting by a

set of rules, called lifting rules Consider a set

of (syntactic) categories T h e following defini- tions make sense only for trees whose nodes are labeled with categories 2

The lifting rules are of the following form

(LD, S G and L G are categories and w is a reg-

ular expression on the set of categories):

This rule says t h a t a node of category L D

can be lifted from its syntactic governor of cat-

egory S G to its linear governor of category LG

through a p a t h consisting of nodes of category

C 1 , , Ca, where the string C 1 Cn belongs

to L(w) Every set of lifting rules defines a par-

ticular property of pseudo-projectivity by im- posing particular constraints on the lifting A sit is possible to define pseudo-projectivity purely structurally (i.e without referring to the labeling) For example, we can impose t h a t each node x is lifted to the highest ancestor of x covered by ~2" ((Nasr, 1996)) The resulting pseudo-projectivity is a fairly weak exten- sion to projectivity, which nevertheless covers major non- projective linguistic structures However, we do not pur- sue a purely s t r u c t u r a l definition of pseudo-projectivity

in this paper

Trang 3

linguistic example of lifting rule is given in Sec-

tion 4

T h e idea of building a projective tree by

means of lifting appears in (Kunze, 1968) and

is used by (Hudson, 1990) a n d (Hudson, un-

published) This idea can also be c o m p a r e d to

the notion of word order d o m a i n (Reape, 1990;

BrSker and Neuhaus, 1997), to the Slash feature

of G P S G a n d HPSG, to the functional uncer-

tainty of LFG, a n d to the Move-a of GB theory

3 P r o j e c t i v e D e p e n d e n c y G r a m m a r s

R e v i s i t e d

We (informally) define a projective D e p e n d e n c y

G r a m m a r as a string-rewriting s y s t e m 3 by giv-

ing a set of categories such as N , V and Adv, 4

a set of distinguished start categories (the root

categories of well-formed trees), a m a p p i n g from

strings to categories, a n d two types of rules: de-

p e n d e n c y r u l e s which state hierarchical order

(dominance) a n d L P r u l e s which state linear

order T h e d e p e n d e n c y rules are f u r t h e r sub-

divided into subcategorization rules (or s-rules)

a n d modification rules (or m-rules) Here are

some sample s-rules:

dl : Vtrans ) gnom, Nobj, (2)

d2 : Yclause ~ gnom, Y

Here is a sample m-rule

(3)

LP rules are represented as regular expressions

(actually, only a limited form of regular expres-

sions) associated with each category We use

t h e hash sign ( # ) to denote the position of the

governor (head) For example:

pl:Yt = (Adv)Nnom(Aux)Adv*#YobjAdv*Yt (5)

3We follow (Gaifman, 1965) throughout this paper by

modeling a dependency grammar with a string-rewriting

system However, we will identify a derivation with its

representation as a tree, and we will sometimes refer

to symbols introduced in a rewrite step as "dependent

nodes" For a model of a DG based on tree-rewriting

(in the spirit of Tree Adjoining Grammar (Joshi et al.,

1975)), see (Nasr, 1995)

4In this paper, we will allow finite feature structures

on categories, which we will notate using subscripts; e.g.,

Vtrans Since the feature structures are finite, this is sim-

ply a notational variant of a system defined only with

simple category labels

Adv Nnom thought Vtrans

yesterday Fernando thought Carlos eats beans slowly

Vclause

Adv Nnom thought Vtrans

yesterday Fernando

Nnom eats N o b j Adv

Carlos beans slowly Figure 3: A sample G D G derivation

We will call this s y s t e m g e n e r a t i v e d e p e n -

d e n c y g r a m m a r or G D G for short

Derivations in G D G are defined as follows

In a rewrite step, we choose a multiset of de-

p e n d e n c y rules (i.e., a set of instances of de-

p e n d e n c y rules) which contains exactly one s- rule a n d zero or m o r e m-rules T h e left-hand side n o n t e r m i n a l is t h e same as t h a t we want to rewrite Call this multiset the rewrite-multiset

In t h e rewriting operation, we introduce a mul- tiset of new n o n t e r m i n a l s a n d exactly one termi- nal symbol (the head) T h e rewriting operation

t h e n must meet t h e following three conditions:

• T h e r e is a bijection between the set of de-

p e n d e n t s of t h e instances of rules in the rewrite-multiset a n d t h e set of newly intro- duced dependents

• T h e order of t h e newly introduced depen- dents is consistent w i t h t h e LP rule associ-

a t e d w i t h t h e governor

• T h e i n t r o d u c e d t e r m i n a l string (head) is

m a p p e d to t h e r e w r i t t e n category

As an example, consider a g r a m m a r contain- ing the three d e p e n d e n c y rules dl (rule 2), d2 (rule 3), a n d d3 (rule 4), as well as the LP rule Pl (rule 5) In addition, we have some lexical map- pings ( t h e y are obvious from t h e example), and

t h e start symbol is Yfinite: + A sample deriva- tion is shown in Figure 3, with the sentential form r e p r e s e n t a t i o n on top a n d the correspond- ing tree r e p r e s e n t a t i o n below

Using this kind of representation, we can derive a b o t t o m - u p parser in the following

Trang 4

straightforward m a n n e r 5 Since syntactic and

linear governors coincide, we can derive de-

terministic finite-state machines which c a p t u r e

b o t h t h e d e p e n d e n c y a n d the LP rules for a

given governor category We will refer to these

FSMs as r u l e - F S M s , a n d if the governor is of

category C, we will refer to a C-rule-FSM In

a rule-FSM, the transitions are labeled by cate-

gories, and the transition corresponding to the

governor labeled by its category and a special

m a r k (such as # ) This transition is called the

"head transition"

T h e entries in the parse m a t r i x M are of the

form (m, q), where rn is a rule-FSM and q a state

of it, except for the entries in squares M(i, i),

1 <: i < n, which also contain category labels

Let w o ' " w n be the i n p u t word We initialize

the parse m a t r i x as follows Let C be a category

of word wi First, we a d d C to M ( i , i ) Then,

we add to M(i, i) every pair (m, q) such t h a t m

is a rule-FSM with a transition labeled C from

a start state and q the state reached after t h a t

transition 6

E m b e d d e d in the usual three loops on i, j , k,

we add an e n t r y ( m l , q ) to M ( i , j ) if (rnl,ql) is

in M ( k , j ) , (m2, q2) is in M(i, k-t-l), q2 is a final

state of m2, m2 is a C-rule-FSM, a n d m l transi-

tions from ql to q on C (a n o n - h e a d transition)

T h e r e is a special case for the head transitions

in m l : i f k = i - 1, C is in M ( i , i ) , m l is a C-

rule-FSM, and there is a head transition from

ql to q in m l , t h e n we add ( m l , q) to M(i, j)

T h e time complexity of the algorithm is

O(n3[GIQmax), where G is the n u m b e r of rule-

FSMs derived from the d e p e n d e n c y and LP

rules in the g r a m m a r and Qmax is the m a x i m u m

n u m b e r of states in any of the rule-FSMs

4 A F o r m a l i z a t i o n o f

P P - D e p e n d e n c y G r a m m a r s

Recall t h a t in a pseudo-projective tree, we make

a distinction between a syntactic governor and

a linear governor A node can be "lifted" along

a lifting p a t h from being a d e p e n d e n t of its syn-

tactic governor to being a d e p e n d e n t of its linear

5This type of parser has been proposed previously

See for example (Lombardi, 1996; Eisner, 1996), who

also discuss Early-style parsers for projective depen-

dency grammars

6We can use pre-computed top-down prediction to

limit the number of pairs added

governor, which must be an ancestor of the gov- ernor In defining a formal rewriting system for pseudo-projective trees, we will not a t t e m p t to model the "lifting" as a t r a n s f o r m a t i o n a l step in the derivation R a t h e r , we will directly derive the "lifted" version of t h e tree, where a node

is d e p e n d e n t of its linear governor Thus, the derived s t r u c t u r e resembles more a unistratal

d e p e n d e n c y representation like those used by (Hudson, 1990) t h a n t h e m u l t i s t r a t a l represen- tations of, for example, (Mel'~uk, 1988) How- ever, from a formal point of view, the distinction

is not significant

In order to c a p t u r e pseudo-projectivity, we will interpret rules of the form (2) (for subcate- gorization of a r g u m e n t s by a head) and (4) (for selection of a head by an a d j u n c t ) as introducing syntactic d e p e n d e n t s which m a y lift to a higher linear governor A n LP rule of the form (5) or- ders all linear d e p e n d e n t s of the linear governor,

no m a t t e r whose syntactic d e p e n d e n t s they are

In addition, we need a third type of rule,

n a m e l y a lifting rule, or l-rule (see 2.3) T h e 1-rule (1) can be rewrited on the following form:

ll : LG > L D { L G w SG LD} (6) This rule resembles n o r m a l d e p e n d e n c y rules but instead of i n t r o d u c i n g syntactic dependents

of a category, it introduces a lifted dependent Besides introducing a linear d e p e n d e n t LD, a

1-rule should make sure t h a t the syntactic gov- ernor of L D will be i n t r o d u c e d at a later stage of the derivation, a n d prevent it to introduce LD

as its syntactic d e p e n d e n t , otherwise non pro- jective nodes would be i n t r o d u c e d twice, a first time by their linear governor and a second time

by their syntactic governor This condition is represented in the rule by means of a constraint

on the categories found along the lifting path This condition, which we call the lifting con- dition, is represented by t h e regular expression

LG w SG T h e regular expression representing the lifting condition is enriched with a dot sep- arating, on its left, t h e p a r t of the lifting p a t h which has a l r e a d y been i n t r o d u c e d during the rewriting and on its right t h e p a r t which is still

to be i n t r o d u c e d for the rewriting to be valid

T h e dot is an unperfect way of representing the current state in a finite state a u t o m a t o n equiv- alent to the regular expression We can further notice t h a t the lifting condition ends with a rep-

Trang 5

etition of L D for reasons which will be m a d e

clear when discussing the rewriting process

A sentential form contains terminal strings

and categories paired with a multiset of lifting

conditions, called the lift multiset T h e lift mul-

tiset associated to a category C contains 'tran-

siting' lifting conditions: introduced by ances-

tors of C and passing across C

Three cases must be distinguished when

rewriting a category C and its lifting multiset

L M :

• L M contains a single lifting condi-

tion which dot is situated to its right:

L G w S G C In such a c a s e , C m u s t be

rewritten by the e m p t y string T h e situ-

ation of the dot at the right of the lifting

condition indicates t h a t C has been intro-

duced by its syntactic governor although it

has already been introduced by its linear

governor earlier in the rewriting process

This is the reason why C has been added

at the end of the lifting condition

• L M contains several lifting conditions one

of which has its dot to the right In such

a case, the rewriting fails since, in accor-

dance with the preceding case, C must be

rewritten by the e m p t y string Therefore,

the other lifting conditions of L M will not

be satisfied Furthermore, a single instance

of a category cannot anchor more t h a n one

lifting condition

• L M contains several lifting conditions none

of which having the dot to their right In

this case, a rewrite multiset of dependency

rules and lifting rules, b o t h having C as

their left hand side, is selected T h e result

of the rewriting then must meet the follow-

ing conditions:

1 T h e order of the newly introduced de-

pendents is consistent with the LP rule

associated with C

2 T h e union 7 of the lift multisets asso-

ciated with all the newly introduced

(instances of) categories is equal to the

union of the lift multiset of C and the

multiset composed of the lift condition

7 W h e n discussing set o p e r a t i o n s on multisets, we of

course m e a n t h e c o r r e s p o n d i n g m u l t i s e t operations

of the 1-rules used in the rewriting op- eration

3 T h e lifting conditions contained in the lift multiset of all the newly introduced dependents D should be compatible with D, with the dot advanced appro- priately

In addition, we require that, when we rewrite

a category as a terminal, the lift multiset is empty

Let us consider an example Suppose we have have a g r a m m a r containing the dependency rules dl (rule 2), d2 (rule 3), and d3 (rule 4); the LP rule Pl (rule 5) a n d p2:

p2:Vclause : (Ntop: + INwh:+)(Adv)Nnom(Aux)Adv* #Adv* Vt Furthermore, we have the following 1-rule:

II :Vbridge:+ -~Nc bj top:+ {'V~ridge:+VNc bj top:+ }

This rule says t h a t an objective wh-noun with feature t o p : + which depends on a verb with no further restrictions (the third V in the lifting path) can raise to any verb that dominates its immediate governor as long as the raising paths contains only verb with feature bridge:+, i.e., bridge verbs

Vclause

Nobj Nnom thought Adv Y{'Y~ridge: + Y Ncase:obj top:+}

beans Fernando thought yesterday

V{.V~ridge: + V Nc bj top:+}

beans Fernando thought yesterday Nnom claims

V{.V~ridge: + V Nc bj top:+}

= ~ beans Fernando thought yesterday Milagro claims

V{-V~ridge: + Y Nc bj top:+}

beans yesterday Fernando thought yesterday Milagro

:=~ beans Fernando thought yesterday Milagro claims Carlos eats slowly

Vcl~us¢

Nno m claims Vtrans

Milagro

Figure 4: A sample P P - G D G derivation

A sample derivation is shown in Figure 4, with the sentential form representation on top

Trang 6

and t h e corresponding tree representation be-

low We s t a r t our derivation with t h e start

symbol Vclause a n d rewrite it using d e p e n d e n c y

rules d2 a n d d3, a n d the lifting rule ll which

introduces an objective N P argument T h e lift-

ing condition of I1 is passed to t h e V d e p e n d e n t

but the dot remains at the left of V'bridge: { be-

cause of t h e Kleene star W h e n we rewrite the

e m b e d d e d V, we choose to rewrite again with

Yclause , and the lifting condition is passed on to

the next verb This verb is a Ytrans which re-

quires a Yobj T h e lifting condition is passed to

Nob j and the dot is moved to the right of the

regular expression, therefore Nob j is r e w r i t t e n

as the e m p t y string

5 A P o l y n o m i a l P a r s e r f o r P P - G D G

In this section, we show t h a t pseudo-projective

d e p e n d e n c y g r a m m a r s as defined in Section 2.3

are polynomially parsable

We can e x t e n d t h e b o t t o m - u p parser for GDG

to a parser for P P - G D G in the following man-

ner In P P - G D G , syntactic a n d linear governors

do not necessarily coincide, and we must keep

track separately of linear precedence a n d of lift-

ing (i.e., "long distance" syntactic dependence)

T h e entries in the parse m a t r i x M are of

the form (m,q, LM), where m is a rule-FSM,

q a state of m, a n d LM is a multiset of lift-

ing conditions as defined in Section 4 A n e n t r y

(m, q, LM) in a square M(i, j) of the parse ma-

trix means t h a t the sub-word w i w j of the

e n t r y can be analyzed by m up to state q (i.e.,

it matches the beginning of an LP rule), but

t h a t nodes corresponding to the lifting rules in

LM are being lifted from the subtrees span-

ning wi wj P u t differently, in this b o t t o m -

up view LM represents the set of nodes which

have a syntactic governor in the subtree span-

ning w i w j and a lifting rule, but are still

looking for a linear governor

Suppose we have an e n t r y in the parse m a t r i x

M of the form (m, q, L) As we traverse the C-

rule-FSM m, we recognize one by one the linear

d e p e n d e n t s of a node of category C Call this

governor ~? T h e action of adding a new e n t r y to

the parse m a t r i x corresponds to adding a single

new linear d e p e n d e n t to 77 (While we are work-

ing on the C-rule-FSM m and are not yet in a

final state, we have not yet recognized ~? itself.)

Each new d e p e n d e n t ~?' brings with it a multiset

of nodes being lifted from t h e subtree it is the root of Call this multiset LM' T h e new e n t r y will be (m, q', L M U LM') (where q' is the state

! ,

t h a t m transitions to w h e n ~? is recognized as the next linear d e p e n d e n t

W h e n we have reached a final state q of the rule-FSM m, we have recognized a complete subtree r o o t e d in the new governor, ~? Some

of the d e p e n d e n t nodes of ~? will be b o t h syn- tactic a n d linear d e p e n d e n t s of ~?, and the others will be linear d e p e n d e n t s of ~?, but lifted from a descendent of 7 In addition, 77 m a y have syn- tactic d e p e n d e n t s which are not realized as its own linear d e p e n d e n t a n d are lifted away (No other options are possible.) Therefore, when we have reached the final s t a t e of a rule-FSM, we must connect up all nodes a n d lifting conditions before we can proceed to p u t an e n t r y (m, q, L)

in the parse matrix This involves these steps:

1 For every lifting condition in LM, we en- sure t h a t it is compatible with the category

of ~? This is done by moving the dot left- wards in accordance with the category of

77 (The dot is moved leftwards since we are doing b o t t o m - u p recognition.)

T h e obvious special provisions deal with the Kleene star a n d optional elements

If t h e category m a t c h e s a catgeory with Kleene start in the lifting condition, we do not move the dot If t h e category matches

a category which is to the left of an op- tional category, or to the left of category with Kleene star, t h e n we can move the dot

to the left of t h a t category

If the dot cannot be placed in accordance with the category of 77, t h e n no new entry

is m a d e in t h e parse m a t r i x for ~?

2 We t h e n choose a multiset of s-, m-, and 1- rules whose left-hand side is the category of

~? For every d e p e n d e n t of 77 introduced by

an 1-rule, the d e p e n d e n t must be compati- ble with an instance of a lifting condition in

LM (whose dot must be at its beginning, or seperated from the beginning by optional

or categories only); the lifting condition is

t h e n removed from L

3 If, after t h e above repositioning of the dot

a n d t h e linking up of all linear dependents

to lifting conditions, there are still lifting

Trang 7

conditions in L M such that the dot is at

the beginning of the lifting condition, then

no new entry is made in the parse matrix

for ~?

For every syntactic dependent of ?, we de-

termine if it is a linear d e p e n d e n t of ~ which

has not yet been identified as lifted For

each syntactic dependents which is not also

a linear dependent, we check whether there

is an applicable lifting rule If not, no entry

is made in the parse matrix for 77 If yes,

we add the lifting rule to LM

This procedure determines a new multiset

L M so we can add entry (m, q, LM) in the parse

matrix (In fact, it may determine several pos-

sible new multisets, resulting in multiple new

entries.) T h e parse is complete if there is an

entry (m, qrn, O) in square M(n, 1) of the parse

matrix, where m is a C-rule-FSM for a start

category and qm is a final state of m If we keep

backpointers at each step in the algorithm, we

have a compact representation of the parse for-

est

T h e m a x i m u m n u m b e r of entries in each

square of the parse matrix is O(GQnL), where

G is the n u m b e r of rule-FSMs corresponding to

LP rules in the grammar, Q is the m a x i m u m

n u m b e r of states in any of the rule-FSMs, and

L is the m a x i m u m n u m b e r of states that the

lifting rules can be in (i.e., the n u m b e r of lift-

ing conditions in the g r a m m a r multiplied by the

m a x i m u m n u m b e r of dot positions of any lifting

condition) Note that the exponent is a gram-

m a r constant, b u t this n u m b e r can be rather

small since the lifting rules are not lexicalized

- they are construction-specific, not lexeme-

specific T h e time complexity of the algorithm

is therefore O(GQn3+21L[)

R e f e r e n c e s

Norbert BrSker and Peter Neuhaus 1997 T h e

complexity of recognition of linguistically ad-

equate dependency grammars In 35th Meet-

ing of the Association for Computational Lin-

guistics (ACL'97), Madrid, Spain ACL

M Collins 1997 Three generative, lexicalised

models for statistical parsing In Proceedings

of the 35th Annual Meeting of the Associa-

tion for Computational Linguistics, Madrid,

Spain, July

Jason M Eisner 1996 Three new probabilis- tic models for d e p e n d e n c y parsing: An ex- ploration In Proceedings of the 16th Inter- national Conference on Computational Lin- guistics (COLING'96), Copenhagen

Haim Galfman 1965 Dependency systems and phrase-structure systems Information and Control, 8:304-337

Richard Hudson 1990 English Word Gram- mar Basil Blackwell, Oxford, RU

Richard Hudson unpublished Discontinuity e-preprint (ftp.phon.ucl.ac.uk)

Aravind K Joshi, Leon Levy, and M Takahashi

1975 Tree a d j u n c t grammars J Comput Syst Sci., 10:136-163

Jiirgen Kunze 1968 T h e t r e a t m e n t of non- projective structures in the syntactic analysis and synthesis of english and german Com- putational Linguistics, 7:67-77

Yves Lecerf 1960 P r o g r a m m e des conflits, module des conflits Bulletin bimestriel de

I'ATALA, 4,5

Vicenzo Lombardi 1996 An Earley-style parser for dependency grammars In Pro- ceedings of the 16th International Conference

on Computational Linguistics (COLING'96),

Copenhagen

Solomon Marcus 1965 Sur la notion de projec- tivit6 Zeitschr f math Logik und Grundla- gen d Math., 11:181-192

Igor A Mel'6uk 1988 Dependency Syntax: Theory and Practice State University of New York Press, New York

Alexis Nasr 1995 A formalism and a parser for lexicalised d e p e n d e n c y grammars In 4th In- ternational Workshop on Parsing Technolo- gies, pages 186-195, Prague

Alexis Nasr 1996 Un syst~me de reformu- lation automatique de phrases fondd sur la Thdorie Sens-Texte : application aux langues contr61des Ph.D thesis, Universit6 Paris 7 Michael Reape 1990 Getting things in order

In Proceedings of the Symposium on Discon- tinuous Constituents, Tilburg, Holland Jane J Robinson 1970 Dependency struc- tures and transformational rules Language,

46(2):259-285

Ngày đăng: 20/02/2014, 18:20

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm