1. Trang chủ
  2. » Luận Văn - Báo Cáo

Báo cáo khoa học: "Rules for Pronominalization" pdf

8 355 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 8
Dung lượng 667,13 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Watson 1976: "System R: Relational Approach to Database Management", ACM Transactions on Da- tabase Systems, vol.. Stenbock-Fermor 1976: "User Applica- Conference on Relational Data Base

Trang 1

Rules for Pronominalization

Franz G u e n t h n e r , H u b e r t Lehmann IBM D e u t s c h l a n d GmbH

H e i d e l b e r g Science C e n t e r

T i e r g a r t e n s t r 15, D-6900 H e i d e l b e r g , FRG

A b s t r a c t

R i g o r o u s i n t e r p r e t a t i o n of p r o n o u n s is p o s s i b l e

w h e n s y n t a x , semantics, and p r a g m a t i c s of a d i s -

c o u r s e can be r e a s o n a b l y c o n t r o l l e d I n t e r a c t i o n

w i t h a d a t a b a s e p r o v i d e s such an e n v i r o n m e n t In

t h e f r a m e w o r k of t h e User S p e c i a l t y L a n g u a g e s

s y s t e m and D i s c o u r s e R e p r e s e n t a t i o n T h e o r y , we

f o r m u l a t e s t r i c t and p r e f e r e n t i a l rules f o r p r o n o m i -

n a l i z a t i o n and o u t l i n e a p r o c e d u r e to f i n d p r o p e r

a s s i g n m e n t s of r e f e r e n t s to p r o n o u n s

1 O v e r v i e w : Relation to p r e v i o u s w o r k

One of t h e main obstacles of t h e a u t o m a t e d p r o c e s s -

i n g of n a t u r a l l a n g u a g e sentences ( a n d a f o r t e r i o r i

t e x t s ) is t h e p r o p e r t r e a t m e n t of a n a p h o r i c re-

l a t i o n s Even t h o u g h t h e r e is a p l e t h o r a of re-

s e a r c h a t t e m p t i n g to s p e c i f y ( b o t h on t h e

t h e o r e t i c a l level as well as in c o n n e c t i o n w i t h im-

p l e m e n t a t i o n s ) " s t r a t e g i e s " f o r " p r o n o u n

r e s o l u t i o n " , it is f a i r to say

a) t h a t no u n i f o r m and c o m p r e h e n s i v e t r e a t m e n t of

a n a p h o r a has y e t been a t t a i n e d

b ) t h a t s u r p r i s i n g l y l i t t l e e f f o r t has been s p e n t in

a p p l y i n g t h e r e s u l t s of r e s e a r c h in l i n g u i s t i c s

and formal semantics in actual i m p l e m e n t e d s y s -

tems

A q u i c k g l a n c e at H i r s t (1981) w i l l c o n f i r m t h a t

t h e r e is a l a r g e gap between t h e k i n d s of t h e o r e -

t i c a l issues and p u z z l i n g cases t h a t have been con-

s i d e r e d on t h e one hand in t h e s e t t i n g of

c o m p u t a t i o n a l l i n g u i s t i c s and on t h e o t h e r in r e c e n t

s e m a n t i c a l l y o r i e n t e d a p p r o a c h e s to t h e formal

a n a l y s i s of n a t u r a l languages

One of t h e main aims of t h i s p a p e r is to b r i d g e

t h i s gap b y c o m b i n i n g recent e f f o r t s f o r t h c o m i n g in

f o r m a l semantics (based on M o n t a g u e g r a m m a r and

D i s c o u r s e R e p r e s e n t a t i o n T h e o r y ) w i t h e x i s t i n g

and r e l a t i v e l y c o m p r e h e n s i v e g r a m m a r s of German

and English c o n s t r u c t e d in c o n n e c t i o n w i t h t h e Us-

e r S p e c i a l t y Languages (USL) system, a n a t u r a l

l a n g u a g e d a t a b a s e q u e r y system b r i e f l y d e s c r i b e d

b e l o w

We have d r a w n e x t e n s i v e l y - - as f a r as

i n s i g h t s , e x a m p l e s , puzzles and a d e q u a c y c o n d i -

t i o n s are c o n c e r n e d - - on t h e v a r i o u s " v a r i a b l e

b i n d i n g " a p p r o a c h e s to p r o n o u n s (e 9, w o r k in t h e

M o n t a g u e t r a d i t i o n , t h e i l l u m i n a t i n g d i s c u s s i o n b y

Evans (1980) and Webber (1978), as well as r e c e n t

t r a n s f o r m a t i o n a l a c c o u n t s ) O u r a p p r o a c h has

h o w e v e r been most d e e p l y i n f l u e n c e d b y those w h o have ( l i k e Smaby (1979), (1981) and Kamp (1981))

a d v o c a t e d d i s p e n s i n g w i t h p r o n o u n i n d e x i n g on t h e one hand and b y those ( l i k e C h a s t a i n (1973), Evans (1980), and Kamp (1981)) w h o have empha- sized t h e " r e f e r e n t i a l " f u n c t i o n of c e r t a i n uses of

i n d e f i n i t e noun p h r a s e s

2 B a c k g r o u n d

C o n t r a r y to w h a t is assumed in most t h e o r i e s of

p r o n o m i n a l i z a t i o n (namely t h a t t h e most p r o p i t i o u s

w a y of d e a l i n g w i t h p r o n o u n s is to c o n s i d e r them as

a k i n d of i n d e x e d v a r i a b l e ) , we a g r e e w i t h Kamp (1981) and S m a b y (1979) in treating p r o n o u n s as bona fide lexical elements at the level of syntactic representation

T r e a t m e n t s of a n a p h o r a have t a k e n place w i t h i n

t w o q u i t e d i s t i n c t s e t t i n g s , so it seems On t h e one h a n d , l i n g u i s t s have p r i m a r i l y been c o n c e r n e d

w i t h t h e s p e c i f i c a t i o n of m a i n l y s y n t a c t i c c r i t e r i a in

d e t e r m i n i n g t h e p r o p e r " b i n d i n g " and

" d i s j o i n t n e s s " c r i t e r i a ( c f b e l o w ) , w h e r e a s compu-

t a t i o n a l l i n g u i s t s have in g e n e r a l p a i d more

a t t e n t i o n to a n a p h o r i c r e l a t i o n s in t e x t s , w h e r e se- mantic and p r a g m a t i c f e a t u r e s p l a y a much g r e a t e r role In t r y i n g to r e l a t e t h e t w o a p p r o a c h e s one

s h o u l d be a w a r e t h a t in t h e absence of any serious

t h e o r y of t e x t u n d e r s t a n d i n g , any a t t e m p t to deal

w i t h a n a p h o r a in u n r e s t r i c t e d domains (even if

t h e y are simple e n o u g h as f o r i n s t a n c e c h i l d r e n ' s

s t o r i e s ) , w i l l e n c o u n t e r so many d i v e r s e p r o b l e m s

w h i c h , even when t h e y i n f l u e n c e a n a p h o r i c re-

l a t i o n s , are c o m p l e t e l y b e y o n d t h e scope of a

s y s t e m a t i c t r e a t m e n t at t h e p r e s e n t moment We have t h o u g h t it to be i m p o r t a n t t h e r e f o r e to impose some c o n s t r a i n t s r i g h t f r o m t h e s t a r t on t h e t y p e of

d i s c o u r s e w i t h r e s p e c t to w h i c h o u r t r e a t m e n t of

a n a p h o r a is to be v a l i d a t e d ( o r f a l s i f i e d ) Of

c o u r s e , w h a t we are g o i n g to say s h o u l d in p r i n c i - ple be e x t e n d i b l e to more complex t y p e s of

d i s c o u r s e in t h e f u t u r e

T h e c o n t e x t of t h e p r e s e n t i n q u i r y is t h e q u e r y - in9 of r e l a t i o n a l databases {as opposed to say g e n - eral d i s c o u r s e a n a l y s i s ) T h e t y p e of d i s c o u r s e we are i n t e r e s t e d in a r e t h u s d i a l o g u e s in t h e s e t t l n g

of a r e l a t i o n a l d a t a b a s e ( w h i c h may be said to r e p -

r e s e n t b o t h t h e c o n t e x t of q u e r i e s and a n s w e r s as well as t h e " w o r l d " ) It s h o u l d be c l e a r t h a t a

w i d e v a r i e t y of a n a p h o r i c e x p r e s s i o n s is a v a i l a b l e

in t h i s k i n d of i n t e r a c t i o n ; on t h e o t h e r h a n d , t h e

r e l e v a n t k n o w l e d g e we assume in r e s o l v i n g p r o n o m - inal r e l a t i o n s m u s t come f r o m t h e i n f o r m a t i o n

Trang 2

s p e c i f i e d in t h e database (in t h e r e l a t i o n s , in t h e

v a r i o u s d e p e n d e n c i e s and i n t e g r i t y c o n s t r a i n t s )

and in t h e rules g o v e r n i n g t h e l a n g u a g e

We are m a k i n g t h e f o l l o w i n g a s s u m p t i o n s f o r d a -

t a b a s e q u e r y i n g A q u e r y d i a l o g u e is a sequence

of p a i r s < q u e r y , a n s w e r > For t h e sake of s i m p l i c i -

t y we assume t h a t the possible a n s w e r s are of t h e

f o r m

y e s / n o a n s w e r

s i n g l e t o n a n s w e r

( e g Spain, to a q u e r y l i k e "Who b o r d e r s Por-

t u g a l ? " )

set a n s w e r

( [ F r a n c e , P o r t u g a l

d e r s Spain?")

m u l t i p l e a n s w e r

( [ < F r a n c e , Spain>,

b o r d e r s who?)

and

refusal

( w h e n a p r o n o u n cannot receive a p r o p e r i n t e r -

p r e t a t i o n )

to a q u e r y l i k e "Who b o r -

• I to a q u e r y l i k e "Who

2.1 T h e User S p e c i a l t y Languages system

T h e USL system (Lehmann (1978), O t t and Zoep-

p r i t z (1979), Lehmann (1980)) p r o v i d e s an i n t e r -

face to a r e l a t i o n a l data base management system

f o r data e n t r y , q u e r y , and m a n i p u l a t i o n via re-

s t r i c t e d n a t u r a l l a n g u a g e T h e USL System t r a n s -

lates i n p u t q u e r i e s e x p r e s s e d in a n a t u r a l l a n g u a g e

( c u r r e n t l y German ( Z o e p p r i t z (1983), E n g l i s h , and

Spanish (SopeSa (1982))) i n t o e x p r e s s i o n s in t h e

SQL q u e r y l a n g u a g e , and e v a l u a t e s those e x -

p r e s s i o n s t h r o u g h t h e use of System R ( A s t r a h a n

&al ( 1 9 7 6 ) ) T h e p r o t o t y p e b u i l t has been v a l i -

d a t e d w i t h real a p p l i c a t i o n s and t h u s shown its

u s a b i l i t y T h e system consists of (1) a l a n g u a g e

p r o c e s s i n g component ( U L G ) , (2) g r a m m a r s f o r

German, E n g l i s h , and Spanish, (3) a set of 75 in-

t e r p r e t a t i o n r o u t i n e s , (4) a code g e n e r a t o r f o r

SQL, and (5) t h e data base management system

System R USL r u n s u n d e r VM/CMS in a v i r t u a l

machine of 7 M B y t e s , w o r k i n g set size is 1.8

M B y t e s ULG, i n t e r p r e t a t i o n r o u t i n e s , and code

g e n e r a t o r comprise a p p r o x i m a t e l y 40,000 lines of

P L / I code

S y n t a c t i c analysis

T h e s y n t a x component of USL uses t h e User

L a n g u a g e G e n e r a t o r (ULG) which o r i g i n a t e s f r o m

t h e Paris S c i e n t i f i c C e n t e r of IBM France and has

been d e s c r i b e d by B e r t r a n d 8al (1976) ULG con-

sists of a p a r s e r , a semantic e x e c u t e r , the g r a m m a r

META, and META i n t e r p r e t a t i o n r o u t i n e s META is

used to process the g r a m m a r of a l a n g u a g e ULG

accepts general p h r a s e s t r u c t u r e g r a m m a r s w r i t t e n

in a modified B a c k u s - N a u r - F o r m With any r u l e it

allows t h e s p e c i f i c a t i o n of a r b i t r a r y , r o u t i n e s to

c o n t r o l its a p p l i c a t i o n o r to p e r f o r m a r b i t r a r y ac-

t i o n s , and it allows s o p h i s t i c a t e d c h e c k i n g and

s e t t i n g of s y n t a c t i c f e a t u r e s Grammars f o r G e r -

man, E n g l i s h , and Spanish have been d e s c r i b e d in

a f o r m accepted by ULG T h e g r a m m a r s p r o v i d e

rules f o r those f r a g m e n t s of t h e languages r e l e v a n t

f o r c o m m u n i c a t i n g w i t h a d a t a b a s e T h e USL

g r a m m a r s have been c o n s t r u c t e d in such a w a y t h a t

c o n s t i t u e n t s c o r r e s p o n d as c l o s e l y as p o s s i b l e to semantic r e l a t i o n s h i p s in t h e s e n t e n c e , and t h a t

p a r s i n g is made as e f f i c i e n t as p o s s i b l e Where a

t r u e r e p r e s e n t a t i o n of t h e semantic r e l a t i o n s h i p s in

t h e p a r s e t r e e could not be a c h i e v e d , t h e b u r d e n was p u t on t h e i n t e r p r e t a t i o n r o u t i n e s to r e m e d y

t h e s i t u a t i o n

I n t e r p r e t a t i o n

T h e a p p r o a c h to i n t e r p r e t a t i o n in t h e USL s y s - tem b u i l d s on t h e ideas of model t h e o r e t i c semantics T h i s implies t h a t t h e meaning of s t r u c -

t u r e w o r d s and s y n t a c t i c c o n s t r u c t i o n s is i n t e r -

p r e t e d s y s t e m a t i c a l l y and i n d e p e n d e n t of t h e

c o n t e n t s of a g i v e n d a t a b a s e F u r t h e r m o r e , since

a r e l a t i o n a l d a t a b a s e can be r e g a r d e d as a ( p a r t i a l ) model in t h e sense of model t h e o r y , t h e i n t e r p r e t a -

t i o n of n a t u r a l l a n g u a g e concepts in t e r m s of

r e l a t i o n s is q u i t e n a t u r a l (A more d e t a i l e d d i s - cussion can be f o u n d in Lehmann ( 1 9 7 8 ) )

In t h e USL system, e x t e n s i o n s of concepts are

r e p r e s e n t e d as v i r t u a l r e l a t i o n s of a r e l a t i o n a l d a - tabase w h i c h are d e f i n e d on p h y s i c a l l y s t o r e d re- lations (base r e l a t i o n s ) T h e set of v i r t u a l

r e l a t i o n s r e p r e s e n t s t h e c o n c e p t u a l k n o w l e d g e

a b o u t t h e data and is d i r e c t l y l i n k e d to n a t u r a l

l a n g u a g e w o r d s and p h r a s e s T h i s a p p r o a c h has

t h e a d v a n t a g e t h a t e x t e n s i o n s of concepts can r e l a -

t i v e l y easily be r e l a t e d to objects of c o n v e n t i o n a l databases

For i l l u s t r a t i o n of t h e connection between v i r t u -

al r e l a t i o n s and w o r d s , c o n s i d e r t h e f o l l o w i n g e x - ample Suppose t h a t f o r a g e o g r a p h i c a l a p p l i c a t i o n someone has a r r a n g e d t h e data in t h e f o r m of t h e

r e l a t i o n

CO ( C O U N T R Y , C A P I T A L , AREA, POPULATION) Now v i r t u a l r e l a t i o n s such as t h e f o l l o w i n g w h i c h

c o r r e s p o n d to concepts can be f o r m e d b y s i m p l y

p r o j e c t i n g o u t t h e a p p r o p r i a t e columns of CO:

C A P I T A L ( N O M _ C A P I T A L , O F _ C O U N T R Y )

S t a n d a r d role names (OF, NOM ) e s t a b l i s h t h e connection between s y n t a c t i c c o n s t r u c t i o n s and co- lumns of v i r t u a l r e l a t i o n s and enable a n s w e r i n g

q u e s t i o n s such as (1) What is A u s t r i a ' s capital?

in a s t r a i g h t f o r w a r d and simple w a y S t a n d a r d role names are s u r f a c e o r i e n t e d because t h i s makes

it p o s s i b l e f o r a u s e r not t r a i n e d in l i n g u i s t i c s to

d e f i n e his own w o r d s and r e l a t i o n s ( F o r a com-

p l e t e l i s t of s t a n d a r d role names see e g Z o e p p r i t z ( 1 9 8 3 ) )

We are c u r r e n t l y w o r k i n g on t h e i n t e g r a t i o n of

t h e concepts u n d e r l y i n g t h e USL system w i t h Dis-

c o u r s e R e p r e s e n t a t i o n T h e o r y which is d e s c r i b e d in

t h e n e x t section We have a l r e a d y i m p l e m e n t e d a

p r o c e d u r e w h i c h g e n e r a t e s Discourse R e p r e s e n -

t a t i o n S t r u c t u r e s f r o m USL's semantic t r e e s and

Trang 3

w h i c h c o v e r s t h e e n t i r e f r a g m e n t o f l a n g u a g e d e -

s c r i b e d in Kamp ( 1 9 8 1 )

2 2 D i s c o u r s e R e p r e s e n t a t i o n T h e o r y ( D R T )

In t h i s s e c t i o n we g i v e a b r i e f d e s c r i p t i o n of

Kamp's D i s c o u r s e R e p r e s e n t a t i o n T h e o r y ( D R T ) in

as much as it r e l a t e s t o o u r c o n c e r n s w i t h p r o n o m i -

n a l i z a t i o n F o r a m o r e d e t a i l e d d i s c u s s i o n of t h i s

t h e o r y and its g e n e r a l r a m i f i c a t i o n s f o r n a t u r a l

l a n g u a g e p r o c e s s i n g , cf t h e p a p e r s b y Kamp

(1981) and G u e n t h n e r (1983a, 1983b)

A c c o r d i n g t o D R T , each n a t u r a l l a n g u a g e s e n -

t e n c e ( o r d i s c o u r s e ) is a s s o c i a t e d w i t h a s o - c a l l e d

D i s c o u r s e R e p r e s e n t a t i o n S t r u c t u r e ( D R S ) on t h e

basis of a set o f DRS f o r m a t i o n r u l e s T h e s e r u l e s

a r e s e n s i t i v e t o b o t h t h e s y n t a c t i c s t r u c t u r e of t h e

s e n t e n c e s in q u e s t i o n as well as t o t h e DRS c o n t e x t

in w h i c h in t h e s e n t e n c e o c c u r s In t h e f o r m u -

l a t i o n of Kamp (1981) t h e l a t t e r is r e a l l y of

i m p o r t a n c e o n l y in c o n n e c t i o n w i t h t h e p r o p e r a n a l -

y s i s of p r o n o u n s We feel on t h e o t h e r h a n d t h a t

t h e DRS e n v i r o n m e n t of a s e n t e n c e t o be p r o c e s s e d

s h o u l d d e t e r m i n e much m o r e t h a n j u s t t h e a n a p h o r -

ic a s s i g n m e n t s We shall d i s c u s s t h i s issue - in

p a r t i c u l a r as i t r e l a t e s t o p r o b l e m s of a m b i g u i t y

and v a g u e n e s s - in m o r e d e p t h in a f o r t h c o m i n g

p a p e r

A DRS K f o r a d i s c o u r s e has t h e g e n e r a l f o r m

K = <U, Con>

w h e r e U is a s e t of " d i s c o u r s e r e f e r e n t s " f o r K and

Con a set o f " c o n d i t i o n s " on t h e s e i n d i v i d u a l s

C o n d i t i o n s can be e i t h e r atomic o r c o m p l e x An

atomic c o n d i t i o n has t h e f o r m

P ( t l t n )

o r

t l = c

w h e r e t i is a d i s c o u r s e r e f e r e n t a n d c a p r o p e r

name and P an n - p l a c e p r e d i c a t e

T h e o n l y c o m p l e x c o n d i t i o n we shall discuss

h e r e is t h e one representing u n i v e r s a l l y q u a n t i f i e d

noun p h r a s e s o r c o n d i t i o n a l s e n t e n c e s Both a r e

t r e a t e d in much t h e same w a y L e t us call t h e s e

" i m p l i c a t i o n a l " c o n d i t i o n s :

K1 IMP K2

w h e r e K1 and K2 a r e also DRSs With a d i s c o u r s e

D is t h u s a s s o c i a t e d a D i s c o u r s e R e p r e s e n t a t i o n

s t r u c t u r e w h i c h r e p r e s e n t s D in a q u a n t i f i e r - f r e e

" c l a u s a l " f o r m , and w h i c h c a p t u r e s t h e p r o p o s i -

t i o n a l i m p o r t o f t h e d i s c o u r s e b y - among o t h e r

t h i n g s , e s t a b l i s h i n g t h e c o r r e c t p r o n o m i n a l c o n -

n e c t i o n s

What is i m p o r t a n t f o r t h e t r e a t m e n t of a n a p h o r a

in t h e p r e s e n t c o n t e x t is t h e f o l l o w i n g :

a) G i v e n a d i s c o u r s e w i t h a p r i n c i p a l DRS Ko and a

set of n o n - p r i n c i p a l DRSs ( o r c o n d i t i o n s ) Ki among

its c o n d i t i o n s all d i s c o u r s e r e f e r e n t s of Ko a r e a d -

m i s s i b l e r e f e r e n t s f o r p r o n o u n s in s e n t e n c e s o r

( p h r a s e s ) g i v i n g r i s e t o t h e v a r i o u s e m b e d d e d

K i ' s In p a r t i c u l a r , all o c c u r r e n c e s of p r o p e r names in a d i s c o u r s e w i l l a l w a y s be a s s o c i a t e d w i t h

d i s c o u r s e r e f e r e n t s of t h e p r i n c i p a l DRS Ko ( T h i s

is on t h e ( a d m i t t e d l y u n r e a l i s t i c ) a s s u m p t i o n t h a t

p r o p e r names r e f e r u n i q u e l y )

b ) G i v e n an i m p l i c a t i o n a l DRS of t h e f o r m K1 IMP K2 o c c u r r i n g in a DRS K, a r e l a t i o n of r e l a t i v e ac-

c e s s i b i l i t y b e t w e e n DRSs is d e f i n e d as f o l l o w s : K1 is a c c e s s i b l e f r o m K2 and all K' a c c e s s i b l e

f r o m K1 a r e also a c c e s s i b l e f r o m K2

In p a r t i c u l a r , t h e p r i n c i p a l DRS Ko is a c c e s s i b l e

f r o m its s u b o r d i n a t e DRSs ( f o r a p r e c i s e d e f i n i t i o n

cf Kamp ( 1 9 8 1 ) ) T h e i m p o r t of t h i s d e f i n i t i o n

f o r anaphora is s i m p l y t h a t i f a p r o n o u n is b e i n g

r e s o l v e d ( i e i n t e r p r e t e d ) in t h e c o n t e x t o f a DRS K' f r o m w h i c h a set K of DRSs is a c c e s s i b l e , t h e n

t h e u n i o n of all t h e sets of d i s c o u r s e r e f e r e n t s as-

s o c i a t e d w i t h e v e r y Ki in K is t h e set of a d m i s s i b l e

c a n d i d a t e s f o r t h e i n t e r p r e t a t i o n o f t h e p r o n o u n

T h e f o l l o w i n g i l l u s t r a t i o n s w i l l make t h i s c l e a r :

K ( E v e r y c o u n t r y i m p o r t s a p r o d u c t i t n e e d s )

c o u n t r y ( u 1 ) IMP i m p o r t ( u l , u 2 )

p r o d u c t ( u 2 )

n e e d ( u l , u 2 )

T h i s s e n t e n c e (as well as its interrogative v e r s i o n )

a l l o w s o n l y one i n t e r p r e t a t i o n of t h e p r o n o u n i t ac-

c o r d i n g t o D R T I t does n o t i n t r o d u c e a n y d i s -

c o u r s e r e f e r e n t a v a i l a b l e f o r p r o n o m i n a l i z a t i o n in

l a t e r s e n t e n c e s ( o r q u e r i e s ) B u t in a DRS l i k e

t h e f o l l o w i n g , DRT does not - as i t s t a n d s - ac-

c o u n t f o r p r o n o u n r e s o l u t i o n :

K ( J o h n t i c k l e d B i l l He s q u i r m e d )

l ~ u l u2

u l = J o h n

u2 = Bill

t i c k l e d ( u l , u 2 )

A t t h i s p o i n t , t h e p r o n o u n he has to be

i n t e r p r e t e d T h e r e a r e t w o a d m i s s i b l e c a n d i d a t e s ,

u l and u2, b u t DRT does not choose b e t w e e n t h e m

So t h e DRS c o u l d be c o n t i n u e d w i t h e i t h e r

s q u i r m ( u l )

o r

s q u i r m ( u 2 )

S i m i l a r l y , in t h e f o l l o w i n g DRS

Trang 4

K ( I f Spain is a member of e v e r y o r g a n i z a t i o n ,

i t has a m e m b e r )

1 I

[ o r g a n ! z a t i o n ( u 2 ) I

IMP

IMP [ u 3 e m b e r ( u 3 ' i t ) ]

t h e p r o n o u n i t c o u l d o n l y r e f e r t o Spain (on c o n -

f i g u r a t i o n a l g r o u n d s ) , and w o u l d h a v e t o be as-

s i g n e d t h a t o b j e c t i f no o t h e r c r i t e r i a a r e a s s u m e d

O b v i o u s l y , as f a r as t h i s s e n t e n c e and t h e i n t e n d e d

d a t a b a s e is c o n c e r n e d , we s h o u l d w a n t to r u l e o u t

such an a s s i g n m e n t ( T h i s can be d o n e v i a r u l e $1

d i s c u s s e d b e l o w )

In g e n e r a l , t h e n , g i v e n a s e n t e n c e ( o r d i s -

c o u r s e ) r e p r e s e n t e d in a DRS t h e r e w i l l be m o r e

c a n d i d a t e s f o r a d m i s s i b l e p r o n o u n a s s i g n m e n t s as

o n e s h o u l d l i k e t o h a v e a v a i l a b l e w h e n a p a r t i c u l a r

p r o n o u n is t o be i n t e r p r e t e d T h e r u l e s d e s c r i b e d

in Section 3 a r e m e a n t to c a p t u r e some of t h e r e g u -

l a r i t i e s t h a t a r i s e in t y p i c a l d a t a b a s e q u e r y i n g

i n t e r a c t i o n s

c) F i n a l l y , g i v e n a DRS f o r a d i s c o u r s e D we can

s a y t h a t a p r o n o u n is p r o p e r l y r e f e r e n t i a l i f f i t is

r e p r e s e n t e d b y ( i e e l i m i n a t e d in f a v o r o f ) a d i s -

c o u r s e r e f e r e n t ui o c c u r r i n g in t h e domain of t h e

p r i n c i p a l DRS r e p r e s e n t i n g D ( I n t h e c o n t e x t of

t h e c o n s t r u c t i o n s i l l u s t r a t e d so f a r , t h i s w i l l be

t r u e in p a r t i c u l a r of p r o p e r names as well as o f i n -

d e f i n i t e noun p h r a s e s n o t in t h e scope of of a

u n i v e r s a l noun p h r a s e o r a c o n d i t i o n a l )

T h e main p r o b l e m t h e n f o r t h e t r e a t m e n t of a n a p h o -

ra is t o d e t e r m i n e w h i c h p o s s i b l e d i s c o u r s e r e f e r -

e n t s s h o u l d be chosen when we come t o t h e

i n t e r p r e t a t i o n o f a p a r t i c u l a r p r o n o u n o c c u r r e n c e

pi in t h e f o r m a t i o n of t h e e x t e n s i o n of t h e DRS in

w h i c h we a r e w o r k i n g

We w o u l d l i k e to s u g g e s t t h e f o l l o w i n g s t r a t e g y

as a s t a r t i n g p o i n t C o n s i d e r a q u e r y d i a l o g u e Q

w i t h an a l r e a d y e s t a b l i s h e d DRS K and t h e u t t e r -

ance of a q u e r y S, w h e r e S c o n t a i n s o c c u r r e n c e s of

p e r s o n a l p r o n o u n s Suppose f u r t h e r t h a t A ( S ) is

t h e sole s y n t a c t i c a n a l y s i s a v a i l a b l e f o r S T h e n

we r e g a r d t h e c o n s t r u c t i o n of t h e e x t e n s i o n of t h e

DRS o b t a i n e d on t h e basis of S and K as t h e v a l u e

o f a p a r t i a l f u n c t i o n f d e f i n e d on K and A ( S )

M o r e g e n e r a l l y s t i l l , as Kamp h i m s e l f s u g g e s t s , we

can r e g a r d t h e " m e a n i n g " ( o r i n f o r m a t i o n c o n t e n t )

o f a s e n t e n c e t o be t h a t p a r t i a l f u n c t i o n f r o m DRSs

t o DRSs

In a g i v e n d i a l o g u e both t h e q u e r i e s and t h e a n -

s w e r s will h a v e t h e side e f f e c t o f i n t r o d u c i n g new

i n d i v i d u a l s a n d " p r e f e r e n c e " o r " s a l i e n c e " o r -

d e r i n g s on t h e s e i n d i v i d u a l s , and we w a n t to a l l o w

f o r p r o n o m i n a l r e f e r e n c e to t h e s e much in t h e same

w a y t h a t in a t e x t p r e c e d i n g s e n t e n c e s may h a v e

d e t e r m i n e d a set of p o s s i b l e a n t e c e d e n t s f o r p r o -

n o u n s in t h e c u r r e n ~ ! y p r o c e s s e d s e n t e n c e T h e

DRS b u i l t up in t h e process of a q u e r y i n g session

w i l l c o n s t i t u t e t h e " m u t u a l k n o w l e d g e " a v a i l a b l e t o

t h e u s e r in s p e c i f y i n g his f u r t h e r q u e r i e s as well

as in his uses o f p r o n o u n s I t is on t h e i n d i v i d u a l s

i n t r o d u c e d in t h e DRSs t h a t t h e r u l e s t o be d i s -

c u s s e d b e l o w a r e i n t e n d e d t o o p e r a t e

3 I n t e r p l a y o f s y n t a x , s e m a n t i c s , a n d p r a g m a t i c s in

pronominalization

T h e p r o c e s s o f p r o n o m i n a l i z a t i o n is g o v e r n e d b y

r u l e s i n v o l v i n g m o r p h o l o g i c a l , s y n t a c t i c , s e m a n t i c , and p r a g m a t i c c r i t e r i a T h e s e r u l e s a r e d i s c u s s e d and i l l u s t r a t e d w i t h e x a m p l e s d r a w n f r o m t h e c o n -

t e x t o f q u e r y i n g a g e o g r a p h i c a l d a t a b a s e T h e n a

p r o c e d u r e is o u t l i n e d w h i c h uses t h e s e r u l e s and

a p p l i e s them in t h e f o l l o w i n g o r d e r :

F i r s t m o r p h o l o g i c a l c r i t e r i a a r e c h e c k e d , if t h e y

f a i l no f u r t h e r t e s t s a r e r e q u i r e d

T h e n s y n t a c t i c ( o r c o n f i g u r a t i o n a l ) c r i t e r i a a r e

t e s t e d A g a i n , i f t h e y f a i l , no f u r t h e r t e s t s a r e

n e c e s s a r y

N e x t s e m a n t i c c r i t e r i a a r e a p p l i e d , and if t h e y

do n o t f a i l ,

t h e p r a g m a t i c c r i t e r i a have to be t e s t e d If

m o r e t h a n one c a n d i d a t e r e m a i n s , t h e use of t h e

p r o n o u n was p r a g m a t i c a l l y i n a p p r o p r i a t e and

m u s t be n o t e d as s u c h

3.1 S t r i c t f a c t o r s d e t e r m i n i n g t h e a d m i s s i b i l i t y of anaphora

3 1 1 M o r p h o l o g i c a l c r i t e r i a

M o r p h o l o g i c a l c r i t e r i a c o n c e r n t h e a g r e e m e n t of

g e n d e r and n u m b e r C o m p l i c a t i o n s come i n , w h e n

c o o r d i n a t e d n o u n p h r a s e s o c c u r , e g (2) J o h n and Bill w e n t t o Pisa T h e y d e l i v e r e d a

p a p e r (3) * J o h n and Bill w e n t t o Pisa He d e l i v e r e d a p a -

p e r (4) J o h n and Sue w e n t t o Pisa He d e l i v e r e d a p a -

p e r (5) * J o h n o r Bill w e n t t o Pisa T h e y d e l i v e r e d a

p a p e r (6) * J o h n o r Bill w e n t t o Pisa He d e l i v e r e d a p a -

p e r (7) N e i t h e r J o h n n o r Bill w e n t to Pisa T h e y w e n t

t o Rome

(8) * E i t h e r J o h n o r Bill d i d n o t go t o Pisa He w e n t

to Rome

T h e s t a r r e d e x a m p l e s c o n t a i n i n a p p r o p r i a t e uses of

p r o n o u n s With a n d - c o o r d i n a t i o n , r e f e r e n c e to t h e

c o m p l e t e NP is p o s s i b l e w i t h a p l u r a l p r o n o u n When t h e members of t h e c o o r d i n a t i o n a r e d i s t i n c t

in g e n d e r a n d / o r n u m b e r , r e f e r e n c e to them is

p o s s i b l e w i t h t h e c o r r e s p o n d i n g p r o n o u n s

C l e a r l y , t h e same o b s e r v a t i o n s hold f o r i n t e r r o g a -

t i v e s e n t e n c e s

3 1 2 C o n f i g u r a t i o n a l c r i t e r i a

S y n t a c t i c c r i t e r i a o p e r a t e o n l y w i t h i n t h e b o u n d a - ries of a s e n t e n c e , o u t s i d e t h e y a r e useless T h e

c o n f i g u r a t i o n a l critp.ria s t e m m i n g f r o m DRT h o w e v e r

w o r k i n d e p e n d e n t o f s e n t e n c e b o u n d a r i e s

147

Trang 5

D i s j o i n t reference

T h e r u l e of " d i s j o i n t r e f e r e n c e " a c c o r d i n g to

R e i n h a r t (1983) goes back to C h o m s k y and has

been r e f i n e d b y Lasnik (1976) and R e i n h a r t (1983)

I t is able to h a n d l e a v a r i e t y of w e l l - k n o w n cases,

such as

(9) When d i d i t join t h e UN?

(10) Which c o u n t r i e s t h a t i m p o r t i t , p r o d u c e

p e t r o l ?

(11) *Does it e n t e r t a i n d i p l o m a t i c r e l a t i o n s w i t h

Spain's n e i g h b o r ?

( I n t h e s t a r r e d e x a m p l e , t h e use of " i t " is i n a p p r o -

p r i a t e , if it is to be c o r e f e r e n t i a l w i t h " S p a i n " )

R a t h e r t h a n using c-command to f o r m u l a t e t h i s

c r i t e r i o n , w h i c h is e l e g a n t b u t too s t r i c t in some

cases (as noted b y R e i n h a r t h e r s e l f and B o l i n g e r

(1979), we have chosen an a d m i t t e d l y less e l e g a n t ,

b u t h o p e f u l l y r e l i a b l e , a p p r o a c h to d i s j o i n t r e f e r -

ence, in t h a t we s p e c i f y t h e c o n c r e t e s y n t a c t i c

c o n f i g u r a t i o n s w h e r e d i s j o i n t r e f e r e n c e h o l d s We

do not r e l y here on t h e s y n t a c t i c f r a m e w o r k of USL

g r a m m a r , b u t use more o r less t r a d i t i o n a l l y known

t e r m i n o l o g y f o r e x p r e s s i n g o u r r u l e s We need t h e

t e r m s " c l a u s e " , " p h r a s e " , " m a t r i x " , " e m b e d d i n g " ,

and " l e v e l " These can be made e x p l i c i t , when a

s u i t a b l e s y n t a c t i c f r a m e w o r k is chosen

Now we can f o r m u l a t e o u r d i s j o i n t r e f e r e n c e r u l e

and some of its less obvious c o n s e q u e n c e s

C I The referent of a personal pronoun can never

be within the same clause at the same phrase level

(Note that this rule does not hold f o r possessive

pronouns,)

C1 has a n u m b e r of consequences w h i c h we now

l i s t :

C l a T h e ( i m p l i c i t ) s u b j e c t of an i n f i n i t v e clause

can n e v e r be r e f e r e n t of a p e r s o n a l p r o n o u n in t h a t

clause

(12) Does t h e EC w a n t to d i s s o l v e it?

C l b Nouns common to c o o r d i n a t e clauses c a n n o t

be r e f e r r e d to f r o m w i t h i n these c o o r d i n a t e clauses

(13) Which c o u n t r y b o r d e r s it and Spain?

clause can n e v e r be r e f e r r e d t o

(14) Does it b o r d e r Spain's n e i g h b o r s ?

T h e f o l l o w i n g rules have to do w i t h p h r a s e s and

clauses m o d i f y i n g a noun T h e y too can be r e -

g a r d e d as consequences of C1

C2 Head noun of a p h r a s e o r clause can n e v e r be

r e f e r e n t of a personal p r o n o u n in t h a t p h r a s e o r

clause

C2a Head noun of p a r t i c i p i a l p h r a s e

(15) a c o u n t r y e x p o r t i n g p e t r o l to it

C2b Head noun of t h a t - c l a u s e (16) t h e t r u t h is t h a t it follows f r o m A

C2c Head noun of r e l a t i v e clause (17) t h e c o u n t r y i t e x p o r t s p e t r o l to

T h e f o l l o w i n g t w o rules deal w i t h k a t a p h o r i c p r o n -

o m i n a l i z a t i o n (sometimes called b a c k w a r d p r o n o m i -

n a l i z a t i o n ) C3a K a t a p h o r a into a more d e e p l y e m b e d d e d clause is i m p o s s i b l e

(18) Did i t e x p o r t a p r o d u c t t h a t Spain p r o d u c e s ?

C 3 b K a t a p h o r a into a s u c c e e d i n g c o o r d i n a t e clause is impossible

(19) Who d i d not belong to i t b u t l e f t t h e UN?

T h e a c c e s s i b i l i t y r e l a t i o n on DRSs

C4 O n l y those d i s c o u r s e r e f e r e n t s in t h e accessi-

b i l i t y r e l a t i o n d e f i n e d in sec 2.2 are a v a i l a b l e as

r e f e r e n t s to a p r o n o u n

3 1 3 Semantic criteria

Widely used is t h e c r i t e r i o n of semantic c o m p a t i b i l i -

t y It is u s u a l l y implemented via " s e m a n t i c f e a -

t u r e s " In t h e USL f r a m e w o r k we can d e r i v e t h i s

i n f o r m a t i o n f r o m relation schemata We s t a t e t h e

c r i t e r i o n as f o l l o w s :

31 If s is a sentence c o n t a i n i n g a p r o n o u n p and

c a f u l l noun p h r a s e in t h e c o n t e x t of p If p is

s u b s t i t u t e d b y c in s to y i e l d s' and s' is not se-

m a n t i c a l l y anomalous, i e does not i m p l y a c o n t r a -

d i c t i o n , t h e n c is semantically c o m p a t i b l e w i t h s and is hence a semantically p o s s i b l e c a n d i d a t e f o r

t h e r e f e r e n c e of p

(20) What is t h e capital of A u s t r i a ? - V i e n n a What does it e x p o r t ?

If i t is assumed t h a t o n l y c o u n t r i e s b u t not c a p i t a l s

e x p o r t g o o d s , then the o n l y s e m a n t i c a l l y p o s s i b l e

r e f e r e n t f o r " i t " is A u s t r i a S2 N o n - r e f e r e n t i a l l y i n t r o d u c e d nouns c a n n o t be

a n t e c e d e n t s of p r o n o u n s (21) Which c o u n t r i e s does I t a l y have t r a d e w i t h ? How l a r g e is it?

Since " t r a d e " is used n o n - r e f e r e n t i a l l y , it c a n n o t

be a n t e c e d e n t of " i t " U n f o r t u n a t e l y , in many cas-

es w h e r e t h i s c r i t e r i o n could a p p l y , t h e r e is an

a m b i g u i t y between r e f e r e n t i a l and n o n - r e f e r e n t i a l use

A p a r t f r o m t h e t y p e of semantic c o m p a t i b i l i t y

c o v e r e d b y r u l e S1, more complex semantic p r o p e r -

t i e s a r e used to d e t e r m i n e t h e r e f e r e n t of a p r o - noun T h e " t a s k s t r u c t u r e s " d e s c r i b e d b y G r o s z (1977) i l l u s t r a t e t h i s f a c t We hence f o r m u l a t e t h e rule

Trang 6

$3 T h e p r o p e r t i e s of and r e l a t i o n s h i p s b e t w e e n

p r e d i c a t e s d e t e r m i n e p r o n o r n i n a l i z a b i l i t y

F o r an i l l u s t r a t i o n o f its e f f e c t , c o n s i d e r t h e f o l l o w -

i n g q u e r y :

(22) What c o u n t r y is its n e i g h b o r ?

T h e i r r e f l e x i v i t y of t h e n e i g h b o r - r e l a t i o n e n t a i l s

t h a t " i t s " c a n n o t be b o u n d b y " w h a t c o u n t r y " in

t h i s case, b u t has t o r e f e r t o s o m e t h i n g m e n t i o n e d

in t h e p r e v i o u s c o n t e x t

G i v e n a s u b j e c t d o m a i n , one can a n a l y z e t h e

p r o p e r t i e s of t h e r e l a t i o n s and t h e r e l a t i o n s h i p s b e -

t w e e n them and so b u i l d a basis f o r d e c i d i n g p r o -

n o u n r e f e r e n c e on s e m a n t i c g r o u n d s In t h e

f r a m e w o r k of t h e USL s y s t e m , i n f o r m a t i o n on t h e

p r o p e r t i e s of r e l a t i o n s is a v a i l a b l e in t e r m s of

" f u n c t i o n a l d e p e n d e n c i e s " g i v e n in t h e d a t a b a s e

schema o r as i n t e g r i t y c o n s t r a i n t s

3 2 P r a g m a t i c c r i t e r i a

T h e g e n e r a t i o n of d i s c o u r s e is c o n t r o l l e d b y t w o

f a c t o r s : c o m m u n i c a t i v e i n t e n t i o n s and m u t u a l

k n o w l e d g e In t h e c o n t e x t of d a t a b a s e i n t e r a c t i o n ,

we can assume t h a t t h e c o m m u n i c a t i v e i n t e n t i o n s of

a u s e r a r e s i m p l y to o b t a i n f a c t u a l a n s w e r s to f a c -

t u a l q u e s t i o n s His i n t e n t i o n s a r e e x p r e s s e d e i t h e r

b y s i n g l e q u e r i e s o r b y s e q u e n c e s of q u e r i e s , d e -

p e n d i n g on how c o m p l e x t h e s e i n t e n t i o n s a r e o r

how c l o s e l y t h e y c o r r e s p o n d to t h e i n f o r m a t i o n in

t h e d a t a b a s e As w i l l be shown b e l o w , in m a n y

cases t h e s y s t e m w i l l n o t h a v e a c h a n c e to d e t e r -

mine w h e t h e r a g i v e n q u e r y is a " o n e - s h o t q u e r y " ,

o r w h e t h e r i t is p a r t of a s e q u e n c e of q u e r i e s w i t h

a common " t h e m e " For t h e r e s o l u t i o n of p r o n o u n s ,

t h i s means t h a t t h e s y s t e m s h o u l d r a t h e r ask t h e

u s e r b a c k t h a n make w i l d guesses on w h a t m i g h t be

t h e most " p l a u s i b l e " r e f e r e n t T h i s is of c o u r s e

n o t p o s s i b l e w h e n r u n n i n g t e x t is a n a l y z e d in a

" b a t c h m o d e " , and no u s e r is t h e r e to be a s k e d f o r

c l a r i f i c a t i o n

M u t u a l k n o w l e d g e (see e g C l a r k and M a r s h a l l

(1981) f o r a d i s c u s s i o n ) d e t e r m i n e s t h e r u l e s f o r

i n t r o d u c i n g and r e f e r e n c i n g i n d i v i d u a l s in t h e d i s -

c o u r s e In t h e c o n t e x t of d a t a b a s e i n t e r a c t i o n we

assume t h e m u t u a l k n o w l e d g e t o c o n s i s t i n i t i a l l y o f :

- t h e set of p r o p e r names in t h e d a t a b a s e ,

- t h e p r e d i c a t e s w h o s e e x t e n s i o n s a r e in t h e d a t a -

base,

- t h e "common sense" r e l a t i o n s h i p s b e t w e e n and

p r o p e r t i e s of t h e s e p r e d i c a t e s

It w i l l be p a r t of t h e d e s i g n of a d a t a b a s e to e s t a b -

lish w h a t t h e s e "common sense" r e l a t i o n s h i p s and

p r o p e r t i e s a r e , e g , w h e t h e r it is g e n e r a l l y k n o w n

to t h e u s e r c o m m u n i t y , w h e t h e r " c a p i t a l " e x p r e s s e s

a o n e - o n e r e l a t i o n Each q u e s t i o n - a n s w e r p a i r oc-

c u r r i n g in t h e d i s c o u r s e is a d d e d to t h e s t o c k of

m u t u a l k n o w l e d g e

I t is a p r a g m a t i c p r i n c i p l e of p r o n o m i n a l i z a t i o n

t h a t o n l y m u t u a l k n o w l e d g e may be used to d e t e r -

mine t h e r e f e r e n t of a p r o n o u n on s e m a n t i c

g r o u n d s , and h e n c e it may be legal to use t h e same

s e n t e n c e c o n t a i n i n g a p r o n o u n w h e r e e a r l i e r in t h e

d i s c o u r s e i t was i l l e g a l , b e c a u s e t h e m u t u a l k n o w -

l e d g e has i n c r e a s e d in t h e m e a n t i m e

3 2 1 A f i r s t a t t e m p t u s i n g p r e f e r e n c e r u l e s What t h e t o p i c o f a d i s c o u r s e is, w h i c h o f t h e e n t i -

t i e s m e n t i o n e d in i t a r e in f o c u s , is r e f l e c t e d in t h e

s y n t a c t i c s t r u c t u r e o f s e n t e n c e s T h i s has been

o b s e r v e d f o r a long t i m e I t has also o f t e n been

o b s e r v e d t h a t d i s c o u r s e t o p i c and f o c u s h a v e an e f -

f e c t on p r o n o m i n a l i z a t i o n w h e r e m o r p h o l o g i c a l , c o n -

f i g u r a t i o n a l , a n d s e m a n t i c r u l e s fail t o d e t e r m i n e a

s i n g l e C a n d i d a t e f o r r e f e r e n c e H o w e v e r , i t has

n o t been p o s s i b l e y e t t o f o r m u l a t e p r e c i s e r u l e s e x -

p l a i n i n g t h i s p h e n o m e n o n We h a v e t h e i m p r e s s i o n

t h a t such r u l e s c a n n o t be a b s o l u t e l y s t r i c t r u l e s ,

b u t a r e of a p r e f e r e n t i a l n a t u r e We h a v e d e v e l -

o p e d a set of such r u l e s and t e s t e d them a g a i n s t a

c o r p u s o f t e x t c o n t a i n i n g some 600 p r o n o u n o c c u r -

r e n c e s , a n d h a v e f o u n d them t o w o r k r e m a r k a b l y

w e l l S i m i l a r t e s t s ( w i t h a s i m i l a r set o f r u l e s )

h a v e been c o n d u c t e d b y Hofmann ( 1 9 7 6 )

In t h e s e q u e l we f o r m u l a t e and d i s c u s s o u r l i s t

of r u l e s T h e i r o r d e r i n g c o r r e s p o n d s to t h e o r d e r

in w h i c h t h e y h a v e t o be a p p l i e d P1 ( p r i n c i p l e o f p r o x i m i t y ) Noun p h r a s e s w i t h i n

t h e s e n t e n c e c o n t a i n i n g t h e p r o n o u n a r e p r e f e r r e d

o v e r noun p h r a s e s in p r e v i o u s o r s u c c e e d i n g s e n -

t e n c e s

C o n s i d e r t h e s e q u e n c e (23) What c o u n t r y j o i n e d t h e EC a f t e r 1980?

G r e e c e (24) What c o u n t r y consumes t h e w i n e i t p r o d u c e s ? One c o u l d a r g u e t h a t " G r e e c e " is j u s t as p r o b a b l y

t h e i n t e n d e d r e f e r e n t of " i t " in t h i s case as t h e

b o u n d i n t e r p r e t a t i o n and t h a t h e n c e t h e use of " i t "

s h o u l d be r e j e c t e d as i n a p p r o p r i a t e H o w e v e r ,

t h e r e is no w a y to a v o i d t h e " i t " , if t h e b o u n d v a r -

i a b l e i n t e r p r e t a t i o n is i n t e n d e d , and one can use

t h i s as a g r o u n d t o r u l e o u t t h e i n t e r p r e t a t i o n w h e -

re " i t " r e f e r s t o " G r e e c e "

P l a Noun p h r a s e s in s e n t e n c e s b e f o r e t h e s e n -

t e n c e c o n t a i n i n g t h e p r o n o u n a r e p r e f e r r e d o v e r noun p h r a s e s in m o r e d i s t a n t s e n t e n c e s

T h i s c r i t e r i o n is v e r y i m p o r t a n t to l i m i t t h e s e a r c h

f o r p o s s i b l e d i s c o u r s e r e f e r e n t s P2 P r o n o u n s a r e p r e f e r r e d o v e r f u l l n o u n

p h r a s e s

T h i s r u l e is f o u n d in m a n y s y s t e m s d e a l i n g w i t h

a n a p h o r a One can m o t i v a t e it b y s a y i n g t h a t

p r o n o m i n a l i z a t i o n e s t a b l i s h e s an e n t i t y as a t h e m e

w h i c h is t h e n m a i n t a i n e d u n t i l t h e c h a i n of p r o -

n o u n s is b r o k e n b y a s e n t e n c e n o t c o n t a i n i n g a s u i -

t a b l e p r o n o u n For an e x a m p l e c o n s i d e r : (25) W:lat =s t h e area of A u s t r i a !

(26) What is its c a p i t a l ? (27) What is its p o p u l a t i o n ?

Trang 7

P3 Noun ~hrases in a m a t r i x clause o r p h r a s e are

p r e f e r r e d o v e r noun p h r a s e s in e m b e d d e d clauses

o r p h r a s e s

P3ạ Noun p h r a s e s in a m a t r i x clause a r e p r e -

f e r r e d o v e r noun p h r a s e s in embeđẽ clauses

Example:

(28) What c o u n t r y i m p o r t s a p r o d u c t t h a t Spain

p r o d u c e s ? - D e n m a r k

(29) What does it e x p o r t ?

Here " i t " has to r e f e r to t h e i n d i v i d u a l s a t i s f y i n g

" w h a t c o u n t r y " , not to " S p a i n " w h i c h o c c u r s in an

e m b e d d e d clausẹ

P3b Head nouns are p r e f e r r e d o v e r noun c o m p l e -

ments

Example:

(30) What is t h e c a p i t a l of A u s t r i a ? - V i e n n a

(31) What is its p o p u l a t i o n ?

"Vienna", not "Austria" becomes the referent of

" i t s " , and t h e a r g u m e n t is a n a l o g o u s to t h a t f o r

P3ạ

P4 S u b j e c t noun p h r a s e s are p r e f e r r e d o v e r

n o n - s u b j e c t noun p h r a s e s

In d e c l a r a t i v e c o n t e x t s , t h i s r u l e w o r k s q u i t e w e l l

It c o r r e s p o n d s e s s e n t i a l l y to t h e focus r u l e of S i d -

h e r (1981) In a q u e s t i o n - a n s w e r i n g s i t u a t i o n it is

h a r d l y a p p l i c a b l e , since e s p e c i a l l y in w h - q u e s t i o n s

s u b j e c t p o s i t i o n and w o r d o r d e r , w h i c h b o t h p l a y a

role, t e n d to i n t e r f e r e We t h e r e f o r e t e n d to not

use t h i s r u l e , b u t r a t h e r to let t h e s y s t e m ask back

in cases w h e r e it w o u l d a p p l y For i l l u s t r a t i o n

c o n s i d e r t h e f o l l o w i n g e x a m p l e s :

(32) Does Spain b o r d e r P o r t u g a l ? What is its p o p u -

lation?

(33) Is Spain b o r d e r e d b y P o r t u g a l ? What is its

p o p u l a t i o n ?

(34) Which c o u n t r y b o r d e r s P o r t u g a l ? What is its

p o p u l a t i o n ?

(35) Which c o u n t r y does P o r t u g a l b o r d e r ? What is

its p o p u l a t i o n ?

P5 A c c u s a t i v e o b j e c t noun p h r a s e s a r e p r e f e r r e d

o v e r o t h e r n o n - s u b j e c t noun p h r a s e s

P6 Noun p h r a s e s p r e c e d i n g t h e p r o n o u n are p r e -

f e r r e d o v e r noun p h r a s e s s u c c e e d i n g t h e p r o n o u n

( o r : a n a p h o r a is p r e f e r r e d o v e r k a t a p h o r a )

3 3 O u t l i n e of a p r o n o u n r e s o l u t i o n p r o c e d u r e

We now o u t l i n e a p r o c e d u r e f o r " r e s o l v i n g " p r o -

nouns in t h e f r a m e w o r k of t h e USL system and

DRT

Let M = <U, Con> be t h e DRS r e p r e s e n t i n g t h e

mutual k n o w l e d g e , in p a r t i c u l a r t h e p a s t d i s c o u r s e

Let K ( s ) be t h e DRS r e p r e s e n t i n g t h e c u r r e n t sen-

tence s and let p be a p r o n o u n o c c u r r i n g in s f o r

w h i c h an a p p r o p r i a t e d i s c o u r s e r e f e r e n t has to be

f o u n d Let U be t h e set of d i s c o u r s e r e f e r e n t s

ăp) accessible to p according to the accessibility re- lation given in sec 2.2

Let f u r t h e r c be a f u n c t i o n t h a t a;)plies to U ăp)

all the morphological, syntactic, and semantic cri- teria, given above and yields a set Uc(p) as result Now three cases have to be distinguished:

1 U c ( p ) is e m p t y In t h i s case t h e use of p was

i n a p p r o p r i a t e

2 C a r d ( U c ( p ) ) is 1 In t h i s case a r e f e r e n t f o r p has been u n i q u e l y d e t e r m i n e d , p is r e p l a c e d b y

it in t h e DRS, and t h e p r o c e d u r e is f i n i s h e d

3 C a r d ( U c ( p ) ) is g r e a t e r than 1 In t h i s case t h e

p r e f e r e n c e r u l e s a r e a p p l i e d Let p be a f u n c t i o n t h a t a p p l i e s to U c ( p ) if t h e

c a r d i n a l i t y of Uc(p) is g r e a t e r t h a n 1 all t h e p r e f -

e r e n c e rules g i v e n a b o v e in t h e o r d e r i n d i c a t e d

t h e r e y i e l d i n g t h e r e s u l t Up C a r d ( U p ) can n e v e r

be 0, hence t w o cases are p o s s i b l e , e i t h e r t h e c a r -

d i n a l i t y is 1, t h e n a r e f e r e n t has been u n i q u e l y

d e t e r m i n e d and t h e p r o n o u n p can be e l i m i n a t e d in

K, o r t h e c a r d i n a l i t y is g r e a t e r t h a n 1, and t h e n

t h e use of p was i n a p p r o p r i a t e

I t can be i n f e r r e d f r o m t h e f o r m u l a t i o n of t h e

p r o n o m i n a l i z a t i o n r u l e s g i v e n a b o v e , w h a t m o r p h o - logical and s y n t a c t i c i n f o r m a t i o n has to be s t o r e d

w i t h t h e d i s c o u r s e r e f e r e n t s in t h e DRSs, and w h a t semantic i n f o r m a t i o n has to be accessible f r o m t h e schema of t h e d a t a b a s e to enable t h e a p p l i c a t i o n of

t h e f u n c t i o n s c and p Hence, we w i l l not spell o u t

t h e s e d e t a i l s h e r e

4 Open q u e s t i o n s and c o n c l u s i o n s Many w e l l - k n o w n and p u z z l i n g cases have not been

a d d r e s s e d h e r e , among them p l u r a l a n a p h o r a ,

s o - c a l l e d p r o n o u n s of laziness, one p r o n o m i n a l i z a -

t i o n , to name j u s t a f e w

We have n o t said a n y t h i n g a b o u t phenomena such as d i s c o u r s e t o p i c , f o c u s , o r c o h e r e n c e and

t h e i r i n f l u e n c e on a n a p h o r a T h e i r e f f e c t s are c a p -

t u r e d in o u r p r e f e r e n c e rules to some d e g r e e , b u t

no one can p r e c i s e l y say how I n s p i r e of claims to

t h e c o n t r a r y , we b e l i e v e t h a t much w o r k is s t i l l r e -

e f f e c t i v e l y in n a t u r a l l a n g u a g e p r o c e s s i n g

By l i m i t i n g o u r s e l v e s to t h e r e l a t i v e l y

w e l l - d e f i n e d c o m m u n i c a t i v e s i t u a t i o n of d a t a b a s e i n -

t e r a c t i o n , we have been able to s t a t e p r e c i s e l y ,

w h a t r u l e s are a p p l i c a b l e in t h e f r a g m e n t of lan-

g u a g e we are d e a l i n g w i t h We are c u r r e n t l y w o r k - ing on t h e a n a l y s i s of r u n n i n g t e x t s , b u t again in a

w e l l - d e l i n e a t e d d o m a i n , and we hope to be able to

e x t e n d o u r t h e o r y on t h e basis of t h e e x p e r i e n c e

g a i n e d

Trang 8

We are convinced that serious progress in the

u n d e r s t a n d i n g of anaphora and of discourse phe-

nomena in general is only possible t h r o u g h a care-

ful control of the environment, and on a solid

syntactic and semantic foundation

References

Astrahan, M M., M W Blasgen, D D Chamber-

lin, K P Eswaran, J N Gray, P P G r i f f i t h s ,

W F King, R A Lorie, P R McJones, J W

Mehl, (3 R Putzolu, I L Traiger, B W Wade,

V Watson (1976): "System R: Relational Approach

to Database Management", ACM Transactions on Da-

tabase Systems, vol 1, no 2, June 1976, p 97

B e r t r a n d , O., J J D~udennarde, D Starynke-

r i c h , A Stenbock-Fermor (1976): "User Applica-

Conference on Relational Data Base Systems, Bari,

Italy, p 83

Bolinger, D (1979): "Pronouns in Discourse", in:

T Givon ( e d , ) : Syntax and Semantics, Vol 12:

Discourse and Syntax, Academic Press, New York,

p 289

Thesis, Princeton

Clark, H H and C R Marshall (1981): "Definite

Reference and Mutual Knowledge", in: B L Web-

ber, A K Joshi, and I A Sag ( e d s ) : Elements

of Discourse Understanding, Cambridge U n i v e r s i t y

Press, Cambridge, p 10

Donnellan, K S (1978): "Speaker Reference, De-

scriptions and Anaphora", in P Cole ( e d ) : Syn-

tax and Semantics, Vol 9: Pragmatics, Academic

Press, New York, p 47

I n q u i r y , vol 11

(3rosz, B J (1977): "The Representation and Use

of Focus in Dialogue Understanding", Technical

California

Guenthner, F (1983a) "Discourse Representation

T h e o r y and Databases", forthcoming

Representation Theory in PROLO(3", forthcoming

Understanding: A Survey, Springer, Heidelberg

n i c h t - r e f e r e n t i e l l e Verweisformen in juristischen

Normtexten, unpublished dissertation, Univ Re-

gensburg

Kamp, H (1981) "A Theory of T r u t h and Semantic

Representation", in Groenendijk, J et al Formal

Methods in the Study of Language Amsterdam

L i n g u i s t i c Analysis, vol 2, hr 1

Lehmann, H (1978): " I n t e r p r e t a t i o n of Natural Language in an Information System", IBM J Res Develop vol 22, p 533

Lehmann, H (1980): "A System f o r A n s w e r i n g Ouestions in German", paper presented at the 6th International Symposium of the ALLC, Cambridge, England

Ott, N and M Zoeppritz (1979): "USL - an Exper- imental Information System based on Natural Lan-

Computer Systems, Hanser, Munich

d u n d a n t Join Operations in Queries I n v o l v i n g Views", TR 82.03.003, IBM Heidelberg Scientific Center

mantic Rules", in F (3uenthner and S J Schmidt

( e d s ) : Formal Semantics and Pragmatics f o r Na-

t u r a l Languages, Reidel, Dordrecht

Anaphora: A Restatement of the Anaphora Ques- tions", Linguistics and Philosophy, vol 6, p 47 Sidner, C L (1981): "Focusing for Interpretation

of Pronouns", AJCL, vol 7, nr 4, p 217

Smaby, R (1979): "Ambiguous Coreference with

Q u a n t i f i e r s " , in F (3uenthner and S.J Schmidt

tura| Languages, Reidel, Dordrecht

Smaby, R (1981): "Pronouns and A m b i g u i t y " , in

U M6nnich ( e d ) : Aspects of Philosophical Logic, Reidel, Dordrecht

de Sope~a Pastor, L (1982): "Grammar of Spanish for User Specialty Languages", TR 82.05.004, IBM Heidelberg Scientific Center

Webber, B L (1978): "A Formal Approach to Dis- course Anaphora", TR 3761, Bolt, Beranek & New- man, Cambr, idge, MA

TObingen

Ngày đăng: 18/03/2014, 02:20

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN