Watson 1976: "System R: Relational Approach to Database Management", ACM Transactions on Da- tabase Systems, vol.. Stenbock-Fermor 1976: "User Applica- Conference on Relational Data Base
Trang 1Rules for Pronominalization
Franz G u e n t h n e r , H u b e r t Lehmann IBM D e u t s c h l a n d GmbH
H e i d e l b e r g Science C e n t e r
T i e r g a r t e n s t r 15, D-6900 H e i d e l b e r g , FRG
A b s t r a c t
R i g o r o u s i n t e r p r e t a t i o n of p r o n o u n s is p o s s i b l e
w h e n s y n t a x , semantics, and p r a g m a t i c s of a d i s -
c o u r s e can be r e a s o n a b l y c o n t r o l l e d I n t e r a c t i o n
w i t h a d a t a b a s e p r o v i d e s such an e n v i r o n m e n t In
t h e f r a m e w o r k of t h e User S p e c i a l t y L a n g u a g e s
s y s t e m and D i s c o u r s e R e p r e s e n t a t i o n T h e o r y , we
f o r m u l a t e s t r i c t and p r e f e r e n t i a l rules f o r p r o n o m i -
n a l i z a t i o n and o u t l i n e a p r o c e d u r e to f i n d p r o p e r
a s s i g n m e n t s of r e f e r e n t s to p r o n o u n s
1 O v e r v i e w : Relation to p r e v i o u s w o r k
One of t h e main obstacles of t h e a u t o m a t e d p r o c e s s -
i n g of n a t u r a l l a n g u a g e sentences ( a n d a f o r t e r i o r i
t e x t s ) is t h e p r o p e r t r e a t m e n t of a n a p h o r i c re-
l a t i o n s Even t h o u g h t h e r e is a p l e t h o r a of re-
s e a r c h a t t e m p t i n g to s p e c i f y ( b o t h on t h e
t h e o r e t i c a l level as well as in c o n n e c t i o n w i t h im-
p l e m e n t a t i o n s ) " s t r a t e g i e s " f o r " p r o n o u n
r e s o l u t i o n " , it is f a i r to say
a) t h a t no u n i f o r m and c o m p r e h e n s i v e t r e a t m e n t of
a n a p h o r a has y e t been a t t a i n e d
b ) t h a t s u r p r i s i n g l y l i t t l e e f f o r t has been s p e n t in
a p p l y i n g t h e r e s u l t s of r e s e a r c h in l i n g u i s t i c s
and formal semantics in actual i m p l e m e n t e d s y s -
tems
A q u i c k g l a n c e at H i r s t (1981) w i l l c o n f i r m t h a t
t h e r e is a l a r g e gap between t h e k i n d s of t h e o r e -
t i c a l issues and p u z z l i n g cases t h a t have been con-
s i d e r e d on t h e one hand in t h e s e t t i n g of
c o m p u t a t i o n a l l i n g u i s t i c s and on t h e o t h e r in r e c e n t
s e m a n t i c a l l y o r i e n t e d a p p r o a c h e s to t h e formal
a n a l y s i s of n a t u r a l languages
One of t h e main aims of t h i s p a p e r is to b r i d g e
t h i s gap b y c o m b i n i n g recent e f f o r t s f o r t h c o m i n g in
f o r m a l semantics (based on M o n t a g u e g r a m m a r and
D i s c o u r s e R e p r e s e n t a t i o n T h e o r y ) w i t h e x i s t i n g
and r e l a t i v e l y c o m p r e h e n s i v e g r a m m a r s of German
and English c o n s t r u c t e d in c o n n e c t i o n w i t h t h e Us-
e r S p e c i a l t y Languages (USL) system, a n a t u r a l
l a n g u a g e d a t a b a s e q u e r y system b r i e f l y d e s c r i b e d
b e l o w
We have d r a w n e x t e n s i v e l y - - as f a r as
i n s i g h t s , e x a m p l e s , puzzles and a d e q u a c y c o n d i -
t i o n s are c o n c e r n e d - - on t h e v a r i o u s " v a r i a b l e
b i n d i n g " a p p r o a c h e s to p r o n o u n s (e 9, w o r k in t h e
M o n t a g u e t r a d i t i o n , t h e i l l u m i n a t i n g d i s c u s s i o n b y
Evans (1980) and Webber (1978), as well as r e c e n t
t r a n s f o r m a t i o n a l a c c o u n t s ) O u r a p p r o a c h has
h o w e v e r been most d e e p l y i n f l u e n c e d b y those w h o have ( l i k e Smaby (1979), (1981) and Kamp (1981))
a d v o c a t e d d i s p e n s i n g w i t h p r o n o u n i n d e x i n g on t h e one hand and b y those ( l i k e C h a s t a i n (1973), Evans (1980), and Kamp (1981)) w h o have empha- sized t h e " r e f e r e n t i a l " f u n c t i o n of c e r t a i n uses of
i n d e f i n i t e noun p h r a s e s
2 B a c k g r o u n d
C o n t r a r y to w h a t is assumed in most t h e o r i e s of
p r o n o m i n a l i z a t i o n (namely t h a t t h e most p r o p i t i o u s
w a y of d e a l i n g w i t h p r o n o u n s is to c o n s i d e r them as
a k i n d of i n d e x e d v a r i a b l e ) , we a g r e e w i t h Kamp (1981) and S m a b y (1979) in treating p r o n o u n s as bona fide lexical elements at the level of syntactic representation
T r e a t m e n t s of a n a p h o r a have t a k e n place w i t h i n
t w o q u i t e d i s t i n c t s e t t i n g s , so it seems On t h e one h a n d , l i n g u i s t s have p r i m a r i l y been c o n c e r n e d
w i t h t h e s p e c i f i c a t i o n of m a i n l y s y n t a c t i c c r i t e r i a in
d e t e r m i n i n g t h e p r o p e r " b i n d i n g " and
" d i s j o i n t n e s s " c r i t e r i a ( c f b e l o w ) , w h e r e a s compu-
t a t i o n a l l i n g u i s t s have in g e n e r a l p a i d more
a t t e n t i o n to a n a p h o r i c r e l a t i o n s in t e x t s , w h e r e se- mantic and p r a g m a t i c f e a t u r e s p l a y a much g r e a t e r role In t r y i n g to r e l a t e t h e t w o a p p r o a c h e s one
s h o u l d be a w a r e t h a t in t h e absence of any serious
t h e o r y of t e x t u n d e r s t a n d i n g , any a t t e m p t to deal
w i t h a n a p h o r a in u n r e s t r i c t e d domains (even if
t h e y are simple e n o u g h as f o r i n s t a n c e c h i l d r e n ' s
s t o r i e s ) , w i l l e n c o u n t e r so many d i v e r s e p r o b l e m s
w h i c h , even when t h e y i n f l u e n c e a n a p h o r i c re-
l a t i o n s , are c o m p l e t e l y b e y o n d t h e scope of a
s y s t e m a t i c t r e a t m e n t at t h e p r e s e n t moment We have t h o u g h t it to be i m p o r t a n t t h e r e f o r e to impose some c o n s t r a i n t s r i g h t f r o m t h e s t a r t on t h e t y p e of
d i s c o u r s e w i t h r e s p e c t to w h i c h o u r t r e a t m e n t of
a n a p h o r a is to be v a l i d a t e d ( o r f a l s i f i e d ) Of
c o u r s e , w h a t we are g o i n g to say s h o u l d in p r i n c i - ple be e x t e n d i b l e to more complex t y p e s of
d i s c o u r s e in t h e f u t u r e
T h e c o n t e x t of t h e p r e s e n t i n q u i r y is t h e q u e r y - in9 of r e l a t i o n a l databases {as opposed to say g e n - eral d i s c o u r s e a n a l y s i s ) T h e t y p e of d i s c o u r s e we are i n t e r e s t e d in a r e t h u s d i a l o g u e s in t h e s e t t l n g
of a r e l a t i o n a l d a t a b a s e ( w h i c h may be said to r e p -
r e s e n t b o t h t h e c o n t e x t of q u e r i e s and a n s w e r s as well as t h e " w o r l d " ) It s h o u l d be c l e a r t h a t a
w i d e v a r i e t y of a n a p h o r i c e x p r e s s i o n s is a v a i l a b l e
in t h i s k i n d of i n t e r a c t i o n ; on t h e o t h e r h a n d , t h e
r e l e v a n t k n o w l e d g e we assume in r e s o l v i n g p r o n o m - inal r e l a t i o n s m u s t come f r o m t h e i n f o r m a t i o n
Trang 2s p e c i f i e d in t h e database (in t h e r e l a t i o n s , in t h e
v a r i o u s d e p e n d e n c i e s and i n t e g r i t y c o n s t r a i n t s )
and in t h e rules g o v e r n i n g t h e l a n g u a g e
We are m a k i n g t h e f o l l o w i n g a s s u m p t i o n s f o r d a -
t a b a s e q u e r y i n g A q u e r y d i a l o g u e is a sequence
of p a i r s < q u e r y , a n s w e r > For t h e sake of s i m p l i c i -
t y we assume t h a t the possible a n s w e r s are of t h e
f o r m
y e s / n o a n s w e r
s i n g l e t o n a n s w e r
( e g Spain, to a q u e r y l i k e "Who b o r d e r s Por-
t u g a l ? " )
set a n s w e r
( [ F r a n c e , P o r t u g a l
d e r s Spain?")
m u l t i p l e a n s w e r
( [ < F r a n c e , Spain>,
b o r d e r s who?)
and
refusal
( w h e n a p r o n o u n cannot receive a p r o p e r i n t e r -
p r e t a t i o n )
to a q u e r y l i k e "Who b o r -
• I to a q u e r y l i k e "Who
2.1 T h e User S p e c i a l t y Languages system
T h e USL system (Lehmann (1978), O t t and Zoep-
p r i t z (1979), Lehmann (1980)) p r o v i d e s an i n t e r -
face to a r e l a t i o n a l data base management system
f o r data e n t r y , q u e r y , and m a n i p u l a t i o n via re-
s t r i c t e d n a t u r a l l a n g u a g e T h e USL System t r a n s -
lates i n p u t q u e r i e s e x p r e s s e d in a n a t u r a l l a n g u a g e
( c u r r e n t l y German ( Z o e p p r i t z (1983), E n g l i s h , and
Spanish (SopeSa (1982))) i n t o e x p r e s s i o n s in t h e
SQL q u e r y l a n g u a g e , and e v a l u a t e s those e x -
p r e s s i o n s t h r o u g h t h e use of System R ( A s t r a h a n
&al ( 1 9 7 6 ) ) T h e p r o t o t y p e b u i l t has been v a l i -
d a t e d w i t h real a p p l i c a t i o n s and t h u s shown its
u s a b i l i t y T h e system consists of (1) a l a n g u a g e
p r o c e s s i n g component ( U L G ) , (2) g r a m m a r s f o r
German, E n g l i s h , and Spanish, (3) a set of 75 in-
t e r p r e t a t i o n r o u t i n e s , (4) a code g e n e r a t o r f o r
SQL, and (5) t h e data base management system
System R USL r u n s u n d e r VM/CMS in a v i r t u a l
machine of 7 M B y t e s , w o r k i n g set size is 1.8
M B y t e s ULG, i n t e r p r e t a t i o n r o u t i n e s , and code
g e n e r a t o r comprise a p p r o x i m a t e l y 40,000 lines of
P L / I code
S y n t a c t i c analysis
T h e s y n t a x component of USL uses t h e User
L a n g u a g e G e n e r a t o r (ULG) which o r i g i n a t e s f r o m
t h e Paris S c i e n t i f i c C e n t e r of IBM France and has
been d e s c r i b e d by B e r t r a n d 8al (1976) ULG con-
sists of a p a r s e r , a semantic e x e c u t e r , the g r a m m a r
META, and META i n t e r p r e t a t i o n r o u t i n e s META is
used to process the g r a m m a r of a l a n g u a g e ULG
accepts general p h r a s e s t r u c t u r e g r a m m a r s w r i t t e n
in a modified B a c k u s - N a u r - F o r m With any r u l e it
allows t h e s p e c i f i c a t i o n of a r b i t r a r y , r o u t i n e s to
c o n t r o l its a p p l i c a t i o n o r to p e r f o r m a r b i t r a r y ac-
t i o n s , and it allows s o p h i s t i c a t e d c h e c k i n g and
s e t t i n g of s y n t a c t i c f e a t u r e s Grammars f o r G e r -
man, E n g l i s h , and Spanish have been d e s c r i b e d in
a f o r m accepted by ULG T h e g r a m m a r s p r o v i d e
rules f o r those f r a g m e n t s of t h e languages r e l e v a n t
f o r c o m m u n i c a t i n g w i t h a d a t a b a s e T h e USL
g r a m m a r s have been c o n s t r u c t e d in such a w a y t h a t
c o n s t i t u e n t s c o r r e s p o n d as c l o s e l y as p o s s i b l e to semantic r e l a t i o n s h i p s in t h e s e n t e n c e , and t h a t
p a r s i n g is made as e f f i c i e n t as p o s s i b l e Where a
t r u e r e p r e s e n t a t i o n of t h e semantic r e l a t i o n s h i p s in
t h e p a r s e t r e e could not be a c h i e v e d , t h e b u r d e n was p u t on t h e i n t e r p r e t a t i o n r o u t i n e s to r e m e d y
t h e s i t u a t i o n
I n t e r p r e t a t i o n
T h e a p p r o a c h to i n t e r p r e t a t i o n in t h e USL s y s - tem b u i l d s on t h e ideas of model t h e o r e t i c semantics T h i s implies t h a t t h e meaning of s t r u c -
t u r e w o r d s and s y n t a c t i c c o n s t r u c t i o n s is i n t e r -
p r e t e d s y s t e m a t i c a l l y and i n d e p e n d e n t of t h e
c o n t e n t s of a g i v e n d a t a b a s e F u r t h e r m o r e , since
a r e l a t i o n a l d a t a b a s e can be r e g a r d e d as a ( p a r t i a l ) model in t h e sense of model t h e o r y , t h e i n t e r p r e t a -
t i o n of n a t u r a l l a n g u a g e concepts in t e r m s of
r e l a t i o n s is q u i t e n a t u r a l (A more d e t a i l e d d i s - cussion can be f o u n d in Lehmann ( 1 9 7 8 ) )
In t h e USL system, e x t e n s i o n s of concepts are
r e p r e s e n t e d as v i r t u a l r e l a t i o n s of a r e l a t i o n a l d a - tabase w h i c h are d e f i n e d on p h y s i c a l l y s t o r e d re- lations (base r e l a t i o n s ) T h e set of v i r t u a l
r e l a t i o n s r e p r e s e n t s t h e c o n c e p t u a l k n o w l e d g e
a b o u t t h e data and is d i r e c t l y l i n k e d to n a t u r a l
l a n g u a g e w o r d s and p h r a s e s T h i s a p p r o a c h has
t h e a d v a n t a g e t h a t e x t e n s i o n s of concepts can r e l a -
t i v e l y easily be r e l a t e d to objects of c o n v e n t i o n a l databases
For i l l u s t r a t i o n of t h e connection between v i r t u -
al r e l a t i o n s and w o r d s , c o n s i d e r t h e f o l l o w i n g e x - ample Suppose t h a t f o r a g e o g r a p h i c a l a p p l i c a t i o n someone has a r r a n g e d t h e data in t h e f o r m of t h e
r e l a t i o n
CO ( C O U N T R Y , C A P I T A L , AREA, POPULATION) Now v i r t u a l r e l a t i o n s such as t h e f o l l o w i n g w h i c h
c o r r e s p o n d to concepts can be f o r m e d b y s i m p l y
p r o j e c t i n g o u t t h e a p p r o p r i a t e columns of CO:
C A P I T A L ( N O M _ C A P I T A L , O F _ C O U N T R Y )
S t a n d a r d role names (OF, NOM ) e s t a b l i s h t h e connection between s y n t a c t i c c o n s t r u c t i o n s and co- lumns of v i r t u a l r e l a t i o n s and enable a n s w e r i n g
q u e s t i o n s such as (1) What is A u s t r i a ' s capital?
in a s t r a i g h t f o r w a r d and simple w a y S t a n d a r d role names are s u r f a c e o r i e n t e d because t h i s makes
it p o s s i b l e f o r a u s e r not t r a i n e d in l i n g u i s t i c s to
d e f i n e his own w o r d s and r e l a t i o n s ( F o r a com-
p l e t e l i s t of s t a n d a r d role names see e g Z o e p p r i t z ( 1 9 8 3 ) )
We are c u r r e n t l y w o r k i n g on t h e i n t e g r a t i o n of
t h e concepts u n d e r l y i n g t h e USL system w i t h Dis-
c o u r s e R e p r e s e n t a t i o n T h e o r y which is d e s c r i b e d in
t h e n e x t section We have a l r e a d y i m p l e m e n t e d a
p r o c e d u r e w h i c h g e n e r a t e s Discourse R e p r e s e n -
t a t i o n S t r u c t u r e s f r o m USL's semantic t r e e s and
Trang 3w h i c h c o v e r s t h e e n t i r e f r a g m e n t o f l a n g u a g e d e -
s c r i b e d in Kamp ( 1 9 8 1 )
2 2 D i s c o u r s e R e p r e s e n t a t i o n T h e o r y ( D R T )
In t h i s s e c t i o n we g i v e a b r i e f d e s c r i p t i o n of
Kamp's D i s c o u r s e R e p r e s e n t a t i o n T h e o r y ( D R T ) in
as much as it r e l a t e s t o o u r c o n c e r n s w i t h p r o n o m i -
n a l i z a t i o n F o r a m o r e d e t a i l e d d i s c u s s i o n of t h i s
t h e o r y and its g e n e r a l r a m i f i c a t i o n s f o r n a t u r a l
l a n g u a g e p r o c e s s i n g , cf t h e p a p e r s b y Kamp
(1981) and G u e n t h n e r (1983a, 1983b)
A c c o r d i n g t o D R T , each n a t u r a l l a n g u a g e s e n -
t e n c e ( o r d i s c o u r s e ) is a s s o c i a t e d w i t h a s o - c a l l e d
D i s c o u r s e R e p r e s e n t a t i o n S t r u c t u r e ( D R S ) on t h e
basis of a set o f DRS f o r m a t i o n r u l e s T h e s e r u l e s
a r e s e n s i t i v e t o b o t h t h e s y n t a c t i c s t r u c t u r e of t h e
s e n t e n c e s in q u e s t i o n as well as t o t h e DRS c o n t e x t
in w h i c h in t h e s e n t e n c e o c c u r s In t h e f o r m u -
l a t i o n of Kamp (1981) t h e l a t t e r is r e a l l y of
i m p o r t a n c e o n l y in c o n n e c t i o n w i t h t h e p r o p e r a n a l -
y s i s of p r o n o u n s We feel on t h e o t h e r h a n d t h a t
t h e DRS e n v i r o n m e n t of a s e n t e n c e t o be p r o c e s s e d
s h o u l d d e t e r m i n e much m o r e t h a n j u s t t h e a n a p h o r -
ic a s s i g n m e n t s We shall d i s c u s s t h i s issue - in
p a r t i c u l a r as i t r e l a t e s t o p r o b l e m s of a m b i g u i t y
and v a g u e n e s s - in m o r e d e p t h in a f o r t h c o m i n g
p a p e r
A DRS K f o r a d i s c o u r s e has t h e g e n e r a l f o r m
K = <U, Con>
w h e r e U is a s e t of " d i s c o u r s e r e f e r e n t s " f o r K and
Con a set o f " c o n d i t i o n s " on t h e s e i n d i v i d u a l s
C o n d i t i o n s can be e i t h e r atomic o r c o m p l e x An
atomic c o n d i t i o n has t h e f o r m
P ( t l t n )
o r
t l = c
w h e r e t i is a d i s c o u r s e r e f e r e n t a n d c a p r o p e r
name and P an n - p l a c e p r e d i c a t e
T h e o n l y c o m p l e x c o n d i t i o n we shall discuss
h e r e is t h e one representing u n i v e r s a l l y q u a n t i f i e d
noun p h r a s e s o r c o n d i t i o n a l s e n t e n c e s Both a r e
t r e a t e d in much t h e same w a y L e t us call t h e s e
" i m p l i c a t i o n a l " c o n d i t i o n s :
K1 IMP K2
w h e r e K1 and K2 a r e also DRSs With a d i s c o u r s e
D is t h u s a s s o c i a t e d a D i s c o u r s e R e p r e s e n t a t i o n
s t r u c t u r e w h i c h r e p r e s e n t s D in a q u a n t i f i e r - f r e e
" c l a u s a l " f o r m , and w h i c h c a p t u r e s t h e p r o p o s i -
t i o n a l i m p o r t o f t h e d i s c o u r s e b y - among o t h e r
t h i n g s , e s t a b l i s h i n g t h e c o r r e c t p r o n o m i n a l c o n -
n e c t i o n s
What is i m p o r t a n t f o r t h e t r e a t m e n t of a n a p h o r a
in t h e p r e s e n t c o n t e x t is t h e f o l l o w i n g :
a) G i v e n a d i s c o u r s e w i t h a p r i n c i p a l DRS Ko and a
set of n o n - p r i n c i p a l DRSs ( o r c o n d i t i o n s ) Ki among
its c o n d i t i o n s all d i s c o u r s e r e f e r e n t s of Ko a r e a d -
m i s s i b l e r e f e r e n t s f o r p r o n o u n s in s e n t e n c e s o r
( p h r a s e s ) g i v i n g r i s e t o t h e v a r i o u s e m b e d d e d
K i ' s In p a r t i c u l a r , all o c c u r r e n c e s of p r o p e r names in a d i s c o u r s e w i l l a l w a y s be a s s o c i a t e d w i t h
d i s c o u r s e r e f e r e n t s of t h e p r i n c i p a l DRS Ko ( T h i s
is on t h e ( a d m i t t e d l y u n r e a l i s t i c ) a s s u m p t i o n t h a t
p r o p e r names r e f e r u n i q u e l y )
b ) G i v e n an i m p l i c a t i o n a l DRS of t h e f o r m K1 IMP K2 o c c u r r i n g in a DRS K, a r e l a t i o n of r e l a t i v e ac-
c e s s i b i l i t y b e t w e e n DRSs is d e f i n e d as f o l l o w s : K1 is a c c e s s i b l e f r o m K2 and all K' a c c e s s i b l e
f r o m K1 a r e also a c c e s s i b l e f r o m K2
In p a r t i c u l a r , t h e p r i n c i p a l DRS Ko is a c c e s s i b l e
f r o m its s u b o r d i n a t e DRSs ( f o r a p r e c i s e d e f i n i t i o n
cf Kamp ( 1 9 8 1 ) ) T h e i m p o r t of t h i s d e f i n i t i o n
f o r anaphora is s i m p l y t h a t i f a p r o n o u n is b e i n g
r e s o l v e d ( i e i n t e r p r e t e d ) in t h e c o n t e x t o f a DRS K' f r o m w h i c h a set K of DRSs is a c c e s s i b l e , t h e n
t h e u n i o n of all t h e sets of d i s c o u r s e r e f e r e n t s as-
s o c i a t e d w i t h e v e r y Ki in K is t h e set of a d m i s s i b l e
c a n d i d a t e s f o r t h e i n t e r p r e t a t i o n o f t h e p r o n o u n
T h e f o l l o w i n g i l l u s t r a t i o n s w i l l make t h i s c l e a r :
K ( E v e r y c o u n t r y i m p o r t s a p r o d u c t i t n e e d s )
c o u n t r y ( u 1 ) IMP i m p o r t ( u l , u 2 )
p r o d u c t ( u 2 )
n e e d ( u l , u 2 )
T h i s s e n t e n c e (as well as its interrogative v e r s i o n )
a l l o w s o n l y one i n t e r p r e t a t i o n of t h e p r o n o u n i t ac-
c o r d i n g t o D R T I t does n o t i n t r o d u c e a n y d i s -
c o u r s e r e f e r e n t a v a i l a b l e f o r p r o n o m i n a l i z a t i o n in
l a t e r s e n t e n c e s ( o r q u e r i e s ) B u t in a DRS l i k e
t h e f o l l o w i n g , DRT does not - as i t s t a n d s - ac-
c o u n t f o r p r o n o u n r e s o l u t i o n :
K ( J o h n t i c k l e d B i l l He s q u i r m e d )
l ~ u l u2
u l = J o h n
u2 = Bill
t i c k l e d ( u l , u 2 )
A t t h i s p o i n t , t h e p r o n o u n he has to be
i n t e r p r e t e d T h e r e a r e t w o a d m i s s i b l e c a n d i d a t e s ,
u l and u2, b u t DRT does not choose b e t w e e n t h e m
So t h e DRS c o u l d be c o n t i n u e d w i t h e i t h e r
s q u i r m ( u l )
o r
s q u i r m ( u 2 )
S i m i l a r l y , in t h e f o l l o w i n g DRS
Trang 4K ( I f Spain is a member of e v e r y o r g a n i z a t i o n ,
i t has a m e m b e r )
1 I
[ o r g a n ! z a t i o n ( u 2 ) I
IMP
IMP [ u 3 e m b e r ( u 3 ' i t ) ]
t h e p r o n o u n i t c o u l d o n l y r e f e r t o Spain (on c o n -
f i g u r a t i o n a l g r o u n d s ) , and w o u l d h a v e t o be as-
s i g n e d t h a t o b j e c t i f no o t h e r c r i t e r i a a r e a s s u m e d
O b v i o u s l y , as f a r as t h i s s e n t e n c e and t h e i n t e n d e d
d a t a b a s e is c o n c e r n e d , we s h o u l d w a n t to r u l e o u t
such an a s s i g n m e n t ( T h i s can be d o n e v i a r u l e $1
d i s c u s s e d b e l o w )
In g e n e r a l , t h e n , g i v e n a s e n t e n c e ( o r d i s -
c o u r s e ) r e p r e s e n t e d in a DRS t h e r e w i l l be m o r e
c a n d i d a t e s f o r a d m i s s i b l e p r o n o u n a s s i g n m e n t s as
o n e s h o u l d l i k e t o h a v e a v a i l a b l e w h e n a p a r t i c u l a r
p r o n o u n is t o be i n t e r p r e t e d T h e r u l e s d e s c r i b e d
in Section 3 a r e m e a n t to c a p t u r e some of t h e r e g u -
l a r i t i e s t h a t a r i s e in t y p i c a l d a t a b a s e q u e r y i n g
i n t e r a c t i o n s
c) F i n a l l y , g i v e n a DRS f o r a d i s c o u r s e D we can
s a y t h a t a p r o n o u n is p r o p e r l y r e f e r e n t i a l i f f i t is
r e p r e s e n t e d b y ( i e e l i m i n a t e d in f a v o r o f ) a d i s -
c o u r s e r e f e r e n t ui o c c u r r i n g in t h e domain of t h e
p r i n c i p a l DRS r e p r e s e n t i n g D ( I n t h e c o n t e x t of
t h e c o n s t r u c t i o n s i l l u s t r a t e d so f a r , t h i s w i l l be
t r u e in p a r t i c u l a r of p r o p e r names as well as o f i n -
d e f i n i t e noun p h r a s e s n o t in t h e scope of of a
u n i v e r s a l noun p h r a s e o r a c o n d i t i o n a l )
T h e main p r o b l e m t h e n f o r t h e t r e a t m e n t of a n a p h o -
ra is t o d e t e r m i n e w h i c h p o s s i b l e d i s c o u r s e r e f e r -
e n t s s h o u l d be chosen when we come t o t h e
i n t e r p r e t a t i o n o f a p a r t i c u l a r p r o n o u n o c c u r r e n c e
pi in t h e f o r m a t i o n of t h e e x t e n s i o n of t h e DRS in
w h i c h we a r e w o r k i n g
We w o u l d l i k e to s u g g e s t t h e f o l l o w i n g s t r a t e g y
as a s t a r t i n g p o i n t C o n s i d e r a q u e r y d i a l o g u e Q
w i t h an a l r e a d y e s t a b l i s h e d DRS K and t h e u t t e r -
ance of a q u e r y S, w h e r e S c o n t a i n s o c c u r r e n c e s of
p e r s o n a l p r o n o u n s Suppose f u r t h e r t h a t A ( S ) is
t h e sole s y n t a c t i c a n a l y s i s a v a i l a b l e f o r S T h e n
we r e g a r d t h e c o n s t r u c t i o n of t h e e x t e n s i o n of t h e
DRS o b t a i n e d on t h e basis of S and K as t h e v a l u e
o f a p a r t i a l f u n c t i o n f d e f i n e d on K and A ( S )
M o r e g e n e r a l l y s t i l l , as Kamp h i m s e l f s u g g e s t s , we
can r e g a r d t h e " m e a n i n g " ( o r i n f o r m a t i o n c o n t e n t )
o f a s e n t e n c e t o be t h a t p a r t i a l f u n c t i o n f r o m DRSs
t o DRSs
In a g i v e n d i a l o g u e both t h e q u e r i e s and t h e a n -
s w e r s will h a v e t h e side e f f e c t o f i n t r o d u c i n g new
i n d i v i d u a l s a n d " p r e f e r e n c e " o r " s a l i e n c e " o r -
d e r i n g s on t h e s e i n d i v i d u a l s , and we w a n t to a l l o w
f o r p r o n o m i n a l r e f e r e n c e to t h e s e much in t h e same
w a y t h a t in a t e x t p r e c e d i n g s e n t e n c e s may h a v e
d e t e r m i n e d a set of p o s s i b l e a n t e c e d e n t s f o r p r o -
n o u n s in t h e c u r r e n ~ ! y p r o c e s s e d s e n t e n c e T h e
DRS b u i l t up in t h e process of a q u e r y i n g session
w i l l c o n s t i t u t e t h e " m u t u a l k n o w l e d g e " a v a i l a b l e t o
t h e u s e r in s p e c i f y i n g his f u r t h e r q u e r i e s as well
as in his uses o f p r o n o u n s I t is on t h e i n d i v i d u a l s
i n t r o d u c e d in t h e DRSs t h a t t h e r u l e s t o be d i s -
c u s s e d b e l o w a r e i n t e n d e d t o o p e r a t e
3 I n t e r p l a y o f s y n t a x , s e m a n t i c s , a n d p r a g m a t i c s in
pronominalization
T h e p r o c e s s o f p r o n o m i n a l i z a t i o n is g o v e r n e d b y
r u l e s i n v o l v i n g m o r p h o l o g i c a l , s y n t a c t i c , s e m a n t i c , and p r a g m a t i c c r i t e r i a T h e s e r u l e s a r e d i s c u s s e d and i l l u s t r a t e d w i t h e x a m p l e s d r a w n f r o m t h e c o n -
t e x t o f q u e r y i n g a g e o g r a p h i c a l d a t a b a s e T h e n a
p r o c e d u r e is o u t l i n e d w h i c h uses t h e s e r u l e s and
a p p l i e s them in t h e f o l l o w i n g o r d e r :
F i r s t m o r p h o l o g i c a l c r i t e r i a a r e c h e c k e d , if t h e y
f a i l no f u r t h e r t e s t s a r e r e q u i r e d
T h e n s y n t a c t i c ( o r c o n f i g u r a t i o n a l ) c r i t e r i a a r e
t e s t e d A g a i n , i f t h e y f a i l , no f u r t h e r t e s t s a r e
n e c e s s a r y
N e x t s e m a n t i c c r i t e r i a a r e a p p l i e d , and if t h e y
do n o t f a i l ,
t h e p r a g m a t i c c r i t e r i a have to be t e s t e d If
m o r e t h a n one c a n d i d a t e r e m a i n s , t h e use of t h e
p r o n o u n was p r a g m a t i c a l l y i n a p p r o p r i a t e and
m u s t be n o t e d as s u c h
3.1 S t r i c t f a c t o r s d e t e r m i n i n g t h e a d m i s s i b i l i t y of anaphora
3 1 1 M o r p h o l o g i c a l c r i t e r i a
M o r p h o l o g i c a l c r i t e r i a c o n c e r n t h e a g r e e m e n t of
g e n d e r and n u m b e r C o m p l i c a t i o n s come i n , w h e n
c o o r d i n a t e d n o u n p h r a s e s o c c u r , e g (2) J o h n and Bill w e n t t o Pisa T h e y d e l i v e r e d a
p a p e r (3) * J o h n and Bill w e n t t o Pisa He d e l i v e r e d a p a -
p e r (4) J o h n and Sue w e n t t o Pisa He d e l i v e r e d a p a -
p e r (5) * J o h n o r Bill w e n t t o Pisa T h e y d e l i v e r e d a
p a p e r (6) * J o h n o r Bill w e n t t o Pisa He d e l i v e r e d a p a -
p e r (7) N e i t h e r J o h n n o r Bill w e n t to Pisa T h e y w e n t
t o Rome
(8) * E i t h e r J o h n o r Bill d i d n o t go t o Pisa He w e n t
to Rome
T h e s t a r r e d e x a m p l e s c o n t a i n i n a p p r o p r i a t e uses of
p r o n o u n s With a n d - c o o r d i n a t i o n , r e f e r e n c e to t h e
c o m p l e t e NP is p o s s i b l e w i t h a p l u r a l p r o n o u n When t h e members of t h e c o o r d i n a t i o n a r e d i s t i n c t
in g e n d e r a n d / o r n u m b e r , r e f e r e n c e to them is
p o s s i b l e w i t h t h e c o r r e s p o n d i n g p r o n o u n s
C l e a r l y , t h e same o b s e r v a t i o n s hold f o r i n t e r r o g a -
t i v e s e n t e n c e s
3 1 2 C o n f i g u r a t i o n a l c r i t e r i a
S y n t a c t i c c r i t e r i a o p e r a t e o n l y w i t h i n t h e b o u n d a - ries of a s e n t e n c e , o u t s i d e t h e y a r e useless T h e
c o n f i g u r a t i o n a l critp.ria s t e m m i n g f r o m DRT h o w e v e r
w o r k i n d e p e n d e n t o f s e n t e n c e b o u n d a r i e s
147
Trang 5D i s j o i n t reference
T h e r u l e of " d i s j o i n t r e f e r e n c e " a c c o r d i n g to
R e i n h a r t (1983) goes back to C h o m s k y and has
been r e f i n e d b y Lasnik (1976) and R e i n h a r t (1983)
I t is able to h a n d l e a v a r i e t y of w e l l - k n o w n cases,
such as
(9) When d i d i t join t h e UN?
(10) Which c o u n t r i e s t h a t i m p o r t i t , p r o d u c e
p e t r o l ?
(11) *Does it e n t e r t a i n d i p l o m a t i c r e l a t i o n s w i t h
Spain's n e i g h b o r ?
( I n t h e s t a r r e d e x a m p l e , t h e use of " i t " is i n a p p r o -
p r i a t e , if it is to be c o r e f e r e n t i a l w i t h " S p a i n " )
R a t h e r t h a n using c-command to f o r m u l a t e t h i s
c r i t e r i o n , w h i c h is e l e g a n t b u t too s t r i c t in some
cases (as noted b y R e i n h a r t h e r s e l f and B o l i n g e r
(1979), we have chosen an a d m i t t e d l y less e l e g a n t ,
b u t h o p e f u l l y r e l i a b l e , a p p r o a c h to d i s j o i n t r e f e r -
ence, in t h a t we s p e c i f y t h e c o n c r e t e s y n t a c t i c
c o n f i g u r a t i o n s w h e r e d i s j o i n t r e f e r e n c e h o l d s We
do not r e l y here on t h e s y n t a c t i c f r a m e w o r k of USL
g r a m m a r , b u t use more o r less t r a d i t i o n a l l y known
t e r m i n o l o g y f o r e x p r e s s i n g o u r r u l e s We need t h e
t e r m s " c l a u s e " , " p h r a s e " , " m a t r i x " , " e m b e d d i n g " ,
and " l e v e l " These can be made e x p l i c i t , when a
s u i t a b l e s y n t a c t i c f r a m e w o r k is chosen
Now we can f o r m u l a t e o u r d i s j o i n t r e f e r e n c e r u l e
and some of its less obvious c o n s e q u e n c e s
C I The referent of a personal pronoun can never
be within the same clause at the same phrase level
(Note that this rule does not hold f o r possessive
pronouns,)
C1 has a n u m b e r of consequences w h i c h we now
l i s t :
C l a T h e ( i m p l i c i t ) s u b j e c t of an i n f i n i t v e clause
can n e v e r be r e f e r e n t of a p e r s o n a l p r o n o u n in t h a t
clause
(12) Does t h e EC w a n t to d i s s o l v e it?
C l b Nouns common to c o o r d i n a t e clauses c a n n o t
be r e f e r r e d to f r o m w i t h i n these c o o r d i n a t e clauses
(13) Which c o u n t r y b o r d e r s it and Spain?
clause can n e v e r be r e f e r r e d t o
(14) Does it b o r d e r Spain's n e i g h b o r s ?
T h e f o l l o w i n g rules have to do w i t h p h r a s e s and
clauses m o d i f y i n g a noun T h e y too can be r e -
g a r d e d as consequences of C1
C2 Head noun of a p h r a s e o r clause can n e v e r be
r e f e r e n t of a personal p r o n o u n in t h a t p h r a s e o r
clause
C2a Head noun of p a r t i c i p i a l p h r a s e
(15) a c o u n t r y e x p o r t i n g p e t r o l to it
C2b Head noun of t h a t - c l a u s e (16) t h e t r u t h is t h a t it follows f r o m A
C2c Head noun of r e l a t i v e clause (17) t h e c o u n t r y i t e x p o r t s p e t r o l to
T h e f o l l o w i n g t w o rules deal w i t h k a t a p h o r i c p r o n -
o m i n a l i z a t i o n (sometimes called b a c k w a r d p r o n o m i -
n a l i z a t i o n ) C3a K a t a p h o r a into a more d e e p l y e m b e d d e d clause is i m p o s s i b l e
(18) Did i t e x p o r t a p r o d u c t t h a t Spain p r o d u c e s ?
C 3 b K a t a p h o r a into a s u c c e e d i n g c o o r d i n a t e clause is impossible
(19) Who d i d not belong to i t b u t l e f t t h e UN?
T h e a c c e s s i b i l i t y r e l a t i o n on DRSs
C4 O n l y those d i s c o u r s e r e f e r e n t s in t h e accessi-
b i l i t y r e l a t i o n d e f i n e d in sec 2.2 are a v a i l a b l e as
r e f e r e n t s to a p r o n o u n
3 1 3 Semantic criteria
Widely used is t h e c r i t e r i o n of semantic c o m p a t i b i l i -
t y It is u s u a l l y implemented via " s e m a n t i c f e a -
t u r e s " In t h e USL f r a m e w o r k we can d e r i v e t h i s
i n f o r m a t i o n f r o m relation schemata We s t a t e t h e
c r i t e r i o n as f o l l o w s :
31 If s is a sentence c o n t a i n i n g a p r o n o u n p and
c a f u l l noun p h r a s e in t h e c o n t e x t of p If p is
s u b s t i t u t e d b y c in s to y i e l d s' and s' is not se-
m a n t i c a l l y anomalous, i e does not i m p l y a c o n t r a -
d i c t i o n , t h e n c is semantically c o m p a t i b l e w i t h s and is hence a semantically p o s s i b l e c a n d i d a t e f o r
t h e r e f e r e n c e of p
(20) What is t h e capital of A u s t r i a ? - V i e n n a What does it e x p o r t ?
If i t is assumed t h a t o n l y c o u n t r i e s b u t not c a p i t a l s
e x p o r t g o o d s , then the o n l y s e m a n t i c a l l y p o s s i b l e
r e f e r e n t f o r " i t " is A u s t r i a S2 N o n - r e f e r e n t i a l l y i n t r o d u c e d nouns c a n n o t be
a n t e c e d e n t s of p r o n o u n s (21) Which c o u n t r i e s does I t a l y have t r a d e w i t h ? How l a r g e is it?
Since " t r a d e " is used n o n - r e f e r e n t i a l l y , it c a n n o t
be a n t e c e d e n t of " i t " U n f o r t u n a t e l y , in many cas-
es w h e r e t h i s c r i t e r i o n could a p p l y , t h e r e is an
a m b i g u i t y between r e f e r e n t i a l and n o n - r e f e r e n t i a l use
A p a r t f r o m t h e t y p e of semantic c o m p a t i b i l i t y
c o v e r e d b y r u l e S1, more complex semantic p r o p e r -
t i e s a r e used to d e t e r m i n e t h e r e f e r e n t of a p r o - noun T h e " t a s k s t r u c t u r e s " d e s c r i b e d b y G r o s z (1977) i l l u s t r a t e t h i s f a c t We hence f o r m u l a t e t h e rule
Trang 6$3 T h e p r o p e r t i e s of and r e l a t i o n s h i p s b e t w e e n
p r e d i c a t e s d e t e r m i n e p r o n o r n i n a l i z a b i l i t y
F o r an i l l u s t r a t i o n o f its e f f e c t , c o n s i d e r t h e f o l l o w -
i n g q u e r y :
(22) What c o u n t r y is its n e i g h b o r ?
T h e i r r e f l e x i v i t y of t h e n e i g h b o r - r e l a t i o n e n t a i l s
t h a t " i t s " c a n n o t be b o u n d b y " w h a t c o u n t r y " in
t h i s case, b u t has t o r e f e r t o s o m e t h i n g m e n t i o n e d
in t h e p r e v i o u s c o n t e x t
G i v e n a s u b j e c t d o m a i n , one can a n a l y z e t h e
p r o p e r t i e s of t h e r e l a t i o n s and t h e r e l a t i o n s h i p s b e -
t w e e n them and so b u i l d a basis f o r d e c i d i n g p r o -
n o u n r e f e r e n c e on s e m a n t i c g r o u n d s In t h e
f r a m e w o r k of t h e USL s y s t e m , i n f o r m a t i o n on t h e
p r o p e r t i e s of r e l a t i o n s is a v a i l a b l e in t e r m s of
" f u n c t i o n a l d e p e n d e n c i e s " g i v e n in t h e d a t a b a s e
schema o r as i n t e g r i t y c o n s t r a i n t s
3 2 P r a g m a t i c c r i t e r i a
T h e g e n e r a t i o n of d i s c o u r s e is c o n t r o l l e d b y t w o
f a c t o r s : c o m m u n i c a t i v e i n t e n t i o n s and m u t u a l
k n o w l e d g e In t h e c o n t e x t of d a t a b a s e i n t e r a c t i o n ,
we can assume t h a t t h e c o m m u n i c a t i v e i n t e n t i o n s of
a u s e r a r e s i m p l y to o b t a i n f a c t u a l a n s w e r s to f a c -
t u a l q u e s t i o n s His i n t e n t i o n s a r e e x p r e s s e d e i t h e r
b y s i n g l e q u e r i e s o r b y s e q u e n c e s of q u e r i e s , d e -
p e n d i n g on how c o m p l e x t h e s e i n t e n t i o n s a r e o r
how c l o s e l y t h e y c o r r e s p o n d to t h e i n f o r m a t i o n in
t h e d a t a b a s e As w i l l be shown b e l o w , in m a n y
cases t h e s y s t e m w i l l n o t h a v e a c h a n c e to d e t e r -
mine w h e t h e r a g i v e n q u e r y is a " o n e - s h o t q u e r y " ,
o r w h e t h e r i t is p a r t of a s e q u e n c e of q u e r i e s w i t h
a common " t h e m e " For t h e r e s o l u t i o n of p r o n o u n s ,
t h i s means t h a t t h e s y s t e m s h o u l d r a t h e r ask t h e
u s e r b a c k t h a n make w i l d guesses on w h a t m i g h t be
t h e most " p l a u s i b l e " r e f e r e n t T h i s is of c o u r s e
n o t p o s s i b l e w h e n r u n n i n g t e x t is a n a l y z e d in a
" b a t c h m o d e " , and no u s e r is t h e r e to be a s k e d f o r
c l a r i f i c a t i o n
M u t u a l k n o w l e d g e (see e g C l a r k and M a r s h a l l
(1981) f o r a d i s c u s s i o n ) d e t e r m i n e s t h e r u l e s f o r
i n t r o d u c i n g and r e f e r e n c i n g i n d i v i d u a l s in t h e d i s -
c o u r s e In t h e c o n t e x t of d a t a b a s e i n t e r a c t i o n we
assume t h e m u t u a l k n o w l e d g e t o c o n s i s t i n i t i a l l y o f :
- t h e set of p r o p e r names in t h e d a t a b a s e ,
- t h e p r e d i c a t e s w h o s e e x t e n s i o n s a r e in t h e d a t a -
base,
- t h e "common sense" r e l a t i o n s h i p s b e t w e e n and
p r o p e r t i e s of t h e s e p r e d i c a t e s
It w i l l be p a r t of t h e d e s i g n of a d a t a b a s e to e s t a b -
lish w h a t t h e s e "common sense" r e l a t i o n s h i p s and
p r o p e r t i e s a r e , e g , w h e t h e r it is g e n e r a l l y k n o w n
to t h e u s e r c o m m u n i t y , w h e t h e r " c a p i t a l " e x p r e s s e s
a o n e - o n e r e l a t i o n Each q u e s t i o n - a n s w e r p a i r oc-
c u r r i n g in t h e d i s c o u r s e is a d d e d to t h e s t o c k of
m u t u a l k n o w l e d g e
I t is a p r a g m a t i c p r i n c i p l e of p r o n o m i n a l i z a t i o n
t h a t o n l y m u t u a l k n o w l e d g e may be used to d e t e r -
mine t h e r e f e r e n t of a p r o n o u n on s e m a n t i c
g r o u n d s , and h e n c e it may be legal to use t h e same
s e n t e n c e c o n t a i n i n g a p r o n o u n w h e r e e a r l i e r in t h e
d i s c o u r s e i t was i l l e g a l , b e c a u s e t h e m u t u a l k n o w -
l e d g e has i n c r e a s e d in t h e m e a n t i m e
3 2 1 A f i r s t a t t e m p t u s i n g p r e f e r e n c e r u l e s What t h e t o p i c o f a d i s c o u r s e is, w h i c h o f t h e e n t i -
t i e s m e n t i o n e d in i t a r e in f o c u s , is r e f l e c t e d in t h e
s y n t a c t i c s t r u c t u r e o f s e n t e n c e s T h i s has been
o b s e r v e d f o r a long t i m e I t has also o f t e n been
o b s e r v e d t h a t d i s c o u r s e t o p i c and f o c u s h a v e an e f -
f e c t on p r o n o m i n a l i z a t i o n w h e r e m o r p h o l o g i c a l , c o n -
f i g u r a t i o n a l , a n d s e m a n t i c r u l e s fail t o d e t e r m i n e a
s i n g l e C a n d i d a t e f o r r e f e r e n c e H o w e v e r , i t has
n o t been p o s s i b l e y e t t o f o r m u l a t e p r e c i s e r u l e s e x -
p l a i n i n g t h i s p h e n o m e n o n We h a v e t h e i m p r e s s i o n
t h a t such r u l e s c a n n o t be a b s o l u t e l y s t r i c t r u l e s ,
b u t a r e of a p r e f e r e n t i a l n a t u r e We h a v e d e v e l -
o p e d a set of such r u l e s and t e s t e d them a g a i n s t a
c o r p u s o f t e x t c o n t a i n i n g some 600 p r o n o u n o c c u r -
r e n c e s , a n d h a v e f o u n d them t o w o r k r e m a r k a b l y
w e l l S i m i l a r t e s t s ( w i t h a s i m i l a r set o f r u l e s )
h a v e been c o n d u c t e d b y Hofmann ( 1 9 7 6 )
In t h e s e q u e l we f o r m u l a t e and d i s c u s s o u r l i s t
of r u l e s T h e i r o r d e r i n g c o r r e s p o n d s to t h e o r d e r
in w h i c h t h e y h a v e t o be a p p l i e d P1 ( p r i n c i p l e o f p r o x i m i t y ) Noun p h r a s e s w i t h i n
t h e s e n t e n c e c o n t a i n i n g t h e p r o n o u n a r e p r e f e r r e d
o v e r noun p h r a s e s in p r e v i o u s o r s u c c e e d i n g s e n -
t e n c e s
C o n s i d e r t h e s e q u e n c e (23) What c o u n t r y j o i n e d t h e EC a f t e r 1980?
G r e e c e (24) What c o u n t r y consumes t h e w i n e i t p r o d u c e s ? One c o u l d a r g u e t h a t " G r e e c e " is j u s t as p r o b a b l y
t h e i n t e n d e d r e f e r e n t of " i t " in t h i s case as t h e
b o u n d i n t e r p r e t a t i o n and t h a t h e n c e t h e use of " i t "
s h o u l d be r e j e c t e d as i n a p p r o p r i a t e H o w e v e r ,
t h e r e is no w a y to a v o i d t h e " i t " , if t h e b o u n d v a r -
i a b l e i n t e r p r e t a t i o n is i n t e n d e d , and one can use
t h i s as a g r o u n d t o r u l e o u t t h e i n t e r p r e t a t i o n w h e -
re " i t " r e f e r s t o " G r e e c e "
P l a Noun p h r a s e s in s e n t e n c e s b e f o r e t h e s e n -
t e n c e c o n t a i n i n g t h e p r o n o u n a r e p r e f e r r e d o v e r noun p h r a s e s in m o r e d i s t a n t s e n t e n c e s
T h i s c r i t e r i o n is v e r y i m p o r t a n t to l i m i t t h e s e a r c h
f o r p o s s i b l e d i s c o u r s e r e f e r e n t s P2 P r o n o u n s a r e p r e f e r r e d o v e r f u l l n o u n
p h r a s e s
T h i s r u l e is f o u n d in m a n y s y s t e m s d e a l i n g w i t h
a n a p h o r a One can m o t i v a t e it b y s a y i n g t h a t
p r o n o m i n a l i z a t i o n e s t a b l i s h e s an e n t i t y as a t h e m e
w h i c h is t h e n m a i n t a i n e d u n t i l t h e c h a i n of p r o -
n o u n s is b r o k e n b y a s e n t e n c e n o t c o n t a i n i n g a s u i -
t a b l e p r o n o u n For an e x a m p l e c o n s i d e r : (25) W:lat =s t h e area of A u s t r i a !
(26) What is its c a p i t a l ? (27) What is its p o p u l a t i o n ?
Trang 7P3 Noun ~hrases in a m a t r i x clause o r p h r a s e are
p r e f e r r e d o v e r noun p h r a s e s in e m b e d d e d clauses
o r p h r a s e s
P3ạ Noun p h r a s e s in a m a t r i x clause a r e p r e -
f e r r e d o v e r noun p h r a s e s in embeđẽ clauses
Example:
(28) What c o u n t r y i m p o r t s a p r o d u c t t h a t Spain
p r o d u c e s ? - D e n m a r k
(29) What does it e x p o r t ?
Here " i t " has to r e f e r to t h e i n d i v i d u a l s a t i s f y i n g
" w h a t c o u n t r y " , not to " S p a i n " w h i c h o c c u r s in an
e m b e d d e d clausẹ
P3b Head nouns are p r e f e r r e d o v e r noun c o m p l e -
ments
Example:
(30) What is t h e c a p i t a l of A u s t r i a ? - V i e n n a
(31) What is its p o p u l a t i o n ?
"Vienna", not "Austria" becomes the referent of
" i t s " , and t h e a r g u m e n t is a n a l o g o u s to t h a t f o r
P3ạ
P4 S u b j e c t noun p h r a s e s are p r e f e r r e d o v e r
n o n - s u b j e c t noun p h r a s e s
In d e c l a r a t i v e c o n t e x t s , t h i s r u l e w o r k s q u i t e w e l l
It c o r r e s p o n d s e s s e n t i a l l y to t h e focus r u l e of S i d -
h e r (1981) In a q u e s t i o n - a n s w e r i n g s i t u a t i o n it is
h a r d l y a p p l i c a b l e , since e s p e c i a l l y in w h - q u e s t i o n s
s u b j e c t p o s i t i o n and w o r d o r d e r , w h i c h b o t h p l a y a
role, t e n d to i n t e r f e r e We t h e r e f o r e t e n d to not
use t h i s r u l e , b u t r a t h e r to let t h e s y s t e m ask back
in cases w h e r e it w o u l d a p p l y For i l l u s t r a t i o n
c o n s i d e r t h e f o l l o w i n g e x a m p l e s :
(32) Does Spain b o r d e r P o r t u g a l ? What is its p o p u -
lation?
(33) Is Spain b o r d e r e d b y P o r t u g a l ? What is its
p o p u l a t i o n ?
(34) Which c o u n t r y b o r d e r s P o r t u g a l ? What is its
p o p u l a t i o n ?
(35) Which c o u n t r y does P o r t u g a l b o r d e r ? What is
its p o p u l a t i o n ?
P5 A c c u s a t i v e o b j e c t noun p h r a s e s a r e p r e f e r r e d
o v e r o t h e r n o n - s u b j e c t noun p h r a s e s
P6 Noun p h r a s e s p r e c e d i n g t h e p r o n o u n are p r e -
f e r r e d o v e r noun p h r a s e s s u c c e e d i n g t h e p r o n o u n
( o r : a n a p h o r a is p r e f e r r e d o v e r k a t a p h o r a )
3 3 O u t l i n e of a p r o n o u n r e s o l u t i o n p r o c e d u r e
We now o u t l i n e a p r o c e d u r e f o r " r e s o l v i n g " p r o -
nouns in t h e f r a m e w o r k of t h e USL system and
DRT
Let M = <U, Con> be t h e DRS r e p r e s e n t i n g t h e
mutual k n o w l e d g e , in p a r t i c u l a r t h e p a s t d i s c o u r s e
Let K ( s ) be t h e DRS r e p r e s e n t i n g t h e c u r r e n t sen-
tence s and let p be a p r o n o u n o c c u r r i n g in s f o r
w h i c h an a p p r o p r i a t e d i s c o u r s e r e f e r e n t has to be
f o u n d Let U be t h e set of d i s c o u r s e r e f e r e n t s
ăp) accessible to p according to the accessibility re- lation given in sec 2.2
Let f u r t h e r c be a f u n c t i o n t h a t a;)plies to U ăp)
all the morphological, syntactic, and semantic cri- teria, given above and yields a set Uc(p) as result Now three cases have to be distinguished:
1 U c ( p ) is e m p t y In t h i s case t h e use of p was
i n a p p r o p r i a t e
2 C a r d ( U c ( p ) ) is 1 In t h i s case a r e f e r e n t f o r p has been u n i q u e l y d e t e r m i n e d , p is r e p l a c e d b y
it in t h e DRS, and t h e p r o c e d u r e is f i n i s h e d
3 C a r d ( U c ( p ) ) is g r e a t e r than 1 In t h i s case t h e
p r e f e r e n c e r u l e s a r e a p p l i e d Let p be a f u n c t i o n t h a t a p p l i e s to U c ( p ) if t h e
c a r d i n a l i t y of Uc(p) is g r e a t e r t h a n 1 all t h e p r e f -
e r e n c e rules g i v e n a b o v e in t h e o r d e r i n d i c a t e d
t h e r e y i e l d i n g t h e r e s u l t Up C a r d ( U p ) can n e v e r
be 0, hence t w o cases are p o s s i b l e , e i t h e r t h e c a r -
d i n a l i t y is 1, t h e n a r e f e r e n t has been u n i q u e l y
d e t e r m i n e d and t h e p r o n o u n p can be e l i m i n a t e d in
K, o r t h e c a r d i n a l i t y is g r e a t e r t h a n 1, and t h e n
t h e use of p was i n a p p r o p r i a t e
I t can be i n f e r r e d f r o m t h e f o r m u l a t i o n of t h e
p r o n o m i n a l i z a t i o n r u l e s g i v e n a b o v e , w h a t m o r p h o - logical and s y n t a c t i c i n f o r m a t i o n has to be s t o r e d
w i t h t h e d i s c o u r s e r e f e r e n t s in t h e DRSs, and w h a t semantic i n f o r m a t i o n has to be accessible f r o m t h e schema of t h e d a t a b a s e to enable t h e a p p l i c a t i o n of
t h e f u n c t i o n s c and p Hence, we w i l l not spell o u t
t h e s e d e t a i l s h e r e
4 Open q u e s t i o n s and c o n c l u s i o n s Many w e l l - k n o w n and p u z z l i n g cases have not been
a d d r e s s e d h e r e , among them p l u r a l a n a p h o r a ,
s o - c a l l e d p r o n o u n s of laziness, one p r o n o m i n a l i z a -
t i o n , to name j u s t a f e w
We have n o t said a n y t h i n g a b o u t phenomena such as d i s c o u r s e t o p i c , f o c u s , o r c o h e r e n c e and
t h e i r i n f l u e n c e on a n a p h o r a T h e i r e f f e c t s are c a p -
t u r e d in o u r p r e f e r e n c e rules to some d e g r e e , b u t
no one can p r e c i s e l y say how I n s p i r e of claims to
t h e c o n t r a r y , we b e l i e v e t h a t much w o r k is s t i l l r e -
e f f e c t i v e l y in n a t u r a l l a n g u a g e p r o c e s s i n g
By l i m i t i n g o u r s e l v e s to t h e r e l a t i v e l y
w e l l - d e f i n e d c o m m u n i c a t i v e s i t u a t i o n of d a t a b a s e i n -
t e r a c t i o n , we have been able to s t a t e p r e c i s e l y ,
w h a t r u l e s are a p p l i c a b l e in t h e f r a g m e n t of lan-
g u a g e we are d e a l i n g w i t h We are c u r r e n t l y w o r k - ing on t h e a n a l y s i s of r u n n i n g t e x t s , b u t again in a
w e l l - d e l i n e a t e d d o m a i n , and we hope to be able to
e x t e n d o u r t h e o r y on t h e basis of t h e e x p e r i e n c e
g a i n e d
Trang 8We are convinced that serious progress in the
u n d e r s t a n d i n g of anaphora and of discourse phe-
nomena in general is only possible t h r o u g h a care-
ful control of the environment, and on a solid
syntactic and semantic foundation
References
Astrahan, M M., M W Blasgen, D D Chamber-
lin, K P Eswaran, J N Gray, P P G r i f f i t h s ,
W F King, R A Lorie, P R McJones, J W
Mehl, (3 R Putzolu, I L Traiger, B W Wade,
V Watson (1976): "System R: Relational Approach
to Database Management", ACM Transactions on Da-
tabase Systems, vol 1, no 2, June 1976, p 97
B e r t r a n d , O., J J D~udennarde, D Starynke-
r i c h , A Stenbock-Fermor (1976): "User Applica-
Conference on Relational Data Base Systems, Bari,
Italy, p 83
Bolinger, D (1979): "Pronouns in Discourse", in:
T Givon ( e d , ) : Syntax and Semantics, Vol 12:
Discourse and Syntax, Academic Press, New York,
p 289
Thesis, Princeton
Clark, H H and C R Marshall (1981): "Definite
Reference and Mutual Knowledge", in: B L Web-
ber, A K Joshi, and I A Sag ( e d s ) : Elements
of Discourse Understanding, Cambridge U n i v e r s i t y
Press, Cambridge, p 10
Donnellan, K S (1978): "Speaker Reference, De-
scriptions and Anaphora", in P Cole ( e d ) : Syn-
tax and Semantics, Vol 9: Pragmatics, Academic
Press, New York, p 47
I n q u i r y , vol 11
(3rosz, B J (1977): "The Representation and Use
of Focus in Dialogue Understanding", Technical
California
Guenthner, F (1983a) "Discourse Representation
T h e o r y and Databases", forthcoming
Representation Theory in PROLO(3", forthcoming
Understanding: A Survey, Springer, Heidelberg
n i c h t - r e f e r e n t i e l l e Verweisformen in juristischen
Normtexten, unpublished dissertation, Univ Re-
gensburg
Kamp, H (1981) "A Theory of T r u t h and Semantic
Representation", in Groenendijk, J et al Formal
Methods in the Study of Language Amsterdam
L i n g u i s t i c Analysis, vol 2, hr 1
Lehmann, H (1978): " I n t e r p r e t a t i o n of Natural Language in an Information System", IBM J Res Develop vol 22, p 533
Lehmann, H (1980): "A System f o r A n s w e r i n g Ouestions in German", paper presented at the 6th International Symposium of the ALLC, Cambridge, England
Ott, N and M Zoeppritz (1979): "USL - an Exper- imental Information System based on Natural Lan-
Computer Systems, Hanser, Munich
d u n d a n t Join Operations in Queries I n v o l v i n g Views", TR 82.03.003, IBM Heidelberg Scientific Center
mantic Rules", in F (3uenthner and S J Schmidt
( e d s ) : Formal Semantics and Pragmatics f o r Na-
t u r a l Languages, Reidel, Dordrecht
Anaphora: A Restatement of the Anaphora Ques- tions", Linguistics and Philosophy, vol 6, p 47 Sidner, C L (1981): "Focusing for Interpretation
of Pronouns", AJCL, vol 7, nr 4, p 217
Smaby, R (1979): "Ambiguous Coreference with
Q u a n t i f i e r s " , in F (3uenthner and S.J Schmidt
tura| Languages, Reidel, Dordrecht
Smaby, R (1981): "Pronouns and A m b i g u i t y " , in
U M6nnich ( e d ) : Aspects of Philosophical Logic, Reidel, Dordrecht
de Sope~a Pastor, L (1982): "Grammar of Spanish for User Specialty Languages", TR 82.05.004, IBM Heidelberg Scientific Center
Webber, B L (1978): "A Formal Approach to Dis- course Anaphora", TR 3761, Bolt, Beranek & New- man, Cambr, idge, MA
TObingen