PROCESSING ENGLISH WITH A GENERALIZED PHRASE STRUCTURE GRAMMAR Jean Mark Gawron, Jonathan King, John Lamping, Egon Loebner, Eo Anne Paulson, Geoffrey K.. Sag, and Thomas Wasow Computer R
Trang 1PROCESSING ENGLISH WITH A GENERALIZED PHRASE STRUCTURE GRAMMAR Jean Mark Gawron, Jonathan King, John Lamping, Egon Loebner,
Eo Anne Paulson, Geoffrey K Pullum, Ivan A Sag, and Thomas Wasow
Computer Research Center Hewlett Packard Company
1501 Page Mill Road Palo Alto, CA 94304
ABSTRACT
This paper describes a natural language
processing system implemented at Hewlett-Packard's
Computer Research Center The system's main
components are: a Generalized Phrase S t r u c t u r e
Grammar (GPSG); a top-down parser; a logic
t r a n s d u c e r that outputs a f i r s t - o r d e r logical
representation; and a "disambiguator" that uses
sortal information to convert "normal-form"
f i r s t - o r d e r logical expressions into the q u e r y
language f o r HIRE, a relational database hosted in
the SPHERE system We argue that theoretical
developments in GPSG syntax and in Montague
semantics have specific advantages to b r i n g to this
domain of computational linguistics The s y n t a x
and semantics of the system are t o t a l l y
domain-independent, and thus, in p r i n c i p l e ,
h i g h l y portable We discuss the prospects f o r
extending domain-independence to the lexical
semantics as well, and thus to the logical semantic
representations
I INTRODUCTION
This paper is an i n t e r i m progress report on
l i n g u i s t i c research carried out at Hewlett-Packard
Laboratories since the summer of 1981 The
research had three goals: (1) demonstrating the
computational t r a c t a b i l i t y of Generalized Phrase
S t r u c t u r e Grammar (GPSG), (2) implementing a
GPSG system covering a large fragment of English,
and (3) establishing the f e a s i b i l i t y of using GPSG
f o r interactions with an inferencing knowledge
base
Section 2 describes the general a r c h i t e c t u r e
of the system Section 3 discusses the grammar
and the lexicon A b r i e f dicussion of the parsing
technique used in found in Section 4 Section 5
discusses the semantics of the system, and Section
6 presents ~ detailed example of a p a r s e - t r e e
complete with semantics S o m e typical examples
that the system can handle are given in the
Appendix
The system is based on recent developments
in syntax and semantics, reflecting a modular view
in which grammatical s t r u c t u r e an~ abstract logical
s t r u c t u r e have independent status The
understanding of a sentence occurs in a number of
stages, d i s t i n c t from each other and governed by
d i f f e r e n t principles of organization We are
opposed to the idea that language understanding
can be achieved without detailed s y n t a c t i c analysis There is, of course, a massive pragmatic component to human l i n g u i s t i c interaction But we hold that pragmatic inference makes use of a logically p r i o r grammatical and semantic analysis This can be f r u i t f u l l y modeled and exploited even in the complete absence of any modeling of pragmatic inferencing c a p a b i l i t y However, this does not entail an i n c o m p a t i b i l i t y between o u r work and research on modeling
interaction directly= Ultimately, a successful language understanding system wilt require both kinds of research, combining the advantages of precise, g r a m m a r - d r i v e n analysis of utterance
s t r u c t u r e and pragmatic inferencing based on discourse s t r u c t u r e s and knowledge of the world
We stress, however, that our concerns at this stage do not extend beyond the specification of a system that can e f f i c i e n t l y e x t r a c t literal meaning from isolated sentences of a r b i t r a r i l y complex grammatical s t r u c t u r e Future systems will e x p l o i t the literal meaning thus extracted in more ambitious applications that i n v o l v e pragmatic reasoning and discourse manipulation
The system embodies two features t h a t simultaneously promote e x t e n s i b i l i t y , f a c i l i t a t e modification, and increase efficiency The f i r s t is that its grammar is c o n t e x t - f r e e in the informal sense sometimes ( r a t h e r misleadingly) used in discussions of the autonomy of grammar and pragmatics: the syntactic rules and the semantic translation rules are independent of the specific application domain Our rules are not devised ad hoc with a p a r t i c u l a r application or t y p e of interaction in mind Instead, they are motivated
by recent theoretical developments in natural language s y n t a x , and evaluated by the usual linguistic canons of simplicity and g e n e r a l i t y No changes in the knowledge base or other exigencies
d e r i v i n g from a p a r t i c u l a r context of application can introduce a problem f o r the grammar (as
d i s t i n c t , of course, from the lexicon)
The second relevant feature is that the grammar i r the- system is c o n t e x t - f r e e in the sense
of formal language t h e o r y This makes the extensive mathematical l i t e r a t u r e on c o n t e x t - f r e e phrase s t r u c t u r e grammars (CF-PSG's) d i r e c t l y relevant to the e n t e r p r i s e , and permits utilization
of all the well-known techniques f o r the computational implementation of c o n t e x t - f r e e grammars It might seem anachronistic to base a language understanding system on c o n t e x t - f r e e
Trang 2p a r s i n g As Pratt (1975, 423) observes: " I t is
fashionable these days to want to avoid all
reference to c o n t e x t - f r e e grammars beyond w a r n i n g
s t u d e n t s t h a t t h e y are u n f i t f o r computer
consumption as f a r as computational l i n g u i s t i c s is
c o n c e r n e d " Moreover, w i d e l y accepted arguments
have been given in the l i n g u i s t i c s l i t e r a t u r e to the
effect t h a t some human languages are not even
weakly c o n t e x t - f r e e and t h u s cannot p o s s i b l y be
described by a CF-PSG However, Gazdar and
Pullum (1982) answer all of these arguments,
showing t h a t t h e y are e i t h e r formally i n v a l i d or
e m p i r i c a l l y u n s u p p o r t e d or both It seems
appropriate, t h e r e f o r e , to take a renewed i n t e r e s t
in the p o s s i b i l i t y of CF-PSG d e s c r i p t i o n of human
languages, both in computational l i n g u i s t i c s and in
l i n g u i s t i c research g e n e r a l l y
2 COMPONENTS OF THE SYSTEM
The l i n g u i s t i c basis of the GPSG l i n g u i s t i c
system resides in the w o r k reported in Gazdar
(1981, 1982) and Gazdar, Pullum, and Sag (1981) 1
These papers argue on empirical and theoretical
g r o u n d s t h a t c o n t e x t - f r e e n e s s is a desirable
c o n s t r a i n t on grammars It c l e a r l y would not be
so desirable, however, if (1) it led to lost
generalizations or (2) it resulted in an
unmanageable number of rules in the grammar
Gazdar (1982) proposes a way of simultaneously
a v o i d i n g these two problems L i n g u i s t i c
generalizations can be captured in a c o n t e x t - f r e e
grammar w i t h a metagrammor, i.e a h i g h e r - l e v e l
grammar t h a t generates the actual grammar as its
language The metagrammar has two kinds of
statements:
(1) Rule schemata These are
basically like o r d i n a r y rules, except t h a t
they contain variables r a n g i n g o v e r
categories and features
(2) Metarules These are implicational
statements, w r i t t e n in the form ===>B,
which c a p t u r e relations between rules A
metarule ===>t~ is i n t e r p r e t e d as s a y i n g ,
" f o r e v e r y r u l e t h a t is an i n s t a n t i a t i o n of
the schema =, there is a c o r r e s p o n d i n g rule
of form [5." Here 13 will be @(~), where 8
i s s o m e mapping specified p a r t l y b y the
general t h e o r y of grammar and p a r t l y in
the metarule formulation For instance,
it is taken to be p a r t of the t h e o r y of
grammar t h a t @ preserves unchanged the
subcategorization ( r u l e name) features of
rules (cf below)
The GPSG system also assumes the
R u l e - t o - R u l e Hypothesis, f i r s t advanced by
Richard Montague, which requires t h a t each
s y n t a c t i c rule be associated with a single semantic
I See also Gazdar, Pullum, Sag, and Wasow
(1982) f o r some f u r t h e r discussion and comparison
w i t h other work in the l i n g u i s t i c l i t e r a t u r e
t r a n s l a t i o n rule The s y n t a x - s e m a n t i c s match is realized as follows: each rule is a t r i p l e c o n s i s t i n g
of a r u l e name, a s y n t a c t i c statement (~ormally a local condition on node a d m i s s i b i l i t y ) , and a semantic t r a n s l a t i o n , s p e c i f y i n g how the
h i g h e r - o r d e r logic representations of the d a u g h t e r nodes combine to yield the c o r r e c t t r a n s l a t i o n f o r the mother =
The present GPSG system has f i v e components :
1 Grammar
a Lexicon
b Rules and Metarules
2 Parser and Grammar Compiler
3 Semantics Handler
4 Disambiguator
5 HIRE database
3 GRAMMAR AND LEXICON The grammar t h a t has been implemented t h u s
f a r is o n l y a subset of a much l a r g e r GPSG grammar t h a t we have defined on paper It nevertheless describes a broad sampling of the basic c o n s t r u c t i o n s of English, i n c l u d i n g a v a r i e t y
of prepositional phrase c o n s t r u c t i o n s , n o u n - n o u n compounds, the a u x i l i a r y system, g e n i t i v e s , questions and relative clauses, passives, and
e x i s t e n t i a l sentences
Each e n t r y in the lexicon contains two kinds
of information about a lexical item, s y n t a c t i c and semantic The s y n t a c t i c p a r t of an e n t r y consists
of a s y n t a c t i c f e a t u r e specification; this includes,
inter alia, information about any i r r e g u l a r morphology the item may have, and what is known
in the l i n g u i s t i c l i t e r a t u r e as strict subcategorization information In o u r terms the
l a t t e r is information l i n k i n g lexical items of a
p a r t i c u l a r category to specific environments in which t h a t category is i n t r o d u c e d by phrase
s t r u c t u r e rules Presence in the lexical e n t r y f o r
an item I of the feature R (where R is the name
of a rule) indicates t h a t / may appear in
s t r u c t u r e s admitted by R, and absence indicates
t h a t it may not
The semantic information in a lexical e n t r y is sometimes simple, d i r e c t l y l i n k i n g a lexical item with some HIRE predicate or relation With verbs
or p r e p o s i t i o n s , there is also a specification of what case roles to associate w i t h p a r t i c u l a r arguments (cf below f o r discussion of case roles) Expressions t h a t make a complex logical
c o n t r i b u t i o n to the sentence in which t h e y appear witl in general have complicated t r a n s l a t i o n s
T h u s every has the t r a n s l a t i o n -
2 T h e r e is a theoretical issue here about
w h e t h e r semantic t r a n s l a t i o n rules need to be stipulated f o r each s y n t a c t i c rule or w h e t h e r t h e r e
is a general way of p r e d i c t i n g t h e i r form See Klein and Sag (t981) f o r an attempt to develop the
l a t t e r view, which is not at p r e s e n t implemented
in o u r system
Trang 3(LAMBDA P (LAMBDA Q ((FORALL X (P X ) )
> (Q x ) ) ) ) , This indicates t h a t it denotes a f u n c t i o n which
takes as argument a set P, and r e t u r n s the set of
p r o p e r t i e s t h a t are t r u e of all members of t h a t set
(cf below f o r s l i g h t l y more detailed d i s c u s s i o n )
A t y p i c a l rule looks like t h i s :
<VPI09: V] -> V N]! N!I2: ((V N!!2) N!!)>
The exclamation marks here are o u r notation f o r
the bars in an X - b a r category system (See
J a c k e n d o f f (1977) f o r a t h e o r y of this
t y p e - - t h o u g h one which d i f f e r s on points of detail
from o u r s ) The rule has the form <a: b: c>
Here a is the name 'VP109'; b is a condition t h a t
will admit a node labeled ' V ! ' if it has t h r e e
d a u g h t e r nodes labeled r e s p e c t i v e l y 'V' ( v e r b ) ,
' N i t ' (noun phrase at the second bar l e v e l ) , and
' N I ! ' (the numeral 2 being merely an index to
permit reference to a specific symbol in the
semantics, t h e metarules, and t h e rule compiler,
and is not a p a r t of the category l a b e l ) ; and c is
a semantic t r a n s l a t i o n rule s t a t i n g t h a t the V
c o n s t i t u e n t translates as a f u n c t i o n expression
t a k i n g as its argument the t r a n s l a t i o n of the
second N ! ! , the result being a f u n c t i o n expression
to be applied to the t r a n s l a t i o n of the f i r s t N ! !
By a general convention in the theory of
grammar, the rule name is one of the feature
values marked on the lexical head of any rule that
introduces a lexical category (as this one
introduces V) Only verbs marked with that
feature value satisfy this rule For example, if we
include in the lexicon the word give and assign to
it the feature VPI09, then this rule would generate
the verb phrase gave Anne a job
A t y p i c a l metarule is the passive metarule,
which looks like this ( i g n o r i n g semantics):
< P A S : <V! -> V NI! W > => <V! -> V [ P A S ] W > >
W is a s t r i n g v a r i a b l e r a n g i n g o v e r zero or more
c a t e g o r y symbols The metarule has the form <N:
<A> => <B>>, where N is a name and <A> and <B >
are schemata t h a t have rules as t h e i r instantiations
when a p p r o p r i a t e s u b s t i t u t i o n s are made f o r the
free v a r i a b l e s This metarule says t h a t f o r e v e r y
rule t h a t expands a v e r b phrase as v e r b followed
by noun phrase followed by a n y t h i n g else
( i n c l u d i n g n o t h i n g else), there is a n o t h e r rule that
expands v e r b phrase as v e r b with passive
morphology followed by w h a t e v e r followed the noun
phrase in the given rule The metarule PAS would
apply to grammar rule VP109 given above, y i e l d i n g
the rule:
<VP109: V! -> V[PAS] N { ! >
As we noted above, the rule number f e a t u r e is
p r e s e r v e d here, so we get Anne was given a job,
where the passive v e r b phrase is given a job,
but not *Anne was hired a job 3
Passive sentences are thus analyzed d i r e c t l y , and not reduced to the form of active sentences in
t h e course of being analyzed, in the way t h a t is
f a m i l i a r from w o r k on t r a n s f o r m a t i o n a l grammars and on ATN's However, this d o e s not mean t h a t
no relation between passives and t h e i r active
c o u n t e r p a r t s is expressed in the system, because
t h e r u l e s f o r a n a l y z i n g passives are in a sense
d e r i v a t i v e l y defined on the basis o f ' rules f o r
a n a l y z i n g actives
More d i f f i c u l t than t r e a t i n g passives and the like, and often cited as l i t e r a l l y impossible w i t h i n a
c o n t e x t - f r e e grammar'," is t r e a t i n g constructions like questions and r e l a t i v e clauses The a p p a r e n t
d i f f i c u l t y resides in the fact t h a t in a question like
Which employee has Personnel reported that Anne thinks has performed outstandingly?, the p o r t i o n
b e g i n n i n g with the t h i r d word must c o n s t i t u t e a
s t r i n g analyzable as a sentence e x c e p t t h a t at some
p o i n t it must lack a t h i r d person s i n g u l a r noun phrase in a position where such a noun phrase could otherwise have o c c u r r e d If it lacks no noun phrase, we get ungrammatical s t r i n g s of the
t y p e *Which employee has Personnel reported that Anne thinks Montague has performed outstandingly? If it lacks a noun phrase at a position where the v e r b agreement indicates something o t h e r than a s i n g u l a r one is r e q u i r e d ,
we get ungrammaticalities like *Which employee has Personnel reported that Anne thinks have performed outstandingly? The problem is thus one of g u a r a n t e e i n g a grammatical dependency across a c o n t e x t t h a t may be a r b i t r a r i l y wide, while keeping the grammar c o n t e x t - f r e e The
t e c h n i q u e used is i n t r o d u c e d into the l i n g u i s t i c
l i t e r a t u r e by Gazdar (1981) It involves an augmentation of the nonterminal v o c a b u l a r y of the grammar t h a t permits c o n s t i t u e n t s with "gaps" to
be t r e a t e d as not b e l o n g i n g to the same category
as similar constituents w i t h o u t gaps This would
be an unwelcome and inelegant enlargement of the grammar if it had to be done by means of case-by-case s t i p u l a t i o n , but again the use of a metagrammar avoids this Gazdar (1981) proposes
a new set of syntactic categories of the form a/B,
where ~ and 15 are categories from the basic nonterminal v o c a b u l a r y of the grammar These are called slash categories A slash category e/B may
be t h o u g h t of as r e p r e s e n t i n g a c o n s t i t u e n t of
c a t e g o r y = with a missing i n t e r n a l occurrence of !5
We employ a method of i n t r o d u c i n g slash categories
t h a t was suggested by Sag (1982): a metarule
s t a t i n g t h a t f o r e v e r y rule i n t r o d u c i n g some B
u n d e r = t h e r e is a parallel rule i n t r o d u c i n g 15/~
u n d e r =/~ In o t h e r words, any c o n s t i t u e n t can have a gap of t y p e ~" if one of its d a u g h t e r
c o n s t i t u e n t s does too Wherever this would lead to
a d a u g h t e r c o n s t i t u e n t with the label [/~' in some
3 ~ regard was given a job not as a passive
v e r b phrase itself b u t as a v e r b phrase c o n t a i n i n g the v e r b be plus a passive v e r b phrase containing
given and a job
4 See Pullum and Gazdar (1982) for references
Trang 4r u l e , a n o t h e r metarule allows a parallel rule
w i t h o u t the ~'/;r, and t h e r e f o r e defines rules t h a t
allow f o r actual g a p s - - i e , missing c o n s t i t u e n t s
In t h i s way, complete sets of rules f o r d e s c r i b i n g
the u n b o u n d e d dependencies f o u n d in i n t e r r o g a t i v e
and r e l a t i v e clauses can r e a d i l y be w r i t t e n Even
l o n g - d i s t a n c e agreement facts can be (and are)
c a p t u r e d , since the m o r p h o s y n t a c t i c features
r e l e v a n t to a specific case of agreement are
p r e s e n t in the feature composition of any given ~'
4 PARSING
The system is i n i t i a l i z e d b y e x p a n d i n g out
the g r a m m a r T h a t is, t i l e metarules are applied
to the rules to produce the f u l l rule set, which is
then compiled and used b y the parser Metarules
are not consulted d u r i n g the process of p a r s i n g
One might well wonder about the possible benefits
of the o t h e r a l t e r n a t i v e : a p a r s e r t h a t made the
m e t a r u l e - d e r i v e d rules to o r d e r each time t h e y
were needed, instead of c o n s u l t i n g a precompiled
l i s t T h i s p o s s i b i l i t y has been explored by Kay
(1982) Kay draws an analogy between metarules
and phonological rules, modeling both by means of
f i n i t e state t r a n s d u c e r s We believe t h a t t h i s line
is w o r t h p u r s u i n g ; however, the GPSG system
c u r r e n t l y operates off a precompiled set of rules
A p p l i c a t i o n of ten metarules to f o r t y basic
rules yielded 283 grammar rules in the 1/1/82
version of the GPSG system Since then the
grammar has been expanded somewhat, t h o u g h the
c u r r e n t version is still u n d e r g o i n g some
d e b u g g i n g , and the number of rules is unstable
T h e size of the g r a m m a r - p l u s - m e t a r u l e s system
grows b y a f a c t o r of f i v e or six t h r o u g h the rule
compilation The great practical advantage of
using a m e t a r u l e - i n d u c e d grammar is, t h e r e f o r e ,
t h a t the w o r k of designing and r e v i s i n g the system
of l i n g u i s t i c rules can proceed on a body of
statements t h a t is u n d e r t w e n t y p e r c e n t of the size
it would be if it were formulated as a simple list of
c o n t e x t - f r e e rules
The system uses a standard t y p e of
t o p - d o w n p a r s e r w i t h no Iookahead, augmented
s l i g h t l y to p r e v e n t it from looking f o r a given
c o n s t i t u e n t s t a r t i n g in a given spot more than
once It produces, in parallel, all legal parse
trees f o r a sentence, w i t h semantic t r a n s l a t i o n s
associated w i t h each node
5 SEMANTICS
The semantics h a n d l e r uses the t r a n s l a t i o n
rule associated w i t h a node to c o n s t r u c t its
semantics from the semantics of its d a u g h t e r s
T h i s c o n s t r u c t i o n makes crucial use of a procedure
t h a t we call Cooper storage ( a f t e r Robin Cooper;
see below) In the s p i r i t of c u r r e n t research in
formal semantics, each s y n t a c t i c c o n s t i t u e n t is
associated d i r e c t l y with a single logic expression
(modulo Cooper Storage), r a t h e r than any program
or p r o c e d u r e for p r o d u c i n g such an expression
O u r semantic analysis thus embraces the p r i n c i p l e
of " s u r f a c e c o m p o s i t i o n a l i t y " The semantic
representations d e r i v e d at each node are r e f e r r e d
to as the Logical Representation ( L R ) The disambiguator p r o v i d e s the c r u c i a l
t r a n s i t i o n from LR to H I R o E queries; the disambiguator uses information about the sort, or
domoin of definition, of v a r i o u s terms in the logical
r e p r e s e n t a t i o n One of the most i m p o r t a n t
f u n c t i o n s of the disambiguator is to eliminate parses t h a t do not make sense in the conceptual scheme of HIRE
HIRE is a relational database w i t h a certain amount of i n f e r e n c i n 9 c a p a b i l i t y It is implemented
in SPHERE, a database system which is a descendant of FOL ( d e s c r i b e d in Weyhrauch (1980)) Many of the relation-names o u t p u t by the d i s a m b i g u a t o r are d e r i v e d relations defined b y axioms in SPHERE The SPHERE e n v i r o n m e n t was important f o r t h i s a p p l i c a t i o n , since it was essential to have something t h a t could process
f i r s t - o r d e r logical o u t p u t , and SPHERE does j u s t
t h a t A noticeable recent t r e n d in database t h e o r y has been a move t o w a r d an i n t e r d i s c i p l i n a r y comingling of mathematical logic and relational database t e c h n o l o g y (see especially Gallaire and
M i n k e r (1978) and Gallaire, M i n k e r and Nicolas ( 1 9 8 ] ) ) We regard it as an important fact about the GPSG system t h a t links computational
l i n g u i s t i c s to f i r s t - o r d e r logical representation
j u s t as the w o r k r e f e r r e d to above has linked
f i r s t - o r d e r logic to relational database t h e o r y We believe t h a t SPHERE offers promising prospects f o r
a knowledge representation system t h a t is
p r i n c i p l e d and general in the way t h a t we have
t r i e d to e x e m p l i f y in o u r s y n t a c t i c and semantic rule system Filman, Lamping and Montalvo (]982)
p r e s e n t details of some capabilities of SPHERE t h a t
we have not as yet e x p l o i t e d in o u r w o r k ,
i n v o l v i n g the use of multiple c o n t e x t s to r e p r e s e n t
v i e w p o i n t s , beliefs, and modalities, which are
g e n e r a l l y regarded as i n s u p e r a b l e s t u m b l i n g - b l o c k s
to f i r s t - o r d e r logic approaches
T h u s f a r the l i n g u i s t i c w o r k we have described has been in keeping w i t h GPSG presented in the papers cited above However two semantic innovations have been i n t r o d u c e d to
f a c i l i t a t e the disambiguator's t r a n s l a t i o n from LR to
a HIRE q u e r y As a r e s u l t the l i n g u i s t i c system version of LR has two new p r o p e r t i e s :
(1) The intensional logic of the p u b l i s h e d work was set aside and LR was designed to be an extensional f i r s t - o r d e r language A l t h o u g h
c o n s t i t u e n t t r a n s l a t i o n s b u i l t up on the way to a root node may be s e c o n d - o r d e r , the system- maintains f i r s t - o r d e r r e d u c i b i l i t y T h i s
r e d u c i b i l i t y is i l l u s t r a t e d by the f o l l o w i n g analysis
of noun phrases as s e c o n d - o r d e r p r o p e r t i e s (essentially the analysis of Montague ( ] 9 7 0 ) ) For example, the p r o p e r name Egon and the q u a n t i f i e d noun phrase every opplicant are both t r a n s l a t e d as sets of p r o p e r t i e s :
77
Trang 5Egon = LAMBDA P (P EGON)
E v e r y a p p l i c a n t = LAMBDA P (FORALL X
( ( A P P L I C A N T X) > (P X ) ) )
Egon is t r a n s l a t e d as the set of p r o p e r t i e s
t r u e of Egon, and every applicant, as the set of
p r o p e r t i e s t r u e of all applicants Since basic
predicates in the logic are f i r s t - o r d e r , n e i t h e r of
• a r g u m e n t of any basic predicate; instead the
argument is some u n i q u e e n t i t y - l e v e l v a r i a b l e
which is later bound to the q u a n t i f i e r - e x p r e s s i o n
b y q u a n t i f y i n g in T h i s t e c h n i q u e is essentially
One advantage of t h i s method of " d e f e r r i n g " the
i n t r o d u c t i o n into the i n t e r p r e t a t i o n process of
phrases w i t h q u a n t i f i e r meanings is that it allows
f o r a n a t u r a l , n o n s y n t a c t i c treatment of scope
ambiguities A n o t h e r is t h a t w i t h a logic limited to
f i r s t - o r d e r predicates, t h e r e is still a natural
t r e a t m e n t f o r coordinated noun phrases of
a p p a r e n t l y heterogeneous semantics, such as Egon
and every applicant
(2) HIRE represents events as objects All
objects in the knowledge base, i n c l u d i n g events,
belong to various sorts For o u r purposes, a sort
is a set HIRE relations are declared as p r o p e r t i e s
of entities w i t h i n p a r t i c u l a r sorts For example,
t h e r e is an employment sort, c o n s i s t i n g of various
p a r t i c u l a r employment events, and an
employment.employee relation as well as
employment organization and employment.manager
relations More conventional relations, like
employee.manager are defined as joins of the basic
some f a i r l y obvious connections between v e r b s and
events (between, say, the v e r b work and events
of employment), and to represent d i f f e r e n t
relations between a v e r b and its arguments as
d i f f e r e n t f i r s t - o r d e r relations between an event
and its p a r t i c i p a n t s A l t h o u g h the lexical
treatment sketched here is c l e a r l y domain
d e p e n d e n t (the English v e r b work doesn't
necessarily i n v o l v e employment e v e n t s ) , it was
chosen p r i m a r i l y to s i m p l i f y the ontology of a f i r s t
implementation As an a l t e r n a t i v e , one might
consider associating work w i t h events of a sort
labor, one of whose subsorts was an employment
e v e n t , d e f i n i n g employments as those labors
associated w i t h an o r g a n i z a t i o n
Whichever choice one makes about the basic
e v e n t - t y p e s of v e r b s , the mapping from verbs to
HIRE relations cannot be d i r e c t Consider a
sentence like Anne work5 for Egon The HIRE
employment.manager relation of a p a r t i c u l a r
employment event and a p a r t i c u l a r manager, and
t h e employment.employee relation of t h a t same
e v e n t and ,~knl,~ Yet where Egon in t h i s example
is picked out w i t h the employment manager
relation, the sentence Anne worl<s for HP will need
to pick out HP w i t h the employment.organization
relation I n o r d e r to accomodate t h i s
m a n y - t o - m a n y mapping between a v e r b and
p a r t i c u l a r relations in a knowledge base, the
lexicon stipulates special relations t h a t l i n k a
v e r b to its eventual arguments Following Fillmore
(1968), these mediating relations are called case roles
The disambiguator narrows the case roles down to specific knowledge base relations To take a simple example, Anne works for HP has a logical representation r e d u c i b l e to:
(EXISTS SIGMA (AND (EMPLOYMENT SIGMA)
(AG SIGMA ANNE) (LOC SIGMA H P ) ) ) Here SIGMA is a v a r i a b l e o v e r s i t u a t i o n s or event
i n s t a n t i a t i o n s , s The formula may be read, " T h e r e
is an employment-situation whose A g e n t is A n n e and whose Location is H P " The lexical e n t r y f o r
work supplies the information t h a t its subject is an
A g e n t and its complement a Location T h e disambiguator now needs to f u r t h e r specify the case roles as HIRE relations It does t h i s by
t r e a t i n g each atomic formula in the expression locally, using the f a c t t h a t A n n e is a person in
o r d e r to i n t e r p r e t AG, and the fact t h a t HP is
an o r g a n i z a t i o n in o r d e r to i n t e r p r e t LOC In t h i s case, it i n t e r p r e t s the AG role as
employment.employee and the LOC role as
employment.organization
T h e advantages of u s i n g the roles in Logical Representation, r a t h e r than going d i r e c t l y to predicates in a knowledge base, include (1) the
a b i l i t y to i n t e r p r e t at least some prepositional phrases, those known as a d j u n c t s , w i t h o u t
s u b c a t e g o r i z i n g v e r b s specially f o r them, since the case role may be supplied e i t h e r b y a v e r b or a
p r e p o s i t i o n (2) the option of i n t e r p r e t i n g 'vague' v e r b s such as have and give using case
roles w i t h o u t e v e n t t y p e s These v e r b s , t h e n , become " p u r e l y " relational For example, the representation of Egon gave Montague a job would be:
(EXISTS SIGMA (AND ((SO EGON) SIGMA)
((POS MONTAGUE) SIGMA) (EMPLOYMENT S I G M A ) ) ) Here SO 'source' w i l l pick out the same
employment.manager relation it did in the example above; and POS 'possession' is the same relation as
t h a t associated w i t h have Here the s i t u a t i o n - t y p e
is supplied by the t r a n s l a t i o n of the noun job It
is important to realize t h a t t h i s representation is
d e r i v e d w i t h o u t g i v i n g the noun phrase a job any special treatment The lexical e n t r y f o r give
contains the information t h a t the subject is the source of the d i r e c t object, and the d i r e c t object the possession of the i n d i r e c t object If t h e r e were lamps in o u r knowledge base, the d e r i v e d representation of Egon gave Montague a lamp would simply be the above formula w i t h the predicate
lamp replacing employment The possession relation would hold between Montague and some
5 O u r w o r k in t h i s domain has been i n f l u e n c e d
b y the recent papers of Barwise and Perry on
" s i t u a t i o n semantics"; see e c Barwise and P e r r y (1982))
Trang 6lamp, and the disambiguator would r e t r i e v e
w h a t e v e r knowledge-base relation kept t r a c k of
such matters
Two active research goals o f the c u r r e n t
project are to give all lexical entries domain
i n d e p e n d e n t r e p r e s e n t a t i o n s , and to make all
knowledge base-specific predicates and relations
the e x c l u s i v e p r o v i n c e of the disambiguator One
i m p o r t a n t means to t h a t end is case roles, which
allow us a level of a b s t r a c t , p u r e l y " l i n g u i s t i c "
relations to mediate between logical representations
and HIRE queries A n o t h e r is the use of general
e v e n t types such as labor, to replace e v e n t - t y p e s
specific to HIRE, such as employments The case
roles maintain a separation between the domain
representation language and LR Insofar as t h a t
separation is achieved, then absolute p o r t a b i l i t y
of the system, up to and i n c l u d i n g the l e x i c o n , is
an attainable goal
Absolute p o r t a b i l i t y o b v i o u s l y has immediate
practical benefits f o r any system t h a t expects to
handle a large fragment of English, since the
e f f o r t in moving from one application to a n o t h e r
will be limited to " t u n i n g " the disambiguator to a
new o n t o l o g y , and adding "specialized" v o c a b u l a r y
The actual rules g o v e r n i n g the p r o d u c t i o n of
f i r s t - o r d e r logical representations make no
reference to the facts of HIRE The question
remains of j u s t how portable the c u r r e n t lexicon
is; the answer is t h a t much of it is already domain
i n d e p e n d e n t Q u a n t i f i e r s like e v e r y (as we saw in
the discussion of NP semantics) are expressed as
logical constants; v e r b s like give are expressed
e n t i r e l y in terms of the case relations t h a t hold
among t h e i r arguments Verbs like w o r k can be
abstracted away from the domain by a simple
e x t e n s i o n The obvious goal is to t r y to g i v e
domain independent representations to a core
v o c a b u l a r y of English t h a t could be used in a
v a r i e t y of application domains
6 AN EXAMPLE
We shall now g i v e a s l i g h t l y more detailed
i l l u s t r a t i o n of how the s y n t a x and compositional
semantics rules w o r k We are still s i m p l i f y i n g
c o n s i d e r a b l y , since we have selected an example
where rote frames are not i n v o l v e d , and we are
not employing features on nodes Here we have
the grammar of a t r i v i a l subset of English:
<$1: S -> NP VP: (NP Vp)>"
<NPI: NP -> DET N: (DET N)>
<VPI: VP -> V NP: i V NP)>
<VP2: VP -> V A: A>
Suppose t h a t the lexicon associated w i t h the above
rules is:
< e v e r y : D E T : (LAMBDA P (LAMBDA Q
(FORALL X ((P X) IMPLIES (Q X ) ) ) ) ) >
<applicant: N: APPLICANT>
< i n t e r v i e w e d : V [ ( R U L E V P 1 ) ] : INTERVIEW>
<Bill: NP: (LAMBDA P (P B I L L ) ) >
<is: V [ ( R U L E MP2)]: (BE)>
<competent: A: (LAMBDA Y
( E X P E R T L E V E L HIGH Y ) ) >
The s y n t a x of a lexical e n t r y is <L: C: T>, where
L is t h e spelling of the item, C is its grammatical category and f e a t u r e specification ( i f o t h e r than
the d e f a u l t set) and T is its translation into LR
Consider how we assign an LR to a sentence
like E v e r y applicant is competent The t r a n s l a t i o n
of e v e r y supplies most of the s t r u c t u r e of the
u n i v e r s a l q u a n t i f i c a t i o n needed in LR I t represents a f u n c t i o n from p r o p e r t i e s to f u n c t i o n s from p r o p e r t i e s to t r u t h values, so when applied
to applicant it y i e l d s a c o n s t i t u e n t , namely e v e r y applicant, which has one of the p r o p e r t y slots
f i l l e d , and represents a f u n c t i o n from p r o p e r t i e s
to t r u t h - v a l u e s ; it is:
(LAMBDA P (FORALL X ( ( A P P L I C A N T X) IMPLIES (P X ) ) ) )
T h i s f u n c t i o n can now be applied to the f u n c t i o n
denoted b y competent, i.e
( LAMBDA Y ( E X P E R T L E V E L HIGH Y ) )
T h i s y i e l d s : (FORALL X ( ( A P P L I C A N T X ) IMPLIES
(LAMBDA Y ( E X P E R T L E V E L HIGH Y ) ) X ) ) And a f t e r one more lambda-conversion, we have:
( FORALL X ( ( A P P L I C A N T X ) IMPLIES
( E X P E R T L E V E L HIGH X ) ) ) Fig 1 shows one parse tree t h a t would be generated b y the above rules, t o g e t h e r w i t h its logical t r a n s l a t i o n The sentence is B i l l
i n t e r v i e w e d e v e r y applicant The complicated
t r a n s l a t i o n of the VP is necessary because INTERVIEW is a one-place predicate t h a t takes an
e n t i t y - t y p e argument, not the t y p e of f u n c t i o n
t h a t e v e r y applicant denotes We t h u s defer combining the NP t r a n s l a t i o n w i t h the v e r b b y using Cooper storage A t r a n s l a t i o n w i t h a stored
NP is represented above in a n g l e - b r a c k e t s Notice
t h a t at the S node the NP e v e r y applicant is still
stored, b u t the subject is not stored It has
d i r e c t l y combined w i t h the VP, by t a k i n g the VP
as an argument INTERVIEW is itself a two-place predicate, b u t one of its argument places has been
f i l l e d by a p l a c e - h o l d i n g variable, X1 T h e r e is th~Js ~ o n l y one slot left The t r a n s l a t i o n can now
be completed via the operations of Storage Retrieval and lambda c o n v e r s i o n F i r s t , we s i m p l i f y the p a r t of the semantics t h a t i s n ' t in storage:
79
Trang 7Fig 1 A typical parse tree
S
<((LAMBDA P (P BILL))(INTERVIEW X1)),
<(LAMBDA P (FORALL X ((APPLICANT X) IMPLIES (P X ) ) ) ) >>
NP ((LAMBDA P (P B I L L ) ) )
VP
<(INTERVIEW X1) (LAMBDA P (FORALL X ((APPLICANT X) IMPLIES (P X ) ) ) ) >
INTERVIEW
I
interviewed
NP (LAMBDA P (FORALL X ((APPLICANT X) IMPLIES (P X ) ) ) )
LAMBDA Q (LAMBDA P (FORALL X ((Q X) IMPLIES (P X ) ) ) ) every
((LAMBDA P (P BILL))(INTERVIEW X1)) :>
((INTERVIEW X l ) BILL)
The function (LAMBDA P (P BILL)) has been
evaluated with P set to the value (INTERVIEW
X1); this is a conventional lambda-conversion
The rule for storage retrieval is to make a
one-place predicate of the sentence translation by
lambda-binding the placeholding variable, and then
to apply the NP translation as a function to the
result The S-node translation above becomes:
((LAMBDA P
(FORALL X
((APPLICANT X) IMPLIES (P X ) ) ) )
(LAMBDA X1 ((INTERVIEW X1) B I L L ) ) )
[lambda-conversion] = = >
(FORALL X ((APPLICANT X) IMPLIES
((LAMBDA X1
((INTERVIEW X1) BILL)) X ) ) )
[lambda-conversion] : : >
(FORALL X ((APPLICANT X) IMPLIES
(((INTERVIEW X) B I L L ) ) ) )
This is the desired final result
7 CONCLUSION What we have outlined is a natural language system that is a direct implementation of a linguistic theory We have argued that in this case the linguistic theory has the special appeal of computational t r a c t a b i l i t y (promoted by its context-freeness), and that the system as a whole offers the hope of a happy marriage of linguistic
computer applications The system's theoretical underpinnings give it compatibility with c u r r e n t research in Generalized Phrase Structure Grammar, and its augmented f i r s t order logic gives it compatibility with a whole body of ongoing research in the field of model-theoretic semantics The work done thus far is only the f i r s t step on the road to a robust and practical natural language processor, but the guiding principle throughout has been e x t e n s i b i l i t y , both of the grammar, and of the applicability to various spheres of computation
ACKNOWLEDGEMENT Grateful acknowledgement is given to two brave souls, Steve Gadol and Bob Kanefsky, who helped give this system some of its c r e d i b i l i t y by implementing the actual hook-up with HIRE Thanks are also due Robert Filman and Bert Raphael for helpful comments on an early version
of this paper And a special thanks is due Richard Weyhrauch, for encouragement, wise advice, and comfort in times of debugging
Trang 8APPENDIX
This appendix lists some sentences that are
actually translated into HIRE and answered by the
current system Declarative sentences presented
to the system are evaluated with respect with
their t r u t h value in the usual way, and thus also
function as queries
SIMPLE SENTENCES
1 HP employs Egon
2 Egon works for HP
3 HP offered Montague the position
4 HP gave Montague a job
5 Montague got a job from HP
6 Montague's job is at HP
7 HP's offer was to Capulet
8 Montague had a meeting with Capulet
9 Capulet has an offer from Xerox
10 Capulet is competent
IMPERATIVES AND QUESTIONS
11 Find the programmers in CRC
who attended the meeting
12 How many applicants for the
position are there?
13 Which manager interviewed Capulet?
14 Whose job did Capulet accept?
15 Who is a department manager?
16 Is there a LISP programmer
who Xerox hired?
17 Whose job does Montague have?
18 How many applicants
did Capulet interview?
RELATIVE CLAUSES
19 The professor whose student Xerox
hired visited HP
20 The manager Montague met with hired
the student who attended Berkeley
NOUN-NOUN COMPOUNDS
21 Some Xerox programmers visited HP
22 Montague interviewed a job applicant
23 Who are the department managers?
24 How many applicants have a LISP
programming background?
COORDINATION
25 Who did Montague interview and visit?
26 Which department's position did
every programmer and a manager
from Xerox apply for?
PASSIVE AND EXISTENTIAL SENTENCES
27 Egon was interviewed by Montague
28 There is a programmer
who knows LISP in CRC
INFINITIVAL COMPLEMENTS
29 Montague managed to get a job at HP
30 HP failed to hire a programmer
with Lisp programming background
REFERENCES
"Situations and attitudes." Journal of Philosophy 78, 668-692
Cooper, Robin 1 9 7 5 Montague's Semantic Theory and Transformational Syntax
Doctoral dissertation, University of Massachusetts, Amherst
Fillmore, Charles 1968 "The Case for Case."
In Bach, Emmon and Robert Harms
Universals in Linguistic Theory New York: Holt, Rinehart and Winston
Filman, Robert E., John Lamping, and Fanya Nlontalvo 1 9 8 2 "Metalanguage and
presentation at the AAAI National Conference on Artificial Intelligence, Carnegie-Mellon University, Pittsburgh, Pennsylvania
Gallaire, Herv$, and Jack Minker, eds 1978
Logic and Data Bases New York: Plenum Press
Gallaire, Herv$, Jack Minker, and Jean Marie Nicolas, eds 1 9 8 1 Advances in Date Base Theory New York: Plenum Press Gazdar, Gerald 1 9 8 1 "Unbounded Dependencies and Coordinate S t r u c t u r e "
Linguistic Inquiry 12, 155-184
Gazdar, Gerald 1 9 8 2 "Phrase Structure Grammar." In Pauline Jacobson and Geoffrey K Pullum, eds The Nature of Syntactic Representation Dordrecht: D Reidel
Gazdar, Gerald, Geoffrey K Pullum, and Ivan
A Sag In press "Auxiliaries and Related Phenomena." Language
Gazdar, Gerald, Geoffrey K Pullum, Ivan A
"Coordination and Transformational Grammar" Linguistic Inquiry 13
Jackendoff, Ray 1977 ~" Syntax Cambridge: MIT Press
Kay, Martin 1982 "When Metarules are not Metarules." Ms Xerox Palo Alto Research Center
Montague, Richard 1 9 7 0 "The Proper Treatment of Quantification in English."
in Richmond Thomason, ed 1974 Formal Philosophy New Haven: Yale U n i v e r s i t y Press
Pratt, Vaughan R 1 9 7 5 "LINGOL a progress r e p o r t " Advance Papers of the Fourth /nternational Joint Conference on Artificia/ /nte//igence, Tbilisi, Georgia, USSR, 3-8 September 1975 Cambridge, MA: Artificial Intelligence Laboratory 422-428
Pullum, Geoffrey K and Gerald Gazdar
1982 Natural languages and context-free languages Linguistics and phitos.ophy 4 Sag, Ivan A 1982 "Coordination, Extraction,
Grammar." Linguistic Inquiry 13
Weyhrauch, Richard W 1980 "Prolegomena to
a theory of mechanized formal reasoning." Artificial Intelligence, 1, pp 133-170