1. Trang chủ
  2. » Luận Văn - Báo Cáo

Báo cáo khoa học: "COOKINGUP REFERRING EXPRESSIONS" pot

8 163 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Cooking up referring expressions
Tác giả Robert Dale
Trường học University of Edinburgh
Chuyên ngành Cognitive Science
Thể loại báo cáo khoa học
Thành phố Edinburgh
Định dạng
Số trang 8
Dung lượng 494,79 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

The most notable features of the approach taken here are as follows: a the use of a sophisticated un- derlying ontology, to permit the representation of non-singular entities; b the use

Trang 1

C O O K I N G U P R E F E R R I N G E X P R E S S I O N S

Robert Dale Centre for Cognitive Science, University of Edinburgh

2 Buccleuch Place, Edinburgh EH8 9LW, Scotland email: rda~uk, a c ed epJ.stemi~nss, c s u c l ac uk

A B S T R A C T

This paper describes the referring expression

generation mechanisms used in EPICURE, a com-

puter program which produces natural language

descriptions of cookery recipes Major features of

the system include: an underlying ontology which

permits the representation of non-singular entities;

a notion of diacriminatory power, to determine

what properties should be used in a description;

and a PATR-like unification grammar to produce

surface linguistic strings

I N T R O D U C T I O N

EPICURE (Dale 1989a, 1989b) is a natural lan-

guage generation system whose principal concern

is the generation of referring expressions which

pick out complex entities in connected discourse

In particular, the system generates natural lan-

guage descriptions of cookery recipes Given a top

level goal, the program first decomposes that goal

recursively to produce a plan consisting of oper-

ations at a level of detail commensurate with the

assumed knowledge of the hearer In order to de-

scribe the resulting plan, EPICURE then models

its execution, so that the processes which produce

referring expressions always have access to a rep-

resentation of the ingredients in the state they are

in at the time of description

This paper describes that part of the system

responsible for the generation of subsequent refer-

ring expressions, i.e., references to entities which

have already been mentioned in the discourse The

most notable features of the approach taken here

are as follows: (a) the use of a sophisticated un-

derlying ontology, to permit the representation of

non-singular entities; (b) the use of two levels of se-

mantic representation, in conjunction with a model

of the discourse, to produce appropriate anaphoric

referring expressions; (c) the use of a notion of dis-

crimiaatory power, to determine what properties

should be used in describing an entity; and (d) the

use of a PATR-1ike unification grammar (see, for ex-

ample, K a r t t u n e n (1986); Shieber (1986)) to pro-

duce surface linguistic strings from input semantic structures

T H E R E P R E S E N T A T I O N O F

I N G R E D I E N T S

In most natural language systems, it is assumed that all the entities in the domain of discourse are singular individuals In more complex domains, such as recipes, this simplification is of limited value, since a large proportion of the objects we find are masses or sets, such as those described

by the noun phrases two ounces of salt and three pounds of carrots respectively

In order to permit the representation of enti- ties such as these, EPICURE makes use of a notion

of a generalized physical object or physob] This

permits a consistent representation of entities irre- spective of whether they are viewed as individuals, masses or sets, by representing each as a knowledge base entity (KBE) with an appropriate structure at

tribute T h e knowledge base entity corresponding

to three pounds of carrots, for example, is that

shown in figure 1

A knowledge base entity models a physobj in a particular state An entity may change during the

course of a recipe, as processes are applied to it:

in particular, apart from gaining new properties such as being peeled, chopped, etc., an ingredient's

structure m a y change, for example, from set to

mass Each such change of state results in the creation of a new knowledge base entity Suppose, for example, a grating event is applied to our three pounds of carrots between states so and sl: the entity shown in figure i will then become a mass of grated carrot, represented in state sl by the KBE shown in figure 2

B U I L D I N G A R E F E R R I N G

E X P R E S S I O N

To construct a referring expression corresponding

to a knowledge base entity, we first build a deep se-

Trang 2

KBE - ~

indus = ZO state = s o

structure = set

q u a n t i t y = [ num~erUnit = pound= 3 ]

substance = carrot

-, packaging = [ e h a p e = carrot ]

• = regular

8| Ze

Figure 1: The knowledge base entity corresponding to three pounds o f carrots

KBE =

irides = zo state Sl

s t r t ~ | u r c = m~8o

q u 4 n t i t y = [ u n i t = pound ]

spec = n u m b e r = 3

substar~e = carrot

grated = +

Figure 2: T h e knowledge base entity corresponding to three pound8 of grated carrot

mantic structure which specifies the semantic con-

tent of the noun phrase to be generated W e call

this the recoverable semantic content, since it con-

sists of just that information the hearer should

be able to derive from the corresponding utter-

ance, even if that information is not stated explic-

itly: in particular, elided elements and instances of

oae-anaphora are represented in the deep seman-

tic structure by their more semantically complete

counterparts, as w e will see below

F r o m the deep semantic structure, a surface

semantic structure is then constructed Unlike the

deep semantic structure, this closely matches the

syntactic structure of the resulting noun phrase,

and is suitable for passing directly to a PATR-like

unification grammar It is at the level of surface

semantic structure that processes such as elision

and one-anaphora take place

P R O N O M I N A L I Z A T I O N

W h e n an entity is to be referred to, w e first check

to see if pronominalisation is possible S o m e pre-

vious approaches to the pronominalization deci

contextual factors (see, for example, McDonald (1980:218-220)) The approach taken here is rel- atively simple EPICURE makes use of a discourse model which distinguishes two principal compo- nents, corresponding to Grosz's (1977) distinction between local focus and global focus We call that part of the discourse model corresponding to the local focus cache memory: this contains the lex- ical, syntactic and semantic detail of the current utterance being generated, and the same detail for the previous utterance Corresponding to global focus, the discourse model consists of a number

of hierarchically-arranged focua spaces, mirroring the structure of the recipe being described These focus spaces record the semantic content, but not the syntactic or lexlcal detail, of the remainder

of the preceding discourse In addition, w e m a k e use of a notion of discourse centre: this is intu- itively similar to the notion of centering suggested

by (]ross, Joshi and Weinstein (1983), and corre- sponds to the focus of attention in the discourse

In recipes, we take the centre to be the result of

Trang 3

the previous operation described Thus, after an

utterance like Soak the butterbeaa.s the centre is

the entity described by the noun phrase the but-

terbeans Subsequent references to the centre can

be pronominalized, so t h a t the next instruction in

the recipe might then be Drain and dnse tltem

Following Grosz, Joshi and Weinstein (1983),

references to other entities present in cache m e m -

ory m a y also be pronominalized, provided the cen-

tre is pronominalized 1

If the intended referent is the current centre,

then this is marked as part of the status infor-

mation in the deep semantic structure being con-

structed, and a null value is specified for the struc-

ture's descriptive content In addition, the verb

case frame used to construct the utterance speci-

fies whether or not the linguistic realization of the

entity filling each case role is obligatory: as we

will see below, this allows us to model a common

linguistic phenomenon in recipes (recipe contezt

empty objects, after M a s s a m and Roberge (1989))

tory, the resulting deep semantic structure is then

as follows:

D$ =

i n d e : : :

[ N~en = +

statttm : e.cntrs : t

"Pec = [ "PC=q) I

This will be realized as either a pronoun or an

elided NP, generated from a surface semantic struc-

ture which is constructed in accordance with the

following rules:

• If the status includes the features [centre, +]

and [oblig, +], then there should be a cor-

responding element in the surface semantic

structure, with a null value specified for the

descriptive content of the noun phrase to be

generated;

t W e do n o t p e r m i t p r o n o m i n a l reference to e n t i t i e s last

m e n t i o n e d before t h e p r e v i o u s u t t e r a n c e : s u p p o r t for t h i s

r e s t r i c t i o n c o m e s f r o m a s t u d y b y H o b b s , who, in a s a m -

ple of one h u n d r e d c o n s e c u t i v e e~.amples o f p r o n o u n s f r o m

e a c h of t h r e e v e r y different t e x t s , f o u n d t h a t 98% of a n -

t e c e d e n t s were either in t h e s a m e o r p r e v i o u s s e n t e n c e

( H o b b s 1978:322-323) However, see Dale (1988) for a s u g -

gestion as to how t h e few i n s t a n c e s of/onc-dbt~a.e pronom-

inalimtion t h a t do exist m i g h t b e e x p l a i n e d b y m e a n s of a

t h e o r y of discourse s t r u c t u r e like t h a t s u g g e s t e d b y G r o s s

a n d Sidner (1986)

• If the status includes the features [centre, +] and [oblig,-], then this participant should

be o m i t t e d from the surface semantic struc- ture altogether

In the former case, this will result in a pronominal reference as in Remove them, where the surface se- mantic structure corresponding to the pronominal form is as follows:

i n d ~ z = z

s t a t u s : [

SS =

"1

g i v e n = + |

J

c e n t r e = ~r oblig = +

[ n u ~ = pl

a g r

8p~ ~ - C CG$~ = GCC

& * c =

However, if the participant is marked as non-obligatory, then reference to the entity is omitted, as in the following:

Fry the onions

Add the garlic ~b

Here, the case frame for add specifies t h a t the in- direct object is non-obllgatory; since the entity which fills this case role is also the centre, the complete prepositional phrase to the onions can

be elided Note, however, t h a t the entity corre- sponding to the onions still figures in the deep semantic structure; thus, it is integrated into the discourse model, and is deemed to be p a r t of the semantic content recoverable by the hearer

F U L L D E F I N I T E N O U N P H R A S E

R E F E R E N C E

If pronominalization is ruled out, we have to build

an a p p r o p r i a t e description of the intended refer- ent In EPICURE, the process of constructing a description is driven by two principles, very like Gricean conversational m a x i m s (Grice 1975) The p~'nciple of adequacy requires t h a t a referring ex- pression should identify the intended referent un- ambiguously, and provide sufficient information to serve the purpose of the reference; and the princi- ple of e~ciency, pulling in the opposite direction, requires t h a t the referring expression used must not contain more information t h a n is necessary for the task at hand 2

These principles are implemented in EPICUItE

2Similar c o n s i d e r a t i o n s a r e d i s c u s s e d b y A p p e l t (1985)

Trang 4

D S ~ -

inde= ~ =

status = [ g i v e n = + u n i q u e = +

e e l ' n ~ -

o p e c =

agr =

tvpe=

I

countable = + ]

J

n u m b e r = pl category : olive

$ize : regular props =

pitted = +

Figure 3: T h e deep semantic structure corresponding to the pitted olives

#tat*t =

epee =

a/yen= + ]

unique = + [countable : -~ ] agr = n u m b e r = pl ]

head = olive dee¢ = mad= [ head = pltted ]

Figure 4: T h e surface semantic structure corresponding to the pitted olives

by means of a notion of discriminatory power Sup-

pose that w e have a set of entities U such that

U = { z l , z 2 , , x , }

and that w e wish to distinguish one of these en-

tities, zl, from all the others Suppose, also, that

the domain includes a n u m b e r of attributes (a I, a~,

and so on), and that each attribute has a n u m b e r

of permissible values {v,,t, v,,2, and so on}; and

that each entity is described by a set of attribute-

value pairs In order to distinguish z~ from the

other entities in U, w e need to find some set of

attribute-value pairs which are together true of zl,

but of no other entity in U This set of attribute-

value pairs constitutes a distinguishing descriptior,

of xl with respect to the ,~ontext U A mini-

mal distinguishing description is then a set of such

attribute-value pairs, where the cardinality of that

set is such that there are no other sets of attribute-

value pairs of lesser cardinality which are sufficient

to distinguish the intended referent

We find a minimal distinguishing description

by observing that different attribute-value pairs differ in the effectiveness with which they distin- guish an entity from a set of entities Suppose

U has N elements, where N > I Then, any attribute-value pair true of the intended referent

zl will be true of n entities in this set, where

n >_ i For any attribute-value pair < a, v > that

is true of the intended referent, w e can compute the discriminatory power (notated here as F) of that attribute-value pair with respect to U as fol- lows"

~'(< ~,v>, U ) = ~-'~ l < n < N

F thus has as its range the interval [0,1], where

a value of 1 for a given attribute-value pair indi- cates that the attribute-value pair singles out the intended referent from the conte×t, and a value of

Trang 5

D S -~-

s t a t u s =

SSf~t

SpSC - ~

[ #/uen= + ]

unique = +

n u m b e r = s g

a g r = c o u n t a b l e +

t y p e =

]

c a t e g o r l ! = c a p s i c u m

r

I eolour = red properties

L s i z e = s m a l l

F i g u r e 5: T h e d e e p s e m a n t i c s t r u c t u r e c o r r e s p o n d i n g to the small red capsicum

S S =

i n d e z = z 2

i

Jpsc =

_ ~ nu,n~sr= so ]

agr- [ countable = + J

Figure 6: T h e surface semantic structure corresponding to the small red one

0 indicates that the attribute-value pair is of no

assistance in singling out the intended referent

Given an intended referent a n d a set of entities

from which the intended referent m u s t be distin-

guished, this notion is used to determine which set

of properties should be used in building a descrip-

tion w h i c h is both adequate a n d efficient 3 There

remains the question of h o w the constituency of

the set U of entities is determined: in the present

work, w e take the context always to consist of the

working set This is the set of distinguishable enti-

sstrictly speaking, this mechanism is only applicable in

the form described here to those properties of an entity

which are realizable by what are known as abJolute (or t~-

tereect/ee or pred~tiee) adjectives (see, for example, K a m p

(1975), Keenan and FaRm (1978)) This is acceptable in

the current domain, where many of the adjectives used are

derived from the verbs used to describe processes applied

to entities

ties in the d o m a i n at a n y given point in time: the constituency of this set changes as a recipe pro- ceeds, since entities m a y be created or destroyed 4 Suppose, for example, w e determine that w e

m u s t identify a given object as being a set of olives which have been pitted (in a context, for example, where there are also olives which have not been pitted}; the corresponding deep semantic struc- ture is then as in figure 3

Note that this deep semantic structure can

be realized in at least t w o ways: as either the

4 A slightly more sophisticated approach would be to restrict U to exclude those entities which are, in G rosz and Sidner's (1986) terms, only present in closed focus spaces However, the benefit gained from doing this (if indeed it is a valid thing to do) is minimal in the current context because

of the small number of entities we are dealing with

Trang 6

~ t a t t ~ = .[ ]

number = pl agr = "ountable = +

8 p e c =

8 u b s t =

]

tltpe categorlt = pound ]

number = pl l

J

type = category = carrot ]

J

Figure 7: T h e deep semantic structure corresponding to three pounds of carrots

Both forms are possible, although they correspond

to different surface semantic structures Thus,

the generation algorithm is non-deterministic in

this respect (although one might imagine there are

other factors which determine which of the two re-

alizations is preferrable in a given context} T h e

surface semantic structure for the simpler of the

two noun phrase structures is as shown in figure 4

O N E A N A P H O R A

T h e algorithms employed in E P I C U R E also permit

the generation of onc-anaphora, as in

Slice the large green capsicum

N o w remove the top of the small red one

T h e deep semantic structure corresponding to the

noun phrase the small red one is as shown in fig-

ure 5

T h e mechanisms which construct the surface

semantic structure determine whether one-anaphora

is possible by comparing the deep semantic struc-

ture corresponding to the previous utterance with

that corresponding to the current utterance, to

identify any elements they have in c o m m o n T h e

two distinct levels of semantic representation play

an important role here: in the deep semantic struc-

ture, only the basic semantic category of the de•

scription has special status (this is similar to Wel>-

her's (1979) use of restricted quantification), whereas

the embedding of the surface semantic structure's

dcsc feature closely matches that of the noun phrase

to be generated For one-anaphora to be possi- ble, the two deep semantic structures being com- pared must have the same value for the feature addressed by the path <sere spec type category>

Rules which specify the relative ordering of ad- jectives in the surface form are then used to build

an appropriately nested surface semantic structure which, w h e n unified with the grammar, will result

in the required one-anaphoric noun phrase In the present example, this results in the surface seman- tic structure in figure 6

P S E U D O - P A R T I T I ' V E N P S Partitive and pseudo-partitive noun phrases, ex- emplified by h a l f o f the carrots and three p o u n d s o f carrots respectively, are very c o m m o n in recipes;

E P I C U R E is capable of generating both So, for example, the pseudo-partitive noun phrase three pounds of carrots (as represented by the knowledge

base entity shown in figure 1) is generated from the deep semantic structure shown in figure 7 via the surface semantic structure shown in figure 8

T h e generation of partitive noun phrases re- quires slightly different semantic structures, de- scribed in greater detail in Dale (1989b)

T H E U N I F I C A T I O N G R A M M A R

Once the required surface semantic structure has been constructed, this is passed to a unification

Trang 7

$S =

a t a t u a =

8 e r a

epee =

[ g i u e n = ]

countable = +

a g r = n u m b e r = 3

e p e c I =

$ p ¢ c 2 =

]

t countable = + age = number = 3

agr= [[eountab|e=+

Figure 8: The surface semantic structure corresponding to three pounds of carrots

grammar In EPICURE, the grammar consists of

phrase structure rules annotated with path equa-

tions which determine the relationships between

semantic units and syntactic units: t h e path equa-

tions specify arbitrary constituents (either com-

plex or atomic) of feature structures

There is insufficient space here to show the en-

tire NP grammar, but we provide some representa-

tive rules in figure 9 (although these rules are ex-

pressed here in a PATR-Iike formalism, within EPI-

CURE they are encoded as PROLOG definite clause

grammar (DCG) rules (Clocksin and Mellish 1981))

Applying these rules to the surface semantic struc-

tures described above results in the generation of

the appropriate surface linguistic strings

C O N C L U S I O N

In this paper, we have described the processes used

in EPICURE to produce noun phrase referring ex-

pressions EPICURE is implemented in C-PROLOG

running under UNIX The algorithms used in the

system permit the generation of a wide range of

pronominal forms, one-anaphoric forms and full

noun phrase structures, including partitives and

pseudo-partitives

A C K N O W L E D G E M E N T S

The work described here has benefited greatly from

discussions with Ewan Klein, Graeme Ritchie, :Ion

Oberlander, and Marc Moens, and from Bonnie Webber's encouragement

R E F E R E N C E S

Appelt, Douglas E (1985) Planning English Refer- ring Expressions Artificial Intelligence, 26, 1-33 Clocksin, William F and Melllsh, Christopher S (1981) Programming in Prolog Berlin: Springer- Verlag

Dale, Robert (1988) The Generation of Subsequent Referring Expressions in Structured Discourses Chapter 5 in Zock, M and Sabah, G (eds.) Ad- uances in Natural Language Generation: An Inter- disciplinary Perspective, Volume 2, pp58-75 Lon- don: Pinter Publishers Ltd

Dale, Robert (1989a) Generating Recipes: An Over- view of EPICURE Extended Abstracts of the Sec- ond European Natural Language Generation Work- shop, Edinburgh, April 1989

Dale, Robert (1989b) Generating Referring Ex- pressions in a Domain of Objects and Processes PhD Thesis, Centre for Cognitive Science, Univer- sity of Edinburgh

Grice, H Paul (1975) Logic and Conversation In Cole, P and Morgan, J L (eds.) Syntax and Se- mantics, Volume 3: Speech Acts, pp41-58 New York: Academic Press

Grosz, Barbara J (1977} The Representation and Use of Focus in Dialogue Technical Note No 151,

Trang 8

N P

N2

N l l

NPx

NPI -4

Dee N1

<Dee sere>

< N P 8yn agr>

<N1 syn agr>

<Dee syn agr>

<N1 sere>

N

A P NI2

< A P sere>

<NI~ sere head>

<NP2 sere>

< N 1 s e r e >

< N I 8yn ayr>

<NP2 sere status>

<NPa 8era>

< P P 8era>

= < N P sere status>

= < N P sere spec agr>

= < N P syn agr>

= < N P sere spec desc>

= <N1 sent head>

= < N l l sere rood>

< N l x sere head>

= <NPx sere spec desc specx >

= <NPx sere spec agr>

Figure 9: A fragment of the noun phrase grammar

Grosz, Barbara J., Joshi, Aravind K and Wein-

stein, Scott (1983) Providing a Unified Account of

Institute of Technology, Cambridge, Mass., 15-17

June, 1983, pp44-49

Grosz, Barbara J and Sidner, Candace L (1986)

Attention, Intentions, and the Structure of Dis-

Hobbs, Jerry R (1978) Resolving Pronoun Refer-

Kamp, Hans (1975) Two Theories about Adjec-

Natural Language: Papers from a colloquium spon-

sored by King's College Research Centre, Cam-

versity Press

Karttunen, Lauri (1986) D-PATR: A Development

Environment for Unification-Based Grammars In

Proceedings of the 11th International Conference

gust, 1986, pp74-80

Keenan, Edward L and Faltz, Leonard M (1978)

sional Papers in Linguistics, No 3

McDonald, David D (1980) Natural Language Gen- eration as a Process of Decision-Making under Con- straints PhD Thesis, Department of Computer Science and Electrical Engineering, MIT

Massam, Diane and Roberge, Yves (1989) Recipe

The University of Chicago Press

lishing

Ngày đăng: 24/03/2014, 02:20

TỪ KHÓA LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm