1. Trang chủ
  2. » Luận Văn - Báo Cáo

Báo cáo khoa học: "COPYING IN NATURAL LANGUAGES, CONTEXT-FREENESS, AND QUEUE GRAMMARS" potx

5 329 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 5
Dung lượng 519,26 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

To allow such finite copying constructions to be taken into account in formal modeling, it is necessary to recognize t h a t natural languages cannot be realistically represented by form

Trang 1

COPYING IN N A T U R A L L A N G U A G E S , C O N T E X T - F R E E N E S S , A N D Q U E U E G R A M M A R S

A l e x i s M a n a s t e r - R a m e r

U n i v e r s i t y o f M i c h i g a n

2236 F u l l e r Road # 1 0 8

A n n Arbor, MI 4 8 1 0 5

A B S T R A C T

The documentation of (unbounded-len~h) copying and

cross-serial constructions in a few languages in the recent

literature is usually taken to mean that natural languages

are slightly context-sensitive However, this ignores those

copying constructions which, while productive, cannot be

easily shown to apply to infinite sublanguages To allow such

finite copying constructions to be taken into account in formal

modeling, it is necessary to recognize t h a t natural languages

cannot be realistically represented by formal languages of the

usual sort Rather, they must be modeled as families of

formal languages or as formal languages with indefinite

vocabularies Once this is done, we see copying as a truly

pervasive and fundamental process in human language

Furthermore, the absence of mirror-image constructions in

human languages means that it is not enough to extend

Context-free Grammars in the direction of context-sensitivity

Instead, a class of g r a m m a r s must be found which handles

(context-sensitive) copying but not (context-free) mirror

images This suggests that human linguistic processes use

queues rather than stacks, making imperative the

development of a hierarchy of Queue Grammars as a

counterweight to the Chomsky Grammars A simple class of

Context-free Queue Grammars is introduced and discussed

I n t r o d u c t i o n

The claim that at least some human languages cannot

be described by a Context-free G r a m m a r no matter how large

or complex has had an interesting career In the late 1960's

it might have seemed, given the arguments of Bar-Hillel and

Shamir (1960) about r e s p e c t i v e l y coordinations in English,

Postal (1964) about reduplication-cum-incorporation of object

noun stems in Mohawk, and Chomsky (1963) about English

comparative deletion, that this claim was firmly established

Potentially serious and at any rate embarrassing

problems with both the formal and the linguistic aspects of

these arguments kept popping up, however (Daly, 1974;

Levelt, 1974), and the partial fixes provided by Brandt

Corstius (as reported in Levelt, 1974) for the r e s p e c t i v e l y

arguments and by Langendoen (1977) for that as well as the

Mohawk argument did not deter Pullum and Gazdar (1982)

from claiming t h a t "it seems reasonable to assume that the

natural languages are a proper subset of the infinite-

cardinality CFL's, until such time as they are validly shown

not to be" Two new arguments, Higginbotham's (1984) one

involving s u c h t h a t relativization and Postal and

Langendoen's (1984) one about sluicing were dismissed on

grounds of descriptive inadequacy by Pullum (1984a), who,

however, suggested that the Langendoen and Postal (1984)

argument about the doubling relativization construction may

be correct (all these arguments deal with English)

Pullum (1984b) likewise heaped scorn on my argument

t h a t English reshmuplicative constructions show non-CFness, but he accepted (1984a; 1984b) Culy's (1985) argument about noun reduplication in Bambara and Shieber's (1985)

one about Swiss German cross-serial constructions of causative and perception verbs and their objects Gazdar and Pullum (1985) also cite these two, as well as an argument by Carlson (1983) about verb phrase reduplication in Engenni They also refer to m y discovery of the X o r n o X

construction in English I and mention t h a t "Alexis Manaster- Ramer in unpublished lectures finds reduplication constructions t h a t appear to have no length bound in Polish, Turkish, and a number of other languages" While they do not refer to my 1983 reshmuplication argument, which they presumably still reject, the Turkish construction they allude

to was cited in my 1983 paper and is similar to the English reshmuplication in form as well as function (see below)

In any case, the acceptance of even one case of non- CFness in one natural language by the only active advocates

of the CF position would seem to suffice to remove the issue from the agenda Any additional arguments, such as Kac (to appear), Kac, Manaster-Ramer, and Rounds (to appear), and Manaster-Ramer (to appear a; to appear b) may appear to be

no more than flogging of dead horses However, as I argued

in Manaster-Ramer (1983) and as recent work (Manaster- Ramer, to appear a; Rounds, Manaster-Ramer, and Friedman, to appear) shows ever more clearly, this conception of the issue (viz., Is there one natural languages that is weakly noncontext-free?) makes very little difference and not much sense

First of all, if non-CFness is so hard to find, then it is presumably linguistically marginal Second, weak generative arguments cannot be made to work for natural languages, because of their high degree of structural ambiguity and the great difficulty in excluding every conceivable interpretation

on which an apparently ungrammatical string might turn

o u t - o n reflection to be in the language Third, weak generative capacity is in any case not a very interesting property of a formal grammar, especially from a linguistic point of view, since linguistic models are judged by other criteria (e.g., natural languages might well be regular without this making CFGs any the more attractive as models for them) Fourth, results about the place of natural languages

in the Chomsky Hierarchy seem to be should be considered in light of the fact that there is no reason to take the Chomsky Hierarchy as the appropriate formal space in which to look for them Fifth, models of natural languages that are actually in use in theoretical, computational, and descriptive linguistics are - a n d always have been only remotely related to the Chomsky Grammars, which means that results about the latter may be of little relevance to linguistic models

Trang 2

A s I a r g u e d in 1983, we should go beyond piecemeal

d e b u n k i n g of invalid a r g u m e n t s a g a i n s t C F G s a n d b y t h e

s a m e token it s e e m s to m e t h a t we m u s t go beyond piecemeal

r e s t a t e m e n t s of s u c h a r g u m e n t s R a t h e r , we should focus on

g e n e r a l i s s u e s a n d ones t h a t h a v e implications for t h e

modeling of h u m a n l a n g u a g e s O n e s u c h i s s u e is, it s e e m s to

m e , t h e kind of c o n t e x t - s e n s i t i v i t y found in n a t u r a l

l a n g u a g e s It a p p e a r s t h a t t h e c o u n t e r e x a m p l e s to context-

f r e e n e s s a r e all r a t h e r similar Specifically, t h e y all s e e m to

involve s o m e kind of c r o s s - s e r i a l d e p e n d e n c y , i.e., a

d e p e n d e n c y b e t w e e n t h e n t h e l e m e n t s of two or m o r e

s u b s t r i n g s T h i s - - u n l i k e t h e s t a t e m e n t t h a t n a t u r a l

l a n g u a g e s a r e n o n c o n t e x t - f r e e - - m i g h t m e a n s o m e t h i n g if we

k n e w w h a t k i n d s of models w e r e a p p r o p r i a t e for cross-serial

dependencies G i v e n t h a t n o t e v e r y kind of c o n t e x t - s e n s i t i v e

c o n s t r u c t i o n is found in h u m a n l a n g u a g e s , it should be clear

t h a t t h e r e is n o t h i n g to be gained by invoking the dubious

s l o g a n of c o n t e x t - s e n s i t i v i t y

A n o t h e r r e l e v a n t q u e s t i o n is t h e c e n t r a l i t y or

p e r i p h e r a l i t y of t h e s e c o n s t r u c t i o n s in n a t u r a l l a n g u a g e s

T h e r e l e v a n t l i t e r a t u r e m a k e s it a p p e a r t h a t t h e y a r e

s o m e w h a t m a r g i n a l a t best T h i s would explain t h e t o r t u r e d

h i s t o r y of t h e a t t e m p t s to s h o w t h a t t h e y e x i s t a t all

H o w e v e r , t h i s a p p e a r s to be w r o n g , a t l e a s t w h e n we

consider copying c o n s t r u c t i o n s T h e r e q u i r e m e n t of full or

n e a r identity of two or m o r e s u b p a r t s of a s e n t e n c e (or a

discourse) is a v e r y w i d e s p r e a d p h e n o m e n o n In this p a p e r , I

will focus on t h e copying c o n s t r u c t i o n s precisely b e c a u s e t h e y

a r e so c o m m o n in h u m a n l a n g u a g e s

In addition to s u c h q u e s t i o n s , w h i c h a p p e a r to focus on

t h e linguistic side of t h i n g s , t h e r e a r e also t h e m o r e

m a t h e m a t i c a l a n d conceptual p r o b l e m s involved in t h e whole

e n t e r p r i s e of m o d e l i n g h u m a n l a n g u a g e s in formal t e r m s

M y o w n belief is t h a t both k i n d s of i s s u e s m u s t be solved in

t a n d e m , since we c a n n o t know w h a t kind of formal models we

w a n t until we k n o w w h a t we a r e going to model, and we

c a n n o t know w h a t h u m a n l a n g u a g e s a r e or a r e n o t like until

we k n o w h o t , to r e p r e s e n t t h e m a n d w h a t to c o m p a r e t h e m

to T h i s p a p e r is intended as a contribution to this kind of

work

C o p y i n g D e p e n d e n c i e s

T h e e x a m p l e s of copying (and other) c o n s t r u c t i o n s w h i c h

h a v e figured in t h e g r e a t c o n t e x t - f r e e n e s s d e b a t e h a v e all

involved a t t e m p t s to s h o w t h a t a whole ( n a t u r a l ) l a n g u a g e is

n o n c o n t e x t free Now, while it is often e a s y to find a

n o n c o n t e x t - f r e e s u b s e t of s u c h a l a n g u a g e , it is not a l w a y s

possible to isolate t h a t s u b s e t formally f r o m t h e r e s t of t h e

l a n g u a g e in s u c h a w a y a s to s h o w t h a t t h e l a n g u a g e a s a

whole is noncontext-free T h e r e is so m u c h a m b i g u i t y in

n a t u r a l l a n g u a g e s t h a t it is strictly s p e a k i n g impossible to

isolate a n y c o n s t r u c t i o n a t the level of s t r i n g s , t h u s

i n v a l i d a t i n g all a r g u m e n t s a g a i n s t C F G s or e v e n R e g u l a r

G r a m m a r s t h a t refer to w e a k g e n e r a t i v e c a p a c i t y H o w e v e r ,

t h e a r g u m e n t s c a n be r e c o n s t r u c t e d b y m a k i n g u s e of t h e

notion of classificatory c a p a c i t y of f o r m a l g r a m m a r s ,

introduced in M a n a s t e r - R a m e r (to a p p e a r a) a n d M a n a s t e r -

R a m e r a n d R o u n d s (to appear) The classificatory c a p a c i t y is

t h e set of l a n g u a g e s g e n e r a t e d by t h e v a r i o u s s u b g r a m m a r s

of a g r a m m a r , a n d if we a r e willing to a s s u m e t h a t l i n g u i s t s

c a n tell w h i c h s e n t e n c e s in a l a n g u a g e e x e m p l i f y t h e s a m e or

different s y n t a c t i c p a t t e r n s , t h e n we c a n u s u a l l y s i m p l y

d e m o n s t r a t e t h a t , e.g., no CFG c a n h a v e a s u b g r a m m a r

g e n e r a t i n g all and only t h e s e n t e n c e s of s o m e p a r t i c u l a r

c o n s t r u c t i o n if t h a t c o n s t r u c t i o n involves reduplication T h i s

will s h o t ' the i n a d e q u a c y of C F G s , e v e n if t h e s t r i n g s e t a s a

a p p r o a c h holds t h a t it is impossible to d e t e r m i n e w i t h a n y confidence t h a t a p a r t i c u l a r s t r i n g q u a s t r i n g is

u n g r a m m a t i c a l , b u t t h a t it m a y be possible to tell one

c o n s t r u c t i o n from a n o t h e r , a n d t h a t t h e l a t t e r - - a n d n o t the

f o r m e r - - i s t h e real b a s i s of all linguistic work, theoretical,

c o m p u t a t i o n a l , and descriptive

F i n i t e C o p y i n g

T h e c o u n t e r e x a m p l e s to c o n t e x t - f r e e n e s s in t h e

l i t e r a t u r e h a v e all b e e n claimed to crucially involve

e x p r e s s i o n s of u n b o u n d e d length T h i s s e e m e d n e c e s s a r y in view of t h e fact t h a t a n u p p e r b o u n d on l e n g t h would i m p l y

f i n i t e n e s s of t h e s u b s e t of s t r i n g s involved, w h i c h would a s a

r e s u l t be of no f o r m a l l a n g u a g e theoretic i n t e r e s t H o w e v e r , it

is often difficult to m a k e a c a s e for u n b o u n d e d length, a n d t h e

m a i n r e s u l t h a s been t h a t , e v e n t h o u g h e v e r y l i n g u i s t k n o w s

a b o u t reduplication, it s e e m e d n e a r l y i m p o s s i b l e to find a n

i n s t a n c e of reduplication t h a t could be u s e d to m a k e a formal

a r g u m e n t a g a i n s t C F G s , e v e n t h o u g h no one would e v e r u s e

a C F G to describe reduplication

For, in addition to reduplications t h a t c a n apply to

u n b o u n d e d l y long e x p r e s s i o n s , t h e r e is a m u c h b e t t e r k n o w n

c l a s s of reduplications exemplified b y I n d o n e s i a n pluralization of n o u n s H e r e it is difficult to s h o w t h a t t h e reduplicated f o r m s a r e infinite in n u m b e r , b e c a u s e c o m p o u n d

n o u n s a r e n o t pluralized in t h e s a m e w a y , a n d ignoring

c o m p o u n d i n g , it would s e e m t h a t t h e n u m b e r of fiouns is finite H o w e v e r , t h i s n u m b e r is v e r y l a r g e a n d m o r e o v e r it is

p r o b a b l y n o t well defined T h e class of n o u n s t e m s is open,

a n d c a n be enriched b y b o r r o w i n g f r o m foreign l a n g u a g e s a n d

n e o l o g i s m s , a n d all of t h e s e s p o n t a n e o u s l y pluralize by reduplication

R o u n d s , M a n a s t e r - R a m e r , a n d F r i e d m a n (to a p p e a r )

a r g u e t h a t facts like t h i s m e a n t h a t a n a t u r a l l a n g u a g e should n o t be modeled a s a f o r m a l l a n g u a g e b u t r a t h e r a s a

f a m i l y of l a n g u a g e s , e a c h of w h i c h m a y be t a k e n a s a n

a p p r o x i m a t i o n to a n ideal l a n g u a g e I n t h e c a s e before u s ,

we could a r g u e t h a t e a c h of t h e a p p r o x i m a t i o n s h a s only a finite n u m b e r of n o u n s , for e x a m p l e , b u t a d i f f e r e n t n u m b e r

in d i f f e r e n t a p p r o x i m a t i o n s T h i s idea, related to t h e w o r k of Yuri G u r e v i c h on finite d y n a m i c models of c o m p u t a t i o n , allows u s to s t a t e t h e a r g u m e n t t h a t t h e e x i s t e n c e of a n open

c l a s s of reduplications is sufficient to s h o w t h e i n a d e q u a c y of

C F G s for t h a t f a m i l y of a p p r o x i m a t i o n s T h e b a s i s of t h e

a r g u m e n t is the o b s e r v a t i o n t h a t while e a c h of t h e

a p p r o x i m a t e l a n g u a g e s could in principle h a v e a C F G , e a c h

s u c h C F G would differ f r o m t h e n e x t n o t only in t h e addition

of a n e w lexical i t e m b u t also in t h e addition of a n e w reduplication rule (for t h a t p a r t i c u l a r item)

To c a p t u r e w h a t is really going on, we r e q u i r e a

g r a m m a r t h a t is t h e s a m e for e a c h a p p r o x i m a t i o n modulo t h e lexicon T h i s g r a m m a r in a s e n s e g e n e r a t e s t h e infinite ideal,

b u t a c t u a l l y e a c h actual a p p r o x i m a t e g r a m m a r only h a s a finite lexicon a n d h e n c e a c t u a l l y only g e n e r a t e s a finite

n u m b e r of reduplications In order to model t h e flexibility of

t h e n a t u r a l l a n g u a g e v o c a b u l a r y , we a s s u m e t h a t e a c h

m e m b e r of t h e f a m i l y h a s t h e s a m e g r a m m a r modulo t h e

t e r m i n a l v o c a b u l a r y a n d t h e r u l e s w h i c h i n s e r t t e r m i n a l s

A n o t h e r w a y of s t a t i n g t h i s is t h a t t h e lexicon of

I n d o n e s i a n is finite b u t of a n indefinite size ( w h a t G u r e v i c h calls " u n c o u n t a b l y finite") A C F G would still h a v e to contain

a s e p a r a t e rule for t h e plural of e v e r y n o u n a n d henc, would h a v e to be of a n indefinite size T h u s , with

Trang 3

n e w rule However, this would m e a n t h a t t h e g r a m m a r a t

a n y given t i m e can only f o r m t h e plurals of n o u n s t h a t h a v e

a l r e a d y been learned Since s p e a k e r s of t h e l a n g u a g e know

in a d v a n c e how to pluralize u n f a m i l i a r n o u n s , t h i s c a n n o t be

true R a t h e r the g r a m m a r at a n y given time m u s t be able to

f o r m plurals of n o u n s t h a t h a v e not y e t been learned T h i s in

t u r n m e a n s t h a t a n indefinite n u m b e r of p l u r a l s c a n be

formed by a g r a m m a r of a d e t e r m i n a t e finite size Hence, in

effect, t h e n u m b e r of r u l e s for plural f o r m a t i o n m u s t be

s m a l l e r t h a n the n u m b e r of plural f o r m s t h a t c a n be

g e n e r a t e d , a n d this in t u r n m e a n s t h a t t h e r e is no CFG of

I n d o n e s i a n

T h i s brings up a crucial issue, of which we are all

p r e s u m a b l y a w a r e b u t w h i c h is u s u a l l y lost s i g h t of in

practice, n a m e l y , t h a t t h e w a y a m a t h e m a t i c a l model (in this

case, formal l a n g u a g e theory) is applied to a p h y s i c a l or

m e n t a l d o m a i n (in this case, n a t u r a l l a n g u a g e ) is a m a t t e r of

utility a n d not itself subject to proof or disproof F o r m a l

l a n g u a g e t h e o r y deals with s e t s of s t r i n g s over well-defined

finite vocabularies (also often called a l p h a b e t s ) s u c h a s t h e

h a c k n e y e d {a, b} It h a s been all too e a s y to fall into t h e t r a p

of e q u a t i n g t h e f o r m a l l a n g u a g e theoretic notion of

v o c a b u l a r y (alphabet) w i t h the linguistic notion of v o c a b u l a r y

and likewise to confuse t h e formal l a n g u a g e theoretic notion

of a s t r i n g (word) over t h e v o c a b u l a r y (alphabet) with t h e

linguistic notion of s e n t e n c e

H o w e v e r , t h e f u n d a m e n t a l fact a b o u t all k n o w n n a t u r a l

l a n g u a g e s is t h e o p e n n e s s of a t l e a s t s o m e c l a s s e s of w o r d s

(e.g., n o u n s b u t p e r h a p s not p r e p o s i t i o n s or, in s o m e

l a n g u a g e s , verbs), w h i c h c a n acquire n e w m e m b e r s t h r o u g h

borrowing or t h r o u g h v a r i o u s p r o c e s s e s of n e w f o r m a t i o n ,

m a n y of t h e m a p p a r e n t l y not rule-governed, a n d w h i c h c a n

also lose m e m b e r s , a s w o r d s a r e forgotten T h u s , t h e well-

defined finite v o c a b u l a r i e s of formal l a n g u a g e t h e o r y a r e not

a v e r y good model of the v o c a b u l a r i e s of n a t u r a l l a n g u a g e s

W h e t h e r we decide to introduce t h e notion of families of

l a n g u a g e s or t h a t of u n c o u n t a b l y finite s e t s or w h e t h e r we

r a t h e r choose to s a y t h a t the v o c a b u l a r y of a n a t u r a l

l a n g u a g e is really infinite (being the s e t of all s t r i n g s over the

s o u n d s or letters of the l a n g u a g e t h a t could conceivably be or

become lexical i t e m s in it), we end up h a v i n g to conclude t h a t

a n y l a n g u a g e w h i c h productively r e d u p l i c a t e s s o m e open

word class to form s o m e g r a m m a t i c a l c a t e g o r y c a n n o t h a v e a

CFG

Copying in English

It should now be noted t h a t reduplications (and

r e i t e r a t i o n s generally) are e x t r e m e l y c o m m o n in n a t u r a l

l a n g u a g e s J u s t how c o m m o n follows f r o m a n inspection of

the bewildering v a r i e t y of s u c h c o n s t r u c t i o n s t h a t a r e found

in English All t h e e x a m p l e s cited h e r e are productive t h o u g h

t h e y m a y be of b o u n d e d length

Linguistics s h m i n g u i s t i c s

L i n g u i s t i c s or no linguistics, (I a m going home)

A dog is a dog is a dog

Philosophize while t h e philosophizing is good!

Moral is as m o r a l does

Is s h e beautiful or is s h e beautiful?

T h e s e a r e clause-level c o n s t r u c t i o n s , b u t we also find

o n e s restricted to t h e p h r a s e level

(He) deliberates, deliberates, deliberates (all d a y long) (He worked slowly) t h e o r e m by t h e o r e m

(They form) a c h u r c h within a c h u r c h (He d e b u n k s ) t h e o r y after t h e o r y Also r e l e v a n t a r e c a s e s w h e r e a copying d e p e n d e n c y

e x t e n d s a c r o s s s e n t e n c e b o u n d a r i e s , a s in d i s c o u r s e s like: A: She is fat

B: She is fat, m y foot

It is i n t e r e s t i n g t h a t s e v e r a l of t h e s e t y p e s a r e productive e v e n t h o u g h t h e y a p p e a r to be b a s e d on w h a t originally m u s t h a v e been m o r e restricted, idiomatic

e x p r e s s i o n s T h e p a t t e r n a X within a X, for e x a m p l e , is

s u r e l y derived f r o m t h e single e x a m p l e a state within a state,

y e t h a s become quite productive

M a n y of t h e s e p a t t e r n s h a v e a n a l o g u e s in o t h e r

l a n g u a g e s For e x a m p l e , the X after X c o n s t r u c t i o n a p p e a r s

to involve quantification and this m a y be related to t h e fact

t h a t , for e x a m p l e , B a m b a r a u s e s reduplication to m e a n ' w h a t e v e r ' and S a n s k r i t to m e a n ' e v e r y ' (P~nini 8.1.4) English r e s h m u p l i c a t i o n h a s close a n a l o g u e s in m a n y

l a n g u a g e s , including t h e whole D r a v i d i a n and Turkic

l a n g u a g e families T a m i l kiduplication (e.g pustakam kistakarn) a n d T u r k i s h meduplication (e.g., kitap mitap) a r e

i n s t a n c e s of this, t h o u g h the s e m a n t i c r a n g e is s o m e w h a t different I n both of these, the s e n s e is m o r e like t h a t of

English books and things, books and such, i.e., a combination

of deprecation and e t c e t e r a n e s s r a t h e r t h a n t h e p u r e l y

derisive function of E n g l i s h books shmoohs The English X or

no X p a t t e r n is v e r y similar to a Polish construction consisting of the f o r m X (nominative) X ( i n s t r u m e n t a l ) in its r a n g e of applications The repetition of a verb or verbal

p h r a s e to d e p r e c a t e excessive repetition or i n t e n s i t y of a n action s e e m s to be found in m a n y l a n g u a g e s as well

I h a v e not tried here to s u r v e y t h e u s e s to w h i c h copying

c o n s t r u c t i o n s a r e p u t in different l a n g u a g e s or e v e n to

d o c u m e n t fully their wide incidence, t h o u g h the e x a m p l e s cited should give s o m e indication of both It does a p p e a r t h a t copying c o n s t r u c t i o n s a r e e x t r e m e l y c o m m o n a n d p e r v a s i v e ,

a n d this in t u r n s u g g e s t s t h a t t h e y a r e central to m a n ' s linguistic faculties W h e n we consider s u c h additional facts

as the f r e q u e n c y of copying in child l a n g u a g e , we m a y be

t e m p t e d to t a k e copying a s one of the basic linguistic operations

C o p i e s v s m i r r o r images

T h e e x i s t e n c e a n d t h e c e n t r a l i t y of copying c o n s t r u c t i o n s poses i n t e r e s t i n g q u e s t i o n s t h a t go beyond the i n a d e q u a c y of

C F G s For e x a m p l e , w h y should n a t u r a l l a n g u a g e s h a v e reduplications w h e n t h e y lack m i r r o r - i m a g e c o n s t r u c t i o n s , which are context-free? T h i s a s y m m e t r y (first noted in

M a n a s t e r - R a m e r a n d Kac, 1985, a n d Rounds, M a n a s t e r -

R a m e r , and F r i e d m a n op cit.) a r g u e s t h a t it is not e n o u g h to

m a k e a s m a l l concession to c o n t e x t - s e n s i t i v i t y , as t h e s a y i n g goes R a t h e r t h a n g r u d g i n g l y c l a m b e r i n g up t h e C h o m s k y

H i e r a r c h y t o w a r d s C o n t e x t - s e n s i t i v e G r a m m a r s , we should consider going back down to R e g u l a r G r a m m a r s a n d s t r i k i n g

Trang 4

o u t in a different direction T h e s i m p l e s t a l t e r n a t i v e proposal

is a class of g r a m m a r s w h i c h intuitively h a v e t h e s a m e

relation to q u e u e s t h a t C F G s h a v e to stacks T h e idea, ~vhich

I owe to Michael Kac, would be t h a t h u m a n linguistic

p r o c e s s e s m a k e little if a n y u s e of s t a c k s and e m p l o y q u e u e s

instead

Q u e u e G r a m m a r s

T h i s s u g g e s t s t h a t C F G s a r e n o t j u s t i n a d e q u a t e a s

models of n a t u r a l l a n g u a g e s b u t i n a d e q u a t e in a p a r t i c u l a r l y

d a m a g i n g w a y T h e y a r e not e v e n the r i g h t point of

d e p a r t u r e , since t h e y n o t only u n d e r g e n e r a t e b u t also

o v e r g e n e r a t e T h i s leads to t h e idea of a h i e r a r c h y of

g r a m m a r s w h o s e relation to q u e u e s is like t h a t of t h e

C h o m s k y G r a m m a r s to s t a c k s A q u e u e - b a s e d a n a l o g u e to

C F G is being developed, u n d e r t h e n a m e of C o n t e x t - f r e e

Q u e u e G r a m m a r T h e c u r r e n t v e r s i o n is allowed r u l e s of

t h e following form:

A - > a

A - - > aB

A - - > a B b

A - - > a b

A - - > .B

W h a t e v e r a p p e a r s to t h e r i g h t of the t h r e e dots is p u t a t

t h e end of t h e s t r i n g b e i n g r e w r i t t e n O t h e r w i s e , all

definitions a r e a s in a c o r r e s p o n d i n g restricted CFG T h u s ,

t h e g r a m m a r

S - > a S a

S - > bS b

S - - > a a

S - - > b b

will g e n e r a t e t h e copying l a n g u a g e over {a,b} excluding t h e

null s t r i n g a n d define d e r i v a t i o n s like t h e following:

S - > a S a - > a b S a b - - > a b a a b a

S - > bSb - - > b a S b a - > b a a S b a a - - > b a a b S b a a b

O n t h e o t h e r h a n d , I conjecture t h a t t h e corresponding

xmi(x) l a n g u a g e c a n n o t be g e n e r a t e d b y s u c h a g r a m m a r

E v e n a t t h i s e a r l y s t a g e of i n q u i r y into t h e s e f o r m a l i s m s ,

t h e n , we h a v e s o m e tangible promise of being able to explain

w h y n a t u r a l l a n g u a g e s should h a v e reduplications b u t n o t

m i r r o r - i m a g e constructions V a r i o u s xh(x) c o n s t r u c t i o n s s u c h

a s the respectively o n e s a n d t h e c r o s s - s e r i a l v e r b c o n s t r u c t i o n s

c a n be h a n d l e d in t h e s a m e w a y a s reduplications

While the idea of t a k i n g q u e u e s a s opposed to s t a c k s a s

t h e principal n o n f i n i t e - s t a t e r e s o u r c e available to h u m a n

linguistic p r o c e s s e s would explain t h e p r e v a l e n c e of copying

a n d t h e a b s e n c e of m i r r o r i m a g e s , it does n o t explain t h e

coexistence of c e n t e r - e m b e d d e d c o n s t r u c t i o n s with cross-serial

o n e s or t h e relative scarcity of c r o s s - s e r i a l c o n s t r u c t i o n s o t h e r

t h a n copying ones

For t h i s r e a s o n , if for no other, t h e C F Q G s could not be

a n a d e q u a t e model of n a t u r a l l a n g u a g e In fact, t h e r e a r e

t h e y fail is t h a t t h e y a p p a r e n t l y c a n only g e n e r a t e two

c o p i e s - - o r two cross-serially d e p e n d e n t s u b s t r i n g s - - w h e r e a s

n a t u r a l l a n g u a g e s s e e m to allow m o r e (as in Grammar is grammar is grammar) T h i s is s i m i l a r to t h e limitation of

H e a d G r a m m a r s a n d Tree Adjoining G r a m m a r s to g e n e r a t i n g

no m o r e t h a n four copies ( M a n a s t e r - R a m e r to a p p e a r a)

H o w e v e r , a m o r e g e n e r a l class of Q u e u e G r a m m a r s a p p e a r s

to be w i t h i n r e a c h which will g e n e r a t e a n a r b i t r a r y n u m b e r of copies

P e r h a p s m o r e serious is t h e fact t h a t C F Q G s a p p a r e n t l y

c a n only g e n e r a t e copying c o n s t r u c t i o n s a t t h e cost of profligacy (as defined in R o u n d s , M a n a s t e r - R a m e r , a n d

F r i e d m a n , to appear) T h e r e p a i r of this defect is less obvious, b u t it a p p e a r s t h a t t h e f u n d a m e n t a l idea of b a s i n g models of n a t u r a l l a n g u a g e s on q u e u e s r a t h e r t h a n s t a c k s is

n o t u n d e r m i n e d R a t h e r , w h a t is a t i s s u e is t h e w a y in which

i n f o r m a t i o n is e n t e r e d into a n d r e t r i e v e d f r o m the queue

T h e C F Q G s s u g g e s t a p i e c e m e a l process b u t t h e

c o n s i d e r a t i o n s cited h e r e s e e m to a r g u e for a global one A

n u m b e r of f o r m a l i s m s with t h e s e p r o p e r t i e s a r e being explored

O n t h e o t h e r h a n d , it m a y be t h a t s o m e t h i n g m u c h like

t h e s i m p l e C F Q G is a n a t u r a l w a y of c a p t u r i n g c r o s s - s e r i a l

d e p e n d e n c i e s in c a s e s o t h e r t h a n copying To see e x a c t l y

w h a t is involved, consider t h e difference b e t w e e n copying a n d

o t h e r c r o s s - s e r i a l dependencies T h i s difference h a s little to

do w i t h t h e f o r m of t h e s t r i n g s R a t h e r , in t h e c a s e of o t h e r

c r o s s - s e r i a l d e p e n d e n c i e s , t h e r e is a s y n t a c t i c a n d s e m a n t i c relation b e t w e e n t h e n t h e l e m e n t s of two or m o r e s t r u c t u r e s For e x a m p l e , in ~ respectively c o n s t r u c t i o n involving a conjoined subject arid a conjoined predicate, e a c h conjunct of

t h e f o r m e r is s e m a n t i c a l l y combined w i t h t h e c o r r e s p o n d i n g conjunct of t h e latter In t h e c a s e of copying c o n s t r u c t i o n s ,

t h e r e is n o t h i n g a n a l o g o u s T h e c o r r e s p o n d i n g p a r t s of t h e two copies do n o t b e a r a n y relations to e a c h other T h u s it

m a k e s s o m e s e n s e to build up t h e c o r r e s p o n d i n g p a r t s of

c r o s s - s e r i a l c o n s t r u c t i o n in a p i e c e m e a l f a s h i o n , b u t t h i s

a p p e a r s to be inapplicable in t h e c a s e of c o p y i n g

c o n s t r u c t i o n s

I n view of all t h e s e limitations, t h e C F Q G s m i g h t s e e m

to be a n o n - s t a r t e r H o w e v e r , their i m p o r t a n c e lies in t h e fact t h a t t h e y a r e t h e first step in r e o r i e n t i n g o u r notions of

t h e f o r m a l s p a c e for m o d e l s of n a t u r a l l a n g u a g e A n y real

s u c c e s s in t h e theoretical m o d e l s of h u m a n l a n g u a g e d e p e n d s

on t h e d e v e l o p m e n t of a p p r o p r i a t e m a t h e m a t i c a l concepts a n d

on closing t h e g a p b e t w e e n formal l a n g u a g e a n d n a t u r a l

l a n g u a g e t h e o r y O n e of t h e first s t e p s in t h i s direction m u s t involve b r e a k i n g t h e spell of C F G s a n d t h e C h o m s k y

H i e r a r c h y T h e C F Q G s s e e m to be c u t o u t for t h i s t a s k Moreover, t h e idea t h a t q u e u e s r a t h e r t h a n s t a c k s a r e involved in h u m a n l a n g u a g e a p p e a r s to be correct, a n d t h i s

m o r e g e n e r a l r e s u l t is i n d e p e n d e n t of t h e l i m i t a t i o n s of

C F Q G s H o w e v e r , given m y s t a t e d goals for f o r m a l models,

it is n e c e s s a r y to develop models s u c h a s C F Q G s before proceeding to m o r e complex o n e s precisely in order to develop

a n a p p r o p r i a t e notion of f o r m a l s p a c e w i t h i n w h i c h we will

h a v e to work

T h e o t h e r m a i n point a d d r e s s e d in t h i s p a p e r , t h e n e e d

to model h u m a n l a n g u a g e s a s families of f o r m a l l a n g u a g e s or

a s formal l a n g u a g e s w i t h indefinite t e r m i n a l v o c a b u l a r i e s , is intended in t h e s a m e spirit T h e allure of identifying f o r m a l

l a n g u a g e theoretic cor~cepts with linguistic o n e s in the

s i m p l e s t possible w a y is h a r d to overcome, b u t it m u s t be if

Trang 5

we are to get any meaningful results about natural languages

through the formal route It will, again, be necessary to do

more work on these concepts, but it is beginning to look as

though we have found the right direction

REFERENCES

Constituents L i n g u i s t i c C a t e g o r i e s (Frank Heny and B a r r y

Richards, eds.), 1: Categories, 69-98 Dordrecht: Reidel

Chomsky, Noam 1963 Formal Properties of

G r a m m a r s H a n d b o o k of M a t h e m a t i c a l P s y c h o l o g y

(R Duncan Luce a t al., eds.), 2: 323-418 New York: Wiley

Culy, Christopher

Vocabulary of Bambara

345-351

1985 The Complexity of the

Linguistics and Philosophy, 8:

Daly, R T 1974 A p p l i c a t i o n s o f t h e Mathematical

T h e o r y of L i n g u i s t i c s The Hague: Mouton

Gazdar, Gerald, and Geoffrey K Pullum 1985

Computationally Relevant Properties of Natural Languages

and Their Grammars New G e n e r a t i o n C o m p u t i n g , 3: 273-

306

Higginbotham, J a m e s 1984 English is not a Context-

free Language L i n g u i s t i c I n q u i r y , 15: 225-234

Kac, Michael B To appear Surface Transitivity and

Context-freeness

Kac, Michael B., Alexis Manaster-Ramer, and William

C Rounds To appear Simultaneous-distributive

Coordination and Context-freeness C o m p u t a t i o n a l

Linguistics

Langendoen, D Terence 1977 On the Inadequacy of

Type-3 and Type-2 G r a m m a r s for H u m a n

Languages Studies in Descriptive and Historical

Linguistics: F e s t s c h r i f t for W i n f r e d P L e h m a n n (Paul

Hopper, ed.), 159-171 Amsterdam: Benjamins

Langendoen, D Terence, and Paul M Postal 1984

Comments on Pullum's Criticisms CL, 8: 187-188

Levelt, W J M 1974 Formal Grammars in

Linguistics and P s y c h o l i n g u i s t i c s The Hague: Mouton

Manaster-Ramer, Alexis 1983 The Soft Formal

Underbelly of Theoretical Syntax CLS, 19: 256-262

Manaster-Ramer, Alexis To appear a Dutch as a

Formal Language Linguistics and P h i l o s o p h y

Manaster-Ramer, Alexis To appear b Subject-verb

Agreement in Respective Coordinations in English

Manaster-Ramer, Alexis, and Michael B Kac 1985

Formal Languages and Linguistic Universals Paper read a t

the Milwaukee Symposium on Typology and Universals

Postal, Paul M 1964 Limitations of Phrase Structure

Grammars T h e S t r u c t u r e of L a n g u a g e : R e a d i n g s in t h e

P h i l o s o p h y of L a n g u a g e (Jerry A Fodor and Jerrold

J Katz, eds.), 137-151 Englewood Cliffs, NJ: Prentice-Hall

Postal, Paul M., and D Terence Langendoen 1984 English and the Class of Context-free Languages CL, 10:177-181

Pullum, Geoffrey K., and Gerald Gazdar 1982 Natural Languages and Context-free Languages Linguistics and Philosophy, 4: 471-504

Pullum, Geoffrey K 1984a On Two Recent Attempts to Show t h a t English is not a CFL CL, 10: 182-186

Pullum, Geoffrey K 1984b Syntactic and Semantic Parsability P r o c e e d i n g s of COLING84, 112-122 Stanford, CA: ACL

Rounds, William C., Alexis Manaster-Ramer, and Joyce Friedman To appear Finding Natural Languages a Home in Formal Language Theory M a t h e m a t i c s o f Language

(Alexis Manaster-Ramer, ed.) Amsterdam: John Benjamins

Shieber, S t u a r t M 1985 Evidence against the Context- freeness of Natural Language Linguistics and P h i l o s o p h y ,

8: 333-343

Ngày đăng: 24/03/2014, 02:20

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm