Some examples of psycholinguistic research on the lexical component of language are reviewed with special atten- tion to their implications for the compu- tational problem.. INTRODUCTION
Trang 1George A Miller Department of Psychology Princeton University
Princeton, NJ 08544, USA
ABSTRACT How lexical information should be
formulated, and how it is organized in
computer memory for rapid retrieval, are
central questions for computational
linguists who want to create systems for
language understanding How lexical
knowledge is acquired, and how it is
organized in human memory for rapid
retrieval during language use, are also
central questions for cognitive psycholo-
gists Some examples of psycholinguistic
research on the lexical component of
language are reviewed with special atten-
tion to their implications for the compu-
tational problem
INTRODUCTION
I would like to describe some recent
psychological research on the nature and
organization of lexical knowledge, yet to
introduce it that way, as research on the
nature a n d organization of lexical
knowledge, usually leaves the impression
that it is abstract and not very
practical But that impression is pre-
cisely wrong; the work is very practical
and not at all abstract So I shall take
a different tack
Computer scientists those in ar-
tificial intelligence especlally some-
times introduce their work by emphasizing
its potential contribution to an under-
standing of the human mind I propose to
adopt that strategy in reverse: to intro-
duce work in psychology by emphasizing
Its potential contribution to the devel-
opment of information processing and
communication systems We may both be
wrong, of course, but at least this
strategy indicates a spirit of coopera-
tion
Let me sketch a general picture of
the future You may not share my expec-
tations, but once you see where I think
events are leading, you will understand
why I believe that research on the nature
and organization of lezical knowledge is
worth doing You may disagree, but a t least you will understand
Some Technological Assumptions
I assume that computers are going to
be directly linked by communication net- works Even now, in local area networks,
a workstation can access information on any disk connected anywhere in the net Soon such networks will not be locally restricted The model that is emerging
is of a very large computer whose parts are geographically distributed; large corporations, government agencies, uni- versity consortia, groups of scientists, and others who can afford it will be working together in shared information environments For example, someday the Association foe Computational Linguistics will maintain and update an exhaustive knowledge base immediately accessible to all computational linguists
Our present conception of computers
as distinct objects will not fade away the local workstation seems destined to grow smaller and more powerful every year but developments in networking will allow users to think of their own work- stations not merely as computers, but as windows into a vast information space that they can use however they desire Most of the parts needed for such a system already exist, and fiber optic technology will soon transmit broadband signals over long distances at affordable costs Putting the parts together into large, non-local networks is no trivial task, but it will happen
Computer scientists probably have their own versions of this story, but no special expertise is required to see that rapid progress lies ahead Moreover, this development will have implications for cognitive psychology However the technological implementation works out,
at least one aspect raises questions of considerable psychological interest: in particular, how will people use it? What kind of man-machine interface will there be?
Trang 2board," as one futurist has put it (Bolt,
1984), has been a subject for much crea-
tive speculation, since the possibilities
are numerous and diverse Although no
single interface will be optimal for
every use, many users will surely want to
interact with the system in something
reasonably close to a natural language
Indeed, if the development of information
networks is to be financed by those who
use them, the interface will have to be
as natural as possible which means
that natural language processing will be
a part of the interface
N a t u r a l Language Interfaces
Natural language interfaces to large
knowledge bases are going to become gen-
erally available The only question is
when How long will it take? Systems
already exist that converse and answer
questions on restricted topics How much
remains to be done?
Before these systems will be gener-
ally useful, three difficult requirements
will have to be met An interface must:
(1) have access to a large, general-pur-
pose knowledge base; (2) be able to deal
with an enormous vocabulary~ (3) be able
to reason in ways that human users find
familiar Other features would be highly
desirable (e.g., automatic speech recog-
nition, digital processing of images,
spatially distributed displays of infor-
mation), but the three listed above seem
critical
Requirement (I) will be met by the
creation of the network How a user's
special interests will shape the organ-
ization of his knowledge base and his
locally resident programs poses fascin-
ating problems, but I do not understand
them well enough to comment I simply
assume that eventually every user can
have at his disposal, either locally or
remotely, whatever data bases and expert
systems he desires
Requirement (3), the ability to draw
inferences as people do, is probably the
most difficult It is not likely to be
"solved" by any single insight, but a
robust system for revising belief struc-
tures will be an essential component of
any satisfactory interface I believe
that psychologists and other cognitive
scientists have much to contribute to the
solution of this problem, but the most
promising work to date has been done by
computer scientists Since I have little
to say about the problem other than how
difficult it is, I will turn instead to
requirement (2), which seems more trac-
table
Giving a system a large vocabulary poses no difficulty in principle And everyone who has tried to develop systems
to process natural language recognizes the importance of a large vocabulary Thus, the vocabulary problem looks like a good place to start The dimensions of the problem are larger than might be expected, however, so there has been some disagreement about the best strategy
If, in addition to understanding a user's queries, the system is expected to understand all the words in the vast knowledge base to which it will have access, then it should probably have on the order of 250,000 lexical entries: at 1,000 bytes/entry (a modest estimate), that is 250 megabytes Since standard dictionaries do not contain many of the words that are printed in newspapers (Walker & Amsler, 1984), another 250,000 megabytes would probably be required for proper nouns Since I am imagining the future, however, I will assume that such large memories will be available inex- pensively at every user's workstation
It is not memory size per se that poses the problem
The problem is how to get all that information into a computer Even if you knew how the information should be repre- sented, a good lexical entry would take a long time to write Writing 250,000 of them is a daunting task
No doubt there are many exciting projects that I don't happen to know about, but on the basis of my perusal of the easily accessible literature there seem to he two approaches to the vocabu- lary problem One uses a machine-read- able version of some traditional diction- ary and tries to adapt it to the needs of
a language processing system Call this the "book" approach The other writes iexical entries for some fragment of the English lexicon, hut formulates those en- tries in a notation that is convenient for computational manipulation Call this the "demo" approach
The book approach has the advantage
of including a large number of words, but the information with each word is d i f f i - cult to use The demo approach has the advantage that the information about each word is easy to use, but there are usual-
ly not many words The real problem, therefore, is how to combine these two approaches: how to attain the coverage of
a traditional dictionary in a c o m p u t a -
tionally convenient form
Trang 3The Book A p p r o a c h
If you a d o p t the book approach, w h a t
you w a n t to do is t r a n s l a t e t r a d i t i o n a l
d i c t i o n a r y e n t r i e s into a n o t a t i o n that
m a k e s evident to the m a c h i n e the m o r p h o -
logical, syntactic, semantic, and p r a g -
m a t i c p r o p e r t i e s that are n e e d e d in o r d e r
to c o n s t r u c t i n t e r p r e t a t i o n s for s e n t e n -
ces Since there are m a n y e n t r i e s to be
translated, the n a t u r a l s o l u t i o n is to
w r i t e a p r o g r a m that w i l l do it a u t o m a -
tically But that is not an e a s y task
One reason the t r a n s l a t i o n s are dif-
ficult is that s y n o n y m s are hard to find
in a c o n v e n t i o n a l d i c t i o n a r y A l p h a -
b e t i c a l o r d e r i n g is the only w a y that a
l e x i c o g r a p h e r w h o w o r k s by hand can k e e p
track of his data, but an a l p h a b e t i c a l
order puts t o g e t h e r w o r d s w i t h similar
s p e l l i n g s and s c a t t e r s h a p h a z a r d l y w o r d s
w i t h similar meanings C o n s e q u e n t l y ,
similar senses of d i f f e r e n t w o r d s may be
w r i t t e n very d i f f e r e n t l y ; they may be
w r i t t e n at d i f f e r e n t times and even by
d i f f e r e n t people (For example, c o m p a r e
the entries for the m o d a l v e r b s 'can,'
'must,' and 'will' in the O x f o r d E n g l i s h
Dictionary.) O n l y a very smart p r o g r a m
could a p p r e c i a t e w h i c h d e f i n i t i o n s should
be p a r a p h r a s e s of one another
A n o t h e r reason that the t r a n s l a t i o n s
are d i f f i c u l t is that l e x i c o g r a p h e r s are
fond of polysemy It is a mark of c a r e -
ful s c h o l a r s h i p that all the senses of a
w o r d should be d i s t i n g u i s h e d ; the m o r e
careful the scholarship, the g r e a t e r the
number of d i s t i n c t i o n s
W h e n d i c t i o n a r y entries are taken
l i t e r a l l y the results for s e n t e n c e inter-
p r e t a t i o n are ridiculous C o n s i d e r an
example Suppose the l a n g u a g e p r o c e s s o r
is asked to p r o v i d e an i n t e r p r e t a t i o n for
some simple sentence, say:
"The boy loves his m o t h e r "
And imagine it has a v a i l a b l e the text of
M e r r i a m - W e b s t e r ' s N i n t h New C o l l e o i a t e
D ~ Ignoring sub-senses:
"the" has 4 senses,
"boy" has 3,
"love" has 9 as a noun and 4 as a
verb,
"his" h a s 2 entries, and
"mother" has 4 as a noun, 3 as an ad-
jective, 2 as a verb
Such numbers invite c a l c u l a t i o n If w e
a s s u m e the s y s t e m has a p a r s e r able to do
no m o r e than r e c o g n i z e that "love" is a
verb and "mother" is a noun, then, on the
b a s i s of the l i t e r a l i n f o r m a t i o n in this
dictionary, there are 4 x 3 x 4 x 2 x 4 - 384
c a n d i d a t e i n t e r p r e t a t i o n s This c a l c u l a -
tion a s s u m e s m i n i m a l p a r s i n g and m a x i m a l
r e l i a n c e on the d i c t i o n a r y Of course,
no s e l f - r e s p e c t i n g p a r s e r w o u l d t o l e r a t e
so m a n y p a r a l l e l i n t e r p r e t a t i o n s of a sentence, but the i l l u s t r a t i o n g i v e s a
f e e l i n g for how m u c h w o r k a good p a r s e r does A-d all of it is done in o r d e r to
" d i s a m b i g u a t e " a s e n t e n c e that n o b o d y w h o
k n o w s E n g l i s h w o u l d c o n s i d e r to be the least a m b i g u o u s
: S y n o n y m y and p o l y s e m y pose s e r i o u s problems, even b e f o r e w e raise the q u e s - tion of how to t r a n s l a t e c o n v e n t i o n a l
d e f i n i t i o n s into c o m p u t a t i o n a l l y u s e f u l
n o t a t i o n s A n y s y s t e m will have to c o p e
w i t h s y n o n y m y and p o l y s e m y , of course, but the book a p p r o a c h to the v o c a b u l a r y
p r o b l e m s e e m s to raise them in a c u t e forms, w h i l e p r o v i d i n g l i t t l e of the in-
f o r m a t i o n r e q u i r e d to resolve them W i t h
s u f f i c i e n t p a t i e n c e this a p p r o a c h will
s u r e l y lead to a s a t i s f a c t o r y solution, but no one s h o u l d think it w i l l be easy
T h e V o c a b u l a r y M a t r i x
As p r e s e n t e d so far, s y n o n y m y and
p o l y s e m y a p p e a r to be two d i s t i n c t p r o b - lems From another point of view, they are m e r e l y two d i f f e r e n t w a y s of l o o k i n g
at the same problem
In essence, a c o n v e n t i o n a l d i c t i o n - ary is s i m p l y a m a p p i n g of senses onto words, and a m a p p i n g can be c o n v e n i e n t l y
r e p r e s e n t e d as a matrix: call it a v o c a b -
u l a r y matrix Imagine a huge m a t r i x w i t h all the w o r d s in a l a n g u a g e a c r o s s the top of the matrix, and all the d i f f e r e n t senses that those w o r d s can e x p r e s s d o w n the the side If a p a r t i c u l a r sense can
be e x p r e s s e d by a word, then the cell in that row and c o l u m n c o n t a i n s an entry;
o t h e r w i s e it c o n t a i n s nothing The e n t r y itself can p r o v i d e s y n t a c t i c i n f o r m a t i o n ,
or e x a m p l e s of usage, or even a p i c t u r e w h a t e v e r the l e x i c o g r a p h e r deems i m -
p o r t a n t e n o u g h to include T a b l e 1 shows
a f r a g m e n t of a v o c a b u l a r y matrix
T a b l e i F r a g m e n t of a V o c a b u l a r y M a t r i x
C o l u m n s r e p r e s e n t m o d a l verbs; rows
r e p r e s e n t m o d a l senses; 'E' in a cell
m e a n s the w o r d in that c o l u m n can e x p r e s s
the sense in that row
W O R D S
S E N S E S can m a y _ m u ~ ~ _ M i l 1
be o b l i g e d to E
c e r t a i n to be E
be n e c e s s a r y E
e x p e c t e d t o b e E E
Trang 4Several comments should be made about the
vocabulary matrix
First, it should be apparent that
any conventional dictionary can be repre-
sented as a vocabulary matrix: simply add
a column to the matrix for every word,
and add a row to the matrix for every
sense of every word that is given in the
printed dictionary (A lexical matrix
can be viewed as an impractical w~y of
printing a dictionary on a single, very
large sheet of paper.)
Second, entering such a matrix con-
sists of searching down some column or
across some row So a vocabulary matrix
can be entered either with a word or w i t h
a sense Thus, one difference between
conventional dicticnaries, which can be
entered only with a word, and the dic-
tionary in out mind, which can be entered
with either words or senses, disappears
when dictionaries are represented in this
more abstract form
Third, if you enter the matrix with
a sense and search along a row, you find
all the words that express that sense
When different words express the same
sense, we say they are g~iQ~ym~USo On
the other hand, if you enter the matrix
with a word and look down that column,
you find all the different senses that
that word can express When one word can
express two or more senses, we say that
it is ambiguous, or ~ixsemglL~ Thus,
the two great complications of lexical
knowledge, synonymy and polysemy, are
seen as complementary aspects of a single
abstract structure=
Finally, since the vocabulary matrix
s e r v e s only to represent the mapping
between the two domains, it is free to
expand as new words, or new senses for
familiar words, are added Of course,
the number of columns is relatively fixed
by the size of the vocabulary, so the
major degrees of freedom are in deciding
what the senses are and how to represent
them
T h e D e m o Approach
When the question is raised of what
a computationally useful lexical entry
should look like, it is time to shift
from the book approach to the demo ap-
proach, where serious attempts have been
made to establish a conceptual notation
in which semantic interpretations can be
expressed for computational use
By "the demo approach" I mean the
strategy of building a system to process
language that is confined to some well
defined content area Since language
processing is a large and difficult
trying out one's ideas in a small way to see whether they work If the ideas don't work in a limited domain, they certainly won't work in the unlimited domain of general discourse The result
of this approach has been a series of progressively more ambitious demonstra- tion programs
Among those who take this approach, two extremes can be distinguished On the one hand are those who feel that syntactic analysis is essential and should be carried, if not to completion, then as far as possible before resorting
to semantic information On the other hand are those who prefer s e m a n t i c s - b a s e d processing and consider syntactic cri- teria only when they get in trouble The difference is largely one of emphasis, since neither extreme seems willing to rely totally on one or the other kind of information, and most workers would probably locate themselves somewhere in the middle Since I am concerned here with the lexical aspects
of language comprehension, however, I shall look primarily at semantics-based processing
Vocabulary S i z e Most of these demos have small vo- cabularies It is surprising how much you can do with 1,500 well chosen words;
a demo with more than 5,000 words would
be evidence of manic energy on the part
of its creator A few thousand lexical entries have been all that was required
in order to test the ideas that the de-
signer was interested in
The problem, of course, is that writing dictionary definitions is hard work, and writing them in LISP doesn't make it any easier If you are satisfied with definitions that take five lines of code, then, obviously, you can build a much larger dictionary than if you try to cram into an entry all the different senses that are found in conventional dictionaries But e v e n with short definitions, a great many have to be written
If you want the language processor
to have as large a vocabulary as the average user, you will have to give it at least i00,000 words One way to g e t a feeling for how many words that is is to translate it into a rate of acquisition Several years ago I looked at Mildred Templin's (1953) data that way Templin measured the vocabulary size of children
of average intelligence at 6, 7, and 8 years of age In two years they acquired
2 8 , 3 0 0 - 13,000 = 15,300 words, which
Trang 5averages out to about 21 words per day
(Miller, 1977)
Most people, when they hear that
result, confess that they had no idea
that children are learning new words at
such a rapid rate But the arithmetic
holds just as well for computers as for
children If you want the language pro-
cessor to have a vocabulary of 100,000
words, and if you are willing to spend
ten years putting definitions into it,
then you will have to put in more than 27
new definitions every day
How far from this goal are today's
demos? The answer should be simple, but
it's not It is hard to tell exactly how
many words these systems can handle
Definitions are usually written in terms
of a relatively small set of semantic
primitives, and the inheritance of
properties is assumed wherever possible
The goal, of course, is to create an
unambiguous semantic representation that
can be used as input to an inferencing
system, so the form of these representa-
tions is much more important than their
variety, at least in the initial experi-
ments In the hands of a clever program-
mer, a few hundred semantic primitives
can really do an enormous a m o u n t of work
Although it is often assumed that
the fewer semantic primitives a system
requires, the better it is, in fact there
seems to be little advantage to keeping
the number small When the number of
primitives is small, definitions become
long permutations of that small number of
different atoms (Miller, 1978) When the
set of primitives gets too small, defini-
tions become like machine code: the com-
puter loves them, but people find them
hard to read or write
C ~ I n l n g Book and Demo
How large a set of semantic primi-
tives do we need? It is claimed that
Basic English can express any idea with
only 850 words, but that really cuts the
vocabulary to the bone The
D i c t i o n a r y of Contemporary Enalish~ which
is very popular with people learning
English as a second language, uses a
constrained vocabulary of about 2,000
words (plus some specialized terms) to
write its definitions
Using the L ~ as a guide, Richard
Cullingford and I tried to estimate how
ing a computationally useful lexicon
Our initial thought was to write LISP
programs for 2,000 basic terms, then use
Cullingford's l a n g u a g e processor
(Cullingford, 1985) to translate all of
the definitions into LISP We quickly
are polysemous; different senses are used
in different definitions As a rough estimate, we thought 12,000 basic concepts might suffice
An examination of the ~ defi- nitions also indicated that a great deal
of information might have to be added to the translated definitions Many of the simpler conceptual dependencies (informa- tion required for disambiguation, as well
as for drawing inferences; Schank, 1975) have to be included in the definitions Each translated definition would have to
be checked to see that all sense relations, predicate-argument structures, and selectional restrictions were explicit and correct, and a wide variety
of pragmatic facts (e.g., that "anyhow"
in initial position signals a change of topic) would probably have to be added
We have not undertaken this task Not only would writing 12,000 defini- tions (and checking out and supple- menting 50,000 more) require a major commitment of time and energy, but we do not have Longman's permission to use their dictionary this way I report it, not as a project currently under way, but simply as one way to think about the magnitude of the vocabulary problem
So the situation is roughly this: In order to have natural language interfaces
to the marvellous information sources that will soon be available, one thing we
m u s t do is beef up the vocabularies that natural language processors can handle That will not be an easy thing to accomplish Although there is no principled reason why natural language processors should not have vocabularies large enough to deal with a any domain of topics, we are presently far from having such vocabularies on llne
THE SEARCH PROBLEM
As we look ahead to having large vocabularies, we must begin to think more carefully about the search problem
In general, the larger a data base
is, the longer it takes to locate some- thing in it How a large vocabulary can
be organized in human memory to permit retrieval of word meanings at conversa- tional rates is a fascinating question, especially since retrieval from the subjective lexicon does not seem to get slower as a person's vocabulary gets larger The technical issues involved in achieving such performance with silicon
Trang 6only well enough to recognize that there
are many possibilities and no easy an-
swers Instead of speculating about the
computer, therefore, I will take a moment
to marvel at how well people manage their
large vocabularies
In the past fifteen years or so a
number of cognitive psychologists have
been sufficiently impressed by people's
lexical skills to design experiments that
they hoped would reveal how people do it
This is not the time to review all that
research (see Simpson, 1984), but some of
the questions that have been raised merit
attention
Psychologists have considered two
kinds of theories of lexical access,
known as search theories and threshold
theories
Search theories assume that a pas-
sive trace is stored in the mental lexi-
con and that lexical access consists of
matching the stimulus to its memory rep-
resentation Preliminary analysis of the
stimulus is said to generate a set of
candidates, which is searched serially
until a match is found
Threshold theories claim that each
sense of every word ks an independent
detector waiting for its features to
occur When the feature count for any
sense gets above some threshold, that
sense becomes conscious
Both kinds of theories can account
for most of the experimental data, but
not all of it which is unfortunate,
since a clear decision in favor of one or
the other might help to resolve the ques-
tion of whether lexical access involves a
serial processor with search and retrie-
val, or a parallel processor with simple
activation Since the brain apparently
uses slow and noisy components, something
searching in parallel seems plausible,
but such devices are not yet well under-
s t o o d
Accesslnq Ambiquous Words
Some of the most interesting psycho-
logical research on lexical access con-
cerns how people get at the meanings of
polysemous words These studies exploit
a phenomenon called priming: when a word
in a given lexical domain occurs, other
words in that domain become more acces-
sible
For example, a person is asked to
say, as quickly as possible, whether a
sequence of letters spells an English
word If the word DOCTOR has just been
presented, then NURSE will be recognized more rapidly than if the preceding word had been unrelated~ like BUTTER (Meyer & Schvaneveldt, 1971; Becket, 1980) The recognition of DOCTOR is said to prime the recognition of NURSE
This lexlcal decision task can be used to study polysemy if the priming word is ambiguous, and if it ks followed
by probe words appropriate to its dif- ferent senses
For example, the ambiguous prime PALM might be followed on some occasions
by BAND and on other occasions by TREE The question ks whether all senses of a polysemous word are activated simultan- eously, or whether context can facili- tate one meaning and inhibit all others Three explanations of the results of these experiments are presently in compe- tition
Context d e p e n d e n t access Only the sense that is appropriate to the context
is retrieved or activated
Ordered access Search starts with the most frequent sense and continues serially until a sense ks found that s a t -
isfies the context
Exhaustive access Everything is activated in parallel at the same time, then context selects the most appropriate sense
At present, exhaustive access seems
to be the favorite According to that theory, disambiguation is a post-access process; the access process itself ks a cognitive "module," automatic and insul- ated from contextual influence My own suspicion is that none of these theories
is exactly right, and that Simpson (1984)
is probably closer to the truth when he suggests that multiple meanings are ac- cessed, but that dominant meanings appear first and subordinate meanings come in more slowly and then disappear
Psychological research on lexical access is continuing; the complete story
is not yet ready to be told One aspect
of the work is so obvious, however, that its importance tends to be overlooked
Semantic Fields
The priming phenomenon presupposes
an organization of lexical knowledge into patterns of conceptually related words, patterns that some linguists have called semantic fields Apparently a semantic field can fluctuate in accessibility as a whole
310
Trang 7of semantic fields as evidence in favor
of theories of semantic decomposition
(Miller & Johnson-Laird, 1976) The idea
is that all the words in a semantic fleld
share some primitive semantic concept,
and it is the activation or suppression
of that shared concept that affects the
accessibillty of the words sharing it
Scribing some research we have been doing
on vocabulary growth in school children The results indicate that we need better ways to teach new words~ with that need
in mind I will return to the question of
what we can reasonably expect from n a t u -
ral language interfaces
Nominal semantic fields are fre-
quently organized hierarchically and so
are relatively simple to appreciate
Verbal semantic fields, however, tend to
be more complex For example, all the
motion verbs "move," "come," "go,"
"bring," "rise," "fall," "walk," "run,"
=turn," and so on share a semantic
primitive that might be glossed as
"change location as a function of time."
In a similar manner, verbs of possession
"possess," "have," "own," "borrow,"
"buy," "sell," "find," and so on share
a semantic primitive that has to do with
Eights of ownership
Not all semantic primes nucleate
semanti¢ fields, however There is a
causative primitive that differentiates
"rise" and "raise," "fall" and "fell,"
"die" and "kill," and so on, yet the
causative verbs "raise," "fell," "kill"
do not form a causative semantic field
Johnson-Laird and I distinguished two
classes of semantic primitives: those
(like motion) around which a semantic
field can form, and those (like causa-
tion) used to differentiate concepts
within a given field
Although the nature of semantic
primitives is a matter of considerable
interest to anyone who proposes a sem-
antic notation for writing the defini-
tions that a language processing system
will use, they have received relatively
little attention from psychologists
Experimental psychologlsts have a strong
tendency to concentrate on questions of
function and process at the expense of
questions of content Perhaps their
attempts to understand the processes of
disambiguation will stimulate greater
interest in these structural questions
THE PROBLEM OF CONTEXT
The reason that lexical polysemy
causes so little actual ambiguity is
that, in actual use, context provides
information that can be used to select
the intended sense Although c o n t e x t u a l
disambiguation is simple enough when
people do it, it is not easy for a compu-
ter to do, even when the text is seman-
tically well-formed With semantically
ill-formed input the problem is much
worse
C h i l d r e n ' s U s e o f D i c t i o n a r i e s
We have been looking at what happens when teachers send children to the dic- tionary to "look up a word and write a sentence using it." The results can be amusing: for example, Deese (1967) has reported on a 7th-grade teacher who told her class to look up "chaste" and use it
in a sentence Their sentences included:
"The milk was chaste," "The plates were still chaste after much use," and "The amoeba is a chaste animal."
In order to understand what they were doing, you have to see the diction- ary entry for "chaste':
CHASTE: i innocent of unlawful sexual intercourse 2 celibate 3 pure in thought and act, modest 4 severely simple in design or execution, austere
As Deese noted, e a c h of the children's
sentences is compatible with information provided by the dictionary that they had been told to consult
You might think that Deese's obser- vation was merely an amusing reflection
of some quirk in the dictionary entry foe
"chaste," but that assumption would be quite wrong Patti Gildea and I (Miller
& Gildea, 1985) have confirmed Deese's observation many times over We asked 5th and 6th grade children to look words
up and to write sentences using them As
of this writing, our i0- and 11-year old friends have written a few thousand sen- tences for us, and we are still collect-
i n g t h e m Our goal is to discover which kinds
of mistakes are most frequent In order
to do this, we evaluate each sentence as
we enter it into a data management system and, if something is wrong, we describe the mistake By collecting our descrip- tions, we have made a first, tentative classification
This project is still going on, so I can give only a preliminary report based
on about 20% of our data So far we have analyzed 457 sentences incorporating 22 target words: 12 are relatively common words that most of the children knew, and i0 are relatively rare words with which they were unfamiliar The common words
Trang 8words introduced by authors of 4th-grade
basal readers; the rare words were selec-
ted from those introduced in 12th-grade
readers (Taylor, Frackenpohl, & White,
1979) It is convenient to refer to them
as the 4th-grade words and the 12th-grade
words, respectively
Errors were relatively frequent Of
the sentences classified so far, only 21%
of those using 4th-grade words were suf-
ficiently odd or unacceptable to indicate
that the author did not have a good grasp
on the meaning and use of the word, but
63t of the sentences using 12th-grade
words were judged to be odd= Thus, the
majority of the errors o c c u r r e d with the
12th-grade words
Table 2 shows our current classifi-
cation Note that the categories are not
mutually exclusive: some ingenious young-
sters are able to make two oz even three
mistakes in a single sentence
Table 2 Classification of S e n t e n c e s
TYPe of Sentence 4th-arade 12th~azade
Most of the descriptive phrases in Table
2 should be self-explanatory, but some
examples may help Skip the selectional
errors; I shall say more about them in a
m o m e n t
Cons ider "Wrong part of speech":
a student wrote "my hobby is 1 istening
to Ouran Duran records, I have obtained
an ACCRUE for it', thus using a verb as a
noun As an example of "Wrong prepo-
sition," consider the student who wrote:
aBe very METICULOUS on your work." An
example of "Inappropriate topic" is: "The
train was TRANSITORY." An example of
"Inappropriate o b j e c t " is: " I was METIC-
ULOUS about falling off the cliff." Ex-
amples of "Used rhyming word" are =Did it
ever ACCRUE to you that Maria T always
marks with a special pencil on my face?',
"Did you evict that old TENET?", and "The
man had a knee REPARATION o"
Other categories were even less fre- quent, so return now to the most common type of mistake, the one labelled "Selec- tional error="
V l o l a t l o n s of Seleetlonal P r e f e r e n c e s The sentences that Deese reported illustrate selectional errors Further examples can be taken from our data= "We had a branch ACCRUE on our plant," "1 bought a battery that was TRANSITORY,"
"The rocket REPUDIATE off into the sky,"
"John is always so TENET to me="
It is unfair to call these sentences
"errors" and to laugh at the children's mistakes= The students were doing their best to use the dictionary If there was any mistake, it was made by adults who misunderstood the nature of the task that they had assigned
Take the "accrue" sentence, for ex- ample= The definition that the students saw was:
ACCRUE= come as a growth or result= "In- terest will accrue to you every year from money left in a savings bank Ability to think will accrue to you from good habits of study."
We assume that the student read this def- inition looking for something she under- stood and found "come as a growth." She composed a sentence around this phrase:
"We had a branch COME AS A GROWTH on our plant', then substituted "accrue" for it This strategy seems to account for the other examples A familiar word is found in the definition, a sentence is composed around it, then the unfamiliar word is substituted for the familiar word Some further evidence supports the claim that something like this strategy
is being used One intriguing clue is that sometimes the final substitution is not made= the written sentence contains the word selected from the definition but not the word that it defined And, since substitution is not a simple mental oper- ation for children, sometimes the selec- ted word or phrase from the definition is actually written in the margin of the paper, alongside the requested sentence These are called selectional errors because they violate selectional pref- erences For example, the girl who dis- covered that "stimulate" means "stir up" and so wrote, "Mrs Jones stimulated the cake," violated the selectional prefer- ence that =stimulate" should take an ani- mate o b j e c t
Trang 9One r e a s o n t h e s e e r r o r s are so fre-
q u e n t is that d i c t i o n a r i e s do not pro-
v i d e m u c h i n f o r m a t i o n about s e l e c t i o n a l
p r e f e r e n c e s W e think w e know how to
remedy that d e f i c i e n c y , but that is not
w h a t I w a n t to d i s c u s s here For the
m o m e n t it s u f f i c e s if you r e c o g n i z e that
w e have a p l e n t i f u l s u p p l y ~ f s e n t e n c e s
c o n t a i n i n g v i o l a t i o n s of s e l e c t i o n a l
p r e f e r e n c e s , and that the s e n t e n c e s are
of some e d u c a t i o n a l s i g n i f i c a n c e
Intelligent Tutoring?
Now let me pose the f o l l o w i n g q u e s -
tion C o u l d w e use these s e n t e n c e s as a
"bug catalog" in an i n t e l l i g e n t t u t o r i n g
system?
At the moment, i n t e l l i g e n t t u t o r i n g
s y s t e m s (Sleeman & Brown, 1982) use m a n y
m e n u s to o b t a i n the s t u d e n t ' s a n s w e r s to
q u e s t i o n s , and some p e o p l e feel that this
is a c t u a l l y an a d v a n t a g e But I s u s p e c t
that if w e had a good l a n g u a g e interface,
one that u n d e r s t o o d natural l a n g u a g e re-
sponses, it w o u l d soon replace the menus
In any case, imagine an i n t e l l i g e n t
tutoring s y s t e m that can h a n d l e n a t u r a l
l a n g u a g e input Imagine that the tutor
asked c h i l d r e n to w r i t e s e n t e n c e s con-
taining w o r d s that they had just seen
defined, r e c o g n i z e d w h e n a s e l e c t i o n a l
e r r o r had occurred, then u n d e r t o o k to ex-
p l a i n the mistake
W h a t w o u l d the i n t e l l i g e n t tutor
have to know in order to d e t e c t and cor-
rect a s e l e c t i o n a l error? O t h e r w i s e
said, w h a t m o r e w o u l d it have to know
than any l a n g u a g e c o m p r e h e n d e r has to
know?
The q u e s t i o n is not rhetorical~ I
ask it b e c a u s e I w o u l d r e a l l y like to
know the answer In my view, it p o s e s
s o m e t h i n g of a dilemma The problem, as
Y o r i c k W i l k s (1978) has p o i n t e d out, is
that any simple rules of c o - o c c u r r e n c e
that w e are l i k e l y to p r o p o s e will, in
real discourse, be v i o l a t e d as o f t e n as
they are observed (Not only do p e o p l e
o f t e n say one thing and m e a n another, but
the p r e v a l e n c e of f i g u r a t i v e and idioma-
tic language is c o n s i s t e n t l y u n d e r e s t i -
m a t e d by theorists.) If we give the
i n t e l l i g e n t tutor strict rules in o r d e r
to d e t e c t s e l e c t i o n a l errors like "Our
car d e p l e t e s g a s o l i n e , " will it not also
treat "Our car d r i n k s g a s o l i n e " as an
error? On the other hand, if the tutor
a c c e p t e d the latter, w o u l d it not also
a c c e p t the former?
An even simpler dilemma, one o f t e n
noted, is that a s y s t e m that b l o c k s such
p h r a s e s as " c o l o r l e s s g r e e n ideas" w i l l
also block such s e n t e n c e s as "There are
t e a c h e s c h i l d r e n to a v o i d " s t i m u l a t e the cake," w i l l it a l s o t e a c h them to a v o i d
=you c a n ' t s t i m u l a t e a c a k e ' ?
W h e n s u b t l e s e m a n t i c d i s t i n c t i o n s are at issue, it is c u s t o m a r y to remark that a s a t i s f a c t o r y l a n g u a g e u n d e r s t a n d - ing s y s t e m w i l l h a v e to k n o w a g r e a t deal
m o r e that the l i n g u i s t i c v a l u e s of w o r d s
It w i l l have to k n o w a g r e a t deal a b o u t the world, and a b o u t t h i n g s that p e o p l e
p r e s u p p o s e w i t h o u t reflection Such remarks are p r o b a b l y true, but they o f f e r
l i t t l e g u i d a n c e in g e t t i n g the job done
S i n c e I have no b e t t e r answer, I
w i l l s i m p l y a g r e e that the lexical i n f o r -
m a t i o n a v a i l a b l e to any s a t i s f a c t o r y lan-
g u a g e u n d e r s t a n d i n g s y s t e m w i l l have to
be c l o s e l y c o o r d i n a t e d w i t h the s y s t e m ' s
g e n e r a l i n f o r m a t i o n a b o u t the w o r l d To
p u r s u e that idea would, of course, go
b e y o n d the l e x i c a l l i m i t s I have i m p o s e d here, but it d o e s s u g g e s t that we w i l l have to w r i t e our d i c t i o n a r y not once, but m a n y times until we get it right
So, w h i l e there is no p r i n c i p l e d
o b s t a c l e to h a v i n g large v o c a b u l a r i e s in our n a t u r a l l a n g u a g e interfaces, there are still many p r o b l e m s to be solved
T h e r e is work here for e v e r y o n e lin- guists, p h i l o s o p h e r s , and p s y c h o l o g i s t s ,
as well as c o m p u t e r s c i e n t i s t s and it
is not a b s t r a c t or i m p r a c t i c a l work The
a n s w e r s w e p r o v i d e w i l l shape i m p o r t a n t
a s p e c t s of the i n f o r m a t i o n s y s t e m s of the future
R e f e r e n c e s Amsler, R A (1984) M a c h i n e - r e a d a b l e
d i c t i o n a r i e s A n n u a l R e v i e w Qf
I n f o r m a t i o n S c i e n c e and T e G h n o l o u v ,
19, 161-209
Becket, C A (1980) S e m a n t i c c o n t e x t
e f f e c t s in v i s u a l w o r d recognition: An
a n a l y s i s of s e m a n t i c s t r a t e g i e s
M e m o r y & C o o n i ~ i o n , 8, 493-512
Bol t , R.A (1984) The Human Interface: Where People and Computers meet Belmont, Ca]if.: Lifetime Learning
C u l l i n g f o r d , R E (1985) N a t u r a l L a n -
g u a g e Processing: A K n o w l e d g e E n g i n e - ering Approach (Manuscript)
Deese, J
meaning
641-651
(1967) M e a n i n g and c h a n g e of
A m e r i c a n P s v c h o l o o i s t , 22,
313
Trang 10(1971) Faciliation in recognizing pairs of words: Evidence of a depen- dence between retrieval operations Journal ofLExDerimental_Psvcholoav,
90, 227-234
Miller, G A (1977)
ADDrentices¢ Children and Lanauaue
New York: Seabury Press
Miller, G A (1978) Semantic relations among words In M Halle, J Bresnan,
& G A Miller (eds.), L i ~
Theor~ a n d Psvcholoaical RealitY°
C~mhridge, Mass.: MIT Press
Miller, G A , & G i l d e a , P M (1985) How to misread a dictionary AILA Bulletin (in press)
Miller, G A., & Johnson-Laird, P N (1976) Lanuuaue and Perception Cambridge, Mass.: Harvard University Press
Procter, P (ed.) (1978) Z d ~
tionarv of Contemporary Enulish
Harlow, Essex: Longman
chank, R C (1975)
marion Processing
North-Holland
Conceotual Infor-
Amsterdam:
Simpson, G B (1984) Lexical ambiguity and its role in models of word recog- nition° P s v c h o l o a i c a l Bulletin, 96, 316-340
Sleeman, D , & B r o w n , J S ( e d s )
(1982) Intelliaent Tutorina Systems New York: Academic Press
Taylor, S E., Frackenpohl, H., & White,
C E (1979) A revised core vocab- ulary In E D L Core Vocabularies in
~Eadinu Mathematics S c i e n c e and
• " New York: McGraw-Hill
Templin, M C (1957) Certain Lanuuaae Skills in Children= T h e i r DeveloomenE and Interrelationships Minneapolis: University of Minnesota Press
Walker, D E., & Amsler, R A (1984) The use of machine ~eadable diction- aries in subianguage analysis In R
I Kittredge (ed.), W o r k s h o p on Sub~ lanuuage Analv~iSo (Available from the authors at Bell Communications Re-
Mocristown, NJ 07960.)
Wilks, Y A (1978) Making preferences more active A r t i f i c i a l Intslliaence,
11, 197-223
314