i that "normal- izes" the text... In inde- finite noun phrases the substantial content of the expected endings is, to say the least, meager, as both nouns and adjectives in many situatio
Trang 1Benny Brodda Inst of L i n g u i s t i c s
U n i v e r s i t y of S t o c k h o l m S-I06 91 Stockholm SWEDEN
A B S T R A C T Heuristic parsing is the art of doing parsing
in a haphazard and s e e m i n g l y careless m a n n e r but
in such a w a y that the o u t c o m e is still "good", at
least from a statistical point of view, or, hope-
fully, e v e n f r o m a m o r e a b s o l u t e p o i n t of v i e w
T h e i d e a is to f i n d s t r a t e g i c s h o r t c u t s d e r i v e d
f r o m g u e s s e s a b o u t the s t r u c t u r e of a s e n t e n c e
b a s e d on s c a n t y o b s e r v a t i o n s of linguistic units
In the sentence If the guess comes out right m u c h
p a r s i n g t i m e c a n be saved, a n d if it d o e s not,
m a n y s u b o b s e r v a t i o n s m a y s t i l l be v a l i d for re-
v i s e d g u e s s e s In the (very p r e l i m i n a r y ) e x p e r i -
ment reported here the main idea is to make use of
( c o m b i n a t i o n s of) s u r f a c e p h e n o m e n a as m u c h as
p o s s i b l e as the b a s e for the p r e d i c t i o n of the
s t r u c t u r e as a w h o l e In the p a r s e r to be d e v e -
loped along the lines sketched in this report main
s t r e s s is p u t o n a r r i v i n g at i n d e p e n d e n t l y
working, parallel recognition procedures
The w o r k reported here Is both a i m e d at s i m u -
l a t l n g c e r t a i n a s p e c t s of h u m a n l a n g u a g e p e r -
c e p t i o n a n d at a r r i v i n g at e f f e c t i v e a l g o r i t h m s
for a c t u a l p a r s i n g of r u n n i n g text T h e r e is,
i n d e e d , a g r e a t n e e d for fast s u c h a l g o r i t h m s ,
e.g for the analysis of the literally millions of
words of running text that already today c o m p r i s e
the d a t a b a s e s in v a r i o u s l a r g e i n f o r m a t i o n re-
t r i e v a l s y s t e m s , a n d w h i c h c a n be e x p e c t e d to
e x p a n d s e v e r a l o r d e r s of m a g n i t u d e b o t h in i m -
portance and In size In the foreseeable future
I BACKGROUND
T h e g e n e r a ! i d e a b e h i n d the s y s t e m for h e u -
ristic parsing n o w being developed at our group in
S t o c k h o l m d a t e s m o r e t h a n 15 y e a r s back, w h e n I
w a s m a k i n g an i n v e s t i g a t i o n ( t o g e t h e r w i t h H a n s
K a r l g r e n , S t o c k h o l m ) of the p o s s i b i l i t i e s of
using c o m p u t e r s for i n f o r m a t i o n retrieval purposes
for the S w e d i s h G o v e r n m e n t a l Board for Rationali-
z a t i o n ( S t a t s k o n t o r e t ) In the c o u r s e of this
i n v e s t i g a t i o n w e p e r f o r m e d s o m e p s y c h o l i n g u l s t i c
e x p e r i m e n t s a i m e d at f i n d i n g out to w h a t e x t e n t
s u r f a c e m a r k e r s , s u c h as e n d i n g s , p r e p o s i t i o n s ,
c o n j u n c t i o n s and o t h e r (bound) e l e m e n t s f r o m
t y p i c a l l y c l o s e d c a t e g o r i e s of linguistic units,
could serve as a base for a syntactic analysis of
s e n t e n c e s W e s a m p l e d a c o u p l e of t e x t s m o r e or
l e s s at r a n d o m a n d p r e p a r e d t h e m in s u c h a w a y
that stems of nouns, adjectives and (main) verbs -
t h e s e c a t e g o r i e s b e i n g t h o u g h t of as the m a i n
carriers of s e m a n t i c I n f o r m a t i o n - w e r e substi- tuted for by a mere "-", w h e r e a s other f o r m a t i v e s were left in their original shape and place These
t r a n s f o r m e d texts w e r e presented to subjects w h o
w e r e a s k e d to fill in the g a p s in s u c h a w a y t h a t the texts thus obtained w e r e to be both syntacti- cally correct and reasonably coherent
T h e r e s u l t of the e x p e r i m e n t w a s r a t h e r astonishing It turned out that not only w e r e the syntactic structures mainly restored, in some few cases also the original content was reestablished,
a l m o s t w o r d by w o r d (It w a s b e y o n d a n y p o s s i - bility that the subjects could have had access to the original text.) Even in those cases w h e n the
t e x t i t s e l f w a s not r e s t o r e d to t h i s r e m a r k a b l e extent, the stylistic v a l u e of the v a r i o u s t e x t s was a l m o s t invariably reestablished; an o r i g i n a l l y
l i v e l y , n a r r a t i v e s t o r y c a m e out as a l i v e l y ,
n a r r a t i v e s t o r y , and a p i e c e o f r a t h e r dull,
f a c t u a l text ( f r o m a s c h o o l text b o o k on s o c i o - logy) invariably c a m e out as dull, factual prose This e x p e r i m e n t s h o w e d quite clearly that at least for S w e d i s h the i n f o r m a t i o n contained in the
c o m b i n a t i o n s of s u r f a c e m a r k e r s to a r e m a r k a b l y
h i g h d e g r e e r e f l e c t s the s y n t a c t i c s t r u c t u r e of the o r i g i n a l text; in a l m o s t a l l c a s e s a l s o the
s t y l i s t i c v a l u e a n d in s o m e f e w c a s e s e v e n the
s e m a n t i c c o n t e n t w a s kept (The e x t e n t to w h i c h this is true is probably language dependent; S w e -
d i s h is r a t h e r r i c h in m o r p h o l o g y , a n d this property is certainly a contributing factor for an
e x p e r i m e n t of this type to come out successful to the extent it actually did.)
T h i s t y p e of e x p e r i m e n t h a s s i n c e t h e n b e e n repeated m a n y t i m e s by many scholars; in fact, it
ls o n e of the s t a n d a r d w a y s to d e m o n s t r a t e the
c o n c e p t of r e d u n d a n c y in texts B u t t h e r e a r e several other important conclusions one could draw
f r o m t h i s type of e x p e r i m e n t s F i r s t of all, of
c o u r s e , the o b v i o u s c o n c l u s i o n that s u r f a c e
s i g n a l s do c a r r y a lot of i n f o r m a t i o n a b o u t the
s t r u c t u r e of s e n t e n c e s , p r o b a b l y m u c h m o r e t h a n one has been inclined to think, and, consequently,
It c o u l d be w o r t h w h i l e to try to c a p t u r e that
I n f o r m a t i o n in s o m e k i n d of a u t o m a t i c a n a l y s i s
s y s t e m T h i s is the p r a c t i c a l s i d e of it But there is more to it One must ask the question why
a language llke S w e d i s h is llke this What are the theoretical implications?
M u c h Interest has been devoted in later years
to t h e o r i e s (and s p e c u l a t i o n s ) a b o u t h u m a n p e r -
Trang 2that o n e s p e c u l a t e s too m u c h if o n e a s s u m e s that
s u r f a c e m a r k e r s of the t y p e that a p p e a r e d in the
d e s c r i b e d e x p e r i m e n t t o g e t h e r c o n s t i t u t e i m -
p o r t a n t c l u e s c o n c e r n i n g the g r o s s s y n t a c t i c
structure of sentences (or utterances), clues that
are probably m u c h less consiously perceived than,
e.g., the a c t u a l w o r d s in the s e n t e n c e s / u t t e r a n -
ces To the e x t e n t that s u c h c l u e s are a c t u a l l y
p e r c e i v e d t h e y a r e o b v i o u s l y p e r c e i v e d s i m u l t a -
n e o u s l y w i t h , i.e in p a r a l l e l w i t h , o t h e r u n i t s
(words, for instance)
The above way of looking upon perception as a
set of i n d e p e n d e n t l y o p e r a t i n g p r o c e s s e s is, of
course, more or less generally accepted nowadays
(cf., e.g., L i n d s a y - N o r m a n 1977), and it is a l s o
g e n e r a l l y a c c e p t e d in c o m p u t a t i o n a l linguistics
t h a t a n y p r o g r a m that a i m s at s i m u l a t i n g p e r -
c e p t i o n in one w a y or o t h e r m u s t h a v e f e a t u r e s
that s i m u l a t e s (or, e v e n b e t t e r , a c t u a l l y p e r -
f o r m s ) p a r a l l e l p r o c e s s i n g , and the a n a l y s i s
system to be described below has m u c h emphasis on
exactly this feature
A n o t h e r c o m m o n s a y i n g n o w a d a y s w h e n d i s -
cussing parsing techniques is that one should try
to i n c o r p o r a t e " h e u r i s t i c d e v i c e s " (cf., e.g.,
the m a n y s u b r e p o r t s r e l a t e d to the big A R P A -
p r o j e c t c o n c e r n i n g S p e e c h Recognition and Under-
standing 1970-76), a l t h o u g h t h e r e d o e s not s e e m
to exist a very precise consensus of what exactly
that w o u l d mean (In m a t h e m a t i c s the t e r m h a s
b e e n t r a d i t i o n a l l y u s e d to r e f e r to i n f o r m a l
r e a s o n i n g , e s p e c i a l l y w h e n u s e d in c l a s s r o o m
s i t u a t i o n s In a f a m o u s s t u d y the h u n g a r i a n
m a t h e m a t i c i a n Polya, 1945 p u t f o r t h the t h e s i s
that h e u r i s t i c s is one of the m o s t i m p o r t a n t
p s y c h o l o g i c a l d r i v i n g m e c h a n i s m s b e h i n d m a t h e -
m a t i c a l - o r s c i e n t i f i c - p r o g r e s s In A I -
l i t e r a t u r e it is o f t e n u s e d to r e f e r to s h o r t c u t
s e a r c h m e t h o d s in s e m a n t i c networks/spaces; c.f
L e n a t , 1982)
O n e r e a s o n for t r y i n g to a d o p t s o m e k i n d of
h e u r i s t i c d e v i c e in the a n a l y s i s p r o c e d u r e s is
that one for m a t h e m a t i c a l r e a s o n s k n o w s that
ordinary, "careful", parsing algorithms inherently
s e e m to r e f u s e to w o r k in r e a l t i m e (i.e in
linear time), whereas h u m a n beings, on the whole,
s e e m to be a b l e to do e x a c t l y that (i.e p e r c e i v e
sentences or utterances simultaneously with their
production) Parallel processing may partly be an
a n s w e r to that d i l e m m a , but still, any p r o c e s s
that c l a i m s to a c t u a l l y s i m u l a t e s o m e p a r t of
h u m a n p e r c e p t i o n m u s t in s o m e w a y or o t h e r
s i m u l a t e the r e m a r k a b l e a b i l i t i e s h u m a n b e i n g s
h a v e in g r a s p i n g c o m p l e x p a t t e r n s ("gestalts")
seemingly in one single operation
O r d i n a r y , c a r e f u l , p a r s i n g a l g o r i t h m s a r e
o f t e n o r g a n i z e d a c c o r d i n g to s o m e g e n e r a l
p r i n c i p l e s u c h as " t o p - d o w n " , " b o t t o m - t o - t o p " ,
" b r e a d t h f i r s t " , " d e p t h f i r s t " , etc., t h e s e
h e a d i n g s r e f e r r i n g to s o m e s p e c i f i e d t y p e of
" s t r a t e g y " T h e h e u r i s t i c m o d e l w e a r e t r y i n g to
w o r k out has no such preconceived strategy built
i n t o it O u r p h i l o s o p h y is i n s t e a d r a t h e r
a n a r c h i s t i c (The H e u r i s t i c P r i n c i p l e ) : W h a t e v e r
stage of the analysis, according to w h a t e v e r means there are, i_~s identified, and the significance of the fact that the u n i t in q u e s t i o n h a s b e e n identified is made use of in all subsequent stages
of the analysis At any time one m u s t b e prepared
to reconsider an already established analysis of a
u n i t on the g r o u n d that e v i d e n c e a ~ a l n s t the analysis m a y successively a c c u m u l a t e due to what analyses other units arrive at
In n e x t s e c t i o n w e g i v e a b r i e f d e s c r i p t i o n
of t h e a n a l y s i s s y s t e m for S w e d i s h that is n o w
u n d e r d e v e l o p m e n t at o u r g r o u p in S t o c k h o l m As
h a s b e e n said, m u c h e f f o r t is s p e n t on t r y i n g to
m a k e u s e of s u r f a c e s i g n a l s as m u c h as p o s s i b l e Not t h a t w e b e l i e v e that s u r f a c e s i g n a l s p l a y a
m o r e i m p o r t a n t r o l e t h a n a n y o t h e r t y p e of linguistic signals, but rather that we think it is
i m p o r t a n t to try to o p t i m i z e e a c h s i n g l e s u b -
p r o c e s s ( i n a p a r a l l e l s y s t e m ) as m u c h a s
~ o s s l b l e , and, as said, it m i g h t be w o r t h w h i l e
to l o o k c a r e f u l i n t o this level, b e c a u s e the i m - portance of surface signals might have been under- estimated in previous research Our exneriments so far s e e m to i n d i c a t e that t h e y c o n s t i t u t e e x -
c e l l e n t u n i t s to b a s e h e u r i s t i c g u e s s e s on A n - other reason for concentrating our efforts on this level is that it takes time and requires much hard
c o m p u t a t i o n a l w o r k to get s u c h an a n a r c h i s t i c system to really work, and this surface level is reasonably simple to handle
II AN OUTLINE OF AN A N A L Y Z E R BASED ON THE HEURISTIC PRINCIPLE
F i g u r e 1 b e l o w s h o w s the g e n e r a l o u t l i n e of the s y s t e m E a c h of the v a r i o u s b o x e s (or s u b - boxes) represents one specific process, usually a
c o m p l e t e c o m p u t e r program in itself, or, in some cases, independent processes w i t h i n a program The
b i g " c o n t a i n e r " , l a b e l l e d "The Pool", c o n t a i n s
b o t h the l i n g u i s t i c m a t e r i a l as w e l l as the
c u r r e n t a n a l y s i s of it E a c h p r o g r a m or p r o c e s s looks into the Pool for things "it" can recognize, and w h e n the process finds anything it is trained
to r e c o g n i z e , it a d d s its o b s e r v a t i o n to the m a - terial in the Pool This added material m a y (hope- fully) h e l p o t h e r p r o c e s s e s in r e c o g n i z i n g w h a t they a r e t r a i n e d to r e c o g n i z e , w h i c h in its t u r n
m a y again help the first process to recognize more
of "its" units A n d so on
T h e s y s t e m is n o w u n d e r d e v e l o p m e n t a n d during this build-up phase each process is, as was
s a i d a b o v e , e s s e n t i a l l y a c o m p l e t e , s t a n d - a l o n e module, and the Pool exists s i m p l y as successively
u p d a t e d text f i l e s on a d i s c storage At the
m o m e n t some programs presuppose that other prog-
r a m s h a v e a l r e a d y b e e n run, but this s t a t e of
a f f a i r s w i l l be v a l i d Just d u r i n g this b u i l d ~ u p phase At the end of the b u i l d - u p p h a s e e a c h
p r o g r a m s h a l l be a b l e to run c o m p l e t e l y i n d e - pendent of any other program in the system and in
a r b i t r a r y o r d e r r e l a t i v e to the o t h e r s (but, of course, usually perform better if more information
is available in the Pool)
Trang 3p r o g r a m s are to be i m p l e m e n t e d T h e s e p r o g r a m s
w i l l f u n c t i o n as " t r a f f i c r u l e s " a n d v i a t h e s e
systems one shall be able to test various strate-
gies, i.e to t e s t w h i c h r e l a t i v e o r d e r b e t w e e n
the different s u b s y s t e m s that yields optimal re-
s u i t in s o m e k i n d of " p e r f o r m a n c e m e t r i c " , s o m e
e v a l u a t i o n p r o c e d u r e t h a t t a k e s b o t h s p e e d a n d
quality into account
The p r o g r a m s / p r o c e s s e s s h o w n in Figure i all
r e p r e s e n t r a t h e r s t r a i g h t f o r w a r d F i n i t e S t a t e
Pattern M a t c h i n g (FS/PM) procedures It is rather
t r i v i a l to s h o w m a t h e m a t i c a l l y t h a t a set of
i n t e r a c t i n g FS/PM p r o c e d u r e s of the type used in
o u r s y s t e m t o g e t h e r w i l l y i e l d a s y s t e m t h a t
f o r m a l l y has the p o w e r of a CF-parser; in practice
it w i l l y i e l d a s y s t e m t h a t in s o m e s e n s e is
s t r o n g e r , at l e a s t f r o m the p o i n t of v i e w of
convenience Congruence and s i m i l a r p h e n o m e n a will
be r e d u c e d to s i m p l e l o c a l o b s e r v a t i o n s T r a n s -
f o r m a t i o n a l v a r i a n t s of s e n t e n c e s w i l l be r e -
c o g n i z e d d i r e c t l y - t h e r e w i l l be no n e e d for
p e r f o r m i n g some kind of b a c k w a r d t r a n s f o r m a t i o n a l
o p e r a t i o n s (In t h i s r e s p e c t a s y s t e m l l k e this
w i l l r e s e m b l e G a z d a r ' s g r a m m a r c o n c e p t ; G a z d a r
1980 )
T h e c o n t r o l s t r u c t u r e s l a t e r to be s u p e r i m -
posed on the interacting FS/PM s y s t e m s will also
be of a F i n i t e S t a t e type A s y s t e m of the t y p e
t h e n o b t a i n e d - a s y s t e m of i n d e p e n d e n t F i n i t e
S t a t e A u t o m a t o n s c o n t r o l l e d by a n o t h e r F i n i t e
State A u t o m a t o n - will in principle have rather
c o m p l e x m a t h e m a t i c a l p r o p e r t i e s It is, e.g.,
rather easy to see that such a system has stronger
c a p a c i t y t h a n a T y p e 2 d e v i c e , but it w i l l n o t
have the p o w e r of a full Type I system
Now a few comments to Figure i
The "balloons" in the figure represent inde-
pendent programs (later to be developed into inde-
p e n d e n t p r o c e s s e s i n s i d e o n e "big" p r o g r a m ) T h e
f i g u r e d i s p l a y s t h o s e p r o g r a m s t h a t so f a r
( J a n u a r y 1983) h a v e b e e n i m p l e m e n t e d a n d t e s t e d
(to some extent) Other programs w i l l successively
be entered into the system
T h e b i g b a l l o o n l a b e l l e d "The C l o s e d Cat"
represents a program that recognizes closed w o r d
c l a s s e s s u c h as p r e p o s i t i o n s , c o n j u n c t i o n s , p r o -
n o u n s , a u x i l i a r i e s , and so on The C l o s e d C a t
r e c o g n i z e s full w o r d f o r m s d i r e c t l y T h e S M U R F
b a l l o o n r e p r e s e n t s the m o r p h o l o g i c a l c o m p o n e n t
( S M U R F = " S w e d i s h M u r p h o l o g y " ) S M U R F i t s e l f is
organized internally as a c o m p l e x s y s t e m of inde-
p e n d e n t l y o p e r a t i n g " d e m o n s " - S M U R F s - e a c h
k n o w i n g "its' little corner of S w e d i s h w o r d forma-
tion (The n a m e of the p r o g r a m is a n a l l u s i o n to
t h e p o p u l a r c o m i c s t r i p l e p r e c h a u n s " l e s
S c h t r o u m p f s " , w h i c h in S w e d i s h a r e c a l l e d
"smurfar".) Thus there is one little smurf recog-
n i z i n g d e r i v a t [ o n a l m o r p h e m e s , o n e r e c o g n i z i n g
flectional endings, and so on One special smurf,
Phonotax, has an important controlling function -
e v e r y o t h e r s m u r f m u s t a l w a y s c o n s u l t P h o n o t a x
before identifying one of "its" (potential) forma-
pronounceable, o t h e r w i s e it cannot be a formative
S M U R F w o r k s e n t i r e l y w i t h o u t s t e m l e x i c o n ; it
a d h e r e s c o m p l e t e l y to t h e " p h i l o s o p h y " of u s i n g surface signals as far as possible
NOMFRAS, VERBAL, IFIGEN, CLAUS and PREPPS are other "demons" that recognize different phrases or
w o r d g r o u p s w i t h i n s e n t e n c e s , viz n o u n p h r a s e s ,
v e r b a l c o m p l e x e s , i n f i n i t i v a l c o n s t r u c t i o n s ,
c l a u s e s a n d p r e p o s i t i o n a l phrases, respectively
N - l e x , V - l e x a n d A - l e x r e p r e s e n t v a r i o u s (sub)- lexicons; so far w e have tried to do w i t h o u t t h e m
as f a r as p o s s i b l e O n e s h o u l d o b s e r v e t h a t s t e m
l e x i c o n s a r e no p r e r e q u i s i t e s for the s y s t e m to work, adding them only enhances its performance The format of the m a t e r i a l inside the Pool is the o r i g i n a l text, p l u s a p p r o p r i a t e " l a b e l l e d
b r a c k e t s " e n c l o s i n g w o r d s , w o r d g r o u p s , p h r a s e s and so on In this way, the form of representation
is c o n s i s t e n t t h r o u g h o u t , no m a t t e r h o w m a n y
d i f f e r e n t t y p e s of a n a l y s e s h a v e b e e n a p p l i e d to
it Thus, v a r i o u s p e o p l e can j o i n o u r g r o u p a n d
w r i t e their o w n "demons" in w h a t e v e r language they prefer, as long as they can take sentences in text
f o r m a t , be r e a s o n a b l y t o l e r a n t to w h a t t y p e s of '~rackets" they find in there, do their analysis, add their o w n brackets (in the specified format), and put the result back into the Pool
Trang 4IFIGEN are extensively tested (and, of course, The
C l o s e d Cat, w h i c h is a s i m p l e l e x i c a l l o o k u p
system), and various examples of analyses of these
programs will be d e m o n s t r a t e d in the next section
W e h o p e to a r r i v e at a c r u c i a l s t a t i o n in t h i s
p r o j e c t d u r i n g 1983, w h e n C L A U S has b e e n m o r e
t h o r o u g h l y tested If C L A U S p e r f o r m s the w a y w e
h o p e (and p r e l i m i n a r y t e s t s i n d i c a t e that it
will), w e w i l l have means to identify very quickly
the c l a u s a l s t r u c t u r e s of the s e n t e n c e s in an
a r b i t r a r y r u n n i n g text, thus h a v i n g a f i r m b a s e
for entering higher h i e r a r c h i e s in the s y n t a c t i c
domains
The programs are w r i t t e n in the Beta language
d e v e l o p e d by the p r e s e n t author; c.f B r o d d a -
Karlsson, 1980, and Brodda, 1983, forthcoming Of
the a c t u a l p r o g r a m s in the s y s t e m , S M U R F w a s
d e v e l o p e d a n d e x t e n s i v e l y t e s t e d by B.B d u r i n g
1 9 7 7 - 7 9 (Brodda, 1979), w h e r e a s the o t h e r s are
(being) developed by B.B and/or Gunnel KEllgren,
Stockholm (mostly "and")
III EXPLODING SOME OF THE BALLOONS
W h e n a "fresh" text is e n t e r e d i n t o The P o o l
it f i r s t p a s s e s t h r o u g h a p r e l i m i n a r y o n e - p a s s -
program, INIT, (not s h o w n in Fig i) that "normal-
izes" the text T h e o r i g i n a l t e x t m a y be of any
t y p e as l o n g as it Is r e g u l a r l y t y p e d S w e d i s h
I N I T t r a n s f o r m s the t e x t so t h a t e a c h g r a p h i c
sentence w i l l make up exactly one physical record
( E x c e p t in p o e t r y , p h y s i c a l r e c o r d s , i.e lines,
u s u a l l y a r e of m a r g i n a l l i n g u i s t i c interest.)
P a r a g r a p h e n d s w i l l be r e p r e s e n t e d by e m p t y re-
cords Periods used to indicate abbreviations are
J u s t t a k e n a w a y a n d the a b b r e v i a t i o n i t s e l f is
contracted to one graphic word, if necessary; thus
"t.ex." ("e.g.") is t r a n s f o r m e d into "rex", a n d so
on Otherwise, periods, commas, question marks and
o t h e r t y p o g r a p h i c c h a r a c t e r s are p r o v i d e d w i t h
p r e c e d i n g blanks T h r o u g h this e a c h w o r d is
g u a r a n t e e d to be s u r r o u n d e d by b l a n k s , a n d d e -
l i m i t e r s llke c o m m a s , p e r i o d s and so on a r e
guaranteed to signal their "normal" textual func-
tions E a c h r e c o r d is a l s o e n d e d by a s e n t e n c e
d e l i m i t e r (preceded by a blank) Some manual post-
e d i t i n g is s o m e t i m e s n e e d e d in o r d e r to get the
text n o r m a l i z e d a c c o r d i n g to the above In the
I N I T - p h a s e no l i n g u i s t i c a n a l y s i s w h a t s o e v e r is
i n t r o d u c e d ( o t h e r t h a n into w h a t a p p e a r s to be
orthographic sentences)
INIT also changes all letters in the original
t e x t to t h e i r c o r r e s p o n d i n g u p p e r c a s e v a r i a n t s
( O r i g i n a l l y c a p i t a l l e t t e r s a r e o p t i o n a l l y p r o -
v i d e d w i t h a p r e f i x e d "=".) All s u b s e q u e n t a n a -
l y s i s p r o g r a m s a d d t h e i r a n a l y s e s In the f o r m of
l o w e r c a s e l e t t e r s or l e t t e r c o m b i n a t i o n s T h u s
u p p e r case l e t t e r s or w o r d s w i l l b e l o n g to the
object language, and lower case letters or letter
c o m b i n a t i o n s w i l l s i g n a l m e t a - l a n g u a g e informa-
tion In this way, s t r i c t l y text (ASCII) f o r m a t
c a n be k e p t for the text as w e l l as for the va-
rious stages of its analysis; the "philosophy" to
use text I n p u t and text o u t p u t for a l l p r o g r a m s
involved represents the computational solution to
process to work independently of all other in the system
T h e C l o s e d C a t (CC) h a s the i m p o r t a n t r o l e to
m a r k words belonging to some w e l l defined closed
c a t e g o r i e s of w o r d s T h i s p r o g r a m m a k e s no in- ternal analysis of the words, and only takes full words into account CC m a k e s use of s i m p l e rewrite rules of the type ' ~ => eP~e / (blank) (blank)",
w h e r e the i n s e r t e d e's r e p r e s e n t the " a n a l y s i s " ("e" s t a n d s f o r " p r e p o s i t i o n " ; P ~ = "on") A
s a m p l e o u t p u t f r o m T h e C l o s e d Cat is s h o w n in
i l l u s t r a t i o n 2, w h e r e the v a r i o u s m e t a - s y m b o l s
a l s o a r e e x p l a i n e d
T h e s i m p l e e x a m p l e a b o v e a l s o s h o w s the format of inserted meta-lnformatlon Each Identi-
f i e d c o n s t i t u e n t is " t a g g e d " w i t h s u r r o u n d i n g lower case letters, which then can be conceived of
a s l a b e l l e d b r a c k e t s T h i s f o r m a t is u s e d throughout the system, also for complex constit- uents Thus the n o m i n a l phrase 'DEN LILLA FLICKAN" ( " t h e l i t t l e g i r l " ) w i l l b e t a g g e d a s
" ' n D E N + L I L L A + F L I C K A N n " by N O M F R A S (cf b e l o w ; the
p l u s e s a r e i n s e r t e d to m a k e the c o n s t i t u e n t o n e
c o n t i n u o u s string) W e h a v e r e s e r v e d the l e t t e r s
n, v a n d s for the m a j o r c a t e g o r i e s n o u n s or n o u n
p h r a s e s , v e r b s or v e r b a l g r o u p s , a n d s e n t e n c e s , respectively, whereas other more or less transpar- ent letters are used for other categories (A list
of u s e d c a t e g o r y s y m b o l s is p r e s e n t e d in t h e Appendix: Printout Illustrations.)
The program S W E M R F (or sMuRF, as it is c a l l e d here) has b e e n e x t e n s i v e l y d e s c r i b e d e l s e w h e r e (Brodda, 1979) It m a k e s a r a t h e r i n t r i c a t e
m o r p h o l o g i c a l a n a l y s i s w o r d - b y - w o r d In r u n n i n g
t e x t (i.e S M U R F a n a l y z e s e a c h w o r d in itself, disregarding the context it appears in) S M U R F can
be r u n in t w o m o d e s , in " s e g m e n t a t i o n " m o d e a n d
" a n a l y s i s " m o d e In its s e g m e n t a t i o n m o d e S M U R F simply strips off the possible affixes from each
w o r d ; it m a k e s n o u s e of a n y s t e m lexicon (The
a f f i x e s it r e c o g n i z e s a r e c o m m o n p r e f i x e s , s u f -
f i x e s - i.e d e r l v a t l o n a l m o r p h e m e s - and flex- lonal endings.) In analysis m o d e it also tries to
m a k e an o p t i m a l g u e s s of the w o r d c l a s s o f t h e
w o r d under inspection, based on what (combinations of) word formation elements it finds in the word
S M U R F in itself is organized entirely according to the h e u r i s t i c p r i n c i p l e s as t h e y a r e c o n c e i v e d here, i.e as a set of i n d e p e n d e n t l y o p e r a t i n g processes that interactively work on each others output T h e S M U R F s y s t e m has b e e n the test b e n c h for t e s t i n g out the m e t h o d s n o w b e i n g u s e d throughout the entire Heuristic Parsing Project
In its s e g m e n t a t i o n m o d e S M U R F f u n c t i o n s
f o r m a l l y as a set of interactive transformations,
w h e r e the s t r u c t u r a l c h a n g e s h a p p e n to be e x - tremely simple, viz simple segmentation rules of the type 'T=>P-", "Sffi> -S" a n d "Effi>-E '' for an arbitrary Prefix, Suffix and Ending, respectively, but w h e r e the "Job" e s s e n t i a l l y c o n s i s t s of
e s t a b l i s h i n g the c o r r e s p o n d i n g s t r u c t u r a l d e -
s c r i p t i o n s T h e s e a r e s h o w n in III I, b e l o w , together with sample analyses It should be noted that p h o n o t a c t l c c o n s t r a i n t s play a central role
Trang 5o b j e c t i v e s in d e s i g n i n g the S M U R F s y s t e m w a s to
find out h o w m u c h i n f o r m a t i o n actually was carried
by the p h o n n t a c t l c c o m p o n e n t in S w e d i s h (It
turned out to be quite much; cf Brodda 1979 This
p r o b a b l y h o l d s f o r o t h e r G e r m a n i c l a n g u a g e s as
well, w h i c h all have a rather elaborated phono-
taxis.)
N O M F R A S is the next p r o g r a m to be c o m m e n t e d
on T h e p r e s e n t v e r s i o n r e c o g n i z e s s t r u c t u r e s of
t h e t y p e
det/quant + (adJ)~ + noun;
w h e r e the "det/quant" categories (i.e d e t e r m i n e r s
or q u a n t l f l e r s ) a r e d e f i n e d e x p l i c i t l y t h r o u g h
e n u m e r a t i o n - they are supposed to belong to the
class of "surface markers" and are as such identi-
f i e d by T h e C l o s e d Cat A d j e c t i v e s a n d n o u n s o n
the other hand are identified solely on the ground
of t h e i r " c a d e n c e s " , i.e w h a t k i n d of ( f o r m a l l y )
e n d l n g - l l k e s t r i n g s t h e y h a p p e n to e n d w i t h T h e
n u m b e r of a d j e c t i v e s that a r e a c c e p t e d (n in the
f o r m u l a above) varies depending on what (probable)
type of construction is under inspection In inde-
finite noun phrases the substantial content of the
expected endings is, to say the least, meager, as
both nouns and adjectives in many situations only
have O-endings In definite noun phrases the noun
mostly - but not a l w a y s - has a m o r e substantial
a n d r e c o g n i z a b l e e n d i n g a n d a l l i n t e r v e n i n g a d -
Jectives have either the cadence -A or a cadence
f r o m a s m a l l but c h a r a c t e r i s t i c set In a (sup-
p o s e d ) d e f i n i t e n o u n p h r a s e a l l w o r d s e n d i n g in
a n y of the m e n t i o n e d c a d e n c e s a r e a s s u m e d to be
a d j e c t i v e s , but in ( s u p p o s e d ) i n d e f i n i t e n o u n
p h r a s e s not m o r e than o n e a d j e c t i v e is a s s u m e d
u n l e s s o t h e r t y p e s of m o r p h o l o g i c a l s u p p o r t a r e
present
T h e F i n i t e S t a t e S c h e m e b e h i n d N O M F R A S is
presented in Ill 2, together w i t h sample outputs;
in this case the text has been preprocessed by The
Closed Cat, and it appears that these t w o programs
in cooperation are able to recognize noun phrases
of the discussed type correctly to w e l l over 95%
in r u n n i n g t e x t (at a s p e e d of a b o u t 5 s e n t e n c e s
p e r s e c o n d , C P U - t l m e ) ; t h e e r r o r s w e r e s h a r e d
about 50% each b e t w e e n over- and undergenerations
P r e l i m i n a r y e x p e r i m e n t s a i m i n g at including also
S M U R F and FREPPS ( P r e p o s i t i o n a l P h r a s e s ) s e e m to
indicate that about the s a m e recall and precision
r a t e c o u l d be k e p t for a r b i t r a r y t y p e s of (non-
s e n t e n t l a l ) n o u n p h r a s e s (cf Iii 6) (The s y s -
t e m s a r e not y e t t r i m m e d to the e x t e n t that t h e y
can be operatively run together.)
I F I G E N ( I n f i n i t i v e G e n e r a t o r ) is a n o t h e r
r a t h e r s t r a i g h t f o r w a r d F i n i t e S t a t e P a t t e r n
M a t c h e r (developed by Gunnel K~llgren) It recog-
n i z e s ( g r o u p s of) n n n f l n l t e verbs S o m e w h a t
s i m p l i f i e d it can be represented by the f o l l o w i n g
d i a g r a m ( r e m e m b e r the c o n v e n t i o n s for u p p e r a n d
l o w e r case):
I F I G E N parsing d i a g r a m (simplified):
A u x
n>Adv)o
ATT - -
-A
# (C)CV -(A/I)T
#
I
w h e r e '~ux" and "Adv" are categories recognized by The Closed Cat (tagged "g" and "a", respectively),
a n d "nXn" a r e s t r u c t u r e s r e c o g n i z e d by e i t h e r
N O M F R A S or, in t h e c a s e of p e r s o n a l p r o n o u n s , b y
C C (It should he w o r t h m e n t i o n i n g that the class
of a u x i l i a r i e s in S w e d i s h is m o r e o p e n t h a n t h e
c o r r e s p o n d i n g w o r d class in English; besides the
" o r d i n a r y " V A R A ("to be"), H A ("to h a v e " ) a n d the modalsy, there is a fuzzy class of s e m l - a u x i l l a r l e s llke BORJA ("begin") and others; IFIGEN makes use
of a b o u t 20 of t h e s e in the p r e s e n t v e r s i o n ) T h e supine cadence -(A/I)'T is supposed to appear only
o n c e in an i n f i n i t i v a l group A s a m p l e o u t p u t of IFIGEN is given in Iii 3 Also for IFIGEN w e have
r e a c h e d a r e c o g n i t i o n l e v e l a r o u n d 95%, w h i c h ,
a g a i n , is r a t h e r a s t o n i s h i n g , c o n s i d e r i n g h o w little i n f o r m a t i o n actually is m a d e use of in the system
The IFIGEN case illustrates very clearly one
of t h e c e n t r a l p o i n t s in o u r h e u r i s t i c a p p r o a c h ,
n a m e l y the following: The i n f o r m a t i o n that a w o r d has a specific cadence, in this case the cadence -A, is u s u a l l y of v e r y l l t t l e s i g n i f i c a n c e in itself in Swedish Certainly it is a typical infi-
n l t l v a l c a d e n c e (at l e a s t 9 0 % of a l l i n f i n i t i v e s
in S w e d i s h h a v e it), but on the o t h e r hand, it is
c e r t a i n l y a v e r y t y p i c a l c a d e n c e for o t h e r t y p e s
of w o r d s as well: FLICKA (noun), H E L A (adjective),
D E N N A / D E T T A / D E S S A (determiners or pronouns) and so
on, and these other types are by no c o m p a r i s o n the
d o m i n a n t g r o u p h a v i n g this s p e c i f i c c a d e n c e in
running text But, in connection with an "infini-
tive warner" - an auxiliary, or the w o r d ATT - the situation changes dramatically This can be d e m o n - strated by the f o l l o w i n g figures: In running text
w o r d s having the cadance -A represents infinitives
in a b o u t 3 0 % of t h e cases A T T is an i n f i n i t i v e
m a r k e r ( e q u i v a l e n t to "to") in q u i t e e x a c t l y 50%
of its o c c u r e n c e s (the o t h e r 50% it is a s u b o r d i -
n a t i n g c o n j u n c t i o n ) T h e c o n d i t i o n a l probability
t h a t the c o n f i g u r a t i o n A T T -A r e p r e s e n t s a n
i n f l n l t v e is, h o w e v e r , g r e a t e r t h a n 99%, p r o -
v i d e d that c h a r a c t e r i s t i c c a d e n c e s l i k e - A R N A / -
O R N A a n d q u a n t i f l e r s / d e t e r m i n e r s llke A L L A a n d
D E S S A a r e d i s r e g a r d e d (In o u r s y s t e m t h e y a r e marked by S M U R F and The Closed Cat, respectively, and thereby "saved" from being classified as infi-
n i t i v e s ) G i v e n this, t h e r e is a l m o s t no o v e r - generation in IFIGEN, but S w e d i s h allows for split
i n f i n i t i v e s to s o m e extent Q u i t e m u c h m a t e r i a l
c a n be p u t in b e t w e e n the i n f i n i t i v e w a r n e r a n d the infinitive, and this gives rise to s o m e under- generation (presengly) ( S i m i l a r o b s e r v a t i o n s re-
g a r d i n g c o n d i t i o n a l p r o b a b i l i t i e s in c o n f i g u r a -
t i o n s of l i n g u i s t i c u n i t s has b e e n m a d e by M a t s Eeg-Olofson, Lund, 1982)
Trang 6Brodda, B "N~got om de svenska ordens fonotax och
m o r f o t a x " , P a p e r s f r o m the I n s t i t u t e Of
Linguistics (PILUS) No 38, University of Stock-
holm, 1979
Brodda, B '~ttre k r i t e r l e r f~r i g e n k E n n l n g av
sammans~ttningar" in Saari, M and Tandefelt, M
(eds.) F 6 r h a n d l l n g a r r~rande svenskans beskriv-
ning - Hanaholmen 1981, Meddelanden fr~n Insti-
tutionen f~r Nordiska Spr~k, Helsingfors Univer-
sitet, 1981
Brodda, B "The BETA System, and some A p p l i c a -
tions", Data Linguistics, Gothenburg, 1983
(forthcoming)
Brodda, B and Karlsson, F "An e x p e r i m e n t with
A u t o m a t i c M o r p h o l o g i c a l Analysis of Finnish",
P u b l i c a t i o n s No 7, Dept of Linguistics, Unl-
versity of Helsinki, 1981
Gazdar, G "Phrase Structure" i_~n Jacobson, P and Pullam G (eds.), Nature of S y n t a c t i c R e p r e s e n - tation, Reidel, 1982
Lenat, D.P "The Nature of Heuristics", A r t i f i - cial Intelligence, Vol 19(2), 1982
Eeg-Olofsson, M '~n s p r ~ k s t a t l s t l s k m o d e l l f~r ordklassm~rknlng i l~pande text" in K~llgren, G (ed.) TAGGNING, Fgredrag fr~n 3:e svenska kollo- kviet i spr~kllg d a t a b e h a n d l i n g i maJ 1982, FILUS 47, Stockholm 1982
Polya, G "How to Solve it", Princeton University Press, 1945 Also D o u b l e d a y Anchor Press, N e w York, N.Y (several editions)•
A P P E N D I X : S o m e c o m p u t e r i l l u s t r a t i o n s
The following three pages illustrate some of the parsing diagrams used in the system: Iii I, SMURF, and Iii 2, NOMFRAS, together with sample analyses
IFIGEN is represented by sample analyses (III 3; the diagram is given in the
text)• The samples are all taken from running text analysis (from a novel by
Ivar Lo-Johansson), and "pruned" only in the way that trivial, recurrent examples
are omitted Some typical erroneous analyses are also shown (prefixed by **)
In III I SMURF is run in segmentation mode only, and the existing tags are inserted by the Closed Cat "A and "E in word final position indicates the
corresponding cadences (fullfilling the pattern ? V~M'A/E '', where M denotes a
set of admissible medial clusters)•
The tags inserted by CC are: aft(sentence) adverbials, b=particles, dfdeterminers, efprepositions, g=auxiliaries, h=(forms of) HA(VA), iffiinfinitives, j=adjectives,
n=nouns, Kfconjunctions, q=quantifiers, r=pronouns, ufsupine verb form, v=verbal
(group)•
(For space reasons, III 3 is given first, then I and II.)
Iii 3: PATTERN: aux/ATT^(pron)'(adv)A(adv)'inf^inf A :
F L O C K N I N G E N e E F T E R I k A T T k + i H A i + u G ~ T T u i r D E T r v V A R v O R I M L I G T i k A T T k + i F I N N A I
r J A G r g S K A g a B A R A a I H J A L P A i
- r D E T r g K A N g I L I G G A I
g S K A g r V l r i V ~ G A i
- rVlr g K A N g a l N T E a iG~i O R N A v H O L L v SIG F A R D I G A i k A T T k + i K A S T A i
r D E r g V ~ G A D E g a A N T L I G E N a i L Y F T A i
g S K A g r N l r a N O D V A N D I G T V I S a i G O R A i r V l r h H A D E h a A N N U a a l N T E a u H U N N I T u iF~i B E C K M O R K R E T e M E D e i k A T T k + I F O R S O K A i + i F ~ I
e M E D e V A T G A S e F O R e i k A T T k + i K U N N A i + I H ~ L L A i
S K O G E N , L A N D E N g T Y C K T E S g i S T ~ i
r D E N r h H A D E h M I S S L Y C K A T S ele i k A T T k + i N A i
*** q E N q kS g V ~ G A D E g I K V l N N O R N A + S T A N N A i
F R A M A T B O J D H E L A D A G E N
q E T T q K A D S T R E C K ele
e T I L L e i k A T T k + i S E i
q E N q K A R L I N U T I ?
V I P P E N ?
H E M e M E D e S K A M M E N
e O M e N A R S O M H E L S T
e P A e r D E T r
N ~ T e M E D e r D E N r , k S ~ k
e U P P e P O T A T I S E N
B A L L O N G E N F Y L L D SEJ O P P E
S T I L L A e U N D E R e OSS
S I T T M~L
Trang 7PATTERNS " S t r u c t u r a l D e s c r i p t i o n s " ) :
I ) E_NOINGS (E):
X " 1/VS Me "E#;
S t r u c t u r a l changes
E :> =E
2) PREFIXES (P):
I' # p > I - p -
X " V " F (s)
V " X ; P => (-)P>
3) SUFFIXES (S):
l (s) I " V " x 1
X " v " F "_S - E#
#
S :> /S(-)
where I : ( a d m i s s i b l e ) i n i t i a l c l u s t e r , F = f i n a l c l u s t e r , M = mor-h-
e - m - e T n t e r n a l c l u s t e r , V = v o w e l , (s) t h e " g l u o n " S ( c f TID~INGSMA~),
# = word boundary, ( = , > , / , - ) = e a r l i e r accepted a f f i x s e g m e n t a t i o n s , and , f i n a l l a y , d e n o t e s o r d i n a r y c o n c a t e n a t i o n ( I t i s t h e enhanced e l e - ment in each p a t t e r n t h a t is t e s t e d f o r i t s s e g m e n t a b i l i t y )
B A G G ' E = v D R O G v R E P = E T S L I N G R = A D E MELLAN S T E N = A R , F O R > B I TALLSTAMM AR , MELLAN ROD*A L I N G O N T U V = O R e l e GRON I N > F A T T / N I N G
q E T T q S T O R T F O R E > M ~ L h H A D E h u R O R T u e P ~ e S I G B O R T ' A e I e
S L A N T = E N • F O R E > M ~ L = E T N A R M = A D E S I G H O T F U L L ' T d D E T d K N A S T R =
= A D E e I e S K O G = E N - S P R I N G
B A G G ' E S L A P P = T E k O C H k v S P R A N G v r D E r L ~ N G ' A K J O L = A R N A
V I R V I = A D E e O V E R e 0 < P L O C K = A D E L I N G O N T U V = O R , B A G G ' E K V I N N O = R N A
h H A D E h S T R U M P E B A N D F O R > F A R D I G = A D E e A V e S O C K E R T O P P S S N O R = E N ,
K N U T = N A N E D A N > F O R K N A N ' A
a F O R S T a b U P P E b e P ~ e q E N q kS V ~ G = A D E K V I N N O = R N A S T A N N ' A
r D E r v S T O D v k O C H k S T R A C K = T E e P ~ e H A L S = A R N A q E N q F R A N
U T > D U N S T / N I N G e A V e S K R A C K S I P P R = A D E b F R A M b r D E r v H O L L v
B E > S V A R J / A N D E H A N D = E R N A F R A M > F O R S I N ' A S K O T = E N
- d D E T d v S E R v S T O R T k O C H k e R U N T e b U T b , v S A v d D E N d K O R T ~ A
e O M e F O R E > M A L = E T d D E T d v A R y a V A L a a l N T E a q N ~ G O T q I N > U T > I ?
- d D E T d g K A N g L I G G ' A q E N q K A R L I N > U T > I ? d D E T d v V E T v r M A N r
a V A L a k V A D k r H A N r v G O R v e M E D e O S S
- r J A G r T Y C K = T E d D E T d R O R = D E e P ~ e S E J g S K A g r V l r i V ~ G A I
V I P P = E N ? - J A ? E S K A g r V l r i V ~ G A I V I P P ~ E N ?
B A G G E v S M O G v S I G e P ~ e G L A P P ' A K N A N U T > F ~ R B R A N T = E N • k N A R k
r D E r N A R M = A D E S I G r D E r F L A T = A D E P O T A T I S K O R G = A R N A e M E D e L I N G O N
k S O M k v S T O D v e P ~ e L U T e V I D e V A R S I N T U V A , v V A R v r D E r a R E D A N a
U T > O M S I G e A V e S K R A C K o D E R A S o S A N S v V A R v B O R T ' A
- P A S S e P ~ e r V l r K A N H A N D ' A a l N T E a v T O R S v N A R M = A R E ? v S A v
d D E N d M A G R ' A R U S T R U N
- r V l r E K A N g a l N T E a G ~ H E M e M E D e S K A M M = E N a H E L L E R a • r V l r
g M ~ S T E E a J U a i H A i B A R K O R G = A R N A e M E D e
- J A V I S S T , B A R K O R G = A R N A
k M E N k k N A R k r D E r u K O M M I T u b N E R b e T I L L e S T A L L = E T I < G E N
u V A R T u r D E r N Y F I K = N A r D E r v D R O G S v e T I L L e F O R E > M ~ L = E T e l e
Trang 8quant + dec + "OWN" + adJ + n o u n
I OENNAL DETTA~
/j MI-T
ALLA " ~ ~
B~DA DEN
-ERI-NI-~ I
ER) "NAI-EN]
- P Y T T , v S A v n D E N + L ~ N G A n
k V A D k v V A R v N U n D E T + D A R n k A T T k V A R A R A D D e F O R e ?
n D E T + O M F ~ N G S R I K A + , + S I D E N L A T T A + T Y G E T n
n D E n G J O R D E n E N + S T O R + P A C K E n e A V e d D E T d
e M E D e S I G S J A L V A e O M e k A T T k n D E T + H E L A n a l N T E a u V A R I T u q E T T q n D E T + N E L A n a l N T E a u V A R I T u n E T T + D U G G n F A R L I G T
n D E T + F O R M E N T A + K L A D S T R E C K E T n v V A R v k D ~ k S N O T T F L E G R O N e M E D e H A N G B J O R K A R k S O M k n A L L A n F Y L L D E F U N K T I O N E R M O D E R N , n D E N + L ~ N G A + E G N A H E M S H U S T R U N n k S O M k u V A R I T u e l e S K O
S T O R A B O K S T A V E R n E T T + S V E N S K T + F I R M A N A M N n
e P ~ e n D E N + A N D R A + , + F R ~ N V A N D A n , v S T O D v O R D E N
n D E T n v V A R v n E N + L U F T E N S + S P I L L F R U K T n k S O M k h H A D E h u R A M L A T
k O C H k n D E N + A N D R A + E G N A H E M S H U S T R U N S + O G O N n V A T T N A D E S e A V e O M S O M
n E T T + S T O R T + M O S S I G T + B E R G n H O J D E S I G e M O T e S K Y N • S I G e M O T e S K Y N e M E D e n E N + D I S I G + M ~ N E n k S O M k q E N q R U N D L Y K T A
e V I D e n D E T + S T A L L E n k D A R k L A N D N I N G S L I N A N
S A G A H O N O M k A T T k n A L L A + D E S S A + F O R E M A L n a A N D ~ a a l N T E a F O R M E D A R N A k S O M k n E N + A V I G T + S K R U B B A N D E + H A N D n
k S O M k n E N + O F O R M L I G + M A S S A n V A L T R A D E S I G B A L L O N G
- n E N ÷ R I K T I G + B A L L O N G n g S K A g V A R A F Y L L D e M E D e
• * n D E T n a l N T E a v L ~ G v n N ~ G O N + K R O P P + G O M D n I N U N D E R
• ** T V ~ k S O M k B A R G A D E ~ D E N + T I L L S A M M A N S n