“Deep Understanding”• Not just parsing + word senses • Construction of a coherent representation of the scene the text describes • Challenge: much of that representation is not in the t
Trang 1Augmenting WordNet for Deep
Understanding of Text
Peter Clark, Phil Harrison,
Bill Murray, John Thompson (Boeing)
Christiane Fellbaum (Princeton Univ)
Jerry Hobbs (ISI/USC)
Trang 2“Deep Understanding”
• Not (just) parsing + word senses
• Construction of a coherent representation of the scene the text describes
• Challenge: much of that representation is not in the text
“A soldier
was killed in
a gun battle”
“The soldier died”
“The soldier was shot”
“There was a fight”
…
Trang 3“Deep Understanding”
“A soldier
was killed in
a gun battle”
“The soldier died”
“The soldier was shot”
“There was a fight”
Guns can kill
If you are killed, you are dead
…
How do we get this
knowledge into the
machine?
How do we exploit it?
Trang 4“Deep Understanding”
“A soldier
was killed in
a gun battle”
“The soldier died”
“The soldier was shot”
“There was a fight”
Guns can kill
If you are killed, you are dead
…
Several partially useful
resources exist.
WordNet is already used
a lot…can we extend it?
Trang 5The Initial Vision
• Our vision:
Rapidly expand WordNet to be more of a knowledge-baseQuestion-answering software to demonstrate its use
Trang 6The Evolution of WordNet
– introduce the instance/class distinction
• Paris isa Capital-City is-type-of City – add in some derivational links
• explode related-to explosion
• …
lexical
resource
knowledge
Trang 7Augmenting WordNet
• World Knowledge
– Sense-disambiguate the glosses (by hand)
– Convert the glosses to logic
• Similar to LCC’s Extended WordNet attempt– Axiomatize “core theories”
• WordNet links
– Morphosemantic links
– Purpose links
• Experiments
Trang 8Converting the Glosses to Logic
Convert gloss to form “word is gloss”
Parse (Charniak)
“ambition#n2: A strong drive for success”
LFToolkit: Generate logical form fragments
strong drive for success strong(x1) & drive(x2) & for(x3,x4) & success(x5)
Lexical output rules produce logical form
fragments
Trang 9Converting the Glosses to Logic
Convert gloss to form “word is gloss”
Parse (Charniak) LFToolkit: Generate logical form fragments
Identify equalities , add senses
“ambition#n2: A strong drive for success”
Trang 10Converting the Glosses to Logic
Identify equalities , add senses
A strong drive for success strong(x1) & drive(x2) & for(x3,x4) & success(x5)
x2=x3 x1=x2
Lexical output rules produce logical form
fragments
Composition rules identify variables
x4=x5
Trang 11Converting the Glosses to Logic
Convert gloss to form “word is gloss”
Parse (Charniak) LFToolkit: Generate logical form fragments
Identify equalities , add senses
“ambition#n2: A strong drive for success”
ambition#n2(x1) → a(x1) & strong#a1(x1) & drive#n2(x1) & for(x1,x2) & success#a3(x2)
Trang 12Converting the Glosses to Logic
• But often not Primary problems:
1 Errors in the language processing
2 Only capture definitional knowledge
3 “flowery” language, many gaps, metonymy, ambiguity;
If logic closely follows syntax → “logico-babble”
“hammer#n2: tool used to deliver an impulsive force by striking”
Trang 13Augmenting WordNet
• World Knowledge
– Sense-disambiguate the glosses (by hand)
– Convert the glosses to logic
– Axiomatize “core theories”
• WordNet links
– Morphosemantic links
– Purpose links
• Experiments
Trang 14Core Theories
• Many domain-specific facts are instantiations of more general, “core” knowledge
• By encoding this core knowledge, get leverage
• eg 517 “vehicle” noun (senses), 185 “cover” verb (senses)
• Approach:
– Analysis and grouping of words in Core WordNet
– Identification and encoding of underlying theories
Trang 16Augmenting WordNet
• World Knowledge
– Sense-disambiguate the glosses (by hand)
– Convert the glosses to logic
– Axiomatize “core theories”
• WordNet links
– Morphosemantic links
– Purpose links
• Experiments
Trang 17Morphosemantic Links
• Often need to cross part-of-speech
T: A council worker cleans up after Tuesday's violence in Budapest.H: There were attacks in Budapest on Tuesday
(“attack”) attack_v3 aggression_n4 (←“violence”)
“aggress”/“aggression”
derivation link
• Can solve with WN’s derivation links:
Trang 18Morphosemantic Links
• But can go wrong!
T: Paying was slowH1: The transaction was slowH2: *The person was slow [NOT entailed]
“pay”/“payment”payment_n1 (→ “transaction”)
Trang 19Morphosemantic Links
• Task: Classify the 22,000 links in WordNet:
• Semi-automatic process
– Exploit taxonomy and morphology
• 15 semantic types used
– agent, undergoer, instrument, result, material, destination, location, result, by-means-of, event, uses, state, property,
Verb Synset Noun Synset Relationship
hammer_v1 hammer_n1 instrumentexecute_v1 execution_n1 event (equal)
Trang 20Experimentation
Trang 21Task: Recognizing Entailment
• Experiment with WordNet, logical glosses, DIRT
• Text interpretation to logic using Boeing’s NLP system
• Entailment: T → H if:
– T is subsumed by H (“cat eats mouse” → “animal was eaten”)– An elaboration of T using inference rules is subsumed by H
• (“cat eats mouse” → “cat swallows mouse”)
“A soldier was
isa(soldier01,soldier_n1), isa(……
object(kill01,soldier01) during(kill01,battle01) instrument(battle01,gun01)
Trang 22Successful Examples with the Glosses
• Good example
T: Britain puts curbs on immigrant labor from Bulgaria and Romania.H: Britain restricted workers from Bulgaria
14.H4
Trang 23Successful Examples with the Glosses
Trang 24T: The administration managed to track down the perpetrators.
H: The perpetrators were being chased by the administration
56.H3
Successful Examples with the Glosses
• Another (somewhat) good example
Trang 25T: The administration managed to track down the perpetrators.
H: The perpetrators were being chased by the administration
WN: hunt_v1 “hunt” “track down”: pursue for food or sport
56.H3
T: The administration managed to pursue the perpetrators [for food
or sport!]
H: The perpetrators were being chased by the administration
Successful Examples with the Glosses
• Another (somewhat) good example
Trang 26Unsuccessful examples with the glosses
• More common: Being “tantalizingly close”
T: Satomi Mitarai bled to death
H: His blood flowed out of his body
16.H3
Trang 27Unsuccessful examples with the glosses
• More common: Being “tantalizingly close”
T: Satomi Mitarai bled to death
H: His blood flowed out of his body
Trang 28T: The National Philharmonic orchestra draws large crowds.H: Large crowds were drawn to listen to the orchestra.
20.H2
Unsuccessful examples with the glosses
• More common: Being “tantalizingly close”
Trang 29T: The National Philharmonic orchestra draws large crowds.H: Large crowds were drawn to listen to the orchestra.
20.H2
WN: orchestra = collection of musicians WN: musician: plays musical instrument
WN: music = sound produced by musical instruments
WN: listen = hear = perceive sound
WordNet:
So close!
Unsuccessful examples with the glosses
• More common: Being “tantalizingly close”
Trang 30Success with Morphosemantic Links
• Good example
T: The Zoopraxiscope was invented by Mulbridge
H*: Mulbridge was the invention of the Zoopraxiscope [NOT entailed]
Trang 31T: The president visited Iraq in September.
H: The president traveled to Iraq
54.H1
Successful Examples with DIRT
• Good example
DIRT: IF Y is visited by X THEN X flocks to Y
WordNet: "flock" is a type of "travel"
Entailed [correct]
Trang 32T: The US troops stayed in Iraq although the war was over.
H*: The US troops left Iraq when the war was over [NOT entailed]
Trang 33Overall Results
• Note: Eschewing statistics!
• BPI test suite (61%):
Correct Incorrect
When H or ¬H is predicted by:
WordNet taxonomy + morphosemantics 14 1
When H or ¬H is not predicted: 97 72
“Straight-Forward”
Trang 34Overall Results
• Note: Eschewing statistics!
• BPI test suite (61%):
Correct Incorrect
When H or ¬H is predicted by:
WordNet taxonomy + morphosemantics 14 1
When H or ¬H is not predicted: 97 72
Useful
Trang 35Overall Results
• Note: Eschewing statistics!
• BPI test suite (61%):
Correct Incorrect
When H or ¬H is predicted by:
WordNet taxonomy + morphosemantics 14 1
When H or ¬H is not predicted: 97 72
Occasionally
useful
Trang 36Overall Results
• Note: Eschewing statistics!
• BPI test suite (61%):
Correct Incorrect
When H or ¬H is predicted by:
WordNet taxonomy + morphosemantics 14 1
When H or ¬H is not predicted: 97 72
Often useful
but
unreliable
• RTE3: 55%
Trang 38Thank you!