270 likes | 467 Views
A WordNet “Detour” to FrameNet. Aljoscha Burchardt Katrin Erk Anette Frank* Saarland University, DFKI* Saarbrücken {albu,erk,frank}@coli.uni-sb.de. Motivation. Demand for semantic information access (IE, QA,…) Available resources Large-scale (statistical) parsing systems WordNet(s)
E N D
A WordNet “Detour” to FrameNet Aljoscha Burchardt Katrin Erk Anette Frank* Saarland University, DFKI* Saarbrücken {albu,erk,frank}@coli.uni-sb.de
Motivation • Demand for semantic information access (IE, QA,…) • Available resources • Large-scale (statistical) parsing systems • WordNet(s) • Modeling approximate lexical semantics • High coverage • FrameNet, PropBank • Modeling predicate-argument structure • Limited coverage • Aim: Combining methods to arrive at a high coverage, various-depth (lexical) semantic analysis
Outline • FrameNet • Using Frames for NLP applications • Current architecture • Coverage problems • A WordNet detour to FrameNet • First Evaluation • Conclusion and Outlook
FrameNet • Frame Semantics (Fillmore 1976, ...) • Frame: a conceptual structure or prototypical situation • Frame elements (roles): participants of the situation • Frame evoking elements (FEEs; verbs, nouns,…) • Example instances of Statement : • “[He Speaker] speaks [highly Manner] [of you Topic],” she said. • “Did [Dominic Speaker] ever make any comments [regarding Toby Topic] [to you Addressee]?” • Berkeley FrameNet Project • Database of frames for core lexicon of English • Current release: 615 frames, ~ 8000 lexical units (LUs)
Saarbrücken SALSA (II) Project • Manual frame-annotation of part of TIGER corpus • Develop automatic methods for Frame/Role assignment • Study metaphors, multi-word expressions • Study frames in context • Work out logical representation for heuristic inferences • Funded by DFG
Using Frames for NLP applications • LFG-based parsing and syntax-semantics interface • ParGram grammars for German and English (Butt et al. 2002) • Interfaces to statistical frame and role assignment (Baldewein et al. 2004, Erk 2004) • Frame projection from f-structure (XLE transfer, Crouch 2005) • Enriching Semantic Representation • Rule-based refinement of semantic representation • Autom. assignment of SUMO/MILO classes (using WordNet WSD) • Logical Representation and Reasoning • FEF (frame exchange format) • Translation to logic programs (joint work with P. Baumgartner and F. Suchanek, MPI Saarbrücken) • First scenario: RTE Challenge (PASCAL Network)
F-Structure string('Jessica Litman is a law professor.'). xcomp(f(0),f(13)). tense(f(0),pres). stmt_type(f(0),declarative). pred(f(0),be). mood(f(0),indicative). dsubj(f(0),f(1)). proper(f(1),name). pred(f(1),'Litman'). num(f(1),sg). mod(f(1),f(4)). proper(f(4),name). pred(f(4),'Jessica'). num(f(4),sg). subj(f(13),f(1)). pred(f(13),professor). num(f(13),sg). mod(f(13),f(16)). det_type(f(13),indef). pred(f(16),law). num(f(16),sg). Semantics Projection frame(s(93),'Education_teaching'). rel(s(93),professor). ont(s(93),s(154)). wn_syn(s(154),'professor#n#1'). sumo_sub(s(154),'Position'). milo_syn(s(154),'Professor'). rel(s(157),law). ont(s(157),s(156)). wn_syn(s(156),'law#n#3'). sumo_sub(s(156),'Proposition'). milo_sub(s(156),'Proposition'). rel(s(166),'Jessica'). ont(s(166),s(165)). sumo_syn(s(165),'Human'). frame(s(168),'People'). person(s(168),s(168)). person(s(168),s(166)). rel(s(168),'Litman'). ont(s(168),s(167)). sumo_syn(s(167),'Human'). FEF Example sslink(f(1),s(168)). sslink(f(4),s(166)). sslink(f(13),s(93)). sslink(f(16),s(157)).
Statistical Frame Assignment-Example- “The Royal Navy servicemen being held captive by Iran are expected to be freed today.” statistical (79,83) Calendric_unit statistical (58,65) Expectation
Statistical Frame Assignment-Issues- Learning statistical frame assignment from annotated FrameNet data • Coverage (often too few examples to learn) • Too little ambiguity • Reason: frame-wise annotation • E.g. have only LU of Birth • 0.7% of the current 8000 LUs ambiguous at all • Baseline for assigning each word its most frequent frame at 93% f-score.
Frame Assignment viaWordNet ”Detour“ • Assign frame(s) on the basis of WordNet related words • Addresses coverage problem • Requires WSD to WordNet • SenseRelate system by Ted Pedersen et al available, alternatively • always take first (most frequent) synset
Frame Assignment via Detour –Example- “The Royal Navy servicemen being held captive by Iran are expected to be freed today.” statistical (79,83) Calendric_unit statistical (58,65) Expectation serviceman#n#1 (16,26) People hold#v#20 (33,37) Containing captive#a#1 (38,45) Prison expect#v#1 (58,66) Expectation free#v#6 (73,78) Emitting,Use_firearm
FN-Detour Algorithm Input: a target word (synset) • Use WordNet Search words = target word, synonyms, antonyms, hypernyms • Look up FrameNet Candidate frames = all frames that list any search word as LUs • Select and return best frame(s) from candidate frames
Detour ExampleStep 1: WordNet Target: serviceman#n#1 serviceman, military man, man, military personnel => skilled worker, trained worker => worker => person, individual, someone, somebody, mortal, human, soul => organism, being => living thing, animate thing => object, physical object => entity => causal agent, cause, causal agency => entity
Weighting • Factors • WordNet distance of FEE from target word (similarity) • “Spreading factor“, i.e. the number of frames a word evokes • Matching vs. LU lookup (boost)
Special: Matching Frame Names • E.g. Research does not (yet) list the noun researcher as LU • If there is no LU for a given word, Detour system looks for matching frame names • Lower weighting for match
Matching Example • Target: researcher#n#1 research worker, researcher, investigator => scientist, man of science => person, individual, someone, somebody,… => organism, being => living thing, animate thing => object, physical object => entity => causal agent, cause, causal agency => entity
Matching Example (ctd) • Target: researcher#n#1
Evaluation • Problem: no off-hand gold standard • FrameNet data (100.000 annotated instances) • All annotated words are LUs of some frame • Detour not really necessary • Solution: detour-only version of our system must not look up target word
First Evaluation Results(detour-only) Table 1: Frame assignment of detour-only system (FrameNet corpus). 80.000 frame instances (60.000 verb, 20.000 noun, 20.000 adj./adv.)
Recent Evaluation Results(detour-only) • Return best frame(s) condition may be too strict (ambiguity is there) • Take first and second best result frame(s) • Gold standard contained +10% • Number of returned frames rises from 1,3 to 3 • Does the WSD system help? • “Always take first synset” slightly better +4%
Evaluation (full system) • Coverage: 96% • Gold standard in (best) result: 83% • WSD not always optimal • Ambiguity leads to a higher weighting of another frame
Issues • Just to mention: frames only (no roles) • Weighting hand-crafted, improvement possible? • Threshold needed (“Is there a frame that fits?”) • What about German? • Access to GermaNet • Available Perl packages for WordNet 2.0 • WSD system as well • Encoding problems (“Period of transition”) • German FrameNet data not (yet) in Berkeley format • Coverage?
Conclusion and Outlook • Detour via WordNet allows assignment of FrameNet frames in many „unknown“ cases • Still: this is the beginning of a journey • Web interface (link on my HP): http://www.coli.uni-saarland.de/~albu/cgi-bin/FN-Detour.cgi • Student project to • Prepare release • More evaluation • Learning of weighting? • Transfer to German?