110 likes | 285 Views
WordNet Enhancements:. Toward Version 2.0 George A. Miller Christiane Fellbaum Princeton University. Maintenance of WordNet. WordNet version 1.7.1 is being downloaded at the rate of 100 copies per week. Users Group was supported and queries answered.
E N D
WordNet Enhancements: Toward Version 2.0 George A. Miller Christiane Fellbaum Princeton University
Maintenance of WordNet • WordNet version 1.7.1 is being downloaded at the rate of 100 copies per week. • Users Group was supported and queries answered. • Coverage increased by 2,600 to a total of 142,600 words or word strings. Many new words are related to terrorism.
Connectivity of WordNet • Disambiguation of definitions. • Morphosemantic links among nouns and verbs to be extended to adjectives. • Bidirectional links among synsets based on topic/semantic domain, geographical relation, or usage.
Disambiguation of Definitions • Procedure: link nouns, verbs, adjectives, adverbs in definitions to the context-appropriate synsets • Steps: --Definition of stoplist --Pre-processing --Automatic insertion of links where possible (as for monosemous words) • Currently testing an application allowing manual insertion of links.
Making the Semantic Information in the Definitions Available as Searchable Corpus • XML markup notation • General structure allowing for different kinds of WN synset annotation --Princeton-defined annotation is gloss markup --Users can define own annotations, adhering to Princeton’s general structure
General Structure of Markup <wordnet ver=“1.7.1”> <synset pos=“n” ofs=“…”> (offset) <gloss desc=“wsd”> <aux>…</aux> <def>…</def> <ex>…</ex> <\gloss> optional user-defined synset markup ex. <gloss desc=“parse”>…</gloss> ex. <user-defined>…</user-defined> </synset> </wordnet>
Markup for a Tagged Gloss grow, raise, farm, produce – (cultivate by growing, often involving improvements by means of agricultural techniques; “We raise hogs here”) <def> <wf tag=“auto” lemma=“cultivate%2” sk=“2:36:00::”>cultivate</wf> <wf tag=“ignore”>by</wf> <wf tag=“man” lemma=“grow%2” sk=“2:36:00::”>growing</wf> <wf tag=“ignore” type=“punc”>,</wf> <wf tag=“man” lemma=“often%4” sk=“4:02:00::”>often</wf> <wf tag=“man” lemma=“involve%2” sk=“2:42:00::”>involving</wf> <wf tag=“man” lemma=“improvement%1” sk=“1:11:00::”>improvements</wf> <mwf type=“other” tag=“ignore” lemma=“by_means_of”> <wf tag=“ignore”>by</wf><wf tag=“cf”>means</wf><wf tag=“ignore”>of</wf> </mwf> <wf tag=“auto” lemma=“agricultural%3” sk=“3:01:00::”>agricultural</wf> <wf tag=“man” lemma=“technique%1” sk=“1:09:00::”>techniques</wf> </def> <ex>“We <wf tag=“man” lemma=“raise%2” sk=“2:36:03::”>raise</wf> hogs here”</ex>
Morphosemantic Links • 42,000 morphosemantic links between nouns and verbs. • Preparatory work to link --nouns, adjectives (consul/consular; frequent/frequency) --adjectives, verbs (bright/brighten) --nouns to nouns (king/kingdom)
New Ways of Accessing WordNet • Interlinking of synsets • --belonging to a domain (or topic) • --relating to a geographic region • --with particular usage or grammatical properties