160 likes | 335 Views
KYOTO ( ICT - 211423) K nowledge Y ielding O ntologies for T ransition-Based O rganization Intelligent Content and Semantics The First KYOTO Workshop February 2-3 2009 Overall Kyoto Architecture and Kyoto Annotation Format Carlo Aliprandi - SyNTHEMA.
E N D
KYOTO (ICT-211423)Knowledge Yielding Ontologies for Transition-Based Organization Intelligent Content and Semantics The First KYOTO WorkshopFebruary 2-3 2009Overall Kyoto Architecture and Kyoto Annotation FormatCarlo Aliprandi - SyNTHEMA
Kyoto Architecture - Baselines • KYOTO: an information sharing system that enables the extraction of deep semantics (Web 3.0) from texts, for a selected domain, anchoring meaning across cultures and languages • KYOTO: a social platform (Web 2.0) for knowledge sharing and transfer supporting people and organization in building, maintaining and improving knowledge • Baselines for KYOTO architecture: • Strong backbone for data exchange among components • Adopt and adapt existing standards • Open and public system • Synchronize across versions/languages/NLP tools/research groups • API to connect to sources and services • Services to plug and unplug different knowledge sources (Lexicon, Wordnets, Ontologies • Tradeoff btw generic vs domain resources The First KYOTO Workshop, Amsterdam, February 2-3 2009
System components • Capture Server system for selecting, converting and storing documents into the Kyoto document DB. linguistic processors producing KAF annotations • Wikyoto system wiki system for yielding wordnets and ontologies. Main interface for concept and fact users • Document Manager • Term Editor • Kybot Editor The First KYOTO Workshop, Amsterdam, February 2-3 2009
System components • Tybot Server Automatic term and relation extraction from KAF documents and population of term database Validation of terms and population and mapping to D-WNs via Wikyoto • Kybots Server Semi-Automatic fact annotation on KAF documents, using patterns (Kybots) • Kyoto Search system Main interface for end-users • Fact search system • Fact alert system The First KYOTO Workshop, Amsterdam, February 2-3 2009
(Simplified) architecture: domain expert point-of-view The First KYOTO Workshop, Amsterdam, February 2-3 2009
Wordnets Overall architecture Wordnet (Chinese) Wordnet (Japanese) Japan Term DB Tybot Server Basque Term DB ExtractedTerms Wordnet (Dutch) Wordnet (Spanish) Wikyoto Domain Wordnet Capture Server Term Editor [2] Concept User Doc. Manager Indexing Server [1] Document Base Kybot Server Ontologies Kybot Editor [3] Kybots DB SUMO Kyoto System Fact User FrameNet Search App. FrameNet DOLCE Domain Ontology Browse Linguistic Processor Kyoto Ontology L.P. (Italian) L.P. (Dutch) L.P. (Basque) L.P. (English) End User The First KYOTO Workshop, Amsterdam, February 2-3 2009
Data formats: KAF • Kyoto Annotation Format (Level 1) a multi-layered annotation format for: • Tokenizaton and word form segmentation • POS tagging • Lemmatization and Term extraction • Constituency Tagging • Dependency Tagging ENG-3.0-107695012-N The First KYOTO Workshop, Amsterdam, February 2-3 2009
Semantic Annotation • Semantic Annotation Format for: • Named Entity Recognition (time, events, quant. …) • Word Sense Disambiguation (D-WSD) • Semantic Role Labeling (SRL) no synsets • KAF level2 (SemKAF) ENG-3.0-107630294-N The First KYOTO Workshop, Amsterdam, February 2-3 2009
Data formats Level of annotation: • Morpho-syntax annotation • Semantic annotation • Terms representation • Facts annotation • Wordnets • Ontologies • Standard format • }KAF • TMF • KAF • LMF • OWL The First KYOTO Workshop, Amsterdam, February 2-3 2009
KAF annotation : words <text> <wf wid="w1" sent="1" para="1">Tropical</wf> <wf wid="w2" sent="1" para="1">terrestrial</wf> <wf wid="w3" sent="1" para="1">species</wf> <wf wid="w4" sent="1" para="1">populations</wf> <wf wid="w5" sent="1" para="1">declined</wf> <wf wid="w6" sent="1" para="1">by</wf> <wf wid="w7" sent="1" para="1">55</wf> <wf wid="w8" sent="1" para="1">per</wf> <wf wid="w9" sent="1" para="1">cent</wf> <wf wid="w10" sent="1" para="1">on</wf> <wf wid="w11" sent="1" para="1">average</wf> <wf wid="w12" sent="1" para="1">from</wf> <wf wid="w13" sent="1" para="1">1970</wf> <wf wid="w14" sent="1" para="1">to</wf> <wf wid="w15" sent="1" para="1">2003</wf> </text> Tropical terrestrial species populations declined by 55 per cent on average from 1970 to 2003. The First KYOTO Workshop, Amsterdam, February 2-3 2009
KAF annotation : terms <term tid="t5" type="open" lemma="decline" pos="V"> <spans> <target id="w5"/> </spans> <term tid="t7" type="open" lemma="55 per cent" pos="N"> <spans> <target id="w7"/> <target id="w8"/> <target id="w9"/> </spans> </term> Tropical terrestrial species populations declined by 55 per cent on average from 1970 to 2003. The First KYOTO Workshop, Amsterdam, February 2-3 2009
KAF annotation : constituents <chunks> <!-- terrestrial species --> <chunk cid="2" head="t3" phrase="NP"> <spans> <target id="t2"/> <target id="t3"/> </spans> </chunk> <!-- terrestrial species populations --> <chunk cid="3" head="t4" phrase="NP"> <spans> <target id="t2"/> <target id="t3"/> <target id="t4"/> </spans> </chunk> <!-- Tropical terrestrial species --> <chunk cid="4" head="t3" phrase="NP"> <spans> <target id="t1"/> <target id="t2"/> <target id="t3"/> </spans> </chunk> </chunks> Tropical terrestrial species populations declined by 55 per cent on average from 1970 to 2003. The First KYOTO Workshop, Amsterdam, February 2-3 2009
KAF annotation : dependencies • <deps> • <dep from="t4" to="t5" rfunc="subj"/> • <dep from="t4" to="t1" rfunc="mod"/> • <dep from="t4" to="t2" rfunc="mod"/> • <dep from="t4" to="t3" rfunc="mod"/> • <term tid="t1" type="open" lemma="tropical" pos="G"> • .. • <term tid="t2" type="open" lemma="terrestrial" pos="G"> • .. • <term tid="t3" type="open" lemma="species" pos="N"> • .. • <term tid="t4" type="open" lemma="population" pos="N"> • .. • <term tid="t5" type="open" lemma="decline" pos="V"> • .. Tropical terrestrial species populations declined by 55 per cent on average from 1970 to 2003. The First KYOTO Workshop, Amsterdam, February 2-3 2009
KAF annotation: WSD <term tid="t4" type="open" lemma="population" pos="N"> <spans> <target id="w4"/> </spans> <senseAlt> <sense sensecode="EN-17-00861095-n" /> <sense sensecode="EN-17-00859568-n" /> ....... <term tid="t4" type="open" lemma="population" pos="N"> <spans> <target id="w4"/> </spans> <senseAlt> <sense sensecode="EN-17-00859568-n" confidence="0.80 "/> <sense sensecode="EN-17-00257849-n" confidence="0.13 /> <sense sensecode="EN-17-00962397-n" confidence="0.07 /> </senseAlt> </term> The First KYOTO Workshop, Amsterdam, February 2-3 2009
Kyoto open-ness • The kernel of the system. • Core components available as Open Source • Integrating existing resources • Usable by anybody in the 7 Kyoto langs • Fast delivery: at M12 beta available for several components (Capture Server, LPs, Tybot server, Wikyoto …) • Third-part resources as plug-ins • Third-part (open sources) linguistic processors • New languages • Search Interface • Fact Alert System - News Monitoring System The First KYOTO Workshop, Amsterdam, February 2-3 2009
Thanks The First KYOTO Workshop, Amsterdam, February 2-3 2009