210 likes | 309 Views
ISO/TC37/SC4/TDG6 Language Resource Ontologies. 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR , AIST, Japan. TDG6 Issues. ontologization DC, LAF, LMF, FS, MAF, SemAF, SynAF, TDG3, etc. Cf. the Pisa group’s work on LMF
E N D
ISO/TC37/SC4/TDG6Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan
TDG6 Issues • ontologization • DC, LAF, LMF, FS, MAF, SemAF, SynAF, TDG3, etc. • Cf. the Pisa group’s work on LMF • extension of RDF (and ontology framework) to more straightforwardly address linguistic information • extended RDF instead of XML • nodes embedding nodes … rdf:Container? • publish TRs • launch ISs
Ontologization • ontology-based reformulation • Most current standards are based on XML and lack standard framework for semantic interpretation. • not XML but RDF as base description and modeling tool • Semantic interpretation is standardized not for XML but for RDF. • ontology as schema • not DTD, XML Schema, RELAXNG, etc.
Motivations of Ontologization • Lack of formal tool by which to write schemas fully addressing the specifications in ISs. • DCR model lacks descriptive power.
Weaknesses of DCR Metamodel • DCR metamodel cannot address • sorts of DCs: such as unary predicate, binary relation, symmetric binary relation, etc. • types of the domain (1st arg.) and the range (2nd arg.) of binary relations (properties)
Semantic Mess of XML • Semantic interpretation of XML is not standardized but rather arbitrary. • Many inconsistent `standards’ on overlapping issues. • Huge standards containing many different semantic interpretation manners. • e.g., MPEG-7 > 2000 pages
RDF • Resource Description Framework • W3C recommendation http://www.w3.org/RDF/ • basis of ontology standards such as RDFS, OWL, and SKOS. • graph data model • textual representation • XML • N3
RDF Graph http://meetings.example.com/m1/hp m:homePage http://meetings.example.com/cal#m1 m:attending Fred m:givenName http://www.example.org/people#fred m:hasEmail mailto:fred@example.com
Cf. RDF in Text <rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:m="http://www.example.org/meeting_organization#" xmlns="http://www.example.org/people#" xmlns:p="http://www.example.org/personal_details#"> <rdf:Description about="http://meetings.example.com/cal#m1"> <m:homePage resource="http://meetings.example.com/m1/hp"/> </rdf:Description> <rdf:Description about="http://www.example.org/people#fred"> <m:attending resource="http://meetings.example.com/cal#m1"/> <p:GivenName>Fred</p:GivenName> <p:hasEmail resource="mailto:fred@example.com"/> </rdf:Description> </rdf:RDF> XML Let’s forget these texts and use graphs! @prefix p: <http://www.example.org/personal_details#> . @prefix m: <http://www.example.org/meeting_organization#> . <http://meetings.example.com/cal#m1> m:homePage <http://meetings.example.com/m1/hp> . <http://www.example.org/people#fred> p:GivenName "Fred"; p:hasEmail <mailto:fred@example.com>; m:attending <http://meetings.example.com/cal#m1> . N3
ISO 24610: Feature Structure • typed feature structure as in HPSG, etc. • ISO 24610-1: Feature Structure Representation • ISO 24610-2: Feature System Declaration • graph model • AVM (attribute-value matrix) • textual encoding by XML
FS Graph determiner POS ORTH la SPECIFIER AGR NUMBER singular AGR HEAD noun POS ORTH pomme
FS in AVM SPECIFIER HEAD POS determiner ORTH `la’ AGR [1][NUMBER singular] POS noun ORTH `pomme’ AGR [1]
FS in XML <fs> <f name="specifier"> <fs> <f name="pos"><symbol value="determiner"/></f> <f name="orth"><string>la</string></f> <f name="agr"> <var label="n1"> <fs><f name="number"><symbol value="singular"/></f></fs> </var> </f> </fs> </f> <f name="head"> <fs> <f name="pos"><symbol value="noun"/></f> <f name="orth"><string>pomme</string></f> <f name="agr"><var label="n1"/></f> </fs> </f> </fs> Let’s forget this, too!
FS in RDF Graph (= FS Graph) determiner POS ORTH la SPECIFIER AGR NUMBER singular AGR HEAD noun POS ORTH pomme
Ontologies Subsume Feature Systems • Features are partial functions, whereas RDF properties are relations in general (possibly partial functions). • Usual feature systems have no taxonomy of features, whereas usual ontologies have taxonomies of properties (e.g., due to rdfs:subPropertyOf).
Feature Structure Declaration <fsDecl type="word" baseTypes="sign"> <fsDescr>The fundamental type for individual words</fsDescr> <fDecl name="orth"> <fDescr>The orthographic representation for this word</fDescr> <vRange><string/></vRange> </fDecl> </fsDecl> The fundamental type for individual words sign rdfs:comment rdfs:subClassOf The orthographic representation for this word word rdfs:comment owl:FunctionalProperty rdf:type rdfs:domain orth rdfs:range string
Constraint (Conditional) <cond> <fs> <f name="inv"> <binary value="true"/> </f> </fs> <then/> <fs> <f name="aux"> <binary value="true"/> </f> <f name="vform"> <symbol value="fin"/> </f> </fs> </cond> named graph true inv X cond true aux X vform fin
FS Ontologization (Summary) • RDF ⊃ FS • Use ontologies for feature-system declarations. • We need RDF-based notations to encode constraints. • Defaults are outside of ontology.
RDF Extended for Embedding TOKEN rdfs:type DET POS The rdfs:type BASE THE clock NN POS BASE CLOCK ● ● rdfs:type NP NUMBER SING possibly stand-off annotation a node embedding nodes
Prospects • RDF as basic data structure • Graph modelis essential. • Forget about textual encoding such as XML • though W3C insists on plain-test encoding. • ontology to address FSD • straightforward to basically declare features and feature structures • need some inventions for constraints • extension of RDF • embeddings (of strings) • collections (sets, bags, lists) • lots more to do