210 likes | 234 Views
Thematic Domain Group 7: Lexical Semantics. ISO TC 37/SC 4 N470 ISO-TC37 meeting in Marrakech 25 May 2008 MONICA MONACHINI ILC-CNR. Outline. Situation & Objectives Links with other ISO work: LMF, TDGs Approaches Sources of info DC metamodel ISO 12620 & samples of xml instantiations
E N D
Thematic Domain Group 7: Lexical Semantics ISO TC 37/SC 4 N470 ISO-TC37 meeting in Marrakech 25 May 2008 MONICA MONACHINI ILC-CNR
Outline • Situation & Objectives • Links with other ISO work: LMF, TDGs • Approaches • Sources of info • DC metamodel ISO 12620 & samples of xml instantiations • TDG7 task force and related activities • Next work
Situation in TDG7 and Objectives • New-born Thematic Domain Group on Lexical Semantics (Hong Kong ISO resolutions). • Convenor: MM • Chairs: Gil Francopoulo & Nicoletta Calzolari • Objective: • focus on lexical semantics for NLP lexicons • establish a coherent family of low level standards (linguistic constants) for the lexical semantic profile • augment the DCR with specific data categories to be used as elementary descriptors in combination with structural elements of LMF – ISO 24613:2008 when defining LMF-compliant lexicons
Links with other ISO activities • interoperability between lexicon and syntactic and semantic annotation • work shoulder to shoulder with • LMF • TDG5 (SynAF) • TDG3 (SemAF) • TDG6 ontologies • ISO 12620 – meta-model for DCs and ISOCat
TDG7 & LMF semantic extension DCR Semantic Profile semantic relations domain info TDG6 ontological nodes TDG5 TDG3 domain info semantic relations semantic roles
TDG7 & LMF multilingual ext. semantic relations ontological nodes TDG6 DCR Semantic Profile
ISO Approach • Bottom-up procedure from best practices (à la Eagles) • Collecting candidate semantic DCs • Grouping, mapping, structuring • Drafting a first set of definitions In TDG7, initially, focus on a set of candidate harmonized descriptors for semantic relations
Sources • LIRICS set of agreed-on candidate semantic DCs for NLP lexicons ( SIMPLE harmonized plurilingual lexicons for 12 lang.s) • KYOTO harmonization of synset semantic relations ( inter-WN; intra-WN) … but also • BootStrep: extension for the biology domain (e.g. OBO relations has_grain; has_component …) • NEDO: definition of harmonized predicate argument structures and semantic roles
SIMPLE set of semantic relations 60semanticrelations
LIRICS XML format Conversion to ISO12620
Instrumental :DataCategory hasOneOfTheseValues#1 UsedFor: DataCategory hasOneOfTheseValues#2 hasOneOfTheseValues#3 UsedAs: DataCategory UsedBy DataCategory Conceptual domain A DC can be associated with a specific subset of valid values described in the Descriptive component (as a list of data categories)
Taxonomies of DCs A DC can be linked to a broader concept, a more general data category described in the Descriptive component Instrumental: DataCategory Activity: DataCategory hasABoaderDataCategory DirectTelic: DataCategory hasABoaderDataCategory hasABoaderDataCategory Telic: DataCategory
LIRICS XML format has broader has one value Conversion to ISO12620
WordNet semantic relations: mapping A list of 85 sem.rels as a result of a mapping of the KYOTOWordNet grid Intra-WN Inter-WN
TDG7 Community • TDG7 mailing list set up to circulate the set of candidate DCs and allow discussion • How to enlarge TDG7 task force? • Link with the LMF mailing list • Invite people to join from international projects we are involved in
Next activities • Write procedures to convert the LIRICS stuff to the ISO-12620 format • Refine the WordNet set of relations by adding definitions, examples, … • Provide xml instantiations of the agreed-on semantic DCs to upload in the new coming ISOCat tool • Cooperate with ISOCat people • Continue with other semantic info: semantic relations (FrameNet), features, domain, link with ontologies
SIMPLE Ontology in OWL CauseChangeofState CauseChangeofState CauseChangeofState roastbeef roast createdBy