1 / 12

Working towards common naming conventions for use in CV and ontology engineering

Working towards common naming conventions for use in CV and ontology engineering. Susanna-Assunta Sansone (EBI) on behalf of many other people acknowledged in the last slide http://msi-ontology.sf.net/recommendations. PSI and MSI ontology WGs – Our scenario.

aelwen
Download Presentation

Working towards common naming conventions for use in CV and ontology engineering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Working towards common naming conventions for use in CV and ontology engineering Susanna-Assunta Sansone (EBI) on behalf of many other people acknowledged in the last slide http://msi-ontology.sf.net/recommendations

  2. PSI and MSI ontology WGs – Our scenario • Proteomics and Metabolomics Standards Initiatives (PSI, MSI) • Large collaborative, multi domain efforts including: • Database/software developers, vendors, manufacturers • Experimentalists (biological/biomedical applications) • Minimal requirements, XML exchange formats, ontology WGs - Experimental workflow, data produced and analysis: -> Design, sample characteristics, treatments, instruments, protocols -> Protein modification and interactions (PSI) • We create CVs to augment the PSI and MSI XML formats • List of terms, definitions organized as taxonomy (OBO format) - PSI CVs are currently being used by EBI and other databases • We build a ontology as part of OBI to minimize duplications • Share common terminology, where applicable, with other domains

  3. PSI and MSI ontology WGs – Our needs • We use a modular engineering approach • To create orthogonal but integrable CVs - Ontology WGs are divided in subWGs, according to expertise -> Sample processing/separation (e.g. chromatography, gel) -> Instruments specific (e.g. MNR, MS) -> Data analysis -> (PSI-MOD) Protein modification -> (PSI-MI) Protein interactions • We need to use common naming conventions • To facilitate communication among PSI and MSI ontology subWGs - Heterogeneous background and no formal ontology training • To rely on such common conventions with the larger OBI group • We need to observe common design procedures • To harmonize the appearance and design of the CVs modules

  4. What common conventions do we need? • To talk about the representation (reference terminology) • Name the representational artefact* types, clarify differences - E.g. thesaurus, CV, ontology • Name specificrepresentational units* within artefacts across representation languages (OBO, OWL) and semantics - E.g. classes vs. concepts or properties vs. relations • To name and define what we represent (domain things) • In a common and consistent manner • E.g.: ‘7_transmembrane_domain_receptor’ vs ‘G-Protein coupled receptor’, vs ‘GPC_receptors’ vs ‘GPCR_class’ • E.g. ‘sample_temperature_in_autosampler’ vs ‘sample’ vs ‘temperature’ and ‘autosampler’ linked by relations *Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Smith, Kusnierczyk, Schober and Ceusters. KR-MED 2006.

  5. Why these do not exist? What is available? • Representational artefacts built according to different: • Engineering methodologies - MethOntology, TOVE, ENTERPRISE • Representation languages and semantics - OBO, OWL and CLIPS-Frames • Engineering ‘schools’ - GO, semantics web/DL, Protégé Frame, IFOMIS realism-based • As diverse as these backgrounds are the naming schemes! • Variety of ad hoc conventions out there, e.g. • BioPax Manual, GO style guide, ISO guidelines • Various references and material disperse in web pages, e.g. • Law and order: Assessing and enforcing compliance with ontological modeling principles in the Foundational Model of Anatomy (FMA) - S Zhang, O Bodenreider, Computers in Biology and Medicine 36 (2006)

  6. Finding adequate documentation is hard • Implementation specific, limited coverage or scope, e.g. • BioPax Manual: - Naming conventions for classes, identifiers and instances are discussed at implementation level (Protégé/OWL) -> page 53, Technical Notes RDF:ID - Does not cover conventions for naming relations • GO style guide: - Has its own definition for namespace and its abbreviation which differs from the one in OWL/semantic web - Refers to terms and not classes - Does not cover conventions for naming relations • Visibility is also a limiting factor, e.g. • Information is dispersed or embedded into many documents - GO namespace, term names, and identifiers are explained in different documents - GO editor style and OBO edit web pages • Acceptance is ‘limited’ to the target community

  7. We have created our own documentation (!) • Theory document: to name what we represent and the representation (work in progress) • “Working towards naming conventionsfor use in controlled vocabulary and ontology engineering” - Implementation and format independent document - Created for MSI, PSI Ontology WGs also target the larger OBI group - A straw man proposal…

  8. http://msi-ontology.sf.net/recommendations

  9. We have created our own documentation (!) • Theory document: to name what we represent and the representation (work in progress) • “Working towards naming conventionsfor use in controlled vocabulary and ontology engineering” - Implementation and format independent document - Created for MSI, PSI Ontology WGs also target the larger OBI group - A straw man proposal… • Practice document: design principles(final review, soon used) • “Guidelines for the development of controlled vocabularies” - Implementation and format (OBO) specific document - Internal policy document for MSI and PSI ontology WGs -> Uses key words “MUST,” “MUST NOT,” “REQUIRED,” “SHALL,” “SHALL NOT,” “SHOULD,” “SHOULD NOT,” “RECOMMENDED,” “MAY,” and “OPTIONAL” to be interpreted as described in RFC-2119. S. Bradner, Key words for use in RFCs to Indicate Requirement Levels, Internet Engineering Task Force, RFC 2119, http://www.ietf.org/rfc/rfc2119.txt, March 1997

  10. …to elaborate this further… • Wider accepted common naming conventions could • Facilitate access to ontology through meta-tools - Reduce diversity with which meta tools have to contend with -> E.g. OLS, NCBIO Portal, PROMPT (text mining tools?) • Assist in the integration • Comparison, alignment and mapping • Certainly, serve as guidelines for new communities We have started seeing the benefits • Appearance of what we represent has been normalized • But it is not just a matter of aesthetics • Communication has improved • Between developers from different domains and backgrounds • In geographically distributed, collaborative efforts

  11. Acknowledgements and Resources • Authors and those contributing to the discussion • Daniel Schober*, Waclaw Kusnierczyk, Barry Smith, Chris Mungall, Philippe Rocca-Serra, Suzi Lewis, Robert Stevens, Dietrich Rebholz, Frank Gibson, Luisa Montecchi-Palazzi, Jane Lomax • Members of MSI, PSI and OBI ontology working groups • http://msi-ontology.sf.net • http://psidev.sf.net • http://obi.sf.net • Funding sources • *UK BBSRC e-Science, EU NuGO and CarcinoGENOMICS grants • *Semantic Mining NoE (visits to IFOMIS and Manchester) http://msi-ontology.sf.net/recommendations

More Related