1 / 12

Ontology, RDF, SW for Chemical Structures

Ontology, RDF, SW for Chemical Structures. T N Bhat & J. Barkley NIST. Bhat@nist.gov. Query tool. Use Case. Publications. Major Features, Goal – to Reduce User Frustration. We have established a use case at the HCLS Website - Chemical taxonomies

libra
Download Presentation

Ontology, RDF, SW for Chemical Structures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ontology, RDF, SW for Chemical Structures T N Bhat & J. Barkley NIST Bhat@nist.gov Query tool Use Case Publications

  2. Major Features, Goal – to Reduce User Frustration • We have established a use case at the HCLS Website - Chemical taxonomies • Combining of Rule-based terms with Vocabulary-based terms to define elements of RDF • Organization of the elements of RDF into predictable ontology using concepts from use cases • Developing tools and techniques to present the information using familiar database environments • Allows easier portability and implementation of the information by the community • Illustrating the concept using high profile data such as for AIDS inhibitors and Protein Data Bank contents

  3. Combining of Rule-based with Vocabulary-based elements to define RDF • Chemical structures are definable by atomic connectivity – thus structures are suitable for identification using graph theory – InChI • Suitable for machine reasoning • Graphs are hard to digest for humans – therefore proposal is to combine InChI with familiar vocabularies such as Ala, Phenyl, Adenine • Also include synonyms in the vocabulary for greater coverage among diverse users • Vocabularies make it easier for humans to recognize the information

  4. InChI – a Scalable URI • InChI is generated using a software that decodes the chemical connectivity information in certain layers such as chirality, ring structure, atom type and then re-codes them to form a text string • InChI is a naming standard for chemicals recommended by IUPAC

  5. InChI – a rule-based URI • InChI • _1_2FC10H11NO2_2Fc11-10_2812_2913-9-5-7-3-1-2-4-8_287_296-9_2Fh1-4_2C9H_2C5-6H2_2C_28H2_2C11_2C12_29

  6. Vocabulary-based Definitions • For decades scientists have been developing names to identify structures and their images • Simple names • His • Ala • DNA • ATP • Semi-rule-based IUPAC names • 2-amino-3-methylpentanamide • 4-amino-3-hydroxy-6-methylheptanoic_acid • 1-[(Benzenesulfonyl-methyl-amino)-phenyl-butyl]-piperidin-4-yl}-propyl-carbamic acid, naphthalen-1-ylmethyl ester • Names facilitate text-based queries of desired components • Names when used together with InChI provide a smoother integration of machine and human needs

  7. Use-Case for SW; Treatment for AIDS is a work in progress • Treatments for AIDS are of two types • Prevention – the most effective • Containment • Drugs to contain, and reduce the viral load • Majority of the drugs ( ~17) target either HIV protease or RT • Complete suppression of either of these viral enzymes could cure AIDS • But drug resistance leads only to partial suppression of the enzymes • All the drug design efforts for AIDS are based on structures • Data needed for drug-design is scattered over many Web resources and users often wean through the data manually • Therefore AIDS drug design is an ideal target for Semantic Web and novel new database related technologies • SW connection between NIST and NIAID AIDS database Choose the problem that matters Website

  8. Annotation Technique/Developing Structural Ontology • Define compounds using chemical features of interest to use cases • Fragment, subgroup, class 000503 030798 1A8K 000505

  9. Modeling with Protégé – Suitable for Text-based Ontology

  10. Web tools • Structures are different from text based info • Structures are not amenable to text-based query/rendering techniques • Majority of the structural users never heard (nor want to hear!) about SPARQL – query language for RDF • Commonly preferred/expected way to query is by ‘click’ • Semantic Web for Structures needs new Web tools that allow navigation by clicking on structural features

  11. Chem-BLAST for Structural Semantic Web http://bioinfo.nist.gov/SemanticWeb_pr3d/chemblast.do Prasanna et al. PROTEINS60, 1-4 (2005). Prasanna et al. PROTEINS 63(4), 907-917(2006). Download publications

  12. Future Plans • Extend the work to chemical structures from Protein Data Bank • If interest exists hold a workshop at NIST Proposed dates - last two weeks of March 2008 • Workshop will be in conjunction with the NIST wide Ontology week • Possible collaboration with IUPAC (International Union of Pure and Applied Chemistry ) and ChEBI • Contact: Colin Batchelor BatchelorC@rsc.org • RSC Publishing,Royal Society of Chemistry • Community participation is essential for further development • Contact bhat@nist.gov 301 975 5448 (US)

More Related