1 / 18

Dictionaries and Ontologies in Structural Biology

Dictionaries and Ontologies in Structural Biology. Scope of Ontology PDB Exchange Dictionary. Meta Data Experimental information Molecular description Structural description Coordinates Macromolecule Ligands Solvent. History of Project. 1990 mmCIF project begins

irma
Download Presentation

Dictionaries and Ontologies in Structural Biology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dictionaries and Ontologies in Structural Biology

  2. Scope of OntologyPDB Exchange Dictionary Meta Data • Experimental information • Molecular description • Structural description Coordinates • Macromolecule • Ligands • Solvent

  3. History of Project 1990 mmCIF project begins 1992 NDB serves as testbed 1998 PDB adopts mmCIF as core data representation 2001 PDB Exchange Dictionary incorporates X-ray, NMR and cryoEM 2003 direct translation of mmCIF data & dictionaries into XML(PDBML)

  4. Challenges in Creating an Ontology • Appropriate coverage and level of detail • Acquiring and organizing expert input • Getting consensus • Evolution with the science • Create a rigorous syntax that can be translated (eg mmCIF ->XML)

  5. mmCIF (PDB Exchange)is an Ontology Relationships among data items are explicit

  6. Features of Dictionary • Data Items • Definitions • Examples • Data types • Ranges or enumerations • Simple organization • Tables and columns (categories) • Related data item sets (subcategories) • Chapters (category groups) • Associations • Parent-child relationships • Interdependencies/exclusivity • Methods

  7. Dictionary Definition Example save__em_detector.type _item_description.description ; The detector type used for recording images. Usually film or CCD camera. ; _item.name '_em_detector.type' _item.category_id em_detector _item.mandatory_code no _item_type.code line loop_ _item_enumeration.value 'KODAK SO163 FILM' 'GATAN 673' 'GATAN 676' ’TVIPS TEMCAM F224' 'TVIPS FASTSCAN F114' PROSCAN AMT save_ Semantics Schema Data type Controlled vocabulary

  8. Dictionary Definition Example Semantics save__struct_biol.id _item_description.description ; The value of _struct_biol.id must uniquely identify a record in the STRUCT_BIOL list. Note that this item need not be a number; it can be any unique identifier. ; _item.name '_struct_biol.id' _item.category_id struct_biol _item.mandatory_code yes _item_type.code line loop_ _item_linked.child_name _item_linked.parent_name '_struct_biol_gen.biol_id' '_struct_biol.id' '_struct_biol_keywords.biol_id' '_struct_biol.id' '_struct_biol_view.biol_id' '_struct_biol.id' '_struct_ref.biol_id' '_struct_biol.id' save_ Schema Data type Parent-child (foreign key) relationships

  9. Molecular Description • Macromolecular sequence • Macromolecular source • Detailed chemical descriptions of monomers • Detailed chemical descriptions of ligands and solvent

  10. Molecular Hierarchy Biological Source Macromolecular Polymer Sequence Molecular Component Dictionary Molecular Description Non-polymer Chemical Details

  11. Structural Description • Coordinates of the experimental subunit • Symmetry operations required to build functional assemblies • Structural annotation • Secondary structure • Hydrogen bonding classification • Base pairs and base pair steps • Backbone torsions and base morphology

  12. Structural Hierarchy Molecular Description Functional Units Experimental Subunits Secondary Structure Hydrogen Bonding Atomic Coordinates Base Pairs Base Pair Steps Backbone Torsions Base Morphology

  13. Connection between Molecular and Structure Descriptions • Macromolecular sequences are explicitly aligned to experimentally determined chemical sequences • Monomers, ligands and solvent matched with chemical descriptions in the PDB molecular components dictionary Molecular Description Structural Description

  14. Relationships with other Resources • Sequence database correspondences • Domain/family annotation • Functional annotation (GO/EC/OMIM) • Structural database correspondences • SCOP/CATH/RNAML structural classifications • Functional annotation • Citation and related literature

  15. Supporting Software ToolsDictionaries, Data Files and Databases • Validating Parsers for Files and Dictionaries (CIFPARSE) • Dictionary access and presentation tools (CIFOBJ) • File format translation tools (MAXIT, CIFTr) • PDB Validation Suite • Data acquisition and editor tool (ADIT) • Database Builder, Loader (mmCIFLOADER) • XML translation tool • Data extraction and merging tools (PDB_EXTRACT)

  16. Availabilityhttp://sw-tools.pdb.org/ • WWW and CDROM Distribution • Source and Binary Distributions • Open Source License • Supported on Linux, IRIX, ALPHA, SUNOS, and Mac OSX

  17. Structure Related Data Dictionaries • DDL2 • mmCIF • RNAML • Ligand data • NMR • Cryo-EM • Modeling • Crystallization • Symmetry • Image data • BIOSYNc • Protein Production

  18. Access • RCSB Protein Data Bank Site http://www.pdb.org/ • RCSB/PDB Beta Data Site http://pdbbeta.rcsb.org/ • RCSB/PDB Dictionary Resource Site http://mmcif.pdb.org / • RCSB/PDB Deposition Site http://deposit.pdb.org / • PDBML site http://pdbml.pdb.org/ • RCSB/PDB Software Download Site http://sw-tools.pdb.org /

More Related