290 likes | 410 Views
Update your computers!. To install a patch: Tools => Instant Patch => Download and Activate All Patches. Editing Pathway/Genome Databases. Ron Caspi. Part I: Compounds, Reactions and Pathways. Why Curation is Important!. Database curation greatly enhances the usefulness of the data
E N D
Update your computers! • To install a patch: Tools => Instant Patch => Download and Activate All Patches
Editing Pathway/Genome Databases Ron Caspi Part I: Compounds, Reactions and Pathways
Why Curation is Important! • Database curation greatly enhances the usefulness of the data • “in silico” information less solid than experimental evidence
Pathway Tools Paradigms • Separate database from user interface • Navigator provides one interface to the DB • Editors provide an alternative interface to the DB • Reuse information whenever possible! • A PGDB should not describe the same biological or chemical entity more than once • Compounds are the building blocks of reactions • Reactions are the building blocks of pathways
List of Editors • Compound Editor • Compound Structure Editor • Reaction Editor • Pathway Editor • Synonym Editor • Protein Editor • Gene Editor • Intron Editor (Eukaryotes only) • Transcription Unit Editor • Publication Editor • Frame Editor • Relationships Editor • Ontology Editor
Invoking the Editors Use the “New” command Or: Right-Click on an Object Handle
Saving Changes • The user must save changes explicitly with Save DB • To discard changes made since last save • File => Revert Current DB
The File Menu: DB commands • List Unsaved Changes in Current DB • Revert Current DB • Refresh All Current DBs • Checkpoint Current DB • Revert to Checkpoint in Current DB • Delete a DB • Save Current DB • Attempt to Reconnect to Oracle
Editing rules: Support Policy • Do not alter DB schema • e.g. do not add or remove classes or slots • Do not modify the EcoCyc or MetaCyc datasets
Compound Editor • Create or edit a compound • Invoke by: New: Compound => New Existing: Right-Click compound name, select Compound Editor • Common name and synonyms • links to other DBs
More Compound Editing • Compound Structure Editor • Mol files • Exporting to other DBs • Merging
Reaction Editor • Create or edit a reaction • Invoke by: New: Reaction => New Existing: Right-Click reaction name, select Reaction Editor • Entering Reaction Equation • Compound Resolver
Pathway Editor • Graphically create and modify pathways • Two tools: • Connections Editor: to add reactions, remove reactions, alter connections • Segment Editor: to enter a linear pathway segment(s) • Invoking the pathway editor: New: Pathway => New Existing: Right-Click pathway name, select Pathway Editor command
Connections Editor Operations • Two main display panes: • left: unconnected pathway reactions • right: draws connected reactions (looks like the regular Pathway Tools window) • Connecting reactions: • select initial reaction (in either pane) ===> red and green reactions • select a green reaction • Additional Commands: • Exit: keep changes, abort changes • Reaction: add reaction, add reaction(s) from history, create new reaction frame, clone a reaction frame, add connection, delete predecessor/successor link, disconnect reaction, delete reaction from pathway, choose main compounds for reaction, edit reaction frame • Pathways: enter a linear pathway segment, guess pathway predecessor list, disconnect all reactions, invoke relationships editor, add subpathway by name, add subpathway by substring, add subpathway by class, delete subpathway
Connections Editor Limitations • Ambiguity in some complicated situations on ordering: • link may be ignored • dialog box for disambiguating • pathway drawn in bizarre arrangement • Fix: • try removing offending link and add links in different order • Pathway editor does not handle polymerization pathways • In circular pathways, Pathway editor does not permit specification which compound should be at the top
Pathway Segment Editor • To enter linear sequence of reactions (arguably) faster than with the Connections Editor • Reactions are specified by EC numbers or reaction substrates • One segment may contain up to 7 reactions
Creating Links with External Databases • Creating links from a pathway/genome db to an external database • To define a new external database: • Tools => Ontology Browser • View => Browse from new root / type Databases • Highlight Databases • Frame => Create => Instance • Enter frame name, frame edit • Enter Common Name, Static-Search-URL e.g. http:/gene.pharma.com/dbquery? • Creating links to a pathway/genome db see http://biocyc.org/linking.shtml
Make sure that… You perform all exercises on the Hb. pylori database, not on your own!!!
Creating New Reactions Create the following five reactions: • ascorbate + H2O = 3-keto-L-gulonate • 3-keto-L-gulonate + ATP = 3-keto-L-gulonate-6-phosphate + ADP • 3-keto-L-gulonate-6-phosphate = L-xylulose-5-phosphate + CO2 • L-xylulose-5-phosphate = L-ribulose-5-phosphate* • L-ribulose-5-phosphate = xylulose-5-phosphate
Define a New Pathway • Define the pathway L-ascorbate degradation to xylulose-5-phosphate by connecting the reactions together • Assign class: (Pathways -> Degradation/Utilization/Assimilation -> Carboxylates, Other) • Add a link to non-oxidative branch of the pentose phosphate pathway (Generation of precursor metabolites and energy => Pentose phosphate pathways =>) • Add a reverse link from non-oxidative branch of the pentose phosphate pathway to the new pathway
Pathway Curation • Class • Common Name • Synonyms • Evidence code • Citations • Comments • Links • Hypothetical reactions
Evidence Codes for Pathways • http://brg.ai.sri.com/ptools/evidence-ontology.html • EV-AS: Author statement • NAS – non-traceable • TAS - traceable • EV-COMP: Inferred from computation • AINF - Artificial inference • HINF - Human inference • EV-EXP: Inferred from experiment • IDA - inferred from direct assay • IEP - inferred from expression pattern • IGI - inferred from genetic interaction • IMP- inferred from mutant phenotype • IPI - inferred from physical interaction • EV-IC: Inferred by curator
Super Pathways • Create more complex metabolic networks using superpathways • Example: superpathway of alanine biosynthesis composed of alanine biosynthesis I alanine biosynthesis II alanine biosynthesis III
Pathway Export • Export • Edit => Add Pathway to File Export List • File => Export => Selected Pathways to File
Constraint Checking • General rules that constrain the valid relationships among instances • Constraints are checked when new facts are asserted to assure that the DB remains logically consistent • Constraints on slots: • Domain violation checks to make sure the slots are in instances of the appropriate class • Range violation : • value type • value cardinality • Inverse • Cardinality • Lisp-predicate
Consistency Checking (correctify-kb) • Removes newlines from names • Converts “<“ to “|” in string citations • Checks isozyme sequence similarity • Fixes references from polypeptides to genes • Changes compound names to ids in a variety of slots • Matches physiological regulators to other regulators • Cross-references compounds to reactions • Checks pathways predecessors/reactions/subs • Checks reaction balancing • Checks compound structures • Calculates sub- and super-pathways • Finds missing sub-pathways links • Verifies chromosome components and positions
Run (correctify-kb) • Open the database Hb. pylori (HypCyc) • Run (correctify-kb) • Analyze output