460 likes | 635 Views
Reactions. Reactions. Represents information about a reaction that is independent of enzymes that catalyze the reaction Connected to enzyme(s) via enzymatic reaction frames Classified with EC system when possible Example: 2.7.7.7 – DNA-directed DNA polymerization
E N D
Reactions • Represents information about a reaction that is independent of enzymes that catalyze the reaction • Connected to enzyme(s) via enzymatic reaction frames • Classified with EC system when possible • Example: 2.7.7.7 – DNA-directed DNA polymerization • Carried out by five enzymes in E. coli
Enzymatic Reactions (DnaE and 2.7.7.7) • A necessary bridge between enzymes and “generic” versions of reactions • Carries information specific to an enzyme/reaction combination: • Cofactors and prosthetic groups • Alternative substrates • Links to regulatory interactions • Kinetic data (Km, Vmax, etc.) • Frame is generated when protein is associated with reaction (via protein or reaction editor)
Reaction Direction • Left/Right reflect direction of reaction as written by Enzyme Commission • Reflects systematic direction for different reaction classes • Left/Right do not necessarily correspond to physiological direction of a reaction • Get-rxn-direction(rxn) • Returns :L2R or :R2L or :BOTH or NIL • Integrates all available info about direction of this reaction • Direction explicitly curated for reaction • Direction(s) it occurs in all pathways in the PGDB • Direction(s) as specified in Enzymatic-Reactions
Slots of Reaction Frames • Balance-state • EC-number • Enzymatic-reaction • Generated in protein or reaction editor • In-pathway • Generated in pathway editor • Left and Right • Reaction-Direction • :left-to-right, :right-to-left, :reversible • Spontaneous?
Slots of Enzymatic-Reaction Frames • Enzyme • Reaction • Regulated-By • Cofactors • Kinetic slots: • Vmax, Km, Kcat, Specific-Activity, Temperature-Opt, pH-Opt • Reaction-Direction • Reaction may be reversible, but this enzyme effectively only catalyzes it in one direction.
Genes-of-reaction (rxn) Substrates-of-reaction (rxn) Enzymes-of-reaction (rxn) Lacking-ec-number (organism) Returns list of rxns with no ec numbers in that database Get-reaction-direction-in-pathway (pwy rxn) Reaction-type(rxn) Indicates types of Rxn as: Small molecule rxn, transport rxn, protein-small-molecule rxn (one substrate is protein and one is a small molecule), protein rxn (all substrates are proteins), etc. All-rxns(type) Specify the type of reaction (see above for type) Obtain-rxn-stats Returns six values Length of : all-rxns, transport, non-transport, etc… Semantic Inference Layer
Exercises • Find all small-molecule reactions that have no enzyme but are not spontaneous (“orphan” reactions) • Find all reactions that consume a given compound
Solutions to Exercises • Find all small-molecule reactions that have no enzyme but are not spontaneous (“orphan” reactions): • (loop for r in (all-rxns :small-molecule) when (and (not (slot-has-value-p r 'enzymatic-reaction) (not (get-slot-value r 'spontaneous?))) collect r)
Solutions to Exercises, cont. • Find all reactions that consume a given compound: (defun rxns-consuming-cpd (cpd) (append (loop for r in (get-slot-values cpd ‘appears-in-left-side-of) for dir = (get-rxn-direction r) if (or (eq dir :l2r) (eq dir :both)) collect r) (loop for r in (get-slot-values cpd ‘appears-in-right-side-of) for dir = (get-rxn-direction r) if (or (eq dir :r2l) (eq dir :both)) collect r)))
RNAs • PGDBs only represent RNAs that are “terminal gene products” • tRNAs • rRNAs • Regulatory RNAs • Miscellaneous small RNAs • Slots similar to proteins • tRNAs can have an anticodon
What is a Pathway? • An ordered set of interconnected, directed biochemical reactions • Reactions form a coherent unit, e.g. • Regulated as a single unit • Evolutionarily conserved across organisms as a single unit • When combined, perform a single cellular function • Historically grouped together as a unit • Includes metabolic pathways and signaling pathways • Evidence for all reactions in a single organism • Pathways can be linear, cyclical, branched, or some combination
Internal Representation of Metabolic Pathways • REACTION-LIST: unordered list of reactions that comprise the pathway • PREDECESSORS: list of reaction pairs that define ordering relationships between reactions. E.g. R1 R2 C A B R3 D (R2 R1) : Predecessor of R2 is R1 (R3 R1) : Predecessor of R3 is R1 (R1) : R1 has no predecessor (can be omitted)
What is missing from Pathway Representation? • Reaction directions • Some reactions are unidirectional, but many are reversible – how do we know in which direction to draw the reaction? • Main vs. side substrates A B C D E F • Main compounds form the backbone of the pathway • substrates shared between connecting reactions • major inputs and outputs. • Side compounds omitted from pathway diagrams at low detail levels • Individual reactions do not necessarily have main and side compounds – a particular substrate may be either a main or a side depending on the pathway context.
Computing Directionality and Mains/Sides Our philosophy: Enable curator to specify as little as possible. Compute as much as possible. This reduces redundancy and potential for inconsistencies. Example: Reactions R1: A + B C + D R2: B E Predecessors: (R2 R1) • Only substrate overlap is B • B must be a main substrate • A must be a side substrate, • R1 must proceed from right to left • R2 must proceed from left to right C + D B E A
But… Unfortunately, mains, sides and reaction directions are sometimes ambiguous: • At beginnings and ends of pathways • Use heuristics to determine main/side substrates at beginnings, ends of pathways • Not always what the curator wants • Substrate overlap with both sides of a reaction, e.g. A + B C + D C + B E • Solution: Additional slot PRIMARIES, should only be populated when necessary: PRIMARIES: (R (A B) (C)) says that for reaction R, A and B are both main reactants, and C is a main product.
More Complications… • ENZYMES-NOT-USED: a reaction may be catalyzed by multiple enzymes, but not all the enzymes necessarily participate in a given pathway • Not present in the same compartment with rest of pathway enzymes • Down-regulated or not expressed under conditions in which pathway is active • ENZYMES-NOT-USED slot lists enzymes that are not involved in the pathway even though they catalyze one of its reactions. • LAYOUT-ADVICE: helps software draw pathway correctly, e.g. in a cyclical pathway, tells which substrate should be at the top. • HYPOTHETICAL-REACTIONS: list of reactions in the pathway that are considered hypothetical (i.e. no direct experimental evidence)
Polymerization Pathways … X[n] X[n+1] X[10] • POLYMERIZATION-LINKS: specifies reactions that should be connected by a polymerization link (X R1 R1) --- REACTANT-NAME-SLOT: N-NAME --- PRODUCT-NAME-SLOT: N+1-NAME • CLASS-INSTANCE-LINKS: specifies when a link should be drawn between a substrate class and some instance of it (necessary only if instance is not a member of some reaction, so no predecessor relationship can be defined) R1 --- PRODUCT-INSTANCES: X[10]
Pathway Links • Can be used as an alternative or in addition to defining super-pathways • Link must be to or from some main substrate in the pathway • Other end of link can be a pathway, a reaction, or an arbitrary text string • Software automatically computes direction of link, but curator can override it
Super-Pathways • Collection of pathways that connect to each other via common substrates or reactions, or as part of some larger logical unit • Can contain both sub-pathways and additional connecting reactions • Can be nested arbitrarily • REACTION-LIST: a pathway ID instead of a reaction ID in this slot means include all reactions from the specified pathway • PREDECESSORS: a pathway ID instead of a tuple in this slot means include all predecessor tuples from the specified pathway
Signaling Pathways • Signaling pathways have different layout conventions than metabolic pathways • Layout is done manually by curator using specialized editor • Curator has more control • Lack of automatic layout algorithm means that diagram won’t update automatically when data changes.
Querying Pathways Programmatically • See http://bioinformatics.ai.sri.com/ptools/ptools-resources.html • (all-pathways) • (base-pathways) • Returns list of all pathways that are not super-pathways • (genes-of-pathway pwy) • (unique-genes-of-pathway pwy) • Returns list of all genes of a pathway that are not also part of other pathways • (enzymes-of-pathway pwy) • (compounds-of-pathway pwy) • (get-reaction-list pwy) • (variants-of-pathway pwy) • Returns all pathways in the same variant class as a pathway • (get-predecessors rxn pwy), (get-successors rxn pwy) • (get-rxn-direction-in-pathway pwy rxn) • (pathway-inputs pwy), (pathway-outputs pwy) • Returns all compounds consumed (produced) but not produced (consumed) by pathway (ignores stoichiometry)
Exercises • Find all genes involved in metabolic pathways • Find all compounds that are unique to a single pathway • Find the reactions of a pathway that have multiple isozymes
Solutions to Exercises • Find all genes involved in metabolic pathways: • (remove-duplicates (loop for p in (all-pathways) append (genes-of-pathway p))) • Find all compounds that are unique to a single pathway: • (loop for p in (base-pathways) append (loop for c in (compounds-of-pathway p) when (null (remove p (pathways-of-compound c))) collect (list c p)))
Solutions to Exercises, cont. • Find the reactions of a pathway that have multiple isozymes: • (defun rxns-w-multiple-isozymes (pwy) (loop for rxn in (get-reaction-list pwy) for enzymes = (enzymes-of-reaction rxn) when (> (length enzymes) 1) collect rxn))
Regulation • Class Regulation with subclasses that describe different biochemical mechanisms of regulation • Slots: • Regulator • Regulated-Entity • Mode • Mechanism
Regulation of Enzyme Activity • Class Regulation-of-Enzyme-Activity • Each instance of the class describes one regulatory interaction • Slots: • Regulator -- usually a small molecule • Regulated-Entity -- an Enzymatic-Reaction • Mechanism -- One of: • Competitive, Uncompetitive, Noncompetitive, Irreversible, Allosteric, Unkmech, Other • Mode -- One of: + , - • Physiologically-Relevant?
Transcription-Units, Promoters, Terminators, Binding-Sites • Transcription-Unit • One or more genes, transcribed as a unit • Any gene, promoter, etc. can belong to multiple transcription-units • One or zero promoters • Zero or more terminators, binding-sites • Transcription-Direction: +, - • Promoter • Absolute-plus-1-pos = transcription start site • Binds-Sigma-Factor • Terminator • Rho-Dependent or Rho-Independent • Left-End-Position, Right-End-Position • Binding-Site • DNA-Binding-Site or mRNA-Binding-Site • Left-End-Position, Right-End-Position • Involved-in-Regulation
Transcription Initiation • Class Transcription-Factor-Binding • Slots: • Regulator -- instance of Proteins or Complexes (a transcription-factor) • Regulated-Entity -- instance of Promoters or Transcription-Units or Genes • Mode -- One of: + , - • Associated-Binding-Site
Attenuation • Class Transcriptional-Attenuation • Several subclasses depending on type of attenuation • Slots common to all: • Regulator -- Depends on subtype of attenuation • Regulated-Entity -- instance of Terminators or Genes or Transcription-Units • Mode -- One of: + , - • Slots particular to one or more subclasses: • Associated-Binding-Site • Antiterminator-Start-Pos, Antiterminator-End-Pos • Pause-Start-Pos, Pause-End-Pos
Attenuation Subtypes • Protein-Mediated-Attenuation • RNA-Mediated-Attenuation • Small-Molecule-Mediated-Attenuation • Regulator = A protein/RNA/small molecule • Leader transcript binds protein/RNA/small molecule and determines formation of terminator or antiterminator • Ribosome-Mediated-Attenuation • Regulator = charged tRNA • Ribosome pauses, determining whether terminator or antiterminator forms • RNA-Polymerase-Modification • Regulator = instance of Proteins or Complexes • Regulatory protein binds to site in transcription unit and interacts with RNA polymerase to determine termination • Rho-Blocking-Antitermination
Regulation of Translation • Class Regulation-of-Translation • Several subclasses depending on type of attenuation • Compound-Mediated (i.e. riboswitch) • RNA-Mediated • Protein-Mediated • Slots: • Regulator -- Depends on subtype of attenuation • Regulated-Entity -- Transcription-Unit or Gene • Mode -- One of: + , - • Mechanism – Translation-Blocking, mRNA-Degradation, and/or Translation-Attenuation • Associated-Binding-Site
Transcription-Unit API functions • (transcription-unit-genes tu) • (transcription-unit-promoter tu) • (transcription-unit-binding-sites tu) • (transcription-unit-mrna-binding-sites tu) • (transcription-unit-terminators) • (transcription-unit-transcription-factors) • (terminators-affecting-gene gene) • (containing-tus frame) • (binding-site-transcription-factors bs)
Regulation API Functions • (genes-regulating-gene gene) • (genes-regulated-by-gene gene) • Includes all transcriptional or translational regulation • (direct-activators frame) • (direct-inhibitors frame) • Frame will be an enzymatic-reaction, transcription-unit, promoter, etc. – this is a low-level function • (transcription-unit-activators tu) • (transcription-unit-inhibitors tu) • Includes transcriptional and translational regulators, and regulators of promoter as well as direct regulators of tu
Exercises • Find all DNA-binding-sites that regulate a gene • Find only those substrate-level regulators of an enzyme that are considered physiologically relevant • Find all inhibitors of an enzyme, including transcriptional, translational and substrate-level inhibition.
Solution to Exercises • Find all DNA-binding-sites that regulate a gene: (defun gene-binding-sites (gene) (remove-duplicates (loop for tu in (containing-tus gene) append (transcription-unit-binding-sites tu)))) • Find only those substrate-level regulators of an enzyme that are considered physiologically relevant: (defun relevant-regulators (enz) (loop for enzrxn in (get-slot-values enz ‘catalyzes) append (loop for regframe in (get-slot-values enzrxn ‘regulated-by) when (get-slot-value regframe ‘physiologically-relevant?) collect (get-slot-value regframe ‘regulator) )))
Solution to Exercises, cont. • Find all inhibitors of an enzyme, including transcriptional, translational and substrate-level inhibition: (defun enzyme-inhibitors (enz) (let* ((genes (genes-of-enzyme enz)) (tus (remove-duplicates (loop for g in genes append (containing-tus g)))) (terminators (remove-duplicates (loop for g in genes append (terminators-affecting-gene g)))) (enzrxns (get-slot-values enz 'catalyzes)) ) (append (loop for enzrxn in enzrxns append (direct-inhibitors enzrxn)) (loop for tu in tus append (transcription-unit-inhibitors tu)) (loop for term in terminators append (direct-inhibitors term)) )))