140 likes | 152 Views
Learn about the concept of generic reactions, instantiation algorithms, practical examples, and debugging methods in biochemical pathways to ensure mass balance and compound classification accuracy. Understand the complexities and requirements of compound instances in BioCyc Pathway/Genome Databases. Dive into the specifics of instantiation IDs, polymerization pathways, and special cases like Electron Transfer Reactions and compartment-specific instantiations.
E N D
Instantiation of Generic Reactions by Markus Krummenacker Q4 2011
Generic Reactions • Some enzymes have broad substrate specificity • Therefore, many EC reactions are formulated as generic reactions, with some compound classes as substrates. Often, the full specificity range is unknown. • A generic reaction is a more compact representation than listing every instance reaction explicitly • BioCyc PGDBs have many generic reactions • Examples: 1.1.1.69: NAD(P)+ + D-gluconate = NAD(P)H + 5-dehydro-D-gluconate + H+ 1.3.99.3: a 2,3,4-saturated fatty acyl CoA + FAD -> FADH2 + a 2,3-dehydroacyl-CoA
Problem with FBA • The reaction network in FBA models is formulated in terms of specific compound instances • Problem: disconnect between class and instance frames • Example (to eventually produce cardiolipin): • D-glyceraldehyde-3-phosphate + phosphate + NAD+ -> 1,3-bisphospho-D-glycerate + H+ + NADH • dihydroxyacetonephosphate + NAD(P)H + H+ -> sn-glycerol-3-phosphate + NAD(P)+ • Remedy: automatically generate instance-based versions from a generic reaction • Runs as a preprocessing step. Instance reactions are not saved to the PGDB.
Cases of Generic Reactions • Individual generic reactions: can be part of pathways or be standalone. The most common case. • Polymerization pathways: a series of reactions needs instantiation for several cycles. • Single polymerization reactions: like glycogen metabolism. Not handled currently. • The success rate of instantiation depends on how thoroughly a PGDB was curated. There are still many generic reactions for which this does not work well.
Instantiation Algorithm • Generic reaction: |Xs| + H2O = |Ys| • |Xs| is a class with instances X1 X2 X3 • |Ys| is a class with instances Y1 Y2 • Instantiation code tries to pair all instances on LEFT and RIGHT sides with each other, substituting for the class, leading to temporary reactions like: • X1 + H2O = Y1 • X2 + H2O = Y1 • X1 + H2O = Y2 etc. • Test whether for a given instance in |Xs| , there is only 1 instance in |Ys| that leads to a mass-balanced reaction equation. If yes, create the instance-based reaction on the fly. • (No chemical structure matching yet.) • If an existing reaction frame for the instance based reaction can be found, it is used instead of the instantiation.
Requirements • Reactions have to be fully mass balanced • Compound instances need to be created, with structures • Compound structures need pH7.3 protonation • Compound instances have to be correctly classified under the classes used in generic reactions • Right-click command Edit->Compound Editor • Multiple instances with identical chemical formula will be ambiguous
Practical Example • Right-click command “Show reaction’s instantiations in terminal” • EC# 1.1.1.69 • GLUCONATE-5-DEHYDROGENASE-RXN : NAD(P)+ + D-gluconate = NAD(P)H + 5-dehydro-D-gluconate + H+ [balanced] • successes: • NADP+ + D-gluconate = NADPH + 5-dehydro-D-gluconate + H+ • NAD+ + D-gluconate = NADH + 5-dehydro-D-gluconate + H+ • failures: • non-unique-balanced-instantiations (cannot decide which of several instantiations is correct): • success vs. failures vs. non-unique-balanced-instantiations: 2 / 0 / 0
Debugging of Pathways • Right-click command “Show pathway’s instantiated reactions in terminal” • Conveniently shows results for all reactions • Debugging: If lots of problems, it helps to put a compound into the biomass that occurs early in pathway, to see if this at least can be produced • Example pathways to instantiate: • proline biosynthesis I • CDP-diacylglycerol biosynthesis I
Special ETR Instantiation • Electron Transfer Reactions (ETRs) refer to quinone classes, usually. • Different isoprenoid tail lengths exist in various organisms. • For now, instantiation is hard-coded to use ubiquinone-8 and menaquinone-8 • Future plan: use NCBI taxonomy for selection of tail length. B. subtilis uses menaquinone-7
Special Compartments Instantiation • Schema change in BioCyc 15.0 regarding representing compartments of reactions • Now, 1 reaction can be assigned to multiple compartments. • FBA makes compartment-specific instantiated reactions to differentiate between the compartments
Syntax of Instantiation IDs • Every instantiated reaction gets assigned a unique ID • Visible in .sol file • Constructed from the generic reaction and the IDs of the instance compounds on the left and right • Format: GEN-RXN-ID-L1/L2//R1/R2.suffix-len. • Non-default compartments: • GEN-RXN-ID[CCO-PERI-BAC]-L1/etc….
Polymerization Pathways • Cyclic pathways of generic reactions
Polymerization Pathway Instantiation • A series of instantiated reactions is needed to reach a product of a certain length • Run cycle for 8 iterations (hard-coded, for now) • Structures of class compounds have R groups • The hallmark of a polymerization pathway is that 1 reaction, the polymerization step, is unbalanced. • For now, the chemical formula of the misbalance is determined, which stands for the monomer unit. (No structural information is used, yet.) • Appropriate instance compounds are searched by replacing the R groups with an integral multiple of the misbalance • Still a bit experimental.
Instantiated reactions in .sol • In the .sol file, instantiated reactions are listed in full detail, in the reactions sections • In the .dat file, to be used for the Cellular Overview, fluxes of instantiated reactions are all combined into a value for the base generic reaction, because the Cellular Overview can only show the latter.