1 / 20

CSC 599: Computational Scientific Discovery

Explore the complexities of chemistry mechanisms and basic problems in the Mechem system, understanding the rules and patterns in molecules. Discover insights into chemical reactions and the challenges of automating the process.

Download Presentation

CSC 599: Computational Scientific Discovery

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 6: The Mechem System CSC 599: Computational Scientific Discovery

  2. Overview Domain: Chemistry Mechanisms Basic Mechem Problems with Basic Mechem Mechem with TEST-STEP and INFER-STRUCTURES

  3. The Domain Chemistry is a unique science: • The “rules” of the science are worked out (we think!)‏ • Stoichiometry: conservation of mass/energy/charge • Valences of atoms • Patterns in molecules • Benzene rings • Complexity due to combinatorics • Infinite number of possible organic compounds • Many possible chemical mechanisms

  4. Example Mechanism 6CH4 + 3O2 -> 5H2+CH3CH3+H2O+CH3OH+CH2O+CO+CO2 • CH4 + M -> CH3M + H • CH3M -> M + CH3 • O2 + CH3M -> CH3O2M • 2H -> H2 • CH3 + CH3M -> M + CH3CH3 • H + CH3O2M -> H2O + CH2OM (Not finished yet!)‏

  5. Example Mechanism (2)‏ 6CH4 + 3O2 -> 5H2+CH3CH3+H2O+CH3OH+CH2O+CO+CO2 • CH3 + CH3O2M -> CH3OH + CH2OM • CH2OM -> M + CH2O • CH3O2M -> H + CH2O2M • CH2O + CH2O2M -> CH3O2M + CHO • CH2O2M + CHO -> CH3O2M + CO • CO + CH2O2M -> CH2OM + CO2 (Whew!)‏

  6. Example Mechanism (3)‏ Let's step back and look at what we did: • It's “easy” because: • Apply rules of chemistry to posit legal steps • Assemble legal steps to accomplish overall reaction • It's “hard” because: • Combinatorics of all the reactions that COULD have been done • Why not, for example: 2CH3O2M -> CH3O4CH3O (chemists might think unlikely)‏ • How would you search space of mechanisms? • Is it worth automating?

  7. Basic Mechem: the Big Picture MECHEM searches space of reactions • Exhaustive search • From simplest to increasingly more complex Inputs: all reagents, at least one product Outputs: first (i.e.) simplest mechanism from reagents to product(s)‏ MECHEM will: • Search space of reaction pathways • Recall DENDRAL searched space of chemical structures

  8. Basic MECHEM, Issues • What is “simplicity” for mechanisms? • Number of “species” (atoms, molecules, or radicals) allowed • Number of reaction steps • Space is combinatorially huge! • Use rules of chemistry to limit search • Each reaction may have at most 2 reactants and 2 products • Disallow violations like C2H6O-1 • search formulas, not structures C2H6O can be either Ethyl alcohol: CH3CH2O Dimethyl ether: CH3OCH3

  9. MECHEM Algorithm (1)‏ findPathways(reagents, prods) { for ( maxSpeciesCount = reagents.count() + prods.count(); ; maxSpeciesCount++, prods.addNewVariable()‏ ) { sequenceList.setToEmpty(); extendNum = ceiling((maxSpeciesCount-num(reagents))/2); do { sequenceList=sequenceList.extendBy(extendNum,reagents,prods); if ( sequenceList.hasSolution() )‏ return( sequenceList.getSolution() ); extendNum = 1; } while ( !sequenceList.isEmpty() )‏ } }

  10. MECHEM Algorithm (2)‏ sequenceList::extendBy(downCount, reactants,prods) { if (downCount == 0)‏ return(this.sequenceList); newSequenceList.setToEmpty(); for ( seq in this.sequenceList) do for ( react in reactants) do for (prod in prods) do { newSeqence = seq.append(makeStep(react,prod)); newSeqence.inferAndInstantiateVarValues(); if ( newSequence.getIsLegal() )‏ newSequenceList.add(newSequence); } return(newSequenceList.extendBy(downCount-1,reactants,prods)); }

  11. Example: n1(C7H9N) + n2(CH2O) --> n3(C17H18N2) + n4(H20)‏ maxSpeciesCount = 6; initMaxReactionSteps = 3 • Initialize reactions as half steps: CH2O -> . . . C7H9N -> . . . 2CH2O -> . . . 2C7H9N -> . . . CH2O + C7H9N -> . . . (RECALL: 1 and 2 reagent reactions only)‏ • Reject full step reactions that are illegal • Example: Reject C7H9N -> H2O

  12. Example (2): Legal Whole Steps • CH2O -> H2O + X • CH2O -> X + Y • 2(CH2O) -> X • 2(CH2O) -> H2O + X • 2(CH2O) -> X + Y • 2(C7H9N) -> X • 2(C7H9N) -> X + Y • CH2O + C7H9N -> X • CH2O + C7H9N -> H2O + X • CH2O + C7H9N -> X + Y • C7H9N -> X + Y

  13. Legal 2 Step Reactions Only 2 reagents, only 2 reactions • 91 legal two-step reactions (Yikes!)‏ Here are the 4 that can be extended: CH2O + C7H9N -> H2O + X CH2O + X -> H2O + Y CH2O + C7H9N -> H2O + X CH2O + X -> Y CH2O + C7H9N -> H2O + X 2X -> C7H9N + X CH2O + C7H9N -> H2O + X 2X -> Y

  14. Basic MECHEM analysis Found 5 solutions, 6 steps each Another example: C3H8 (propane) + O2 (oxygen) -> C2H4O2 (acetic acid) + C3H6O (acetone)‏ + C3H8O (isopropanol) + C2H4O (acetaldehyde)‏ + C4H8O2 (ethyl acetate) + CH4O (methanol)‏ + CO2 (carbon dioxide)‏ • 16 pathways, 10 species, 6 steps • Several hours on DecStation 3100 (14 MIPS)‏ Yikes!

  15. What would you do? 10 species, 6 steps --> several hours That reaction ain't all that big How would you speed it up? Other forms of knowledge to reduce search? (What information did we ignore in Basic Mechem?)‏

  16. What would you do? 10 species, 6 steps --> several hours That reaction ain't all that big How would you speed it up? Other forms of knowledge to reduce search? (What information did we ignore in Basic Mechem?)‏ • Search formulas, not structures BINGO!

  17. Mechem with Structural Help Idea: Keep track of structure of molecules and radicals Heuristics: Let N = max. number of (topological) bonds created/destroyed By default, for all steps N <= 3

  18. Mechem with Structural Help (2)‏ Example • Notation: [(#Cs) (#Hs) (#Os) (#Ms)] M = “metal” (catalyst)‏ • Reaction has: . . . . W[1 3 2 1] -> V[0 1 0 0] + Y[1 2 2 1] . . . . With N = 2 No possible Y With N = 3 Y could be 2 things: MCH2OO or MCHOOH With N = 4 Y could be 18 things! • Handles 12 step reactions (hrs on Si Graphic Indigo)‏

  19. Other Ideas for Speeding Mechem • Max # of atoms/element, oxidation state, etc. • Bi-directional search • List of “compounds” that should not appear • Can do spectroscopy to see if they really do occur, even if transient • A.I. notion of “easy”  chemist notion of “easy” • AI. researcher: “Minimize number of steps!” • Chemists: “Minimize energy of rate determining step!” • Compare “cost” of bond-breakings • Mechem (rxn step proposer) + ChemNet (reaction network)‏

  20. Take Home Message: About Combinatorial Search • Finds “all” solutions? Finds “the best” solution? Yes, when scientists' notion of “simplest” == CSD researcher's notion of “simplest” • Takes forever! / only for toy problems Use domain knowledge Domain knowledge usage • Started simply (initially no structures)‏ • Used more when needed to solve harder problems (structure knowledge -> solve bigger problems)‏ • Eventually added so much knowledge that architecture changed (time to redesign algorithm?)‏ • HOW WOULD YOU DO BETTER THAN MECHEM?

More Related