1 / 45

Use of Chemical Information in Organic Synthesis

AGENDA:. Available information Introduction to reaction data searching Concepts and problems Basis of reaction classification DiscoveryGate Retrieving relevant information for the synthesis of new compounds Questions & Answers.

halen
Download Presentation

Use of Chemical Information in Organic Synthesis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AGENDA: • Available information • Introduction to reaction data searching • Concepts and problems • Basis of reaction classification • DiscoveryGate • Retrieving relevant information for the synthesis of new compounds • Questions & Answers Use of Chemical Information in Organic Synthesis Reaction Information for the Practicing Synthetic Chemist: The Search for Relevant Answers Guenter Grethe May, 2006

  2. Use of Chemical Information in Organic Synthesis Information Needs of Synthetic Organic Chemists in Basic Research and Development • new preparation of intermediates and starting materials • well established, high yield preparations (experimental procedures) • new synthetic methodologies (new reagents, catalysts etc.) • information on starting materials (availability, price, physical data etc.) • physical properties of reagents, solvents and catalysts • access to the primary, secondary, and tertiary literature • spectral information of related compounds General:searching for information on molecules precedes retrieval of synthetic methodology data

  3. Use of Chemical Information in Organic Synthesis Differences in Molecule vs. Reaction Searching • Query:Is this particular molecule or similar ones known? Specific data? • Answer:Yes or No from existing databases, including patents • Query:How to selectively reduce the nitrile group (transformation?) • Answer:Pointers to relevant examples in the literature • Criteria: • Efficient transformation • Functional group compatibility • Reactions conditions Molecules: Reaction Conditions? Reactions:

  4. Use of Chemical Information in Organic Synthesis Available Reaction Databases • online: CASREACT (CAS) (ca. 10.5 Mio, including Spresi database, 1985 - present ) Spresi (InfoChem) (ca. 4.5 Mio, 1974 – 2004) CrossFireplusReactions (Elsevier MDL, STN) (ca. 10 Mio, 1779 - present) ChemInform RX on STN (FIZ Chemie) (ca. 0.8 Mio) CCR (Thomson Scientific) (ca. 0.6 Mio) • inhouse: ChemInform Reaction Library (Elsevier MDL) Spresi (InfoChem) CrossFire Beilstein (Elsevier MDL) Specialty Databases (several vendors) Proprietary Databases For a good review see:Zass, E. "Reaction Databases", In: Encyclopedia of Computational Chemistry, Schleyer, P. von R.; Allinger, N.L.; Clark, T.; Gasteiger, J.; Kollman, P.A.; Schaefer, H.F.; Shreiner, P.R. (Eds.). Wiley, Chichester, 4, 2402-2420. QD39.3.E46 E53 1998

  5. Use of Chemical Information in Organic Synthesis Use of Available Information in Synthesis • Preparation of a distinct compound requires • access to information about new synthetic methodologies in journals and databases • experimental details for the preparation of known intermediates and starting materials from databases, journals and other sources • tools to plan syntheses and select optimal reaction conditions • Preparation of a library of diverse compounds requires • all of the above • knowledge about the characteristics of functional groups • information about available building blocks • Process development requirements are defined by • access to information about various reaction conditions of a reaction • knowledge about the characteristics of molecules or their fragments under required reaction condition • tools to calculate the behavior of reagents, solvents, and catalysts

  6. Use of Chemical Information in Organic Synthesis Barriers Impeding the Use of Available Information by Endusers • multiple access systems • different user interfaces • different modi operandi • difficult query formulation • substructure concept • keyword inconsistencies • limited post-search management of large hitlists • some integrated access to other information sources Most importantly:failure of available systems to recognize and to facilitate the integration of the vast knowledge of synthetic chemists

  7. Use of Chemical Information in Organic Synthesis Search Modes • Structure-Based Searches • Full structure • Only for reactions with known molecules (not very useful) • Reaction substructure (RSS) • Most frequently used mode (difficult for end-users to formulate effective query) • Reaction similarity • Various methodologies using different parameters (results often vary greatly, good for browsing and idea generation) • Reaction classification • Several methodologies, mostly based on structural information about reaction centers and immediate environment (good indexing tool, improvement over reaction similarity) • Reagents, Solvents • Full structure and substructure searches for molecules (not available in all databases, used mostly in conjunction with other structural searches) • Data-Based Searches • Keywords • intellectually derived terms for name reactions, reaction types etc. (incomplete, not very useful) • Journal, author, title, yields, etc. • Text or numeric data searches (mostly used in conjunction with structural searches)

  8. Use of Chemical Information in Organic Synthesis Problems with Reaction Searching Synthetic Problem: No hits* Full Structure Search: Reaction Substructure Search (colored fragment): 119 hits* Class Code Search 672 hits* (broad, reaction center only) 2972 hits* Keyword Search “Michael Addition”: *Results were obtained from Elsevier MDL’s combined reaction databases (ca. 1 Mio reactions); 2006

  9. Use of Chemical Information in Organic Synthesis Problems with Substructure Searching DATABASESIZE: ca. 1 million reactions Narrowly Defined Query 0 Hits Problems: - how to avoid excessively large hitlist - how to formulate “reasonable” search queries • Solutions: • combination of several queries (expert approach) • indexing of reactions (focusing on relevant reactions) • - facilitating query building (non-expert approach, intuitive)

  10. Use of Chemical Information in Organic Synthesis Goal for an Efficient Reaction Data Management System Create an environment that allows for combining the intelligence and creativity of synthetic chemists with the processing and simulating power of computers and the wealth of information in databases to meet the challenges in the laboratory for developing efficient syntheses.

  11. Use of Chemical Information in Organic Synthesis Requirements to Facilitate Enduser Searching • User interfaces based on users’ tasks and capabilities (e.g. CrossFire Web, DiscoveryGate, Reaction Browser, Scifinder) (see “A Framework for the Evaluation of Chemical Structure Databases”, Cooke,F; Schofield, H. J. Chem. Inf. Comput. Sci. 2001, 41, 1131-1140) • Hierarchical thesauri for keywords and reaction types • Effective indexing of databases (e.g. classification) • Simplification of the querying process (natural, not rule dependent) • Efficient post-search management tools(e.g.clustering) • Seamless integration of various information sources (web environment, point-and-click) Most importantly:available tools must simulate the chemist’s problem solving process

  12. Use of Chemical Information in Organic Synthesis Databases in DiscoveryGate

  13. Use of Chemical Information in Organic Synthesis Reaction Classification as Indexing Tool Reaction Classification as Indexing Tool ‘Do We Still Need a Classification of Organic Reactions?’ • Reasons • alternate method for indexing databases - complement to structure-based retrieval systems • access to “generic” types of information in retrieval systems • post-search management of large hitlists • simplification of query generation • linking of reaction information from different sources • source for deriving knowledge bases for reaction prediction and synthesis design • automatic procedures for analyses and correlations, e.g. quality control and overlap studies

  14. Use of Chemical Information in Organic Synthesis Reaction Classification as Indexing Tool Examples of some recent work • Horace: An Automatic System for the Hierarchical Classification of Chemical Reactions. Rose, J.R., Gasteiger, J. J. Chem. Inf. Comput. Sci. 1994, 34, 74 • COGNOS: A Beilstein-Type System for Organizing Organic Reactions. Hendrickson, J.B., Sander, T. J. Chem. Inf. Comput. Sci. 1995, 35, 251 • Knowledge Discovery in Reaction Databases: Landscaping Organic Reactions by a Self-Organizing Neural Network. Chen, L., Gasteiger, J. J. Am. Chem. Soc. 1997, 119, 4033 • Classification of Organic Reactions: Similarity of Reactions Based on Changes in the Electronic Features of Oxygen Atoms at the Reaction Sites. Satoh, H., Sacher, O., Nakata, T., Chen, L., Gasteiger, J., Funatsu, K. J. Chem. Inf. Comput. Sci. 1998, 38, 210 • Topology-Based Reaction Classification: An Important Tool for the Efficient Management of Reaction Information. Kraut, H., Löw, P., Matuszczyk, H., Saller, H., Grethe, G. Proceed. 5th Internat. Conf. Chem. Struct., Noordwijkerhout, The Netherlands1999, 26 • Analysis of Reaction Information. Grethe, G. In “Handbook of Chemoinformatics” Gasteiger, J. (Ed.) Wiley-VCH, Volume 4, 1407 – 1427, Weinheim, 2003

  15. Use of Chemical Information in Organic Synthesis Reaction Indexing through Classification Based on: Keywords: Michael addition, Michael reaction, ring closure……. Molecule Type: N-heterocycle, isoquinoline, quinolizidine….. Reaction Type: reaction centers

  16. Use of Chemical Information in Organic Synthesis Reaction Classification - Background • Classify v.2. 5, developed by InfoChem, Munich • Based on InfoChem’s reaction center perception algorithm A bondis defined as a reaction center if it is made or broken • An atom is defined as a reaction center if it changes • number of implicit hydrogens • number of valencies • number of -electrons • atomic charge • the connecting bond is a reaction center Rules and Definitions

  17. Use of Chemical Information in Organic Synthesis Reaction Classification - Background Rules and Definitions • Hashcodes are calculated for all reaction centers taking into account atom properties • atom type • valence state • total number of bonded hydrogens (implicit plus explicitly drawn) • number of -electrons • aromaticity • formal charges • reaction center information • The sum of all reaction center hashcodes of all reactants and one product of a reaction provides the unique reaction classification code: ‘ClassCode’

  18. Use of Chemical Information in Organic Synthesis Reaction Classification - Background Rules and Definitions • Inclusion of atoms in the immediate environment (spheres) • reaction centers only (0-sphere = BROAD) • reaction centers + -atoms (1-sphere = MEDIUM) • reaction centers + -atoms (2-sphere = NARROW) • inclusion of one sp3-atoms during sphere expansion • Atom equivalency • atoms in the same group of the periodic table, with the exception of row-2 elements, are considered equivalent • Multiple occurrences of identical transformations are handled as one

  19. Use of Chemical Information in Organic Synthesis Reaction Classification - Background Rules and Definitions

  20. Use of Chemical Information in Organic Synthesis Reaction Classification – Clustering of Search Results • Classification codes are data • stored in the database • usable for sorting (clustering) Result: 156 hits Clustered by Classification Code “MEDIUM) RSS-Search Query: (in red) 72 clusters 1.Cluster (20 rxns) 2.Cluster (15 rxns) 3.Cluster (13 rxns) 4.Cluster (8 rxns)

  21. Use of Chemical Information in Organic Synthesis Classification by Reaction Names • Chemists are familiar with Name Reactions (Diels-Alder, Michael etc.) • Papers in a one issue of JOC (22, 2004) mentioned 20 name reactions, known and lesser known, some multiple times • e.g.,Mitsunobu reaction, Nazarov reaction, Wolff rearrangement etc. • Several books dealing exclusively with Name Reactions* (ca.700 reactions) • Use of Name Reactions facilitates reaction retrieval • Complementary to other searches • Used in combination with other data • Easier alternative to formulating complex RSS queries • Excellent browsing tool • Overview of scope and limitations of a given reaction, e.g. Aldol reaction • Combining different reaction types leading to same compound class • Hantzsch pyridine synthesis from dihydropyridines or ß-keto esters • Fischer Indole synthesis from hydrazines or hydrazones • Darzens reaction of epoxides from esters, amides, sulfones, or nitriles *References Named Organic Reactions, Laue, T. and Plagens, A., Eds., John Wiley &Sons, 1st Edition 1999, 2nd Edition 2005 Organic Syntheses Based on Name Reactions, Hassner, A. and Stumer,C., Eds., Elsevier Science,1st Edition 1994; 2nd Edition 2002 Name Reactions, Li, J. J., Ed., Springer, 2002 Strategic Applications of Named Reactions, Kürti, L. and Czakó, B., Eds., Elsevier, 2005 Name Reactions and Reagents in Organic Synthesis, Mundy, B.P; Ellerd, M.G. and Favaloro, F.G., Jr. Wiley Interscience 2005 Note:The work on classification by reaction names is being developed at InfoChem (Munich) in consultation with G.Grethe

  22. Use of Chemical Information in Organic Synthesis Classification by Reaction Names- Requirements • Established electronically not intellectually • NOW – Intellectually derived • Inclusion of intellectually derived keywords greatly varies from database to database and depend on abstractors and are either too inclusive or not comprehensive • Example: “Michael addition” 184 hits (keywords) vs. 89 hits (RSS search) 52 hits (reaction name keywords) • FUTURE – Electronically derived • Assignments based on single or multiple RSS searches • Boolean logic is applied to combine and/or subtract search results (queries) • Assignments are pre-processed and added as data to database(s) • Name reactions are aligned in hierarchical order • Based on main reaction categories (addition, substitution, rearrangements, eliminations, oxidations, reductions) • Reactions can be listed in multiple categories, e.g.: • Baeyer-Villiger oxidation in Oxidation and Rearrangement • Hierarchy must be able to accommodate non-name reactions (future project) • Reactions containing n reactions (e.g., tandem reactions) are listed in n categories • Individual name reactions have to be recognizable • Otherwise, stored under “Miscellaneous” • Queries and corresponding names are stored in spreadsheet Use of Chemical Information in Organic Synthesis

  23. Use of Chemical Information in Organic Synthesis Classification by Reaction Names- Hierarchy Main categories First Level Second Level Third Level 1,2-Addition Darzens condensation Sulfones Addition 1,4-Addition Michael reaction Intermolecular Cycloaddition 4+2 Cycloadditions Diels-Alder reaction Aromatic electrophilic Friedel-Crafts acylation Intramolecular Substitution Aliphatic Nucleophilic Schotten-Baumann reaction Free radical Gomberg-Bachmann reaction Intermolecular Nucleophilic Hofmann rearrangement Alkyl Rearrangements Sigmatropic [3,3] Sigmatropic rearrangement Claisen rearrangement Radical Cope reaction Elimination Chugaev reaction Reductions Cannizaro reaction Intermolecular Oxidations Baeyer-Villiger oxidation Lactones Heterocyclic Synthesis Hantzsch pyridine synthesis Modified Miscellaneous Alper reaction Cyclocarbonylation

  24. Use of Chemical Information in Organic Synthesis Classification by Reaction Names– Keyword Generation Example: Intermolecular Mannich reaction with CH-acidic compounds Procedure:- generate query for general search - check hitlist for non-relevant hits - formulate queries to eliminate negatives - combine queries using Boolean operators Mannich reaction Query Q1 Elimination of negative hits: Biginelli reaction Query Q2 Aza Diels-Alder reaction Query Q3 Query set for intermolecular Mannich reaction with CH-acidic compounds: Q1 – (Q2+Q3)

  25. Use of Chemical Information in Organic Synthesis Classification by Reaction Names Example of query menu (partial view) from InfoChem’s SpresiWeb

  26. Databases Use of Chemical Information in Organic Synthesis “The design of organic syntheses by chemists without the help of computers proceeds in anything but a systematic stepwise manner from the target molecule to available starting materials. A systematic stepwise approach is more the exception than the rule”. “The human mind solves problems by lateral thinking, jumping from one idea to the next, from one question to a different one, from retrosynthetic thinking to considering the course and outcome of a reaction ,etc.” Gasteiger, J.; Ihlenfeldt, W.D.; Roese, P. Recl.Trav.Chim.Pays-Bas 1992, 111, 270. The paradigm in an ideal electronic world Journals Major Reference Works Books Databases E-Labjournal + Knowledge, Intuition, and Experience of Synthetic Chemist

  27. Use of Chemical Information in Organic Synthesis Integrated Major Reference Works (iMRW) (Reaction Databases, DiscoveryGate ) (Elsevier MDL, Third Party, Proprietary etc.) present status ClassCodes LinkFinderPlus (citations) LinkFinderPlus (citations) Tertiary Sources Primary Journals Major Reference Works (MRWs) iMRW links Future links

  28. Use of Chemical Information in Organic Synthesis Integrated Major Reference Works - Concept • Simulating chemists’ approach of gathering information from various sources (lateral approach) for solving synthetic problems through a simple point-and-click mechanism • Assisting chemists with the synthesis of new compounds by providing complementary information • With examples for synthetic methodologies from reaction databases • From summaries, critically evaluated by experts, describing • reaction mechanisms • principles of stereo-controlled reactions • applications, preparations, and properties of reagents • and other information generally not found in reaction databases • Through one-click linking to the primary literature when combined with LinkFinderPlus

  29. Use of Chemical Information in Organic Synthesis Integrated Major Reference Works - Summary iMRW…. • isa unique collaborationbetween Elsevier MDL, InfoChem and leading scientific publishers (Elsevier Science, Georg Thieme Verlag, and Springer-Verlag) • providesone-click, bi-directional linking based on reaction type between synthetic methodology databases and electronic versions of major reference works (MRWs) or between individual MRWs, i.e.a true integration of information: • allowstext and (sub)structure searching over multiple major reference works from asingle user interface

  30. Use of Chemical Information in Organic Synthesis Major Reference Works in iMRW • Detailed information about methodologies based on reaction type • Information about scope and limitations of reactions • Evaluated experimental procedures • Information about reaction mechanism, stereo-control, effect of substituents and ligands, and other factors influencing a reaction • Information about reagents and catalysts, their preparation and properties • Updates for each of them are planned or under consideration by the publishers and will be added when available

  31. Use of Chemical Information in Organic Synthesis Comprehensive Asymmetric Catalysis (CAC) - Summary Editors: Eric N. Jacobsen, Andreas Pfaltz, Hisashi Yamamoto (1999) CACis an innovative reference work that reviews inthree volumes catalytic methods for asymmetric organic synthesis, a major challenge in synthetic chemistry today. Illustrated by over 6,000 reactions critically evaluated by 60 leading experts in the field, the basic principles, mechanisms, basis for stereoinduction, and scope and limitations of asymmetric reactions are covered in-depth.

  32. Use of Chemical Information in Organic Synthesis Comprehensive Organic Functional Group Transformations (COFGT) – Summary Editors-in-Chief: Alan R. Katritzky, Otto Meth-Kohn, Charles W. Rees (1995) COFGT covers in40,000 reactionsandseven volumes the vast subject of organic synthesis in terms of the introduction and interconversion of functional groups. The editors have adopted a rather rigorous, logical and formal treatment on the basis of structure, which enables a detailed analysis of all known, and indeed of some as yet unknown, functional groups. Therefore, the treatise deals rationally and comprehensively with the method of their construction.

  33. Use of Chemical Information in Organic Synthesis Science of Synthesis - SummaryHouben-Weyl Methods of Molecular Transformations Editorial Board: D. Bellus, S. V. Ley, R. Noyori, M. Regitz P. J. Reider, E. Schaumann, I. Shinkai, E. J. Thomas, B. M. Trost 2001 • Science of Synthesisis the authoritative and comprehensive reference work for the entire field of organic and organometallic synthesis. The series of48 volumes will be published over a period of 8 years, it will present15,000selected synthetic methods for all classes of compounds illustrated by 150,000 reactions, and it includes • Methods critically evaluated by leading scientists • Background information and detailed experimental procedures • Schemes and tables which illustrate the reaction scope

  34. Use of Chemical Information in Organic Synthesis Collecting Information for the Synthesis of a new Compound Target molecule: Muray, E.; Rifé, J.; Branchadell, V.; Ortuňo, R.M. J. Org. Chem. 2002, 67, 4520 – 4525 (The paper describes the syntheses of cyclopropyl nucleosides as potential antiviral and antitumor agents)

  35. Use of Chemical Information in Organic Synthesis Synthesis Plan Retrosynthetic Analysis:N1-alkylation of adenine 1.Step: general information about the alkylation reaction 2.Step: information about the preparation of A,including stereochemistry 3.Step: information about scope and limitations, effect of substituents, applicable reagents etc.

  36. Use of Chemical Information in Organic Synthesis Reaction Substructure + Data Search in DiscoveryGate

  37. Use of Chemical Information in Organic Synthesis

  38. Use of Chemical Information in Organic Synthesis

  39. Use of Chemical Information in Organic Synthesis Search for Similar Reactions in iMRW

  40. Use of Chemical Information in Organic Synthesis Literature Linking COFGT chapter

  41. Use of Chemical Information in Organic Synthesis Text Search in iMRW

  42. Use of Chemical Information in Organic Synthesis Information about Enantioselective Cyclopropanation from CAC

  43. Use of Chemical Information in Organic Synthesis Text Search Results from COFGT and Linking to Literature

  44. Use of Chemical Information in Organic Synthesis Integration of iMRW with Reaction Database

  45. Use of Chemical Information in Organic Synthesis • DiscoveryGate provides chemists with relevant information from different sources required for solving synthetic problems in a single system allowing for interaction by the user in an interactive fashion • Access is provided from an intuitive user-interface by a simple point-and-click mechanism. • The system very closely simulates the lateral information gathering process of synthetic chemists Conclusion

More Related