1 / 22

Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004

Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004. Development of Molecular Geometry Knowledge Bases from the Cambridge Structural Database. Cambridge Structural Database Stored geometric information for ~300,000 structures Search using Conquest

vicky
Download Presentation

Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Stephanie Harris Crystal Grid Workshop Southampton, 17th September 2004 Developmentof Molecular Geometry Knowledge Bases from the Cambridge Structural Database

  2. Cambridge Structural Database • Stored geometric information for ~300,000 structures • Search using Conquest • Substructure search, user input required • Molecular Geometry Knowledge Bases • Library of chemically well-defined geometric information • Limited user input • Rapid retrieval of statistical data

  3. Molecular Geometry Knowledge Base: • Mogul • Bond lengths, valence angles and torsion angles • Compiled from the CSD • Applications • Model building • Refinement restraints • Structure validation • Comparative values • Published bond length tables: • Organic and metal containing structures • Published late 1980s • Compiled from CSD of ~50,000 structures • Cannot be accessed by computer programs

  4. Mogul 1.0 • Whole molecule input • Graphical (cif, SHELX, mol2 files) or command-line interface • Integration with client applications, e.g. Crystals • Quick, automatic retrieval of statistical data, histogram distributions, CSD structures • Search Algorithm • All non-metal fragments in the CSD coded • Set of keys code chemical environments • Fragments with identical keys are chemically identical • Use hierarchical search tree • Generalised searching if insufficient hits

  5. Search Mogul Search .S1 .C7

  6. Co-O bond length? Metal – Ligand Bond lengths • To be considered: • Ligand type: Carboxylate • Metal Oxidation State: Co(II) • Metal coordination number: 6 • Ligand trans: Oxygen ligand • Spin State?

  7. Method • Analysis of M-L bond lengths. • For a range of metal and ligand types identify factors which influence M-L bond lengths and evaluate their importance. • For a defined Metal-Ligand group sub-divide bond length distribution to produce ‘chemically meaningful’ datasets: • Unimodal distributions. • ‘Reasonably small’ sample standard deviations. • From hand-crafted examples develop an algorithm to produce a molecular geometry knowledge base for metal complexes.

  8. Data Tree Metal-Ligand Group Bin A1 Bin A2 Bin B1 Bin B2 Bin B3 Bin B4 Sharpened distributions Smaller sample standard deviations Bin C1 Bin C2

  9. Criteria Influencing M-L Bond Lengths • Ligand, L • Coordination mode of ligand • Effective Metal Coordination Number • Metal Oxidation State • Metal clusters and cages • Spin state • Jahn-Teller effect • Metal coordination geometry • Ligand trans to L

  10. Ligand Template Library Ligand • Non-metal atom or fragment bonded to a metal. • Two ligands are the same if they have same connectivity (topology) and stereochemistry. Method • All ligands in CSD to be classified. • Classify according to contact atom coordinated to metal. • Ligands with multiple contact atoms can be present in more than one ligand group. e.g. SCN-

  11. Cambridge Structural Database • Approximately 22,000 formulae • Approximately 780,000 ligands • Ligand Template Hierarchy • Exact ligand templates (724) • R-substituted templates (H’s replaced with ‘innocent’ R groups) • Generic templates (ALL ligands classified)

  12. No. of Frags. Co-O: 1.929(62) Å 619 Fragments Co-O (Å) Cobalt Carboxylate Bond Lengths

  13. Co(II) Co(III) 2.049(58) Å 1.904(20) Å 1.929(62) Å 2.073(42) Å 1.904(20) Å 1.910(15) Å 2.074(32) Å 1.895(17) Å

  14. Chlorides Fe-Cl 2.242(68) Å 2.189(24) Å • Pyridines e.g. Fe (spin state) Fe(II)L5py High Spin 2.166(84) Å 2.225(29) Å • Copper complexes (Jahn-Teller effect) Standardisation of Cu connectivity Cu(II)-OH2 2.232(225) Å • Tertiary phosphines, Carbon-ligands

  15. Metal-Ligand Knowledge Base • 1. CSD data adjustment: • Standardisation of metal connections • Assignment of metal as part of a metal cluster • Assignment of metal oxidation state 2. Classification of ligands by ligand template library 3. Perform algorithm on all possible M-L fragments to produce knowledge base

  16. Metal-Ligand Group Algorithm: From ligand template library: Generic or more specific e.g. Carboxylates:

  17. ‘Metal Clusters’ Division on Oxidation State Division on Metal effective coordination number Division on spin and Jahn-Teller effect • Only for particular metals, oxidation states and coordination numbers. • Not found for all ligand types. • Not searchable in CSD. • Flag users, effects evident by: • bimodal histogram, high SSD, outliers. Metal-Ligand Group

  18. Division on Metal coordination geometry E.g. 4-coordinate geometry: Tetrahedral, square planar, disphenoidal Metal-Ligand Group ‘Metal Clusters’ Division on Oxidation State Division on Metal effective coordination number Division on spin and Jahn-Teller effect

  19. Divide on trans ligand to L More specific ligand e.g. alkyl carboxylate Final Ligand division Metal-Ligand Group ‘Metal Clusters’ Division on Oxidation State Division on Metal effective coordination number Division on spin and Jahn-Teller effect Division on Metal coordination geometry

  20. Generalised Searching • No hits or insufficient number of hits. • Allows the retrieval of data on related fragments. • Hierarchical search tree structure • Move up to a higher, less specific level of data tree. • Order of algorithm important. • Should order of criteria be changed? • Should order depend on M-L group? E.g. Should oxidation state always be the first main division?

  21. Conclusions • Pre-processing of structural data from the CSD to construct molecular geometry knowledge bases. • Knowledge bases to contain chemically well-defined datasets. • Limited user input required. • Quick, automatic retrieval of statistical data, distributions. • Efficient analysis of large number of chemical fragments. • Outliers, high SSD? • Further Analysis – Computational Chemistry. • Further development to include extra chemical information e.g. computational data.

  22. Acknowledgements Bristol University: Guy Orpen Natalie Fey X-Ray Crystallography Group Cambridge Crystallographic Data Centre: Robin Taylor Frank Allen Ian Bruno Greg Shields

More Related