280 likes | 294 Views
Bringing Structure to Biology: Small Molecules and the PDBe. PDBe overview. PDB is a core molecular database at EMBL-EBI PDBe is a founding partner of Worldwide Protein Data Bank (wwPDB) Founder of Electron Microscopy Data Bank (EMDB) Mission: Bringing Structure to Biology
E N D
PDBe overview • PDB is a core molecular database at EMBL-EBI • PDBe is a founding partner of Worldwide Protein Data Bank (wwPDB) • Founder of Electron Microscopy Data Bank (EMDB) • Mission: Bringing Structure to Biology • Major activities: • Deposition and annotation site for structural data on biomacromolecules (X-ray, NMR, EM) • Integrated resource of high-quality macromolecular structural data and related information • Provide tools and services for accessing, exploiting and disseminating structural data to the wider biomedical community
PDB Depositions 10,000th PDBe annotated structure - April 2011 (2yf6) www.pdbe.org/2yf6
Chemical Component Dictionary • Compounds in the PDB • Small molecules bound to macromolecules • Individual components of macromolecules • wwPDB maintains dictionary descriptions for all unique chemical components • Name, synonyms, formula, SMILES, … • Atoms and bonds • Ideal and representative coordinates • Each new component assigned a unique 3-letter identifier • Release coincides with the release of the parent PDB entry
Molecule search options • Compound name • Ligand 3-letter code • SMILES • Formula (exact or range)e.g. C6-10 N4 O2 S0 • Chemical substructure www.pdbe.org/chem
Ligands and the PDBe Open chemistry sketchpad
2D Ligand Interaction Diagrams www.pdbe.org/leview • Interaction diagrams for any given PDB entry • Interactive control of distance criteria • Diagram customisation • Image exportpng, jpg, eps… S-benzyl-glutathione (GSB) Human Glyoxalase inhibitor (1guh)
PDBeXpress: rapid access to protein-ligand interaction statistics • Understand and assess binding site interactions • Provide chemists with quick answers to common questions without the need to construct complex search queries • What residues interact? • Which enzymes interact? • What binds here? • www.pdbe.org/express
What residues interact? RTL - Retinol • PDB three-letter ligand code • Ligand name
What residues interact? RTL - Retinol
Which enzymes interact? MAN – Mannose • PDB three-letter ligand code • Ligand name
Which enzymes interact? MAN – Mannose • PDB three-letter ligand code • Ligand name
What binds here? • Search for ligands that interact with a given set of residues • Can specify a partial or exact binding environment
PDBeMotif: powerful and flexible searching • PDBeXpress modules driven by PDBeMotif • PDBeMotif allows to combine protein sequence, chemical structure and 3D data in a single search
PDBeMotif: powerful and flexible searching • construct queries based on - • ligands and their 3D environment • secondary structure elements and small 3D motifs • protein φ/ψ angle sequences - sequential representation of the protein geometry • results can be analysed against UniProt, CATH, PFAM or EC
Ligands need careful validation • CCDC analysis of ligand geometries (using Relibase+/Mogul/EDS) • Around 20% of recently determined structures have geometric errors that could potentially cause a misleading interpretation of the binding interactions Wrong Unusual/Strained Correct Liebeschuetz, J.W., Hennemann, J. The good, the bad and the twisted: A survey of ligand geometry in protein crystal structuresJ. Comput. Aid. Mol. Des., 26, 169-183 (2012)
The solution… • Mogul – a Knowledge-based library of molecular geometry derived from the Cambridge Structural Database (CSD) • Enables rapidly validation of the complete geometry of a given query structure and identification of unusual features
MoU with CCDC • wwPDB/CCDC Memorandum of Understanding • wwPDB gets to use Mogul for validation of all current and future compounds in the PDB • wwPDB gets to incorporate and redistribute CSD coordinates for all current and future ligand compounds in the PDB • wwPDB gets to use Mogul and CSD coordinates to derive dictionaries for all current and future compounds in the PDB
Prevention is the best cure • Thanks to collaboration with CCDC • We can add CSD coordinates for all existing small molecules in the PDB (and variants, e.g. D-amino acids) that also occur in the CSD • We can use these coordinates and Mogul to derive refinement dictionaries • Grade (Global Phasing; uses Mogul and RM1) • Will improve quality and consistency of the archive • We can provide reasonable starting coordinates and refinement dictionaries for all existing compounds in the PDB
Future of the PDB? • At present PDB is a historic archive • We have to accept and distribute everything • “Archive” – i.e., what was described in the literature • Essentially provider-centric • We capture X-ray detector type but not ligand function… • Organised by entry rather than molecule/complex/… • Shifting user communities/demands • We must serve the consumers of structural data (non-experts) • Don’t think in terms of PDB entry codes • Can’t tell a good from a bad model
Thank you! • Tutorials… • http://www.ebi.ac.uk/pdbe/resources/educationTabContent/tutorials/PDBeChem.pdf • http://www.ebi.ac.uk/pdbe-apps/quips?story=XmasFactor&auxpage=XmasChemTut • http://www.ebi.ac.uk/pdbe/docs/Tutorials/PDBeChem.html • Contact us… • www.pdbe.org • pdbehelp@ebi.ac.uk • Follow us… http://www.facebook.com/proteindatabank http://twitter.com/PDBeurope