1 / 54

Overview of Problems with Carbohydrates in the PDB

Overview of Problems with Carbohydrates in the PDB. “ ...while the functions of DNA and proteins are generally known.....it is much less clear what carbohydrates do... ”. Ciba Foundation Symposium 1988. A lesson in doing this project. Performance, Feedback, Revision.

blaine
Download Presentation

Overview of Problems with Carbohydrates in the PDB

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overview of Problems with Carbohydrates in the PDB

  2. “...while the functions of DNA and proteins are generally known.....it is much less clear what carbohydrates do...” Ciba Foundation Symposium 1988

  3. A lesson in doing this project Performance, Feedback, Revision http://the273.com/2011/05/24/baba-brinkman-performance-feedback-revision-video/ Link provided by Helen Berman

  4. Priorities change No point you have had this Major part of PDB but not that interesting Most interesting chemistry Important to understand first Every step you do changes the next steps to be done

  5. New Schedule Carbohydrates and the PDB Natural Product Carbohydrates N- and O-Glycans Dont know – see what is appropriate

  6. How Much more Complex is the Glycome of an organism in Comparison with its Genome? GLYCANS (SUGAR CHAINS) Proteome Genome Glycome ENZYMES Zymome? Transcriptome LIPIDS Lipome DNA RNA PROTEINS Variations in structure, time and space. Changes in response to environment

  7. Diversity of structures, Information carrying potential Laine, RA (1994) “A Calculation of all Possible Oligosaccharide Isomers, Both Branched and Linear Yields 1.05 x 1012 Structures for a Reducing Hexasaccharide: The Isomer Barrier to Development of Single-Method Saccharide Sequencing or Synthesis Systems” Glycobiology 4:759-767.

  8. General Characteristics In nature, most carbohydrates are found bound to other compounds rather than as simple sugars • Polysaccharides (starch, cellulose, inulin, gums) • Glycoproteins and proteoglycans (hormones, blood group substances, antibodies) • Glycolipids (cerebrosides, gangliosides) • Glycosides • Mucopolysaccharides (hyaluronic acid) • Nucleic acid polymers

  9. Classification of Carbohydrates Carbohydrates can be classified by size: • Monosaccharides (monoses or glycoses) • Trioses, tetroses, pentoses, hexoses • Oligosaccharides • Di, tri, tetra, penta …up to 10 • (The disaccharides are the most important) • Polysaccharides (or glycans) • Homopolysaccharides (all the same type) • Heteropolysaccharides (mixtures of momomer types) • Complex carbohydrates (joined to non-carbohydrate molecules)

  10. Derivatives of monosaccharides with biological activities: • Phosphate and sulphate esters • Alditols • Aldonic and uronic acids • Deoxysugars • Aminosugars • Family of sialic acids • N-acetylmuraminic acid • Glycosides

  11. What are you searching ? June 27th 2011 Number PDB entries 73951 Number chem_comp 14206 132117 HETNAM in pdb files Number chem_comp in HETNAM 12111 Number chem_comp Released 12289 Number chem_comp Hold 1363 Number chem_comp Obsolete 381 sum 14033 Number in REMOVED list 381 (REMOVED not equal obs) 351 chem_comp missing from num in PDB + OBS + HOLD must be number in remediation either new or obs from antibiotics/inhibitors Searching chem_comp in isolation of the PDB entries not recommended – check if a chem_comp exists and the LINK records to see if the instance was built correctly Note: after remediation release 100’s chem_comp change status

  12. The majority of potential chemical entities in PDB exist in a small number of Entries 8074 chem_comp appear in 1 entries 1554 chem_comp appear in 2 entries 628 chem_comp appear in 3 entries 365 chem_comp appear in 4 entries 226 chem_comp appear in 5 entries 167 chem_comp appear in 6 entries 124 chem_comp appear in 7 entries 98 chem_comp appear in 8 entries 73 chem_comp appear in 9 entries 678 chem_comp appear in 10 to 100 entries 47 chem_comp appear in 101 to 200 entries For uncommon groups check the PDB entry !!!!

  13. Top 77 chem_comp by count of released pdb entries Includes 8 sugars

  14. NDG and NAG Common error in N-linked N-acetyl-D-glucosamine attached to asparagine There are 429 cases (it was ~200 in May 2007 so the annotation/ deposition is still not alerting the depositor) for which we have had to assign, by stereochemistry matching, the incorrect 2-(acetylamino)- 2-deoxy-α-D- glucopyranose (NDG) rather than the correct 2-( acetylamino)-2-deoxy-β-D- glucopyranose (NAG). Asn---NAG is ALWAYS beta- never alpha

  15. Deposition of PDB Entries Refinement programs use geometric restraints as part of refinement. For protein structures accurate bond and angle parameters are based on parametersderived from a statistical survey of X-ray structures of small compounds from the Cambridge Structural Database. (R. A. Engh and R. Huber). Other restraints for proteins, nucleic acids, and other common molecules come from the CCP4 monomer library.

  16. Deposition of PDB Entries These restraints are used in refinement to prevent distortions of model geometry, and to increase the observation-to-parameter ratio. The default restraints are for bond lengths, bond angles, dihedral (torsion) angles, chiral centers, planar groups (such as aromatic rings), and nonbonded (VDW) interactions.

  17. Refinement Restraints for Carbohydrates Although geometry restraints for carbohydrates exist they are not always used with the result that there are geometry errors in deposited files. Many of the stereochemical errors can be detected by reference to conformational studies of glycans and to publicly available resources (http://www. glycosciences.de/tools/). However, these errors also indicate that there is a wide discrepancy in the sophistication of building and validation tools available for protein and carbohydrate models.

  18. PDB does not contain N-linked Glycan unknown to glycobiology – resources that depositors should use: http://www.glycome-db.org/ http://www.glycostructures.jp/ http://www.cbs.dtu.dk/databases/OGLYCBASE/ http://www.glycoforum.gr.jp/ http://www.genome.jp/ligand/kcam/ http://www.functionalglycomics.org/static/index.shtml http://www.glyco.ac.ru/bcsdb3/ http://www.casper.organ.su.se/ECODAB/ http://www.functionalglycomics.org/static/gt/gtdb.shtml http://akashia.sci.hokudai.ac.jp/ http://hexose.chem.ku.edu/sugar.php http://www.eurocarbdb.org/ http://glyco3d.cermav.cnrs.fr/glyco3d/ http://www.glycosciences.de/modeling/sweet2/doc/index.php

  19. First of all A GOOD carbohydrate PDB 1qbb Di-(N-Acetyl-D-glucosamine)

  20. NOTE role of aromatic amino acid side chains in controlling stereochemical selection NOTE also two positions for reducing end O atom – under PDB rules this would be 2 chem_comps but here alpha- and beta- same compound

  21. Crystallographic Inventions Man-(1→3)-GlcNAc and GlcNAc- (1→3)-GlcNAc linkages (of indeterminate anomericity) within the trimannosyl core, hybrid-type glycans containing a terminal Man-(1→3)-GlcNAc linkage on the 3- antennae β-galactosyl motifs capping oligomannose-type glycans. Entry 2H6O

  22. Crystallographic Inventions The pilin glycans from Neisseria species share a common structure, in particular with respect to the unusual O-linked sugar residue 2,4-diacetamido- 2,4,6- trideoxyhexose (DATDH) However, in the PDB (1AY2, 2PIL) , the pilin structure from Neisseria gonorrhoeae show a galactose-α-1,3-N- acetylglucosamine- serine In later PDB (2HI2, 2HIL), the correct sugar, 2,4-bis(acetylamino)-1,5- anhydro-2,4-dideoxy-d-glucitol, is reported O-linked to serine.

  23. 1AY2 2HI2 incorrect correct

  24. PROBLEM 1 D- vs L- Designation D & L sugars are mirror images of one another They have the same root name (but a different D/L designation), [e.g. D-glucose & L-glucose] Other stereoisomers have unique names, (e.g. glucose, mannose, galactose, etc) The number of stereoisomers is 2n, where n is the number of asymmetric (chiral) centers The 6-C aldoses have 4 asymmetric centers. Thus there are 16 stereoisomers (8 D-sugars and 8 L-sugars).

  25. D and L tell you nothing about stereochemistry The result is authors who refine with a standard e.g. mannose and a linkage or alpha- / beta- C1-OH patch don’t necessarily deposit the PDB required chem_comp name for alpha-Mannose (MAN) or beta-Mannose (BMA). If you used R and S per chiral centre no chemist will understand that you are describing a sugar but the stereochemistry will be exactly defined and mistakes avoided

  26. Cyclization of glucose produces a new asymmetriccenter at C1. The 2 stereoisomers are called anomers, a & b PROBLEM 2 alpha- beta at C1

  27. Chem_Comp LEAVING ATOM The PDB has rules to include LINKAGE in 3-letter code Refinement (suppliers of coordinates to PDB) use “patches” to describe alpha- and beta- NOT 3-letter code Systematic conventions of representing sugars don’t rename alpha-Mannose and beta-Mannose to MAN and BMA as PDB does

  28. Alpha-L-Fucose A NAG-FUC in PDB Beta-? This is PDB FUL Beta-L-Fucose Which doesn’t exist In glycans

  29. The process of identifying a new chem_comp in a PDB entry • Find all atoms belonging to a single entity • Detect bond orders by software • Add appropriate H-atoms • Generate a SMILES • Test if SMILES generate correct ideal coordinates • From ideal coordinates generate a SMILES • From SMILES generate chemical Name • Chem_comp CIF file stores the output, it is not used as input in any step

  30. Identifying an existing chem_comp in a PDB entry • The chem_comp connectivity is extracted and a graph made for each compound • As above – all atoms belonging to a chemical entity are found and its connectivity graph compared to dictionary to find correct match • Crucial step is finding LINKed atoms that may belong to the entity in question – in carbohydrates in PDB in a Glycosidic Bond C1(i) --- X(i+1) the Oxygen of C1(i) is named in the (i+1) residue but in identification it is attached “temporarily” to the sugar to determine the C1 stereochemistry so in an Asn-NAG – O1 is the Asn Nitrogen atom

  31. LEAVING ATOM – frequent problem Asn-NAG N to C is > 2.0 A – would end up as 5AX, the de-hydroxy NAG at C1 (plus angle is impossible)

  32. LEAVING ATOM & alpha- beta- linkage Note this is similar to the peptide bond in proteins, but the leaving atom is assumed in all protein software and the LINK is independent of the 3-letter code, i.e. you can have a cis or trans peptide and trans is assumed while cis is given external to the residue name as CISPEP All glycobiology gives the sugar linkage and C1 stereochemistry external to the sugar name – only the PDB has BMA and MAN to represent beta-mannose and alpha-mannose – everywhere else mannose is mannose “man”. All refinement software (the suppliers to the PDB) use MAN and a link “patch”. Historical legacy we could do without !!!!

  33. Because of the tetrahedral nature of carbon bonds, pyranose sugars actually assume a "chair" or "boat" configuration, depending on the sugar PROBLEM 3 – Conformation (minor)

  34. Conformational formulas of pyranoses

  35. Conformation Sugar ring pucker not always fitted well to density This does not interfere with identification Except where bond lengths and angles may cause processing software to confuse single and double bonds

  36. The conformation of the ring is dominated by steric interactions between axial groups. In hexopyranoses this causes a strong preference for the less crowded 4C1 conformation in the D-series (1C4 in the L-series) as this places C-6 in an equatorial position. In pentoses, furanoses and unsaturated pyranoses the differences in steric energy between conformations are much smaller so that the conformation is often determined by the anomeric effect. The term anomeric effect is used to describe the preference for placing electronegative substituents anti to the electron pair of a heteroatom, i.e. oxygen. But the debate of the ribose ring pucker in dna and rna may have ceased it is not resolved

  37. B7-1 PROBLEM 4 Glycosidic Bonds Glycoprotein carbohydrate moieties are inherently: • (a) Variable: • Variable site occupancy • Variable structures at each site (b) Flexible These are exactly why glycosylation is avoided in constructs for crystallisation!

  38. Glycosidic Bonds The anomeric hydroxyl and a hydroxyl of another sugar or some other compound can join together, splitting out water to form a glycosidic bond: R-OH + HO-R'->R-O-R' + H2O E.g., methanol reacts with the anomeric OH on glucose to form methyl glucoside (methyl-glucopyranose).

  39. Glycosidic bonds determine structure Straight chains, good for structure Bent chains, good for storage

  40. Both glycosides and oligo-/polysacharides are built of compounds linked by glycoside bond Glycosides Molecule (non-sugar) with free –OH or -NH2 groups (aglycone) Monosaccharide with free -OH at C1 Oligosaccharide Polysaccharide Monosaccharide Monosaccharide

  41. A disaccharide A glycoside

  42. Chemical structure – submitted to correct?? Major problem in PDB that authors will always check UniProt for correct amino acid sequence and GenBank for correct DNA/RNA sequences but never check if the sugars built into density actually exist for species understudy either extracted source or expressed source

  43. Structural Features:H-bonding opportunities PROBLEM 5 Cellulose: H-bonds add strength

  44. Secondary & Tertiary Structure Rotational freedom hydrogen bonding oscillations local (secondary) and overall (tertiary) random coil, helical conformations

  45. Movement around bonds: from: http://www.sbu.ac.uk/water/hydro.html

  46. Frequently used definitions of glycosidic torsion angles ASN Well in modelling If not crystallography

  47. Polysaccharide equivalents to phi/psi in proteins are not used Proteins are routinely without question validated for allowed phi/psi torsion angles Polysaccharides have a wider range of allowed torsion angles but there are clear preferences – all universally ignored

  48. Tertiary structure - sterical/geometrical conformations Rule-of-thumb: Overall shape of the chain is determined by geometrical relationship within each monosaccharide unit b(14) - zig-zag - ribbon like b(1 3) &a(14) - U-turn - hollow helix b(1 2) - twisted - crumpled (16) - no ordered conformation

  49. Assignment for next lecture Today has been a general view of sugars in the PDB For next week - Find all instances of the following 4 example groups of sugar compounds Caution: The compounds may be given as a single 3-letter code or as a LINKed set of chem_comp’s Find common name and if a natural metabolite what is the organism source EXCLUDE all phosphate and nucleotide examples

More Related