120 likes | 359 Views
An analysis of pdb-care (PDB CArbohydrate REsidue check): a program to support annotation of complex carbohydrate structures in PDB files by Thomas Lütteke and Claus-W von der Lieth By David Chapman. Background.
E N D
An analysis of pdb-care (PDB CArbohydrate REsidue check): a program to support annotation of complex carbohydrate structures in PDB files by Thomas Lütteke and Claus-W von der Lieth By David Chapman
Background • Protein Data Bank includes 3-D data for carbohydrate structures as well as amino acid structures • 3-D data for protein / carbohydrate interactions is analyzed through X-Ray crytallography and Nuclear Magnetic Resonance • The absence of 3-D glycan data in PDB does not necessarily mean a potential glycosolation site is unoccupied
Background • The crytallography may have been done on plasmid replicated proteins, which may not have the same carbohydrates attached as the human form. • Glycosylation usually occurs at asparagine residues in Asn-X-Ser/Thr sequons where X does not equal proline • Approximately 30% of all 1663 PDB entries (Sep 2003) containing carbohydrates contain errors in glycan description
Biological Significance • Protein / Carbohydrate interactions are important because they are involved in a variety of biological processes • Fertilization • Embryonic development • Cellular differentiation
Background • High error rate in PDB glycan description is mainly due to incorrect assignment of saccharide units • Sequences for complex carbohydrates differ significantly from single letter amino acid sequences • The number of naturally occurring residues is much larger for carbohydrates • Each pair of monosaccharide residues can be linked in several ways • A residue can be connected to three or four others (branching)
Background • Unlike amino acids, carbohydrates use a three letter code which are defined the HET dictionary in PDB • A new residue name is required for each stereochemically different sugar unit • This makes the correct assignment complicated, tedious and error prone
Background • Examples of Definitions of carbohydrate residues: • AGC alpha-D-Glucopyranose • BGC beta -D-Glucopyranose • FCA alpha-D-Fucose • FCB beta-D-Fucose • There are more than 200 carbohydrate residues used in PDB
Implementation • Pdb-care is based on the pdb2linucs carbohydrate detection program • Pdb2linucs is able to identify and assign carbohydrate structures using only the reported atom types and their 3D coordinates • The program output is in LINUCS notation and is used to normalize complex carbohydrate structures • Pdb-care uses a translation table built in XML in order to compare the LINUCS notation from pdb2linucs to the residue assignments in the PDB group dictionary
Implementation • The translation table contains: • 141 monosaccharides • 31 oligosaccharides • 77 combined residues • Pdb-care was written in the C language • Front end is a web interface implemented in PHP
Implementation • Pdb-care web interface can accommodate either direct input using copy/paste of a pdb file or locating a file on a local hard drive or using a PDB-ID • The pdb-care protocol reports the type of problems, inconsistencies and errors detected
Program Example • pdb-care examples
Conclusion • The authors made relevant points regarding the biological significance of protein-carbohydrate interactions and the need for accurate glycan residue information in PDB. • However, the authors did not go into detail regarding the actual implementation of the translation table used in pdb-care so it is difficult to judge the accuracy of their program.