210 likes | 376 Views
Base stacking classification via automated clustering method. Eli Hershkovits 1 , Xavier Le Faucheur 1 , Neocles Leontis 2 , Allen Tannenbaum 1 1 Georgia Institute of Technology, 2 BGSU. Data Classification. Coordinate system and parameterization
E N D
Base stacking classification via automated clustering method Eli Hershkovits1, Xavier Le Faucheur1, Neocles Leontis2, Allen Tannenbaum1 1Georgia Institute of Technology, 2BGSU
Data Classification • Coordinate system and parameterization • Clustering of the data (“by eye” or Automated clustering)
Z2 Y2 Z1 Y1 r12 X2 X1 Base stackingRingCoordinate system • the three orthogonal directions are calculated with Cremer and Pople method. • The coordinates y1 and y2 can be used to define face of the ring (up or down.)
w r r r f Base stackingRelativeCoordinate system • Relative rings coordinates are defined by the spherical coordinates r j and q.
Primary Classification • For each base stacking candidate the two closest rings are chosen to represent the pair. This choice gives a classification to four groups: Pyrimidine-pyrimidine Pyrimidine-imidazole, Imadizole-pyrimidine and Imidazole-imidazole. • There are four possible combinations of face-face interactions: Up-up, Up down, Down-up, Down,down.
Secondary classification • The polar coordinates “r” , “j” and “q” are correlated and show distinction to two clusters” “Proper stacking” and improper stacking. • Those classifications give 4*4*2 = 32 classes
Possible problems • For stacking of residues that are not neighbors the distribution of w is broad. • Possible overlap between clusters.
Stacking of RNA on protein • Stacking interactions between nucleic acids and amino acids are not abundant (9 for the large subunit RR0033.) • Most of the stacking interactions are with Histidine (6.) From the staking cases 5 are with the pyrimidine ring.