1 / 1

Li Xue 1,2 , Rasna Walia 1,2 , Yasser El-Manzalawy 2,4 , Drena Dobbs 1,3 , Vasant Honavar 1,2

Results & Conclusion. Method. Protein-RNA Interface Conservation. Future Directions. Improving Protein-RNA Interface Prediction by Combining a Sequence Homology-based Method with a Naïve Bayes Classifier: Preliminary Results.

brent
Download Presentation

Li Xue 1,2 , Rasna Walia 1,2 , Yasser El-Manzalawy 2,4 , Drena Dobbs 1,3 , Vasant Honavar 1,2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Results & Conclusion Method Protein-RNA Interface Conservation Future Directions Improving Protein-RNA Interface Prediction by Combining a Sequence Homology-based Method with a Naïve Bayes Classifier: Preliminary Results Li Xue1,2, Rasna Walia1,2, Yasser El-Manzalawy2,4, Drena Dobbs1,3, Vasant Honavar1,2 1 Bioinformatics & Computational Biology Program; 2 Dept. of Computer Science; 3 Dept. of Genetics, Development & Cell Biology, Iowa State University; 4 Dept. of Systems & Computer Engineering, Al-Azhar University, Cairo, Egypt • Protein-RNA interactions play important roles in cellular processes including protein synthesis, RNA processing, and gene expression regulation. Reliable identification of the interfaces involved in protein-RNA interactions is essential for comprehending their mechanisms and functional implications and provides a valuable guide for rational drug discovery and design. • Experimental determination of interfaces in protein-RNA complexes is time-consuming and expensive. Thus computational techniques for predicting RNA-binding sites on proteins are valuable. Here we propose a novel family of sequence homology-based methods: • HomPRIP uses interface information from putative homologs of a query protein to predict interface residues in the query protein. • When no sequence homologs for the query protein can be found, HomPRIP-NB uses a Naïve Bayes (NB) classifier trained on evolutionary information derived from protein sequences in the NCBI nr database to return interface predictions. http://einstein.cs.iastate.edu/HomPRIP-NB • NR216 – for analyzing protein interface conservation • RB199 – for testing the prediction performance of HomPRIP & its combination with a NB classifier • nr_RNAprot_s2c – for searching for putative sequence homologs using BLASTP Query protein sequence Search nr_RNAprot_s2c to find homologous sequences Homologoussequences found? Yes No Safe zone HomPRIP-NB returns predicted interface residues HomPRIP returns predicted interface residues Twilight zone Dark zone • Support Vector Machine & Naïve Bayes classifiers were trained using three different features: • amino acid identity • PSSM profiles • smoothed PSSM profiles • and evaluated using five-fold cross-validation. • Performance of HomPRIP is reported only for 71% of complexes in the RB199 dataset (those for which homologs could be found); HomPRIP-NB returned predictions for the entire RB199 dataset. • Ongoing work is aiming at comparing HomPRIP-NB with other publically available servers that predict RNA-binding sites on proteins (e.g., BindN, PiRaNha, PRIP, RNABindR), using an independent test set. An interface conservation score (ICscore) is calculated as a measurement of the similarity of a homolog’s interface residues to those of the query protein. A regression model is used to calculate the ICscore, based on BLAST sequence alignment statistics. Safe zone: a high degree of conservation (red data points) Twilight zone: moderate conservation of interfaces (yellow & orange data points) Dark zone: poor conservation of interfaces (blue data points) Funding provided by: NIH GM 066387 B.A. Lewis, R.R. Walia, M. Terribilini, J. Ferguson, C. Zheng, V. Honavar, and D. Dobbs. PRIDB: a protein–RNA interface database. Nucleic Acids Research, 39(suppl 1):D277, 2011. L.C. Xue, D. Dobbs, and V. Honavar. HOMPPI: A class of sequence homology based protein-protein interface prediction methods. BMC Bioinformatics, 12:244, 2011. Acknowledgements

More Related