1 / 25

Van Hai Van , Cao Thi Ngoc Phuong, Tran Linh Thuoc

Sixth International Conference on Bioinformatics InCoB2007. Training and applying hidden Markov models and support vector machines for prediction of T-cell epitopes. Van Hai Van , Cao Thi Ngoc Phuong, Tran Linh Thuoc Faculty of Biology, University of Natural Sciences, VNU-HCMC, Vietnam.

natividadd
Download Presentation

Van Hai Van , Cao Thi Ngoc Phuong, Tran Linh Thuoc

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sixth InternationalConference on BioinformaticsInCoB2007 Training and applying hidden Markov models and support vector machines for prediction of T-cell epitopes Van Hai Van, Cao Thi Ngoc Phuong, Tran Linh Thuoc Faculty of Biology, University of Natural Sciences, VNU-HCMC, Vietnam

  2. Epitope prediction “Epitope is the portion of an antigen that is recognized by the antigen receptor on lymphocytes” Molecular Biology Epitope prediction: Computers aid to develop epitope-based vaccines against various human pathogens for which no vaccines currently exist http://www.scripps.edu/newsandviews/e_20050228/hiv.html

  3. T-cell epitope prediction • T-cell epitopes are a subset of MHC binding peptides  prediction of the peptides binding to MHC is essential for design of peptide-based vaccines • HLA-A0201 Sequence Binding motifs Artificial neural networks Quantitative matrices Hidden Markov models Support vector machines Decision tree Molecular Biology

  4. HMMs & SVMs SVMs (Support Vector Machines): Learning machine that can find the optimal separating hyperplane. HMMs (Hidden Markov Models) Statistical model that can capture complex relationships in data sets.

  5. Epitope prediction for dengue virus • Tropical disease • Dengue fever • Dengue hemorraghic fever • Dengue shock syndrome • Hypothesis of pathogenesis • Antibody – dependent enhancement • Virus virulence • No dengue vaccine is available In our research: . Develop procedure for building automatically T-cell epitope predicting models . Find candidates in silico for making multivalent vaccines on 4 types of Dengue virus

  6. Building models for predicting T-cell epitopes & applying these models on dengue virus

  7. Building effective prediction models? The predicting ability of HMM and SVM models depends on: Experimentally peptides binding to MHC molecules Partition of the peptides into training set and testing set Encoding method  A system finds easily and quickly the best prediction model when type of MHC molecules and quantity of binding peptides are changed

  8. Processing MHC-binding experimental peptides

  9. Create training and testing sets

  10. Training & testing procedure HMMs (HMMer) SVMs (SVM_light)

  11. Experiment 1

  12. Result of the training by HMMs HMM.7.136: AROC=0.914 Choose parameter from HMM.7.136: At point: E=3.4, S=-8.5, SE=0.91, SP= 0.86, AROC=0.885

  13. Result of the training by SVMs At blosum-62 encoding, data set SVM.7.blo62.46: SE=0.83, SP=0.90, AROC=0.87 Binary encoding: AROC=0.42÷0.77 Blosum-62 encoding: AROC= 0.47÷0.87 Chemical-physical encoding: AROC= 0.41÷0.71

  14. Experiment 2

  15. Result of the training by HMMs

  16. Training in 6-amino acid homologous groups HMM.6.78: AROC=0.883 Parameters of HMM.6.78: At point: E=42, S=-9.2, SE=0.91, SP= 0.84, AROC=0.875

  17. Result of the training by SVMs methods

  18. : Binary encoding : Blosum-62 encoding : Binary-Blosum-62 encoding Training in 7-amino acid homologous groups At SVM.2.7.85: SE=0.93, SP=0.86, AROC=0.894

  19. Epitope predicting procedure for dengue virus • Do multiple sequence alignment • Extract consensus sequences more than or equal 9 amino acids • Create 9-mer overlap sequences • Predict peptides binding to MHC by HMMs profile or SVMs model

  20. Experiment 1 Experiment 2 Result of epitope prediction (peptide binding to HLA-A0201 prediction): Join overlap 9-amino acid peptides predicted binding to HLA-A0201 molecules

  21. Result of prediction • HMMs profile is stable and increase ability of prediction when there are additional data sets. • SVMs model is good but ability of prediction decreases when amount of training data increases.

  22. http://www.biology.hcmuns.edu.vn/epitope

  23. Conclusion • Successfully building system for training Hidden Markov models and Support Vector Machines • Generating training and testing data based on separating data set into homologous groups give us good result. • Could predict consensus epitope for 4 types of Dengue virus based on data of peptides binding to HLA-A0201

  24. Future plans • Set other kernels on SVMs method • Survey other encoding method for sequences having flexible length • Survey other methods for classifying MHC data to homologous groups • Automate procedure collecting and updating data of peptide binding MHC from databases

  25. Thank you very much!

More Related