190 likes | 341 Views
TAP Hunter: A SVM-based system for predicting variable length TAP-binding peptides. Lam Tze Hau Ren Ee Chee Victor Tong. Background. Processing & presentation of endogenous antigen in MHC class I pathway. Abele R. and Tampe R., 1999. Background.
E N D
TAP Hunter: A SVM-based system for predicting variable length TAP-binding peptides. Lam Tze Hau Ren Ee Chee Victor Tong
Background Processing & presentation of endogenous antigen in MHC class I pathway. Abele R. and Tampe R., 1999
Background • Transporter associated with antigen processing (TAP) • Plays a significant role in the major histocompatibility complex (MHC) class I processing and presentation pathway. • Responsible for the translocation of cytosolic peptides for the peptide-loading of MHC class I molecules in the ER.
………… Background • The length of the peptide • N & C terminals of the peptide • TAP-peptide binding preferences: Nonamer Peptide Sequence N-terminal C-terminal
Current works on prediction for TAP binding • Artificial neural networks (ANN) model trained on a set of 9-mer peptides of known TAP binding affinity (Daniel et al., 1998). • Stabilized matrix method (Peters et al., 2003). • Additive scoring function method (Doytchinova. I et al., 2004). • ANN and hidden markov models (Zhang G.L. et al., 2006) • Quantitative matrix-based and support vector machine (SVM) prediction methods (Bhasin M. and Raghava G.P.S., 2003)
Objectives • Identification of variable-length peptides binding to TAP. • Develop a web-based computational system for predicting TAP-binding peptides.
Methods • Novel encoding scheme based on the representations of TAP peptide fragments. • Support vector machine (SVMLight, Joachim, 1999) as prediction engine. The models are generated by linear, polynomial and radial basic function kernels.
Methods • Novel encoding scheme based on the representations of TAP peptide fragments. • Sparse encoding. • Sparse encoding + physiochemical properties.
Methods Truncation Analysis P1 P2 P3 P4 P5 P6 P7 P8 P9 8-mer Fragments
Methods Truncation Analysis P1 P2 P3 P4 P5 P6 P7 P8 P9 7-mer Fragments
Dataset (493 nonamer peptides) Training Set (370 nonamer peptides) Test Set (123 nonamer peptides) 5-fold cross-validation Methods • Performancemeasurement • The area under the curve (Aroc) is used to evaluate the derived models.
Results • Encode TAP peptide fragments at the N & C terminals of the peptide that are known to influence the TAP bindings improves the predictive ability of the model.
Results • The model that represents the first 3 amino acids and the last amino acid of the TAP peptide sequence M(123____9), gives the best Aroc value. • It is able to achieved a high Arocof 0.90 for both the 5-fold cross-validation as well as the independent test set.
Prediction on variable-length peptides • Peptides (other than nonamers) that are known to bind to HLA-A1, HLA-A3, HLA-A11, HLA-A24 and HLA-B27 are retrieved from the Immune Epitope and Analysis Resource database (IEDB). • These HLA Class I molecules depend on the TAP-pathway for peptides presentation.
TAP Hunter VS others TAP prediction methods • Binding nonamers to TAP • A set of known 38 TAP dependent HLA-A1, HLA-A3, HLA-A11, HLA-A24 and HLA-B27 nonamerepitopes is collected from the MHCBN database. • Out of the 38 epitopes, 14 bind to HLA-A1, 9 bind to HLA-A3, 8 bind to HLA-A11, 5 bind to HLA-A24 and 2 bind to HLA-B27. • Non-binding nonamers to TAP • 12 known epitopes (Weinzieret al, 2008) that binds to HLA-A*0201 and HLA-B*5101 where these alleles exhibit partial dependence on TAP for its peptide presentation. • These epitopes are only found on TAP-deficient LCL721.174 cell line where they are presented to the HLA Class I molecules in the absence of TAP. • These 50 epitopes are run against public available web servers: TAPPred, NetCTL as well as TAP Hunter.
Conclusion • Encode TAP peptide fragments that are known to influence the TAP-peptide binding in the model representation, improves the predictive ability of the models. • This method allows prediction for variable-length peptides with satisfactory results.