1 / 1

Boosting Peptide Identification Performance with Multi-Engine Search

Enhance peptide identification sensitivity through PepArML meta-search engine, combining 7 search engines, machine learning, & spectral matching. Publicly available for free at Georgetown University.

Download Presentation

Boosting Peptide Identification Performance with Multi-Engine Search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Georgetown University OMICS 17 Protein Mix LCQ MS/MS Dataset • Semi-tryptic search of SwissProt • 39408 spectra searched ~ 36 times: - Target + 2 decoys, 7 engines, 1+ vs 2+/3+ charge • 3969 search jobs, weeks of CPU time. • Total elapsed time (Mascot bottleneck): < 28 hours. • All non-Mascot jobs: < 19 hours. Boosting Peptide Identification Performance by Combining Many Search Engines, Spectral Matching, and Proteotypic and Physicochemical Peptide Properties. US HUPO 2010 Prabhakar Gubbala and Nathan J. Edwards, Georgetown University Medical Center Introduction Peptide Identification Meta-Search via Grid-Computing Feature Rankings by Info. Gain The PepArML meta-search engine provides: • A unified MS/MS search interface for Mascot, X!Tandem, OMSSA,KScore, SScore, MyriMatch, and InsPecT. • Search job scheduling on independentlarge-scale heterogeneous computational grids. • Additional features including tryptic digest, peptide physicochemical, and proteotypic [1] properties; spectra and precursor isotope cluster properties, plus retention-time modeling. • Spectral match to synthetic spectra using Zhang’s KineticModel [2,3]. • Unsupervised, model-free result combining using machine-learning (PepArML [4]) The PepArML meta-search engine improves peptide identification sensitivity, significantly increasing the number of peptide ids at 10% FDR. Meta-search with seven search engines;Automatic target & decoy searches. Mascot, Tandem, OMSSA, KScore, SScore, MyriMatch, InsPecT Heterogeneous compute resources Secure communication Scales to 250+ simultaneoussearches Edwards Lab Scheduler & 80+ CPUs NSF TeraGrid 1000+ CPUs Free, instantregistration Simple search description Job management Result combining Conclusions PepArML – Evaluation of non-Search Engine Features • Application of meta-search, grid-computing, and machine-learning can significantly improve the sensitivity of peptide identification. • The PepArML meta-search engine is publicly available, free of charge, on-line from: http://edwardslab.bmcb.georgetown.edu Unified MS/MS Search Interface • Automatic search engine configuration and execution, • parameterized by: • Instrument & proteolytic agent • Fixed and variable modifications • Protein sequence database & MS/MS spectra file • Peptide candidate selection • MS/MS Spectra Reformatting • Charge and precursor enumeration for peptide candidate selection (for charge & 13C peak correction) • Search engine formatting constraints (MGF/mzXML) • Consistent MS/MS spectrum identifier tracking • Spectrum file “chunking” References • P. Mallick, Schirle, M., Chen, S. S., Flory, M. R., Lee, H., Martin, D., Ranish, J., Raught, B., Schmitt, R., Werner, T., Kuster, B., Aebersold, R. Computational prediction of proteotypic peptides for quantitative proteomics. Nature Biotechnology (2006), 25 (1). • Z. Zhang, "Prediction of low-energy collision-induced dissociation spectra of peptides". Anal. Chem. (2004), 76(14). • Z. Zhang, "Prediction of Low-Energy Collision-Induced Dissociation Spectra of Peptides with Three or More Charges", Anal. Chem. (2005), 77(19). • N. Edwards, X. Wu, and C.-W. Tseng. "An Unsupervised, Model-Free, Machine-Learning Combiner for Peptide Identifications from Tandem Mass Spectra." Clinical Proteomics (2009), 5(1).

More Related