10 likes | 135 Views
Georgetown University. Novel FDR Estimation for PepArML – A Meta-Search Peptide Identification Platform. David Retz and Nathan J. Edwards, Georgetown University Medical Center. min z -score. min # decoy hits. min quantile.
E N D
Georgetown University Novel FDR Estimation for PepArML – A Meta-Search Peptide Identification Platform David Retz and Nathan J. Edwards, Georgetown University Medical Center min z-score min # decoy hits min quantile Distribution of search engine agnostic metrics for unanimous target peptide identifications: min # decoy hits, min z-score, min quantile. SPMD [3] Mix1/LTQ-FT dataset. [+D0] Min # decoy hits == 0; [-D0] Min z-score < -1.5. Introduction target The PepArML meta-search engine provides: • A unified MS/MS search interface for Mascot, X!Tandem, OMSSA,K-Score, S-Score, MyriMatch, and InsPecT + MSGF, • Search job scheduling and execution on cloud, grid, cluster computing resources. • Unsupervised, model-free result combining using machine-learning (PepArML [1]), • Additional features including tryptic digest, peptide physicochemical, and proteotypic properties; spectra and precursor isotope cluster properties, plus retention-time modeling. The PepArML meta-search engine improves peptide identification sensitivity, significantly increasing the number of peptide ids at 10% FDR. Sampled Target [-D1] 75% of rank 1 non-training “false” target ids, uniformly sampled, rescaled to # of spectra decoy sampled target SPMD [3] Mix1/LTQ-FT. Distribution of PepArML prediction confidence: Target, Decoy [+D1], and Sampled Target [-D1]. Concordance between decoy [+D1] and sampled target [-D1] FDR estimates for iterations 1-3. LCQ QSTAR LTQ-FT References PepArML Combiner’s Internal Decoys Acknowledgements • Edwards, Wu, Tseng. Clinical Proteomics (2009), 5(1). • Keller, Nesvizhskii, Kolker, Aebersold. Anal. Chem. (2002), 74 (20). • Klimeket al. JPR (2008), 7 (1). • PepArML uses two sets of decoy results: • External (reversed target) for final FDR estimation across all methods, • Internal(shuffled target) to select initial set of high-quality peptide identifications and calibrate prediction confidences. • Internal decoys are robust and reliable, but • Internal decoys increase search-time by 50%. • Replace initial peptide selection heuristic (D0). • Replace prediction calibration (D1). • Eliminate internal decoys! • Estimate FDR directly from target scores like PeptideProphet [2], but • Need a new approach as target scores are not easily deconvoluted. NJE supported, in part, by NIH/NCI/CPTI grant CA126189. Comparison of search engines, voting with decoy tiebreaker heuristic, and PepArML with and without internal decoys. SPMD [3] Mix1 datasets. Conclusions • Meta-search, grid-computing, and machine-learning can significantly improve peptide identification sensitivity. • PepArML can run successfully with just one decoy search. • Apply similar technique for external decoys? • PepArML meta-search engine is publicly available, free of charge: edwardslab.bmcb.georgetown.edu