330 likes | 443 Views
Improving Peptide Probability Modeling in Scaffold 4. Brian C. Searle brian.searle@proteomesoftware.com Scaffold Users Meeting, 2013. Creative Commons Attribution. Scaffold 4 Improvements. Probability Estimation using LFDR Target/Decoy Classification of multiple scores
E N D
Improving Peptide Probability Modeling in Scaffold 4 Brian C. Searle brian.searle@proteomesoftware.com Scaffold Users Meeting, 2013 Creative Commons Attribution
Scaffold 4 Improvements • Probability Estimation using LFDR • Target/Decoy Classification of multiple scores • Delta Mass Error Modeling Improvements • Requires Target/Decoy analysis (1:1 … 1:10)
“Incorrect” “Correct”
Number of Identified Proteins Protein-Level False Discovery Rate
Number of Identified Proteins Protein-Level False Discovery Rate
XCorr DeltaCN % Ions Identified …
XCorr DeltaCN % Ions Identified …
XCorr DeltaCN % Ions Identified
Naïve Bayes Classifier • Trained to each data set • Simple (can calculate with a formula, no magic!) • Robust to over-fitting
Number of Identified Proteins Protein-Level False Discovery Rate
Number of Identified Proteins Protein-Level False Discovery Rate
Probability the ID is Correct Probability the ID is Wrong
Number of Identified Proteins Protein-Level False Discovery Rate
Number of Identified Proteins Protein-Level False Discovery Rate
1% Peptide FDR Number of Identified Proteins
1% Peptide FDR > 10% Protein FDR?!? Number of Identified Proteins Protein-Level FDR
New Search Engines? • Difficult to add new search engines with PeptideProphet (new seeds) • Easy to add with Naïve Bayes / LFDR • mzIdentML interchange (HUPO standard)
New Search Enginesin Scaffold 4 • Peaks • Byonic • Myrimatch (Tabb Lab) • SQID (Wysocki Lab) • MS-GF+ (Pevzner Lab) • MS-Amanda (Mechtler Lab, PD)
New Search Enginesin Scaffold 4 • Peaks • Byonic • Myrimatch (Tabb Lab) • SQID (Wysocki Lab) • MS-GF+ (Pevzner Lab) • MS-Amanda (Mechtler Lab, PD) • ... Any engine with decoys & mzIdentML!
Scaffold 4 Improvements • New Naïve Bayes / LFDR Probabilities • Probability Estimation using LFDR • Target/Decoy Classification • Delta Mass Error Modeling • “Next generation” search engine interpretation • New mzIdentML File Loading • Several newly supported search engines • Any search engine with decoys