Spectral matching identifies peptides by comparing spectra with libraries of identified spectra

D1 D2 D3 D4 I0 I1 I2 I4 I3 Begin b1 y1 b2 y2 End HMMatch: Peptide Identification by Spectral Matching of Tandem Mass Spectra Using Hidden Markov Models US HUPO Xue Wu1, Chau-Wen Tseng1,Nathan J. Edwards21Department of Computer Science, University of Maryland, College Park2Dept. of Biochemistry and Mol. & Cell. Biology, Georgetown Medical Center Introduction HMMs Capture Consensus and Variation in MS/MS Spectra Peptide ID Case Studies • Two spectra for peptide DLATVYVDVLK • Scores for Tandem, Mascot, NIST MS Match Intensity • MS/MS Spectra • Two example spectra for peptide DLATVYVDVLK • Different peaks present • Different peak intensities • Intensity Box Plot • Computed from multiple spectra for peptide • Frequency & intensities of peaks vary • Can capture using HMMs! Peak Occurrence Frequency • Spectral matching identifies peptides by comparing spectra with libraries of identified spectra • We present HMMatch, a new approach to peptide identification that summarizes many MS/MS spectra for a peptide using Hidden Markov Models (HMM) • HMMs can capture both consensus and variation in peak intensity found in spectra for a peptide • As a result, HMMatch can confidently identify many spectra for a peptide that other tools miss Intensity Variation • Tandem: 0.0011 HMM: 7.35e-12 • NIST: 0.813 Mascot: 1.42 • Extra small peaks reduce search engine E-values • NIST MS Search score degraded Hidden Markov Models (HMM) m/z ratio Peak Position Peptide Identification: HMM Viterbi Score & p-value • Viterbi Scores • Compare spectra for target peptide (train & test) vs. non-target peptides (other) • Confidence based on Tandem E-values • Identifies test spectra • Rejects other spectra # spectra • HMM:9.87e-12 NIST: 0.994 • Tandem: 0.15Mascot:7.52 • Peaks exceeding m/z tolerance significantly reduced search engine E-values • HMM constructed with states representing ion peaks, insertions, and deletions • Spectra preprocessed via peak normalization, noise filtering, intensity scaling, discretization • Trained w/ high confidence spectra to recognize ion peaks and correlations in peak intensity patterns • Outputs Viterbi score for each spectra Viterbi Score • P-values • HMMatch computes statistical significance using synthetic spectra • Many spectra have low Tandem E-values to target peptide (test), non-target peptide (other), or no peptide (unknown) • HMMatch can compute good p-values for many such spectra Observations • HMMatch confidently identified 3537 spectra • (p-value < 10-5) with low search engine scores (Tandem & Mascot E-values > 0.05), • NIST MS Search identified 673 (score > 0.9) • Accounting for intensity variation provides additional useful information in practice • HMMatch complements existing MS/MS search engines & spectral matching methods • Growing size of MS/MS libraries can improve applicability & accuracy of HMMatch • Reference: Wu et al., J of Computational Biology, 14(8):1025-1043, 2007 # spectra HMMs Used For Protein Families • Protein Family • Consensus & variation found in amino acid sequence • Sequence logos capture variation • HMMs capture both variation & pattern of gaps in sequence • HMMs used by Pfam to provide high sensitivity -log(p-value) Model Extrapolation To Nearby Related Peptides Extrapolated HMM • Extrapolation • Compute HMMs for nearby protein sequence • Compare p-values for extrapolated HMM vs. trained HMM • Extrapolated HMM scores correlate well • Can find novel peptides Acknowledgments • This research was supported by • NIH/NCI grant CA126189 • USDA cooperative agreement 5812757342 Trained HMM

Spectral matching identifies peptides by comparing spectra with libraries of identified spectra

Spectral matching identifies peptides by comparing spectra with libraries of identified spectra

Presentation Transcript

SURVEY OF SPECTRA

SURVEY OF SPECTRA

SURVEY OF SPECTRA

MS/MS Libraries of Identified Peptides and Recurring Spectra in Protein Digests

By 1st Grade Spectra

SURVEY OF SPECTRA

SURVEY OF SPECTRA

Spectra

Spectral matching identifies peptides by comparing spectra with libraries of identified spectra

Systematics of identified particle spectra

Spectra

Analysis of identified hadron spectra at PHENIX

Spectral Types – Continuous Spectra

Evaluated Reference MS/MS Spectra Libraries

Scaling Properties of Identified Hadron Transverse Momentum Spectra

High-pT identified particle spectra

Spectra

SPECTRA

Spectra

DEPT Spectra COSY Spectra HETCOR Spectra

Spectra

Identified particle spectra and jet interactions with the medium