1 / 1

Spectral matching identifies peptides by comparing spectra with libraries of identified spectra

D1. D2. D3. D4. I0. I1. I2. I4. I3. Begin. b1. y1. b2. y2. End. HMMatch : Peptide Identification by Spectral Matching of Tandem Mass Spectra Using Hidden Markov Models. US HUPO.

italia
Download Presentation

Spectral matching identifies peptides by comparing spectra with libraries of identified spectra

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. D1 D2 D3 D4 I0 I1 I2 I4 I3 Begin b1 y1 b2 y2 End HMMatch: Peptide Identification by Spectral Matching of Tandem Mass Spectra Using Hidden Markov Models US HUPO Xue Wu1, Chau-Wen Tseng1,Nathan J. Edwards21Department of Computer Science, University of Maryland, College Park2Dept. of Biochemistry and Mol. & Cell. Biology, Georgetown Medical Center Introduction HMMs Capture Consensus and Variation in MS/MS Spectra Peptide ID Case Studies • Two spectra for peptide DLATVYVDVLK • Scores for Tandem, Mascot, NIST MS Match Intensity • MS/MS Spectra • Two example spectra for peptide DLATVYVDVLK • Different peaks present • Different peak intensities • Intensity Box Plot • Computed from multiple spectra for peptide • Frequency & intensities of peaks vary • Can capture using HMMs! Peak Occurrence Frequency • Spectral matching identifies peptides by comparing spectra with libraries of identified spectra • We present HMMatch, a new approach to peptide identification that summarizes many MS/MS spectra for a peptide using Hidden Markov Models (HMM) • HMMs can capture both consensus and variation in peak intensity found in spectra for a peptide • As a result, HMMatch can confidently identify many spectra for a peptide that other tools miss Intensity Variation • Tandem: 0.0011 HMM: 7.35e-12 • NIST: 0.813 Mascot: 1.42 • Extra small peaks reduce search engine E-values • NIST MS Search score degraded Hidden Markov Models (HMM) m/z ratio Peak Position Peptide Identification: HMM Viterbi Score & p-value • Viterbi Scores • Compare spectra for target peptide (train & test) vs. non-target peptides (other) • Confidence based on Tandem E-values • Identifies test spectra • Rejects other spectra # spectra • HMM:9.87e-12 NIST: 0.994 • Tandem: 0.15Mascot:7.52 • Peaks exceeding m/z tolerance significantly reduced search engine E-values • HMM constructed with states representing ion peaks, insertions, and deletions • Spectra preprocessed via peak normalization, noise filtering, intensity scaling, discretization • Trained w/ high confidence spectra to recognize ion peaks and correlations in peak intensity patterns • Outputs Viterbi score for each spectra Viterbi Score • P-values • HMMatch computes statistical significance using synthetic spectra • Many spectra have low Tandem E-values to target peptide (test), non-target peptide (other), or no peptide (unknown) • HMMatch can compute good p-values for many such spectra Observations • HMMatch confidently identified 3537 spectra • (p-value < 10-5) with low search engine scores (Tandem & Mascot E-values > 0.05), • NIST MS Search identified 673 (score > 0.9) • Accounting for intensity variation provides additional useful information in practice • HMMatch complements existing MS/MS search engines & spectral matching methods • Growing size of MS/MS libraries can improve applicability & accuracy of HMMatch • Reference: Wu et al., J of Computational Biology, 14(8):1025-1043, 2007 # spectra HMMs Used For Protein Families • Protein Family • Consensus & variation found in amino acid sequence • Sequence logos capture variation • HMMs capture both variation & pattern of gaps in sequence • HMMs used by Pfam to provide high sensitivity -log(p-value) Model Extrapolation To Nearby Related Peptides Extrapolated HMM • Extrapolation • Compute HMMs for nearby protein sequence • Compare p-values for extrapolated HMM vs. trained HMM • Extrapolated HMM scores correlate well • Can find novel peptides Acknowledgments • This research was supported by • NIH/NCI grant CA126189 • USDA cooperative agreement 5812757342 Trained HMM

More Related