490 likes | 499 Views
Manual De Novo Peptide MS/MS Interpretation For Evaluating Database Search Results Karl R. Clauser Broad Institute of MIT and Harvard Cold Spring Harbor Proteomics Course July, 2009. Outline. AA properties Fragmentation pathways and ion types b/y pairs
E N D
Manual De Novo Peptide MS/MS Interpretation For Evaluating Database Search Results Karl R. Clauser Broad Institute of MIT and Harvard Cold Spring Harbor Proteomics Course July, 2009
Outline • AA properties • Fragmentation pathways and ion types • b/y pairs • Fragment charge from mass defect • Non-mobile proton • Neutral loss ion types • Phosphosite ambiguity • Sample handling chemistry artifacts • Isobaric co-eluters • Mass tolerance units and isobaric AA’s • Other Tutorials • Dominant ions • AA adjacencies • Positions
AA structures pK: 10 6 12 K H R 128 137 156 pK: 4.0 4.5 D E 115 129 S T Y 87 101 163 N Q 114 128 P 97 L I 113 113 M C 131 103 (+57 IAA) F W 147 186 G A V 57 71 99 pK: N-term 7.5 pK: C-term 3.5 http://ionsource.com/Card/clipart/aaclipart.htm
R1 R2 R3 R4 O O O O + H2N CH C NH CH C NH CH C NH CH C OH H R3 R1 R2 H R4 Charge-directed Fragmentation Scheme zHz+ O O O O + H2N CH C NH CH C NH CH C NH CH C OH H b ion formation y ion formation and/or y1 b3 + + Neutral pumped away by vacuum system + Neutral pumped away by vacuum system Proton Mobility Mobile: zpre > #Arg + #Lys + #His Partially mobile: zpre< #Arg + #Lys + #His and > #Arg Non-mobile: zpre< #Arg For peptides with non-mobile protons, fragmentation tends to proceed via charge-remote mechanisms. MS/MS spectra will be dominated by a few ions, typically: C-term side of D, E N-term side of P
O O O O H2N CH C NH CH C NH CH C NH CH C OH R1 R2 R3 R4 Sequence Specfic Fragment Ion Types a3 b3 c3 nHn+ x1 y1 z1 Ion type restrictions residues delta a-NH3 contains NH3 residue RK NQ -17 b-NH3, y-NH3 contains NH3 residue RK NQ -17 b-H2O, y-H2O contains H2O residue ST DE -18 b-H3PO4, y-H3PO4 contains H3PO4 residue st -98 y++, b++ contains charged residues RHK
128 99 99 128 E VQ L V|E/S|G|G|GL|V|K|PG G\S\L\R Complementary Ions b/y pairs
Dual Picket Fence 163 163 11371101115 115 101 71 113 163 163 A E/D|T|A|L|Y|Y|CA\K
Uniqueness of a Peptide Sequence Clauser, K. R.; Baker, P. R.; Burlingame, A. L. " Role of Accurate Mass Measurement ( +/- 10ppm) in Protein Identification Strategies Employing MS or MS/MS and Database Searching", Anal. Chem. 1999, 71, 2871-2882.
Diagnose Doubly Charged Fragment Ions I/A|D|A|H|L|D|R
Dominant Cleavage Proline N-side 28 87 97 b2 N F|P/S/P V DA A F R y9
202 115 115 202 (K)I S R|P G D|SD|D|SR(S) Non-mobile proton zpre< #Arg Sparse Dominant Fragmentation
Cry Babies (b-H2O & b pairs) P(m/z)-2H2O P(m/z)-H2O E/H/A|V/E|G/D|C D|F Q L L K
Andrea Kaitlin Aidan Jack Interpreting MS/MS Spectra is Fun!!
Source of Incorrect MS/MS Interpretations • Major • Database • Peptide not in database. Mutation. MS/MS not from a peptide. • Unanticipated Protein Chemistry • Chemical modification, post-translational modification. • Enzyme/Ion Source • Non-specific cleavage. In-source fragmentation yields MS3. • Minor • Algorithm • Fragment ion types of instrument not accounted for. Peak Detection. • Instrument Resolution • Wrong parent charge. Wrong fragment charge. • User Competence • Wrong parameters selected.
Phospho Site Ambiguity – S/T P(m/z)-H3PO4-H2O P(m/z)-H3PO4 P(m/z)-H3PO4-2H2O L P/S s/P/V|Y/E/D|A A S F K
Phospho Site Ambiguity – S/T L A G G Q/T/S Q|P T T|P L\T s/P Q R L A G G Q/T/S Q|P T T|P L\t S/P Q R
“Resulting sequences were inspected manually …. When the exact site of phosphorylation could not be assigned for a given phosphopeptide, it was tabulated as ambiguous.” “All spectra supporting the final list of assigned peptides used to build the tables shown here were reviewed by at least three people to establish their credibility.” “Assignment of phosphorylation sites was verified manually with the aid of PEAK Studio (Bioinformatics Solutions) software.” “All identified phosphopeptides were manually validated, and localization of phosphorylated residues within the individual peptide sequences were manually assigned…” Reliability of LC/MS/MS Phosphoproteomic Literature Citation Approach Instrument #sites #ambiguous Scores Site Supplem. sites Shown Ambiq Labeled Shown Spectra Ballif, BA,…Gygi, SP 1DGel LCQ Deca XP 546 86 yes yes no 2004 MCP, 3, digest, SCX 1093-1101 LC/MS/MS Rush, J, … Comb, MJ digest lysate LCQ Deca XP 628 0 yes no no 2005, Nat Biotech, 23, pTyr Ab 94-101 LC/MS/MS Collins, MO, …Grant, SGN protein IMAC Q-Tof Ultima 331 42 no yes no 2005, J Biol Chem, 280, peptide IMAC 5972-5982 LC/MS/MS Gruhler, A, … Jensen, ON digest lysate LTQ-FT 729 0 yes no no 2005 MCP, 4, SCX, IMAC 310-327 LC/MS/MS
Expect Woes & Nuisances • Sample Handling Chemistry • Carbamylation +43 nterm, Lys urea in digest buffer • Deamidation +1 N -> D sample in acid • pyroGlutamic acid -17 nterm Q sample in acid • Oxidized Met +16 M gels • Cys alkylation reagent +x n-term, W • Data Dependent Acquisition Parameters • Isobaric Co-eluters • Protein Isoforms / Family Members • Isobaric peptides from related proteins
Stinkers (b-NH3) & Pyroglutamic Acid (R)q L/Q|L|A|Q|E|A|A\Q\K(R) -17 Da Q to q (R)Q L/Q/L/A|Q/E/A|A Q\K(R) P(m/z)-NH3
G S/E/S|G|I|F|T|n\T K 18.35 96.9% +0.007 Da G S/E/S|G|I|F|T|D\T K Deamidation G S/E S\G\I\F\T\N/T K 6.62 43.4% +0.986 Da
Deamidation of Asn +1Da Asn –NH + O = Asp ionsource.com
Carbamylation N/S/L/E/T/L/L|y/K|P V/D\R +43 18.4 89% V/S T A/Q/D V/I|Q Q t L\C K +0 18.5 93% +0 11.1 68%
Carbamylated N-term I/G/E|G/T/y/G V|V|Y\K P(m/z)-CNHO +43 b ions P(m/z)-CNHO-H2O
Met Oxidation – localizing the site (R)G V D L D Q L/L|D|M|S|Y|E|Q|L|m|Q|L/Y S A R(Q) (R)G V D L D Q L/L|D|m|S/Y/E|Q|L|M|Q|L/Y S A R(Q)
Merged 4 spectra same precursor 50 sec window different peptides Know Your Chromatographic Peak Widths (K)E E m E S A E G|L|K\G P/m\K(S) Top Database Search Result 8.78 71.0% DFwdRev: 3.49
Consequences of Inappropriate Tolerance Units (using Da tolerance when instrument errors are in ppm) too loose too tight just right • Isobaric AA’s • I = L (C6 H11 N1 O) = 113.08406 • K ~ Q (C6 H12 N2 O, C5 H8 N2 O2) 128.09496 ~ 128.05858 D =0.03638 • F~m (C9 H9 N O, C5 H9 N O S) 147.06841 ~ 147.0354 D =0.0330 • Isobaric AA combinations • GG=N (C4 H6 N2 O2 , C4 H6 N2 O2) 114.04293 • GA=Q~K (C5 H8 N2 O2, C5 H8 N2 O2, C6 H12 N2 O) 128.09496 ~ 128.05858 D =0.03638 • DA~W~VS (C7 H10 N2 O4, C11 H11 N2 O, C8 H14 N2 O3) 186.06405 ~ 186.07931 ~ 186.10044 D =0.01526 D =0.02113
Additional Resources Google: “de novo sequencing tutorial” Don Hunt and Jeff Shabanowitz - manual http://www.ionsource.com/tutorial/DeNovo/DeNovoTOC.htm Rich Johnson - manual http://www.abrf.org/ResearchGroups/MassSpectrometry/EPosters/ms97quiz/SequencingTutorial.html PEAKS - automated http://www.bioinformaticssolutions.com/products/peaks/support/tutorials/PEAKS_De_Novo.html
Physiochemical Complications to Computational Interpretation • Incomplete Fragmentation • Inconsistent intensity of fragment ion types • Instrument type dependent • Amino acid dependent • Isobaric AA’s • I = L (C6 H11 N1 O) • K = Q (C6 H12 N2 O, C5 H8 N2 O2) • Isobaric AA combinations • GG=N (C4 H6 N2 O2 , C4 H6 N2 O2) • GA=K=Q (C5 H8 N2 O2, C6 H12 N2 O, C5 H8 N2 O2) • W=DA=VS (C11 H11 N2 O, C7 H10 N2 O4, C8 H14 N2 O3) • Parent charge uncertainty • Fragment charge uncertainty • Chemical or post-translational modifications
>0.8 0.4 - 0.8 # dominant ions # total cleavages 0.1 - 0.4 - (<3 obsv) Frequency of Dominance at Adjacent AA’s – v9, z=2 Mobile Partially Mobile 2061 spectra 4525 spectra Non-mobile 114 spectra
67% 72% 76% Frequency and Distribution Dominant Ions v9 5758 2974 177 Proton Mobility Mobile: zpre > #Arg + #Lys + #His Partially mobile: zpre< #Arg + #Lys + #His and > #Arg Non-mobile: zpre< #Arg Precursor z=2, 6699 spectra from a trypsin GeLC/MS/MS experiment on an LTQ-FT
Short Peptides Often Yield a Dominant Ion Cleavage Between Residues 2 & 3 Bonus C-side b2 residues at position 3: PRKH Bonus N-side b2 residues at position 1 or 2: PRKHNQqVILFYW Bonus ignore b2: niether of above but still dominant If there is a mobile or partially mobile proton, peptides of length <14 are likely to yield at least one intense fragment ion between residues 2 and 3 (yellow and pink curves shifted to shorter lengths, purple curve shifted to longer lengths). Intense ions are favored by the presence of PRKH at residue 3 or the presence of PRKHNQqVILFYW at residues 1 or 2.
Acknowledgements Broad Institute Steve Carr Terri Addona Jinyan Du MIT Michael Yaffe Majbrit Hjerrld Drew Lowery
Frequency of Position Dependent Dominant Ion(s) v9 Proton Mobility Mobile: zpre > #Arg + #Lys + #His Partially mobile: zpre< #Arg + #Lys + #His and > #Arg Non-mobile: zpre< #Arg
(R)R G G/P P\F A\F|V|E|F|E|D|P R(D) (R)N P P R\F A\F|V|E|F|E|D|P\R(D) Related Proteins : Distinct Non-differentiable Peptides
Setting Autovalidation Thresholds Step 2 - Peptide Mode Step 1 - Protein Mode • 2 or more peptides/protein • Each spectrum: moderate or better score • 1 peptide/protein • Each spectrum: excellent score
Wt/Mut PBD Gene Symbol 13.4 YWHAZ 16.5 YWHAE 13.0 SFN 11.1 YWHAB 11.4 YWHAT 13.3 YWHAG 13.5 YWHAH Distinguishing 7 Family Members (14-3-3 proteins) None have a PLK-PBD binding motif: S[st]P Each has 2-4 PLK phosphorylation motifs: [ED]X[st][FLIYWVM] No phosphorylated peptides were recovered.
Relative Abundance Chromatographic Peak Sampling:Abundance/IdentityTrade-off MS Periodic Focus on Several Peptides Abundance MS/MS Sliding Focus on Each Peptide Identity Retention Time
Distinguishing Family Members (ROCK1 & ROCK2 ) #Same #Distinguishing WT/Mut Gene Peps Peps PBD Symbol 5 39 30.8 ROCK2 5 1 26.5 ROCK1
Enabling Integrated Reverse Database Searches Each database sequence candidate passing the parent mass filter is additionally subject to “inner sequence reversal” and interpreted against the MS/MS spectrum. i.e. SAMPLER Becomes SELPMAR Search time increases ~1.5X
Sequence Database SEQUEST - preliminary search Pm Experimental Spectrum Relative Abundance Step 1 calculate ALL theoretical fragment ions for EACH sequence Step 3 Mass (m/z) Step 2 Filter Compare Pm Pm b y b y b y b y Filtered Experimental Spectrum Model Spectrum Relative Abundance a a a a Relative Abundance y-NH3 y-NH3 y-NH3 y-NH3 b-NH3 b-NH3 b-NH3 b-NH3 y-H2O b-H2O y-H2O b-H2O y-H2O b-H2O y-H2O b-H2O Mass (m/z) Mass (m/z)
Sequence Database a b-H2O a b-NH3 b-H2O b b-NH3 b a b-H2O b-NH3 b MS-Tag preliminary search Pm Filtered Experimental Spectrum Relative Abundance Step 2 calculate partial ladders for EACH sequence Step 3 Mass (m/z) Step 1 Transform Compare Pm Pm N-terminal Sequence Spectrum Partial N-terminal Ladder Relative Abundance Relative Abundance Mass (m/z) Mass (m/z) Pm Pm y-NH3 y y-NH3 y C-terminal Sequence Spectrum y-NH3 Relative Abundance Partial C-terminal Ladder Relative Abundance y Mass (m/z) Mass (m/z)
Mass Differences Correspond to Amino Acids u q e e q s u e Intensity n n c e e e q c s n e u s e c e m/z
Graphy Theory Based de novo Algorithms • vertices (from peak m/z’s) • edges (from mass differences) • Transform to • Spectrum Graph • Find best path Sequence
SpectrumMill Scoring of MS/MS Interpretations (R)E F E|I|I|W|V T K(H) 9.24 78.1% DEBS: -3 DFwdRev: 1.75 Score = Assignment Bonus (Ion Type Weighted) - Non-assignment Penalty (Intensity Weighted) Peak Selection: De-Isotoping, S/N thresholding, Parent - neutral removal, Charge assignment Match to Database Candidate Sequences SPI (%) Scored Peak Intensity
PeptideSequencing b2-H2O b3- NH3 a2 b2 a3 b3 HO NH3+ | | R1 O R2 O R3 O R4 | || | || | || | H -- N --- C --- C --- N --- C --- C --- N --- C --- C --- N --- C -- COOH | | | | | | | H H H H H H H y3 y2 y1 y2 - NH3 y3 -H2O
Cry Babies (b-H2O w/o b) 113 128 101 160 P(m/z)-2H2O 160 101 128 113 P(m/z)-H2O E/C|L/Q/T/C/R No b/y pairs