1 / 55

Protein Sequencing and Identification by Mass Spectrometry

iliana
Download Presentation

Protein Sequencing and Identification by Mass Spectrometry

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Protein Sequencing and Identification by Mass Spectrometry

    2. Masses of Amino Acid Residues

    3. Protein Backbone

    4. Peptide Fragmentation Peptides tend to fragment along the backbone. Fragments can also loose neutral chemical groups like NH3 and H2O.

    5. Breaking Protein into Peptides and Peptides into Fragment Ions Proteases, e.g. trypsin, break protein into peptides. A Tandem Mass Spectrometer further breaks the peptides down into fragment ions and measures the mass of each piece. Mass Spectrometer accelerates the fragmented ions; heavier ions accelerate slower than lighter ones. Mass Spectrometer measure mass/charge ratio of an ion.

    6. N- and C-terminal Peptides

    7. Terminal peptides and ion types

    8. N- and C-terminal Peptides

    9. N- and C-terminal Peptides

    10. N- and C-terminal Peptides

    11. N- and C-terminal Peptides

    12. Peptide Fragmentation

    13. Mass Spectra The peaks in the mass spectrum: Prefix Fragments with neutral losses (-H2O, -NH3) Noise and missing peaks.

    14. Protein Identification with MS/MS

    15. Tandem Mass-Spectrometry

    16. Breaking Proteins into Peptides

    17. Mass Spectrometry

    18. Tandem Mass Spectrometry

    19. Protein Identification by Tandem Mass Spectrometry

    20. Tandem Mass Spectrum Tandem Mass Spectrometry (MS/MS): mainly generates partial N- and C-terminal peptides Spectrum consists of different ion types because peptides can be broken in several places. Chemical noise often complicates the spectrum. Represented in 2-D: mass/charge axis vs. intensity axis

    21. De Novo vs. Database Search

    22. De Novo vs. Database Search: A Paradox The database of all peptides is huge ˜ O(20n) . The database of all known peptides is much smaller ˜ O(108). However, de novo algorithms can be much faster, even though their search space is much larger! A database search scans all peptides in the database of all known peptides search space to find best one. De novo eliminates the need to scan database of all peptides by modeling the problem as a graph search.

    23. De novo Peptide Sequencing

    24. Theoretical Spectrum

    25. Theoretical Spectrum (cont’d)

    26. Theoretical Spectrum (cont’d)

    27. Building Spectrum Graph How to create vertices (from masses) How to create edges (from mass differences) How to score paths How to find best path

    28. S E Q U E N C E

    29.

    31.

    32.

    34.

    35. MS/MS Spectrum

    36. Some Mass Differences between Peaks Correspond to Amino Acids

    37. Ion Types Some masses correspond to fragment ions, others are just random noise Knowing ion types ?={d1, d2,…, dk} lets us distinguish fragment ions from noise We can learn ion types di and their probabilities qi by analyzing a large test sample of annotated spectra.

    38. Example of Ion Type ?={d1, d2,…, dk} Ion types {b, b-NH3, b-H2O} correspond to ?={0, 17, 18} *Note: In reality the d value of ion type b is -1 but we will “hide” it for the sake of simplicity

    39. Match between Spectra and the Shared Peak Count The match between two spectra is the number of masses (peaks) they share (Shared Peak Count or SPC) In practice mass-spectrometrists use the weighted SPC that reflects intensities of the peaks Match between experimental and theoretical spectra is defined similarly

    40. Peptide Sequencing Problem Goal: Find a peptide with maximal match between an experimental and theoretical spectrum. Input: S: experimental spectrum ?: set of possible ion types m: parent mass Output: P: peptide with mass m, whose theoretical spectrum matches the experimental S spectrum the best

    41. Vertices of Spectrum Graph Masses of potential N-terminal peptides Vertices are generated by reverse shifts corresponding to ion types ?={d1, d2,…, dk} Every N-terminal peptide can generate up to k ions m-d1, m-d2, …, m-dk Every mass s in an MS/MS spectrum generates k vertices V(s) = {s+d1, s+d2, …, s+dk} corresponding to potential N-terminal peptides Vertices of the spectrum graph: {initial vertex}?V(s1) ?V(s2) ?... ?V(sm) ?{terminal vertex}

    42. Reverse Shifts Two peaks b-H2O and b are given by the Mass Spectrum With a +H2O shift, if two peaks coincide that is a possible vertex.

    43. Reverse Shifts

    44. Edges of Spectrum Graph Two vertices with mass difference corresponding to an amino acid A: Connect with an edge labeled by A Gap edges for di- and tri-peptides

    45. Paths Path in the labeled graph spell out amino acid sequences There are many paths, how to find the correct one? We need scoring to evaluate paths

    46. Path Score p(P,S) = probability that peptide P produces spectrum S= {s1,s2,…sq} p(P, s) = the probability that peptide P generates a peak s Scoring = computing probabilities p(P,S) = ps?S p(P, s)

    47. For a position t that represents ion type dj : qj, if peak is generated at t p(P,st) = 1-qj , otherwise Peak Score

    48. Peak Score (cont’d) For a position t that is not associated with an ion type: qR , if peak is generated at t pR(P,st) = 1-qR , otherwise qR = the probability of a noisy peak that does not correspond to any ion type

    49. Finding Optimal Paths in the Spectrum Graph For a given MS/MS spectrum S, find a peptide P’ maximizing p(P,S) over all possible peptides P: Peptides = paths in the spectrum graph P’ = the optimal path in the spectrum graph

    50. Ions and Probabilities Tandem mass spectrometry is characterized by a set of ion types {d1,d2,..,dk} and their probabilities {q1,...,qk} di-ions of a partial peptide are produced independently with probabilities qi

    51. Ions and Probabilities A peptide has all k peaks with probability and no peaks with probability A peptide also produces a ``random noise'' with uniform probability qR in any position.

    52. Ratio Test Scoring for Partial Peptides Incorporates premiums for observed ions and penalties for missing ions. Example: for k=4, assume that for a partial peptide P’ we only see ions d1,d2,d4. The score is calculated as:

    53. Scoring Peptides T- set of all positions. Ti={t d1,, t d2,..., ,t dk,}- set of positions that represent ions of partial peptides Pi. A peak at position tdj is generated with probability qj. R=T- U Ti - set of positions that are not associated with any partial peptides (noise).

    54. Probabilistic Model For a position t dj ? Ti the probability p(t, P,S) that peptide P produces a peak at position t. Similarly, for t?R, the probability that P produces a random noise peak at t is:

    55. Probabilistic Score For a peptide P with n amino acids, the score for the whole peptides is expressed by the following ratio test:

    56. De Novo vs. Database Search

More Related