1 / 35

High Throughput and Large Scale Proteomics Analysis

High Throughput and Large Scale Proteomics Analysis. Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California. Overview. Shotgun proteomics and ESI mass spectrometry Proteomic data mining and data visualization. 12,000 proteins.

Download Presentation

High Throughput and Large Scale Proteomics Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California

  2. Overview • Shotgun proteomics and ESI mass spectrometry • Proteomic data mining • and data visualization

  3. 12,000 proteins

  4. Are We Ready for Mammalian Proteomics ? Shotgun Proteomics 2-D Gel Cytoskelatal Proteins mM, 1x 109 copies/cell Metabolism 0.1 mM, 1x 108 Ribosomes 10 mM, 1x 107 Kinases 1 mM, 1x 106 Cyclins 0.1 mM, 1x 105 Transcription factors 10 nM, 1x 104 Synaptic Markers 0.1 nM, 1x 103

  5. Advantages of Proteomics Using LC-MS/MS • No pre-selection of biased targets • (hypothesis-free, open approach) • Protein variants are detected simultaneously • Protein isolation and detection are on a small scale (~ 10 fmol from complex mixtures – subcellular fractions, whole cells, or tissue) • Obtain sequence information of peptides (not just masses) and can sequence ~4,000 proteins in a single experiment

  6. Liquid Chromatography Quadrupole Ion Trap Tandem Mass Spectrometer

  7. Electrospray vs Nanospray

  8. Splitless Nano-Liquid Chromatography

  9. Five Independent Loop Injections

  10. 10-cycle MudPIT Analysis

  11. 0-500 mM NH4OAc 500 SCX Column 400 300 200 100 RP #1 RP #2 300 500 200 400 100 1,000-2,000 Sequencing Attempts in 60 Minutes Multidimensional Protein Identification Technology (MudPIT) Digested protein complexes 20,000 MS/MS spectra/day

  12. Isotope-Coded Affinity Tags (ICAT)

  13. Electrospray Ionization (ESI) Ions in gaseous phase Ions in solution LC Spray tip Ion source opening for the MS

  14. Theoretical CID of a Tryptic Peptide y1 y3 b1 y2 b2 b3 MS/MS Spectrum K G L F K F L G + + + + F L G K + + F L G K b3 y1 F L G K + + + + Parent ions F L G K F L G K CID b2 y2 + F L G K + + + + F L G K F L G K b1 y3 Non-dissociated Parent ions Daughter ions Relative Intensity m/z (464.29)

  15. SequestQueue (6,000 dta x50 = 300,000 ms/ms scans)

  16. Data Mining through SEQUEST and PAULA • DatabaseSearch Time • Yeast ORFs (6,351 entries) 52 sec: 0.104 sec/s • Non-redundant protein (100k entries) 3500 min: • EST (100K entries, 3-frames) 5-10,000 min:

  17. STEP 1. SEQ 1 SEQ 2 SEQ 3 SEQ 4 STEP 3. SEQUEST Algorithm Theoretical MS/MS spectra Step 1. Determine Parent Ion molecular mass Step 2. 500 peptides with masses closest to that of the parent ion are retrieved from a protein database. Computer generates a theoretical MS/MS Spectrum for each peptide sequence (SEQ1, 2, 3, 4, …) (Experimental MS/MS Spectrum) ZSA-charge assignment Step 4. Scores are ranked and Protein Identifications are made based on these cross correlation scores. Step 3. Experimental Spectrum is compared with each theoretical spectra and correlation scores are assigned. (Experimental MS/MS Spectrum) Unified Scoring Function

  18. One spectrum TWO protein identifications Spectrum A was used to search against NCBI human database: Macrophage inhibitory factor was identified Same spectrum was used to search against non-redundant database. Bovine G-protein gamma was identified. Since the primary amino acid sequence of human G-protein gamma is almost identical to bovine, this protein was later identified as human G-protein Gamma. The initial false ID was due to an entry missing of human g-protein in human database. The sequence was later reentered Into the human database and the third search yielded correct ID. Mol Cell Proteomics. 2003 Jul;2(7):428-42. Fragment ions match both sequences are indicated by * Spectrum B has two additional ions matched to G-protein gamma

  19. Distribution of Xcorr from correctly and incorrectly identified peptides

  20. X-correlation vs Peptide length

  21. Distribution of Xcorr vs Charge State

  22. F-score and probability-based peptide assignment

  23. Identification of modified LRP in APP/PS1 Transgenic Mice

  24. Neurotransmitter Receptors

  25. Proteomic Data Visualization and Future Directions • information overload • data integration • ease of visualization

  26. Network for NMDA and glutamate receptors

  27. Network for NMDA and glutamate receptors (Zoom-in)

  28. SEQUEST SALSA Raw Unidentified Spectra (~10,000-100,000) Identified Sequence Scoring Algorithm for Spectral Analysis

  29. SALSA Overview * • SALSA is a tool for identifying MS-MS spectra in Xcalibur analysis files that display specific user-defined characteristics. Because these characteristics correspond to structural features of a peptide, SALSA allows the user to selectively locate MS-MS spectra of specific peptides or their variant or modified forms. product ion loss charged Massdifference neutralloss A G D W T ion series

  30. Construction of SALSA ruler GAIIGLMGGVV m/z GAIIGLMGGV GAIIGLMGG GAIIGLMG GAIIGLMGGVV GAIIGLM GAIIGL GAIIG GAII GAI GA GAIIGLMGGV GAIIGLMGG GAIIGLMG GAIIGLMGGVV GAIIGLM GAIIGL GAIIG GAII Methionine Oxidation 16 amu (one oxygen atom) GAI GA

  31. Absolute Quantification Analysis Quantification of Methionine Oxidation GAIIGLMVGGVV GAIIGLMVGGVV: +7 amu

More Related