1 / 33

Principles of Shotgun Proteomics and Proteogenomics

InnoMol Proteomics Workshop April 8, 2014. Principles of Shotgun Proteomics and Proteogenomics. Boris Ma č ek Proteome Center Tuebingen. General MS-based proteomics workflow. Aebersold R and Mann M. 2003 . Nature 422 : 198-207. Principle of protein database search.

isleen
Download Presentation

Principles of Shotgun Proteomics and Proteogenomics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. InnoMol Proteomics Workshop April 8, 2014 Principles of Shotgun Proteomics and Proteogenomics Boris Maček Proteome Center Tuebingen

  2. General MS-based proteomics workflow Aebersold R and Mann M.2003.Nature 422: 198-207

  3. Principle of protein database search A L K G A S Intensity Translated Genomic Sequence Theoretical Spectra for Proteins Intensity Intensity Intensity Intensity m/z m/z m/z m/z m/z Theoretical spectra that fall into the defined mass range. Intensity Each of them is compared to our fragment Ion spectra. m/z Intensity m/z Database

  4. Principle of protein database search >sp|P31946|1433B_HUMAN 14-3-3 protein beta/alpha OS=Homo sapiens GN=YWHAB PE=1 SV=3 MTMDKSELVQKAKLAEQAERYDDMAAAMKAVTEQGHELSNEERNLLSVAYKNVVGARRSSWRVISSIEQKTERNEKKQQMGKEYREKIEAELQDICNDVLELLDKYLIPNATQPESKVFYLKMKGDYFRYLSEVASGDNKQTTVSNSQQAYQEAFEISKKEMQPTHPIRLGLALNFSVFYYEILNSPEKACSLAKTAFDEAIAELDTLNEESYKDSTLIMQLLRDNLTLWTSENQGDEGDAGEGEN >sp|P62258|1433E_HUMAN 14-3-3 protein epsilon OS=Homo sapiens GN=YWHAE PE=1 SV=1 MDDREDLVYQAKLAEQAERYDEMVESMKKVAGMDVELTVEERNLLSVAYKNVIGARRASWRIISSIEQKEENKGGEDKLKMIREYRQMVETELKLICCDILDVLDKHLIPAANTGESKVFYYKMKGDYHRYLAEFATGNDRKEAAENSLVAYKAASDIAMTELPPTHPIRLGLALNFSVFYYEILNSPDRACRLAKAAFDDAIAELDTLSEESYKDSTLIMQLLRDNLTLWTSDMQGDGEEQNKEALQDVEDENQ >sp|P62258-2|1433E_HUMAN Isoform SV of 14-3-3 protein epsilon OS=Homo sapiens GN=YWHAE MVESMKKVAGMDVELTVEERNLLSVAYKNVIGARRASWRIISSIEQKEENKGGEDKLKMIREYRQMVETELKLICCDILDVLDKHLIPAANTGESKVFYYKMKGDYHRYLAEFATGNDRKEAAENSLVAYKAASDIAMTELPPTHPIRLGLALNFSVFYYEILNSPDRACRLAKAAFDDAIAELDTLSEESYKDSTLIMQLLRDNLTLWTSDMQGDGEEQNKEALQDVEDENQ >sp|Q04917|1433F_HUMAN 14-3-3 protein eta OS=Homo sapiens GN=YWHAH PE=1 SV=4 MGDREQLLQRARLAEQAERYDDMASAMKAVTELNEPLSNEDRNLLSVAYKNVVGARRSSWRVISSIEQKTMADGNEKKLEKVKAYREKIEKELETVCNDVLSLLDKFLIKNCNDFQYESKVFYLKMKGDYYRYLAEVASGEKKNSVVEASEAAYKEAFEISKEQMQPTHPIRLGLALNFSVFYYEIQNAPEQACLLAKQAFDDAIAELDTLNEDSYKDSTLIMQLLRDNLTLWTSDQQDEEAGEGN >tr|F2Z3E5|F2Z3E5_HUMAN Hydroxyacid-oxoacidtranshydrogenase, mitochondrial OS=Homo sapiens GN=ADHFE1 PE=4 SV=1 MAAAARARVAYLLRQLQRAACQCPTHSHTYSQDGCFKY >tr|Q5SS58|Q5SS58_HUMAN MHC class I polypeptide-related sequence A OS=Homo sapiens GN=MICA PE=4 SV=2 MGQRDQGLDRERKGPQDDPGSYQGPERRNFLKEDAMKTKTHYHAMHADCLQELRRYLESGVVLRRTVPPMVNVTRSEASEGNITVTCRASSFYPRNIILTWRQDGVSLSHDTQQWGDVLPDGNGTYQTWVATRICRGEEQRFTCYMEHSGNHSTHPVPSGKVLVLQSHWQTFHVSAVAAGCCYFCYYYFLCPLL >tr|Q5T409|Q5T409_HUMAN Disrupted in schizophrenia 1 OS=Homo sapiens GN=DISC1 PE=2 SV=1 MPGGGPQGAPAAAGGGGVSHRAGSRDCLPPAACFRRRRLARRPGYMRSSTGPGIGFLSPAVGTLFRFPGGVSGEESHHSESRARQCGLDSRGLLVRSPVSKSAAAPTVTSVRGTSAHFGIQLRGGTRLPDRLSWPCGPGSAGWQQEFAAMDSSETLDASWEAACSDGARRVRAAGSLPSAELSSNSCSPGCGPEVPPTPPGSHSAFTSSFSFIRLSLGSAGERGEAEGCPPSREAESHCQSPQEMGAKAASLDGPHEDPRCLSRPFSLLATRVSADLAQAARNSSRPERDMHSLPDMDPGSSSSLDPSLAGCGGDGSSGSGDAHSWDTLLRKWEPVLRDCLLRNRRQMEVISLRLKLQKLQEDAVENDDYDKAETLQQRLEDLEQEKISLHFQLPSRQPALSSFLGHLAAQVQAALRRGATQQASGDDTHTPLRMEPRLLEPTAQDSLHVSITRRDWLLQEKQQLQKEIEALQARMFVLEAKDQQLRREIEEQEQQLQWQGCDLTPLVGQLSLGQLQEVSKALQDTLASAGQIPFHAEPPETIRSLQERIKSLNLSLKEITTKVCMSEKFCSTLRKKVNDIETQLPALLEAKMHAISGNHFWTAKDLTEEIRSLTSEREGLEGLLSKLLVLSSRNVKKLGSVKEDYNRLRREVEHQETAYETSVKENTMKYMETLKNKLCSCKCPLLGKVWEADLEACRLLIQSLQLQEARGSLSVEDERQMDDLEGAAPPIPPRLHSEDKRKTPLKESYILSAELGEKCEDIGKKLLYLEDQLHTAIHSHDEDLIHSLRRELQMVKETLQAMILQLQPAKEAGEREAAASCMTAGVHEAQA A L K G A S MaxQuant Software Translated Genomic Sequence Theoretical Spectra for Proteins Intensity m/z Homo Sapiens Reference Proteome 71,434 entries (20,246 reviewed proteins) (51,188 un-reviewed) Database

  5. MS instrumentation in proteomics Aebersold R and Mann M.2003.Nature 422: 198-207

  6. Column (75 µm)/spray tip (8 μm) Reverse-phase C18 beads, 3 μm Coupling LC to MS for complex mixture analysis Nanoflow LC/MS interface set-up: Proxeon Easy nLC nanoflow LC System LTQ-Orbitrap No precolumn or split! 12-15 cm Sample Loading:~700 nl/min Gradient elution:~200 nl/min Platin-wire 2.0 kV

  7. Coupling LC to MS for complex mixture analysis BSA tryptic in-solution digest 50 fmol on column

  8. LTQ-Orbitrap (2005) Octopole coll. cell Linear ion trap (LTQ) Source C-Trap Orbitrap LTQ-FT MS/MS optimized scan cycle: → peptide mass measurement → peptide sequencing Orbitrap-MS MS-Full Scan MS2 MS2 MS2 MS2 MS2 LTQ-MS 0 300 600 900 1200 1500 1800 Time [msec]

  9. Data processing workflow: MaxQuant

  10. Acquisition speed LTQ Orbitrap XL LTQ Orbitrap Velos □ CID Identified + CID Not Iidentified

  11. Acquisition speed # of MS/MS Scans

  12. Stable Isotope Labeling byAmino Acids in Cell Culture (SILAC) ”normal AA” ”heavy AA” Lys-12C6 Lys-13C6 Resting cells Treated (drug, GF) Combine and lyse, protein purification or fractionation Proteolysis (trypsin, Lys-C, etc.) Quantitation and identification by MS (nanoscale LC-MS/MS)

  13. Current research at the PCT • Proteogenomics • B. subtilis, E. coli (Krug et al, 2011, Mol Bosystems; 2013 MCP) • Pristionchus pacificus (Borchert et al, 2010, Genome Res) • cancer cell lines/tissues • Proteomics for systems biology • In-depth sequencing and quantitation of model organisms (B.subtilis, • E.coli, S. pombe, A. thaliana) (Soufi et al, 2010, J Prot Res; Schütz et al, 2011, Plant Cell; Soufi et al, 2012, Curr Opinion Microbiol; Soares et al, 2013, JPR) • Phosphoproteomics • targets of Aurora kinase in S. pombe (Koch et al, 2011, Science Signaling) • targets of protein kinase D in human cells (Franz-Wachtel et al., 2012, MCP) • targets of S/T/Y kinases and phosphatases in B.subtilis and E.coli • Protein modifications • ubiquitylation (Ikeda et al, 2011, Nature) • lysine acetylation (Carpy et al., in preparation) • Clinical proteomics • genetic rescue of Fragile X phenotype in FMR1 KO mice

  14. Super-SILAC in Bacteria

  15. Super-SILAC in Bacteria

  16. E. coli: Replicate 1 and 2 *in all phases of growth Soufi et al. in preparation

  17. Biological reproducibility Soufi et al. in preparation

  18. Proteome dynamics during growth Soufi et al. in preparation

  19. Dynamics of stress proteins during growth Soufi et al. in preparation

  20. Estimation of absolute copy numbers T6 T5 T7 T4 OD 600 T3 UPS standard (iBAQ) T2 T1 1800 5760 Time (min) Soufi et al. in preparation

  21. Summary of absolutely quantified proteins Soufi et al. in preparation

  22. Most abundant Proteins (ES) Soufi et al. in preparation

  23. Dynamic range of protein abundance Count Blue: All proteins Red: Membrane proteins Log2 Protein Copy Number Soufi et al. in preparation

  24. Proteogenomics • Application of tandem mass spectrometry to genome re-annotation • Search MS/MS spectra against a database containing the complete genome • translated in 6 reading frames

  25. Problem: database size and structure „Ususal“ Proteomics applications • Incompatibility with some data processing programs • Long search times • Decreased sensitivity of database search • Unequal target and decoy search spaces • Most translated frames are in fact decoy sequences • Overestimation of the FDR Predicted ORFs REV_Predicted ORFs Proteogenomics applications Predicted ORFs Frame1 Frame2 Frame3 Frame4 Frame5 Frame6 REV_Predicted ORFs REV_Frame1 REV_Frame2 REV_Frame3 REV_Frame4 REV_Frame5 REV_Frame6

  26. Proteogenomics of E. coli • Model Gram-negative bacterium • Small (4.6 Mb) and well characterized genome • ~4,300 protein coding genes (manually annotated and reviewed) • Comprehensive high accuracy MS dataset comprising >42,000 unique peptide sequences from >2,600 proteins • Hypothesis: genome annotation approaches completeness • Assessment of general properties of a simple proteogenomic experiment

  27. Proteogenomics of E. coli 1.9M peptide mass spectra

  28. Proteogenomics of E. coli PEP = 4.02E-08 PP = 0.9999 Annotated genes Detected peptides Six-frame ORFs A B ybdz fes fepa fes Annotated genes Detected peptides Six-frame ORFs MFEVTFWWRDPQGSEEY... VGSESWWQSK TWGYGVTALKVGSESWWQSKHGPEWQRLNDEMFEVTFWWRDPQGSEEY... PEP = 0.027976 PP = 0.9504 C D yhjb yhja tref Position (Mb) Position (Mb) tref MLNQKIQNPNPDELMIEVDLCYELDPYELKLDEMIEAEP... KPPQIRISL ...NAVFKPPQIRISL LATNFGGWILMLNQKIQNPNPDELMIEVDLCYELDPYELKLDEMIEAEP... Krug et al. Mol Cell Proteomics, 2013

  29. Majority of Novel Peptides are False Positives Krug et al. Mol Cell Proteomics, 2013

  30. Assessment of Processing Workflows Krug et al. Mol Cell Proteomics, 2013

  31. Deep Proteome Coverage of Escherichia coli MS/MS scans Mean: 20 scans Median: 7 scans 0 50 100 150 20-fold base coverage of 27.5% genome sequence Krug et al. Mol Cell Proteomics, 2013

  32. Conclusions • proteomics reaches analytical capacity to identify and quantify all gene • products in microorganisms grown in culture • several regulatory protein modifications (e.g. S/T/Y-phosphorylation, lysine • acetylation) can routinly be analyzed on a global scale • many challenges ahead: • analysis of H/D-phosphorylation • analysis of environmental samples • coverage of genome/protein sequence by detected peptides • future developments: • faster MS/MS acquisition • smarter acquisition software • large-scale targeted proteomics • metaproteomics and individual proteomics

  33. Acknowledgements Proteome Center Tuebingen Boumediene Soufi Nelson C. Soares Philipp Spät Karsten Krug Alejantro Carpy Sasa Popic Silke Wahl Funding

More Related