270 likes | 400 Views
Personal Genomics meets Quantitative Proteomics. George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM. Thanks to:. NHGRI Seq Tech 2004: Agencourt , 454, Microchip, 2005: Nanofluidics, Network, VisiGen Affymetrix, Helicos, Solexa-Lynx.
E N D
Personal Genomics meets Quantitative Proteomics George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Thanks to: NHGRI Seq Tech 2004:Agencourt, 454, Microchip, 2005: Nanofluidics, Network, VisiGen Affymetrix, Helicos,Solexa-Lynx
"Open-source" Personal Genome Project (PGP) • Harvard Medical School IRB Human Subjects protocol • submitted 16-Sep-2004, approved Aug 31, 2005. • Gradual plan. Start with "highly-informed" individuals consenting to non-anonymous genomes & extensive phenotypes (medical records, imaging, omics). • Cell lines in Coriell NIGMS Repository • Diploid genome subsets at $0.1/kb, <3E-7 FP Errors • How? Polony bead Sequencing-by-Ligation (SbL)
Analyses of single chromosomes (single cells , RNAs, particles) (1) When we only have one cell as in Preimplantation Genetic Diagnosis (PGD) or environmental samples (2) Candidate chromosome region sequencing (3) Prioritizing or pooling (rare) species based on an initial DNA screen. (4) Multiple chromosomes in a cell or virus (5) RNA splicing (6) Cell-cell interactions (predator-prey, symbionts, commensals, parasites)
CD44 Exon Combinatorics (Zhu & Shendure) • Alternatively Spliced Cell Adhesion Molecule • Specific variable exons are up-or-down-regulated in various cancers (>2000 papers) • v6 & v7 enable direct binding to chondroitin sulfate, heparin… Zhu,J, et al. Science. 301:836-8.
CD44 RNA isoforms Eph4 = murine mammary epithelial cell line Eph4bDD = stable transfection of Eph4 with MEK-1 (tumorigenic) Zhu J, Shendure J, Mitra RD, Church GM. Science 301:836-8. Single molecule profiling of alternative pre-mRNA splicing.
Molecular Weight Assessment of Proteins in Total Proteome Profiles Using 1D-PAGE and LC/MS/MS. Candidates for alternative splicing (AS), endoproteolytic processing (EPP), & post-translational modifications (PTMs) in Lymphoblastoid cells Protein Name Predicted MW Observed MW Difference before & after leader cleavage Cytochrome c oxidase subunit IV isoform 1 19577 2582 205 NADH dehydrogenase 21750 5084 334 Coproporphyrinogen oxidase 50175 13632 357 MHC II, DQ b 1 29733 25896 404 NADH (ubiquinone) Fe-S protein 2 52545 48185 815 Mito short-chain enoyl-coA hydratase 1 31371 27499 901 Peptidylprolyl isomerase B (cyclophilin) 23742 19360 940 Proteome Sci. 3:6(2005) Ahmad R, Nguyen DH, Wingerd MA, Church GM, Steffen MA.
Light regulated Circadianmetabolism glgA glgB glgC Central Carbon Metabol. a-Glc-1P ADP-Glc glycogen a-1,4-glucosyl-glucan glgX glgP Zinser et al. unpubl.
HLIP D1 Viral Photosynthetic Proteins Podovirus P-SSP746 kb Myovirus P-SSM2255 kb PC PC HLIPs HLIPs Fd Fd D1 D1 12kb 24kb 12kb 24kb Myovirus P-SSM4 181 kb HLIPs HLIPs D1 D1 D2 D2 ~500 ~500 bp bp 6.4kb 6.4kb 2.8kb 2.8kb Lindell, Sullivan, Chisholm et al. 2004
Photosynthesis genes in marine viruses yield proteins during host infection. Nature 2005 438:86-9.Lindell D, Jaffe JD, Johnson ZI, Church GM, Chisholm SW.
Photosynthesis genes in marine viruses yield proteins during host infection. host 15N 13C synthetic standards phage Nature 2005 438:86-9.Lindell D, Jaffe JD, Johnson ZI, Church GM, Chisholm SW.
Improving MS Peptide Coverage • Ionization efficiency • Ions outside the mass range of the analyzer • Chromatographic behavior • Sample preparation bias • Instrument duty cycle • Improve Spectra interpretation over current algorithms • Details of fragmentation patterns • Dipeptide P, DE/KR, V.G intensity effects • B & Y ions unequal & co-dependent • More intense ions in middle of peptides MDQuest: Mike Chou, Dan Schwartz, Steve Gygi, Josh Elias http://gygi.med.harvard.edu/dpsp/
MapQuant is a program designed to isolate unique organic species and quantify their relative abundances from an LC/MS experiment. 2-D peptide map m/z units m/z units Scan number: N N+1 N+2 N+3 time or scans Scheme: Data from an LC/MS experiment are analyzed after being formatted into a data structure called a 2-D map, analogous to a gray-scale image.
MapQuant Gives a List of All Organic Species In the Sample 2-D map m/z units Retention time MapQuant
MapQuant is a program designed to isolate unique organic species and quantify their relative abundances from an LC/MS experiment. 2-D peptide map m/z units m/z units Scan number: N N+1 N+2 N+3 time or scans Scheme: Data from an LC/MS experiment are analyzed after being formatted into a data structure called a 2-D map, analogous to a gray-scale image.
MapQuant Gives a List of All Organic Species In the Sample 2-D map m/z units Retention time MapQuant
MapQuant is publicly available at http://arep.med.harvard.edu/mapquant.html
MapQuant gives me a list of all organic species in the sample BUT WHAT ARE THEIR IDENTITIES? ? ? QEPERSEK m/z units ? DAFLSGER EKLAVSAR retention time (in min)
22 Dealing With Many Peptides (Organic Species) MapQuant identifies approx. 2x104 organic speciesper LC/MS experiment. ONLY ~ 500 (3%) organic species have fragmentation (CID) spectra and hence sequence IDs = CID spectrum or MS/MS event ? ? QEPERSEK m/z units ? DAFLSGER EKLAVSAR retention time (in min)
Dealing With Many Peptides (Organic Species) Database of 11845 peptides from ALL LC/MS experiments carried out on Prochlorococcus samples ? ? QEPERSEK m/z units (rt, m/z)coordinates DAFLSGER ? EKLAVSAR retention time (in min)
Protein Distribution Among Experiments TOTAL NUMBER OF ORFS: 1742 17 792 522 539 1314
Summary • Open Personal Genome Project (PGP) including Proteomics • Single molecule RNAs for alternative splicing (AS) • Gel –MS methods for endoproteolytic processing • MapQuest for MS quantitation without isotopic labeling • http://arep.med.harvard.edu Proteome Sci. 3:6(2005) Ahmad R, Nguyen DH, Wingerd MA, Church GM, Steffen MA.
Personal Genomics meets Quantitative Proteomics George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Thanks to: NHGRI Seq Tech 2004:Agencourt, 454, Microchip, 2005: Nanofluidics, Network, VisiGen Affymetrix, Helicos,Solexa-Lynx