1 / 38

“Cancer Genomics”

“Cancer Genomics”. Richard K. Wilson, Ph.D. Washington University School of Medicine. rwilson@watson.wustl.edu. Next-generation sequencing technology. Cancer Genomics. Human Genome v1.0. Ancillary genomes: mouse chimp etc. Discovery. Technology Software tools Infrastructure.

xue
Download Presentation

“Cancer Genomics”

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “Cancer Genomics” Richard K. Wilson, Ph.D.Washington UniversitySchool of Medicine rwilson@watson.wustl.edu

  2. Next-generation sequencing technology Cancer Genomics Human Genome v1.0 Ancillarygenomes:mousechimpetc. Discovery TechnologySoftware toolsInfrastructure CancerOther diseases

  3. PCR-based re-sequencing list of candidate genes large collection of patient samples

  4. EGFR mutations in NSCLC EGF ligand binding Tyrosine kinase autophos TM K DFG Y Y Y Y 718 745 776 835 858 Y869 947 964 GXGXXG K R H DFG R Y M LREA Most TKI responders have EGFR mutations: Study 1: 8/9 (89%) vs. 0/7 controls Study 2: 5/5 (100%) vs. 0/4 controls Study 3: 19/24 (79%) vs. 0/20 controls

  5. Tumor Sequencing Project ~600 genes of interest ~200 lung adenocarcinoma samples • Sequencing Centers: BCM-HGSC, BI, WUGSC • Cancer Centers: MSKCC, DFCI, SCC, MDA

  6. TSP Target List • Too expensive to sequence the whole genome; therefore, focus on “drugable” targets. • For lung adenocarcinoma TSP: ~600 genes (exons only) • Receptor tyrosine kinases (e.g. EGFR) • Selected serine-threonine kinases • Known oncogenes • Known tumor suppressor genes • EGFR pathway genes • DNA repair genes • Etc.

  7. SNP Arrays

  8. SNP Arrays

  9. DNA Chips/SNP Arrays

  10. Lung Adeno Genomic EventsSNP Array Analysis Weir et al. Nature (2007)

  11. Lung Adeno Genomic Events Weir et al. Nature (2007)

  12. Lung Adeno Genomic Events Weir et al. Nature (2007)

  13. Lung Adenocarcinoma Amplifications Weir et al. Nature (2007)

  14. Mutations in lung adenocarcinoma • KRAS and TP53 Are Mutated in About 1/3 of Tumor Samples • Indels have not been included in the analysis

  15. Mutations in TP53, ERBB3, and AKT3 appear to correlate with tumor grade N=24 N=85 N=71 Mutation

  16. Correlations between mutations and clinical features • Mutations in PDGFRA, PTEN, NTRK1 and PRKDC show positive correlation with tumor stage. • Mutations in LRP1B, PRKDC, TP53, and APC correlate with the solid tumor histological subtype of lung adenocarcinoma. • High correlation of mutations in EGFR and MYO3B with never smoker and mutations in KRAS and LRP1B with smokers.

  17. EC TM JM KD 1 2 2 3 3 4 5 6 7 7 8 8 9 10 11 12 13 14 15 15 16 17 18 19 20 21 21 22 23 24 25 26 27 28 A289V/D/T R108K red=somatic blue=germline black=unknown T263P D46N,G63R G598V L861Q E330K R324L P596L I II III IV KINASE 18/132 glioblastoma (13.6%); + 1 KD 1/8 glioblastoma cell lines (12.5%) EGFRvIII (del AA 30-297) 0/11 lower grade gliomas 151 Total samples EGFR mutations in glioblastoma Screen of kinase domains in glioblastomano recurrent mutations But … 119 Lung Tumors: no EC mutations 270 HapMap Normals: no EC mutations

  18. Genomic Studies of Cancer • Hypothesis-driven (biased): • Gene sets with related functions: “kinome”, “phosphatome” • Genes mutated in other cancers • Closely related genes • Investigator-driven ideas • Data-driven (unbiased): • Use genomic platforms to identify loci with recurrent somatic alterations • Array-based RNA profiling • Array CGH • Array-based SNP genotyping  R.K.Wilson 2007

  19. Acute myelogenous leukemia • Project initiated in 2002. • Primary tumors, matched normal tissue (i.e., germline variants vs. somatic mutations) • “Discovery set” (46 tumors) + “Validation set” (94 tumors) • Initial target list: 450 genes • Orthogonal technologies (CGH arrays, expression profiling, etc.) for genome characterization and to detect additional sequencing targets.

  20. Acute myelogenous leukemia • FLT3: 29% • NPM1: 25% • NRAS: 9.6% • PTPN11: 4% • RUNX1: 4% • GCSFR: 4% • Others: 2-3%

  21. Is there a better approach? • What are we missing outside of the exons? • PCR-based re-sequencing: • Relatively expensive • Diploid (at best) & low coverage  R.K.Wilson 2007

  22. Solexa/Illumina 1G Analyzer

  23. Solexa/Illumina 1G Analyzer Illumina flow cell • Acts as the microfluidic conduit for cluster generation and sequencing reagents. • 8-lane flow cell configuration. • Separate libraries can be sequenced in each lane, or the same library in all. • ~60M clusters are sequenced per flow cell.

  24. Next Generation Sequencing Technologies

  25. AML: Whole Genome Sequencing Data types: • Whole genome sequence (tumor genome): Solexa • FL cDNA normalized library: Solexa + 454 • Whole genome sequence (epidermal genome): Solexa • Compare sequence to previously identified mutations. • Compare increasing coverage levels to heterozygous SNPs from Affy/Illumina arrays for coverage evaluation. • Devise strategic approaches to find novel variants; validate and characterize. Analysis plans:

  26. “933124” • 57 y/o Caucasian female • De novo M1 AML • 100% blasts in initial BM sample • Relapsed and died at 11 months • Normal cytogenetics • No LOH on Affy 500K SNP array • Informed consent for whole genome sequencing

  27.  R.K.Wilson 2007

  28. AML: Whole Genome Sequencing • As of 1/28/08: • 75 Solexa runs completed (32 bp reads) • 62 billion bp (~22X haploid coverage) • 2,123,143 sequence variants detected (Q30) • 492,569 (23.2%) are previously undiscovered SNPs • 46,320 heterozygous (informative) SNPs from Affy and Ilumina SNP arrays. • 77% of informative SNPs with both WT and variant alleles were detected in the genome sequence. • 97.4% of informative SNPs of either allele were detected in the genome sequence.  R.K.Wilson 2007

  29. AML: Whole Genome Sequencing “933124” genome sequence 2,123,143 variants Intergenic 145,092 Genic 334,477 dbSNP 1,630,574 Splice_site 99 Coding 5,056 Other 329,322 Synonymous 1,222 Missense 3,402 Nonsense320 Nonstop 9 *Only reporting Q30 variants *Genic region = gene boundary +/- 50kb

  30. AML: Transcriptome Sequencing Various cDNA library construction procedures & normalization schemes 454 cDNA sequencing: Number of mapped cDNA reads: 306,267 SolexacDNA sequencing: Number of mapped reads: 47,153,784

  31. AML: Transcriptome Sequencing Expressed genes: variant:germline frequencies • MYCBP2 1188:345 • HSP90B1 694:1347 • BCCIP 391:394 • NCOR1 256:268 • CHFR 230:52 • DNAJ 218:0 • PTPN11 198:1 • NUMA1 157:2 • CASPASE 7 145:147 • HOX C6 118:2 • PLEKHC1 112:14 • NTRK3 112:10 • CDC2 96:82  R.K.Wilson 2007

  32. V194M (C to T) in FLT3 CT CT cDNA sequence Tumor genome sequence

  33. AML: Whole Genome Sequencing • Currently using SXOligoSearchG (Synamatix) to detect small (1-2 bp) indels. • Evaluating software tools for detection of larger indels.

  34. AML: Current status thirsty for knowledge?  R.K.Wilson 2007

  35. AML: Current status • Diploid coverage was obtained for 77% of an AML M1 tumor genome with 22x haploid coverage. • 2.1M sequence variants found (similar to other whole genomes already ‘finished’). • ~495,000 novel variants: SNPs vs. somatic mutations • 10x coverage of epidermis (“normal”) genome just completed; may identify >90% of variants as rare SNPs. • Remaining 50,000 variants are being prioritized by detection in cDNA: should be <1,000 • Very rare somatic mutations in cDNA thusfar (only 2 validated). • No mutator (“driver”) phenotype is readily apparent for this AML case; ”passenger” mutations appear to be rare. • We continue to sift through the data…  R.K.Wilson 2007

  36. Cancer Genomics • Exon-targeted sequencing (TSP, glioblastoma) is revealing useful & interesting findings; expensive & slow! • Next Gen sequencing is here and will have a substantial near-term impact on the study of cancer genomes! • Ancillary genome-based technologies (expression profiling, SNP arrays, cDNA sequencing) are crucial for understanding the target genome before considering WGS. • The dream is not hype: a comprehensive understanding of the “cancer genome” is probable, and will change the way that you diagnose & treat your patients.  R.K.Wilson 2007

  37. Acknowledgments • WU Genome Sequencing Center Elaine Mardis, Li Ding, Dave Dooling, Tracy Miner, Mike McLellan, Ginger Fewell, Jim Eldred, Asif Chinwalla, Yumi Kasai, Lucinda Fulton, Vince Magrini, Matt Hickenbotham, Lisa Cook, Michael Wendl, Michael Province • WU Siteman Cancer Center Tim Ley, Mark Watson, Matt Walter, Rhonda Ries, Jackie Payton, John DiPersio, Dan Link, Michael Tomasson, Tim Graubert, Sharon Heath • TSP/TCGA Colleagues • Baylor HGSC, Broad Institute, many others… • Funding sources • NHGRI (Wilson), NCI (Ley), Alvin J. Siteman (AML WGS) genome.wustl.edu

More Related