360 likes | 456 Views
RARE Germline variability in pediatric leukemia. Cancer Biology Series January 29, 2013 Todd Druley, MD, PhD Assistant Professor of Pediatric and Genetics. Presenter Disclosure Information Todd E. Druley , M.D., Ph.D. Druley Lab / WUSM CGSSB.
E N D
RARE Germline variability in pediatric leukemia. Cancer Biology Series January 29, 2013 Todd Druley, MD, PhD Assistant Professor of Pediatric and Genetics
Presenter Disclosure InformationTodd E. Druley, M.D., Ph.D.Druley Lab / WUSM CGSSB In compliance with ACCME policy, WU requires the following disclosures to the session audience:
Why study rare variation? • Whole genomes show 2-4 million variants PER PERSON! • Only about 25 – 33% of these are common (>2% MAF). • There are roughly 22,000 human genes • This equals ~40,000,000 nucleotides total for all of our genes. • ~1.5 % of the entire genome • If 2 individual genomes differ by: • 2M x 0.67 = 1,340,000 nucleotides • There are 1.8 x 1012 possible combinations between the two genomes!!
Common vs. Rare Variants • Critical differences between common and rare variant analysis include: • Rare variants have greater effect sizes [average OR=3.7] (Bodmer Nat Genet 2008) • Disruptive rare variants are more likely to act dominantly (Fearnhead Cell Cycle 2005) • Rare variants are individually rare, but collectively common when collapsed (binned) within a genetic locus or metabolic pathway (Cohen Science 2004; Ji Nat Genet 2008)
Antonarakis SE et al. Nature Rev Genet 2009. We’re operating here “Private”
Example: • Cystic Fibrosis • Originally thought that only the ΔF508 mutation was causative for CF. • Sequencing of the CFTR gene was initiated. • Now over 1000 mutations in CFTR have been documented. • Cause various severities of cystic fibrosis. http://www.ccb.sickkids.ca/index.php/cystic-fibrosis-mutation-database.html
Complex diseases demonstrating increased rare variation AJHG 80, 779-791; 2007 • Psychiatric illness, cancer, autoimmune disorders, heart disease, height, extreme longevity, many others… • Obesity • High Cholesterol • Sequenced two groups of 128 individuals each
What about pediatric cancer? • “Early onset cancer” = defined as cancer <50 years old • Germline “cancer causing gene alleles” (TP53, APC, BRCA1) – average age of disease onset is 20’s • Cannot explain the incidence of pediatric cancer by somatic mutation. • Epi studies have failed to explain exposures causing these cancers. • Almost all pediatric cancer patients have a negative family history. • So why do we see ~3 children/week with a new cancer??
Infant acute leukemia – worst outcomes • ~50% mortality, 67% with MLL-rearrangements • MLL regulates developmental transcription (HOX genes) • Survivors often left with developmental problems • COG AE24 “Epidemiology of Infant Leukemia” • Largest case-control study to date looking for pre/perinatal exposures associated with infant leukemia • Topoisomerase II inhibitor exposure during pregnancy • Only associated with AML, but didn’t impact survival • Ross JA, J Nat Cancer InstMonogr 2008
Pilot exome sequencing experiment • GERMLINE exome sequencing from 25 pairs of mothers and infants with MLL-negativeacute leukemia • Julie Ross, PhD (PI) and Amy Linabery, PhD. • We are looking at genes with rare variants in affected infants, but also inherited from mothers • These parents typically don’t have leukemia or other cancers. • We hypothesize a combinatorial effect from parental variants contributes to the early onset/short latency of leukemia.
Demographics 25 pairs of Caucasian mothers and infants: 12 ALL, 13 AML
Validated bioinformatics • We analyzed exome data using a validated bioinformatic pipeline: • Align using Novoalign • Call variants with SAMtools • Sensitivity = 97% • Specificity = 99.8%
Variant calls in COSMIC genes • Prioritize by comparing our variant calls in genes already associated with hematologic malignancies in the COSMIC database. • http://www.sanger.ac.uk/genetics/CGP/cosmic/ • ALL (126 ALL-associated genes) • Infants = 695 total variants (481 known, 214 novel) • Mothers = 728 total (588 known, 140 novel – 65%) • AML (657 AML-associated genes) • Infants = 5517 total (3961 known, 1556 novel) • Mothers = 4735 total (4264 known, 471 novel – 30%)
Permutation testing • Average: ALL = 5 variant genes/infant, AML = 6 variant genes/infant Null distribution Null distribution Both sets of infants have a statistically significant (P<10-7) enrichment of novel, non-synonymous, deleterious germline variants in genes associated with hematopoietic malignancies (COSMIC). Mark Valentine
Validation • No significant enrichment in randomly chosen gene sets in infants • No significant enrichment in random or leukemia gene sets in Caucasian unaffected exomes • Unlikely to see the same novel variant in only related mother : infant pairs by chance. • 45% in ALL; 23% in AML • Consistent with maternal totals of 65% & 30%, respectively • Sanger validation of other variants is ongoing
micro-RNA regulation? • Many variant candidate genes are regulated by MIRs independently associated with leukemia and cell cycle regulation: Nick Sanchez
Pathway Analysis • ABC transporters • Developmental defects • Chloride channel regulator activity • Transcription factor dysregulation • YYI, Cdx, HNF1, MAF, EA2 • TDG glycosylase mediated binding and cleavage of a thymine, uracil or ethenocytosine opposite a guanine
Implications / Conclusions • Supports the hypothesis that infants with leukemia are born with a putatively functional enrichment of variation in genes associated with leukemogenesis. • Infants with AML have an excess of novel, nonsynonymous, deleterious variation not from mother. • Paternal age = de novo mutation during spermatogenesis? • De novo mutation during embryogenesis? • Can we identify discreet biological/developmental and regulatory mechanisms leading to early onset leukemia? • MIRs • ABC transporters • Specific transcription factors
Future work: SHORT TERM: • Complete the bioinformatic analysis • Compare to existing data (TARGET and PCGP) • Exome sequencing of 25 MLL-positive pairs LONG TERM: • Validate results in a second cohort of triads • Establish model systems to study complex genetic interactions • Integrate information into clinical trials?
High-risk pediatric ALL: Pooled sequencing • Patient germline (N=96) • Patient leukemia (N=96) • Unaffected controls (N=93) 55 genes per pool
Candidate genes for pooled sequencing • 55 genes selected for pooled sequencing • All genes have been published in relation to pediatric ALL • 43 were identified near significant tagged-SNPs on the prior array (asterisks) • Various cellular functions
Pooled sequencing pilot project • Sequenced 94.5% of coding regions from all three pools. • 420 kb per person = 1.2 x 108 total bases covered
Validation at 384 base positions by custom Illumina GoldenGate array
Overlap • 49% of called variants are unique to the ALL Germline pool • Only 2.5% of Leukemia variants were NOT seen in the Germline pool (97.5% overlap) • Somatic mutations
Visualizing the dataset Leukemia SNPs (x) Germline SNPs (+) Amplicons Control SNPs (Δ) High Conservation Across Species Low Joe Giacalone Mark Valentine
Visualizing the dataset Leukemia SNPs (x) Germline SNPs (+) Amplicons Control SNPs (Δ) High Conservation Across Species Low • No variants in control group • Multiple variants in affected germline • Overlap with highly conserved region Joe Giacalone Mark Valentine
+ + + + Mark Valentine
Exome variant server overlay Drew Hughes
All looking at known ancestral polymorphisms and the incidence of acute leukemia. • None involve sequencing to demonstrate novel/rare variants in the same genes.
Overexpressed genes: • ATM • CDKN1A • CYP1A1 • CYP3A5 • IKZF1 • MDM2 • MLL • MTHFR • NAT2 • NQO1 • PAX5 • PTPN11 • TCF3 • TPMT
Overexpressed genes: • ATM • CDKN1A • CYP1A1 • CYP3A5 • IKZF1 • MDM2 • MLL • MTHFR • NAT2 • NQO1 • PAX5 • PTPN11 • TCF3 • TPMT 6 of 14 overexpressed genes (43%) are involved in drug metabolism.
Additional gene expression profiles • Similar expression differences in 18 additional genes (5 overexpressed CYPs). • All genes possess ≥1 novel coding variant in P9906 patients. • No clear connection between genetic variation and gene expression. Drew Hughes
Implications / Conclusions: • Overexpression of specific genes involved in metabolism of anti-leukemia agents identifies a subgroup of children with inferior EFS. • Private sequence variation in drug/energy metabolism genes is not coupled to expression profiles, but may predispose to leukemia or modulate therapeutic response through defective metabolism. • Pathogenesis vs. pharmacogenomics? • Therapeutic implications: • Can look for these genomic signatures at diagnosis; existing precedent • Dose modification or direct to bone marrow transplant
Future work: • Validation and identification of individual profiles. • Delve more into the underexpressed genes as well. • Analyze sequencing results of ~700 additional drug/energy metabolism genes. • Functional iPSC-based assays from patient fibroblasts. • Introduction into immune-deficient mice for functional study.
Acknowledgements & Funding Wash U: • Bob Hayashi • Alan Schwartz • Rob Mitra • F. Sessions Cole COG: • Julie Ross • Logan Spector • Mignon Loh • Rick Harvey Druley Lab: • Nick Sanchez • Mark Valentine • Joe Giacalone • Drew Hughes • Andrew Young 1K08CA140720-01A1 Eli Seth Matthews Leukemia Foundation™