470 likes | 1.21k Views
DNA copy number variation and cancer risk. John F Pearson. Canterbury Statistics Open Day University of Canterbury 2/10/2012. Breast Cancer . Foulkes WD. N Engl J Med 2008; 359:2143-2153. Missing heritability. TA Manolio et al. Nature 461 , 747 - 753 (2009) doi:10.1038/nature08 494.
E N D
DNA copy number variation and cancer risk John F Pearson Canterbury Statistics Open DayUniversity of Canterbury 2/10/2012
Breast Cancer Foulkes WD. N Engl J Med 2008; 359:2143-2153
Missing heritability TA Manolioet al. Nature461, 747-753 (2009) doi:10.1038/nature08494
Copy number variation Allele 1 Allele 2 Copy number loss Copy number gain Whole gene Partial gene Contiguous genes Regulatory effects
Copy number variants (CNVs) • 16,000 copy number variant loci cover >50% of the human genome • CNVs are associated with cancer risk • Rare CNVs detected in ~50% of familial cancer genes eg. BRCA1, BRCA2 • Genome-wide association studies of cancer • prostate cancer, hepatocarcinoma, nasopharyngeal carcinoma, and neuroblastoma • Increased CNV load • Li FraumeniSyndome (cancer related genes?) • breast cancer (TP53 pathway, ESR1 pathway)
SNP arrays LRR = log2(Robserved/Rexpected) The B Allele Frequency (BAF) is a somewhat confusing term that actually refers to a normalized measure of relative signal intensity ratio of the B and A allelesWang et al Genome Res. 2007 November; 17(11): 1665–1674.
Copy number Copy number loss Copy neutral LOH Normal BB AB AA
Copy number gain Copy number gain BBB ABB AAB AAA
CNV calling Illumina bead arrays. • CNVision (workflow software) • Gnosis • PennCNV • QuantiSNP • CNV Partition CNV calling algorithms
PennCNV, QuantiSNP Hidden Markov Model Estimate copy number at each SNP from • Log R ratio • B allele frequency • transition probability at previous SNP.
PennCNV ri LRR bi BAF at SNP i. ( 1 ≤ i ≤ M ) zi copy number state The likelihood of the observed data is:
PennCNV ri LRR bi BAF at SNP i. ( 1 ≤ i ≤ M ) zi copy number state The likelihood of the observed data is: LRR emission probability model includes a term for chemical fluctuations and misannotation/assembly BAF emission probability complicated mixture model
PennCNV ri LRR bi BAF at SNP i. ( 1 ≤ i ≤ M ) zi copy number state Transmission probabilities between 2 adjacent SNPs i -1 and i. with copy numbers zi and zi-1 at distance di. D = 100Mb for state 4, 100kb for other states. p are unknowns, estimated by the Baum-Welch algorithm.
PennCNV • ri LRR • bi BAF at SNP i. ( 1 ≤ i ≤ M ) • zi copy number state • Baum-Welch used to train the model • Viterbi algorithm used to infer most likely path • CNV called whenever a stretch of states is different from normal( usually state 3 or 4)
Copy number gain Copy number gain BBB ABB AAB AAA
Breast cancer A characteristic of breast tumour cells is genomic instability BRCA1, BRCA2
BRCA1: known large deletions Detected Not detected • CNV prediction summary: • cnvPartition - 25% (4/16) • GNOSIS - 19% (3/16) • PennCNV- 88% (14/16) • QuantiSNP- 81% (13/16)
Endometrial cancer 1343 cases ANECS, SEARCH 655 female controls Hunter Community Study Want to find: • CNVs overlapping known susceptibility genes • novel CNVs in the mismatch repair pathway • common or rare CNVs associations QC(1) – GWAS criteria 619 controls 1279 cases CNVcalling by 4 algorithms 612 controls 1210 cases Case vs. control analyses
Association study CNV Regions
Association study CNV overlapping genes
Acknowledgements • University of Cambridge • Deborah Thompson • Paul Pharoah • Alison Dunning • Douglas Easton • Studies of Epidemiology and Risk Factors in Cancer Heredity (SEARCH) • University of Newcastle • Rodney Scott • Mark McEvoy • John Attia • Elizabeth Holliday • The Hunter Community Study • CIMBA consortium • MAYO clinic • Fergus Couch • University of Otago • Gemma Moir-Meyer • Logan Walker • Mackenzie Cancer Research Group • Queensland Institute of Medical Research • Mandy Spurdle • Felicity Lose • Yen Tan • Alex Metcalf • Australian National Endometrial Cancer Study • Bryony Thompson