360 likes | 450 Views
Cambridge, July 16, 2010. Cis - regulatory SNPs altering transcription detected by allelic expression mapping. Tomi Pastinen, MD, PhD Assistant Professor Departments of Human and Medical Genetics, McGill University McGill University and Genome Quebec Innovation Centre. Outline.
E N D
Cambridge, July 16, 2010 Cis-regulatory SNPs altering transcription detected by allelic expression mapping Tomi Pastinen, MD, PhD Assistant Professor Departments of Human and Medical Genetics, McGill University McGill University and Genome Quebec Innovation Centre
Outline • Allelic expression: principle and methodology • Catalogs of cis-regulatory SNPs (cis-rSNPs) • Applications
Regulatory SNPs & Phenotypes Coding variant Non-coding variant Coding variant Non-coding variant Type 1 Diabetes Asthma
relative allele ratios can be used as quantitative trait for mapping local cis-regulatory variation in phased samples Allelic Expression (AE) mRNA (or pre-mRNA) T Expected equal expression of allelic transcripts T T C C 1 1 = C T T T T 2 1 Observed biased allelic expression C C =
Mapping cis-variation by AE test T T T Population panel of cells (CEU & YRI LCLs from HapMap) Illumina Human 1M Duo (currently 2.5 M quad) C T T Allelic Expression (AE) Measurement C C AE mapping in phased chromosomes A B A B A A AE association +ve cis-regulatory SNP (cis-rSNP) AE association -ve
Variability of allele ratio [b = Y/(Y+X)] needs to be accounted for when comparing cDNA to gDNA Normalization of allele ratios cDNA cDNA gDNA gDNA
Reproducibility of quantitative AE We use heterozygote ratio difference in phased gDNA and cDNA genotype data Dhet ratio = RNAc1/(c1+c2) - DNAc1/(c1+c2) R2 = 0.72 in biological replicates (CEU LCL) R2 = 0.83 in technical replicates (CEU LCL) R2 = 0.76 in biological replicates in same environment (osteoblast) R2 = 0.61 in individual but in different cell culture condition (osteoblast)
USE of ae-phenotypes in mapping rSNPs • single point allelic expression is noisy • heterozygosity low using coding SNPs only Dhet ratio phased RNA (cDNA) and genomic DNA (gDNA) genotyping data from same individual are averaged across multiple sites in primary transcripts P = 2x10-9 BA BB/AA AB C1C2 genotype
Full Transcript (AFF3 = ~600Kb), 5’ Association Heritable Allelic expression (1) A A B B AE measurements across large genes
Differential 5’ Exon Usage (CUGBP2), 5’ Association Heritable Allelic expression (2) Allele-specific expression of long isoform
Common SNPs govern cis-regulation • on average > 50% of population variance in cis-regulation can be explained by common SNPs in associated loci • 5-10x more fxn variation revealed as compared to cis-eQTL mapping • >90% of mapped cis-rSNPs behave as expected in the offspring (Mendelian inheritance)
observe large effect sizes for associated variants common cis-variants affect >30% of measured RefSeq transcripts low-throughput methods show converging data for 75% of genome-wide significant AE mapping results, but diversity of mechanisms suggested Summary of AE Associations (CEU) Ge et al., Nature Genet. 2009
Transcription Altering cis-rSNPs 1) Large effect size (> 1.2-fold difference between cis-rSNP heterozygotes) across full length transcripts 2) Most SNPs (>75%) of all available SNPs in primary transcript above signal cut-off 3) Consistent allelic effects across introns and exons of the primary transcript (for transcripts fulfilling criteria 1+2, the proportion with exon – intron r2 > 0.3 is >90%)
Transcriptional Cis-rSNPs in 3 Panels • ~17% of genes
of top cis-eQTLs up to 50% of AE-mapping data show converging cis-rSNP; but given the high discovery rate only ~10% of cis-rSNPs yield significant cis-eQTL (Ge et al. 2009) But comparison of AE mapping data in YRI LCLs vs. YRI RNA-seq. data shows converging effects for vast majority of transcriptional cis-rSNPs Comparison to cis-eQTLs
1000 Genomes & Catalogs of Cis-rSNPs 11 CEU SNP YRI SNP • simple scoring based on deviation from expected heterozygosity among samples showing unequal/equal AE Fine-map region of shared association to look for causal cis-rSNPs 6.8 CEU -log10(P-value) 6.3 YRI -log10(P-value) Instead of resequencing this region, use 1000 Genomes data CEU+YRI -log10(P-value) 96 14 94 % 1000G sites scored % HapMap 0-score sites captured 92 12 90 10 88 8 86 6 84 82 4 -5 -4 -3 -2 -1 0 1000G Score Cutoff
Cis-rSNPLocalization (2) Overlapping transcription altering cis-rSNP Pperm < 0.001 in FB Pperm > 0.001 in FB
Cis-variation subtypes (1) 5’ proximal cis-rSNPs altering regulation of DISC1 in a cell type independent manner Most common type of cis-rSNPs
Cis-variation subtypes (2) 5’ distal cis-rSNPs altering regulation of PTGER4 in a cell type dependent manner 3rd most common type of cis-rSNPs
Cis-variation subtypes (3) 3’ distal cis-rSNPs altering regulation of EFNA5 in a cell type dependent manner least common type of cis-rSNPs 5’ distal cis-rSNPs altering regulation of EFNA5 in a cell type dependent manner
Cis-rSNP validation rs17658686CG In vitro validation of intronic enhancer rSNP (rs909685) Input C*+non specificcompetitor G*+non specificcompetitor C*-nuclea extraction G*+G competitor C*+G competitor G*+C competitor C*+C competitor C*-competitor G*-competitor FAIRE In vitro validation of promoter rSNP (rs344071) MNase Allele-specific DNA-protein interactions
Function of Disease Variants Genetic association Functional association Ge et al. Nature Genetics 2009
Mechanisms of rSNP Action (1) Functional association Potential mechanism Verlaan et al. AJHG, 2009
Dissection of GWAS associations Examples: Preliminary observations: Creutzfeld-Jacob’s disease: PRNP LDL cholesterol: HMGCR CRP levels: IL6R Crohn’s disease: IL23R Plasma homocysteine: CBS Tooth development: HOXB2 CIS-rSNPs POTENTIALLY EXPLAINING DISEASE ASSOCIATIONS ARE ENRICHED FOR TISSUE SPECIFIC VARIANTS
common haplotypes harbor functional alleles altering cis-regulation in most human genes • cis-regulatory SNPs altering transcription can be characterized by: • specific assessment of population variation in cis-regulation (AE-mapping) • fine-mapping using sequenced genomes (1000G/imputation for common variants) • intersection with functional genomic data (ENCODE) • regulatory variation in complex genomic regions (overlapping transcripts), or causing post-transcriptional effects require other tools (strand-specific assays/RNA-seq.) • large-scale, orthogonal validation tools need to catch up with mapping Conclusions
Acknowledgements McGill University and Génome Québec Innovation Centre • McGill University • Mathieu Blanchette • Brent Richards • Anna Naumova • SoizikBerlivet • SannyMoussette • Université de Montréal • Daniel Sinnett • Nina N’Diaye • Manon Ouimet • Vincent Gagné • Patrick Beaulieu • Robert Hamon • Illumina Inc. • Dmitry Pokholok • Jennie Le • Kevin Gunderson • GEFOS Consortium • 1000 Genomes Project • ENCODE Project • Southwest Foundation for Biomedical Research • Harald H.H. Göring • Harvard Medical School • Benjamin Raby • Scott Weiss • Jehyuk Lee • George M. Church Pastinen Lab Tony Kwan, Véronique Adoue, Lisanne Morcos, Dominique J Verlaan, Tomi Pastinen, Elin Grundberg, Vonda Koka, Kevin Lam, Bing Ge Alexandre Montpetit, Eef Harmsen, Joana Dias, Rose Hoberman, Ken Dewar
Systematic annotation of cis-rSNPs Potential new transcript RLBP1L1 • top associated SNP from AE-mapping • highest scoring 1000 Genomes site
Systematic annotation of cis-rSNPs Region of active chromatin Histone marks are tissue-specific Region for RNA polymerase 2 binding Highly conserved regions of regulatory potential
REMINDER: not all AE is Mendelian But most comprehensive survey of imprinting to date suggests that <100 imprinted loci exist as compared to thousands of loci modulated by cis-rSNPs Morcos et al. manuscript in prep.