1 / 44

Regulatory variation and its functional consequences

Regulatory variation and its functional consequences. Chris Cotsapas cotsapas@broadinstitute.org. Motivating questions. How do phenotypes vary across individuals? Regulatory changes drive cellular and organismal traits Likely also drive evolutionary differences

isha
Download Presentation

Regulatory variation and its functional consequences

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Regulatory variation and its functional consequences Chris Cotsapas cotsapas@broadinstitute.org

  2. Motivating questions • How do phenotypes vary across individuals? • Regulatory changes drive cellular and organismal traits • Likely also drive evolutionary differences • How are genes (co)regulated? • Pathways, processes, contexts

  3. Regulatory variation • What do “interesting” variants do? • Genetic changes to: • Coding sequence ** • Gene expression levels • Splice isomer levels • Methylation patterns • Chromatin accessibility • Transcription factor binding kinetics • Cell signaling • Protein-protein interactions ~88% of GWAS hits are regulatory

  4. Genetic variation alters regulation • Protein levels • Maize (Damerval 94) • Expression levels • Yeast, maize, mouse, humans (Brem 02, Schadt03, Stranger 05, Stranger 07) • RNA splicing • Humans (Pickrell 12, Lappalainen 13) • Methylation and Dnase I peak strength • Humans (Degner 12; Gibbs 12)

  5. Genetics of gene expression (eQTL) • cis-eQTL • The position of the eQTLmaps near the physical position of the gene. • Promoter polymorphism? • Insertion/Deletion? • Methylation, chromatin conformation? • trans-eQTL • The position of the eQTLdoes not map near the physical position of the gene. • Regulator? • Direct or indirect? Modified from Cheung and Spielman 2009 Nat Gen

  6. Cis- eQTL analysis: Test SNPs within a pre-defined distance of gene 1Mb 1Mb 1Mb window probe gene SNPs

  7. QT association • Analysis of the relationship between a dependent or outcome variable (phenotype) with one or more independent or predictor variables (SNP genotype) Slope: b1 Linear Regression Equation Yi =b0+b1Xi+ei b0 Continuous Trait Value Logistic Regression Equation 0 1 2 pi ln( ) Number of A1 Alleles =b0+b1Xi+ei (1-pi)

  8. gene 4 gene 1 eQTL analysis: a GWAS for every gene gene 2 gene 3 gene N gene 5

  9. cis-eQTLs are rather common Nica et al PLoS Genet 2011

  10. Cis-eQTLs cluster around TSS Stranger et al PLoS Genet 2012

  11. transhotspots (yeast) Brem et al Science 2002

  12. Yvert et al Nat Genet 2003

  13. Candidate genes, perturbations underlying organismal phenotypes does regulatory variation alter phenotype? Application to GWAS

  14. Rationale • How do disease/trait variants actually alter biology? • If they change regulation, then: • Change in gene expression/isoform use • Phenotypic consequence*

  15. Compare patterns of association GWAS peak eQTL for gene 1 eQTL for gene 2

  16. Pearson’s covariance for windows of 51 SNPs between –log(p) in 2 traits CD GWAS p eQTL p No peak when there are independent hits near each other Detect a peak when effect is the same

  17. Crohn’s/eQTL analysis • CD meta analysis (GWAS only) • CEU Hapmap LCL eQTL data • Overlapping SNPs only (eQTL data has 610K SNPs, most in CD meta-analysis) • Test 133 associations (total 1054 tests) GWAS peak eQTL for gene 1 eQTL for gene 2

  18. Crohn’s/eQTL analysis A peak implies that the same effect drives GWAS and eQTL

  19. MS/eQTL analysis A peak implies that the same effect drives GWAS and eQTL

  20. Open question Does regvar reveal co-regulation? A.K.A. Where are the trans eQTLS?

  21. gene 4 gene 1 Whole-genome eQTL analysis is an independent GWAS for expression of each gene gene 2 gene 3 gene N gene 5

  22. Issues with trans mapping • Power • Genome-wide significance is 5e-8 • Multiple testing on ~20K genes • Sample sizes clearly inadequate • Data structure • Bias corrections deflate variance • Non-normal distributions • Sample sizes • Far too small

  23. But… • Assume that transeQTLs affect many genes… • …and you can use cross-trait methods!

  24. Association data

  25. Cross-phenotype meta-analysis L(data | λ≠1) SCPMA ~ L(data | λ=1) Cotsapas et al, PLoS Genetics

  26. CPMA for correlated traits • Empirical assessment to account for correlation • Simulate Z scores under covariance, recalculate CPMA • Construct distribution of CPMA for dataset, call significance with Ben Voight, U Penn

  27. Experimental design CEU CPMA scores 610,180 SNPs MAF >0.15 CEU and YRI LD pruned (r2 < 0.2) CEU p-values Transcript ~ SNP, sex plink CPMA YRI CPMA scores >95%ile sim CPMA YRI p-values Transcript ~ SNP, sex 8368 transcripts Detectable on Illumina arrays 108 CEU individuals* 109 YRI individuals* * Stranger et al Nat Genet 2007 (LCL data; publicly available)

  28. Target sets of genes • trans-acting variant: SNP with CPMA evidence • Target genes: genes affected by trans-acting variant (i.e. regulon)

  29. Prediction 1 • Allelic effects should be conserved between two populations • Binomial test on paired observations for all genes P < 0.05 in at least one population Genes pCEU < 0.05 Genes pYRI < 0.05 True for 1124/1311 SNPs (binomial p < 0.05)

  30. Prediction 2 • Target genes should overlap • Identify by mixture of gaussians classification • Empirical p from distribution of overlaps between NCEU and NYRI genes across SNPs. Genes pCEU < 0.05 Genes pYRI < 0.05 True for 600/1311 SNPs (empirical p < 0.05)

  31. What about the target genes? • Regulons: • Encode proteins more connected than expected by chance www.broadinstitute.org/mpg/dapple.php Rossin et al 2011 PLoS Genetics

  32. What about the target genes? • Regulons: • Encode proteins enriched for TF targets (ENCODE LCL data) • 24/67 filtered TFs significant • Binomial overlap test trans target genes CHiPseq LCL target genes

  33. Summary • Regulatory variation is common • It affects gene expression levels • Likely many other types: • DNA accessibility, chromatin states • Transcript splicing, processing, turnover • Has phenotypic consequences • GWAS • Some cellular assays (not discussed here)

  34. Open questions • Discover regulatory elements (cis) • Promoters, enhancers etc • Gene regulatory circuits (trans) • Dynamics of regulation • Splicing variation, processing, degradation • Phenotypic consequences • Cellular assays required • Tie in to organismal phenotype

  35. RNAseq, GTEx Next-gen sequencing data

  36. GTEx – Genotype-Tissue EXpression An NIH common fund project Current: 35 tissues from 50 donors Scale up: 20K tissues from 900 donors. Novel methods groups: 5 current + RFA

  37. How can we make RNAseq useful? • Standard eQTLs • Montgomery et al, Pickrell et al Nature 2010 • Isoform eQTLs • Depth of sequence! • Long genes are preferentially sequenced • Abundant genes/isoforms ditto • Power!? • Mapping biases due to SNPs

  38. RNAseq combined with other techs • Regulons: TF gene sets via CHiP/seq • Look for trans effects • Open chromatin states (Dnase I; methylation) • Find active genes • Changes in epigenetic marks correlated to RNA • Genetic effects • RNA/DNA comparisons • Simultaneous SNP detection/genotyping • RNA editing ???

More Related