100 likes | 195 Views
Geuvadis RNAseq analysis at UNIGE Analysis plans. Tuuli Lappalainen University of Geneva. Geuvadis Analysis Group Meeting, April 16 2012, Geneva. What we will do: Overview. Coordinate everything Get the data together: QC, normalization, data sharing
E N D
GeuvadisRNAseq analysis at UNIGEAnalysis plans Tuuli Lappalainen University of Geneva Geuvadis Analysis Group Meeting, April 16 2012, Geneva
What we will do: Overview • Coordinate everything • Get the data together: QC, normalization, data sharing • Regulation quantitative trait loci (rQTL): Common and rare cis-regulatory variants • Participate in Loss-of-Function analyses • Functional annotation of both common and rare regulatory variants • Population and evolutionary genetic analyses
Genetic effects on regulatory variation Fine-mapping the causal regulatory variants trans-eQTLs common/rare cis-variants independent effects splicing QTLs miRNA/mRNA interactions eQTL analysis ASE analysis splicing QTL analyses
Finding many needles and little hay • Technical variation reduces our power in eQTL analysis: correction of covariates such as library size, sequencing batches, GC content, % mapping reads… • Linear regression of covariates • Linear regression of ~10 PCs that are expected to be some sort of summaries of technical covariates • Population stratification may lead to false genetic associations • analyze EUR & YRI separately and correct for population structure within EUR with Eigenstrat • Reference allele mapping bias SNP INDEL cSNP ALT reads map worse or not at all simulation results of biased reads & sites remove from ASE test: filter biased reads from sams, redo quantifications & eQTL analysis reference genome
eQTLs : genotype association to regulatory phenotypes • The classical cis-eQTL analysis: • all genetic variants >5% MAF • 1MB from transcription start site • Spearman rank correlation with (normalized) exon read counts • permutations to assess significance • Expect a few thousand genes with an eQTL
Taking the eQTL approach further • Other phenotypes: • Gene expression levels: exon read counts or transcript quantifications? • splicing variation: links between exons (HalitOngen @ UNIGE), Barcelona’s transcript ratios • miRNA quantifications • Variation QTLs: variation between independent measures of an individual’s gene expression levels = stochastic variation in gene expression trans-eQTLs common/rare cis-variants independent effects splicing QTLs miRNA/mRNA interactions • Independent regulatory variants affecting the same gene • Regress out the first eQTL effect and redo the analysis • How to integrate eQTLs – sQTLs – vQTLs - miQTLs? exprvariance genotype
ASE analysis Statistical testing for ASE Is the allelic ratio different from 0.5 / 0.5? ciseQTL* coding SNP mRNA-sequencing T G C C T T T T T C A Thousands of data points per individual Less noisy than expression levels No direct information of the causal variant
ASE applications : population genetics of regulatory effects Clustering of individuals (and populations) Expression distance ASE distance Genetic distance Epistasis between regulatory and coding variants Deficiency of putatively deleterious coding variants with high expression of the derived allele (Lappalainen et al. 2011)
ASE applications : rare regulatory variants POOL OF INDIVIDUALS Sharing of rare ASE effect leads to excess of sharing of the haplotype We have developed a statistical method to look for ASE-genotype concordance to characterize rare regulatory variants (Montgomery et al. Plos Genetics 2011) NO ASE NO ASE NO ASE NO ASE ASE ASE Stephen Montgomery
Functional annotation of regulatory variants • Functional annotation of the genome: 1000g annotations, ENCODE, conservation, etc -> overlap with rQLTs • Can we finally get the causal variants?