330 likes | 466 Views
RNA-seq experiences and plans LUMC. Peter A.C. ’ t Hoen Human Genetics, LUMC. Pipelines. PASSion: a pattern growth algorithm-based pipeline for splice junction detection in paired-end RNA-Seq data Bioinformatics 28:479-86 (2012)
E N D
RNA-seq experiences and plansLUMC Peter A.C. ’t Hoen Human Genetics, LUMC
Pipelines • PASSion: a pattern growth algorithm-based pipeline for splice junction detection in paired-end RNA-Seq data Bioinformatics 28:479-86 (2012) • eMiR: pipeline for mapping, 5p-3p resolution and annotation of miRNAs BMC Genomics 11:716 (2010)
Other studies: Methods • Tag-based: one read per transcript • DeepSAGE most 3’ CATG • DeepCAGE 5’-end • RNA-Seq: multiple reads per transcript • Whole mRNA sequencing after fragmentation
DeepSAGE – sample preparation PCR enrichment and gel purification (~85bp)
Example gene: Gapd 14542 12555
Expression profiling in a human cohort • 105 subjects with GWAS and phenotype data • RNA isolated from total blood • Expression profiling by deep-SAGE • 95 passed all QC
Analysis pipeline • Trimming / addition of nucleotides • Genome alignment (Bowtie) • UCSC genome browser .wiggle files for visualization • Annotation (ENSEMBL/Biomart) • Reads summed per gene • OR tagwise analysis
Gender-specific gene expression male Normalized expression of Y-chr genes female Normalized XIST expression
Genes associated with BMI • Differential expression analysis 1. In edgeR (designed for count data) 2. In limma (designed for microarray data; voom: mean-variance model) • Gender as confounder
Limma and edgeR reasonably consistent -log10 P-value In red: high expressed genes
Example polyA profiling on Helicos Eleonora de Klerk
Oculopharyngeal muscular dystrophy: General switch to shorter 3’-UTRs Eleonora de Klerk
Example RNA-Seq (Helicos) ADAMTS8 ADAMTS15 NOV Peter Henneman
Analysis of pre-mRNA processing pre-mRNA splicing intermediate mRNA mature mRNA Irina Pulyakhina
Pre-mRNA analysis tools • map to both exon-exon junctions and introns; • prioritize intronic alignments; • report multiple alignments; • deal with both low and high coverage; • deal with indels and mismatches; • find novel exons and splice sites; • look for both canonical and non-canonical splice sites
GSNAP Difference between TopHat and GSNAP results: G T A T C G A T T T T T T TG T . . . GSNAP TopHat T T T T T T TG T TopHat alignment: GSNAP alignment: T T T T T T TG C . . . G T A T C G A T T T T T T TG T . . . T T T T T T TG T T T T T T T TG T
pre-splicing Normal (standard) insert size Intermediate (pre+post) post-splicing int int int ex int e-i int e-e ex ex ex e-i ex e-e e-i e-i e-i e-e e-e e-e ? Extremely large insert size int int int ex int e-i int e-e ex ex ex e-i ex e-e e-e e-e e-i e-i e-i e-e
Plans for GEUVADIS • Transcription of repeat sequences such as Macro Satellite Repeats • Study effect on local and global gene expression • Study heterogeneity of transcripts expressed from repeats
FSHD: disease mechanism Lemmers et al. Science 329:1650-3 (2010)
Acknowledgements Rick Jansen Jeroen van Zanten Gerard van Grootheest Brenda Penninx Jan Smit Joukejan Hottenga Gonneke Willemsen Dorret Boomsma Eco de Geus Shoaib Amini Irina Pulyakhina Eleonora de Klerk Henk Buermans Yanju Zhang Kai Ye Jeroen Laros Johan den Dunnen Gertjan van Ommen NTR