100 likes | 227 Views
Geuvadis WP4: RNA sequencing Progress, Aims and Data. Tuuli Lappalainen University of Geneva. Geuvadis Analysis Group Meeting II July 11, 2012, Barcelona. Genomics, meet transcriptomics RNA sequencing of ~500 individuals from the 1000 Genomes. FIN. GBR. CEU. TSI.
E N D
Geuvadis WP4: RNA sequencingProgress, Aims and Data Tuuli Lappalainen University of Geneva Geuvadis Analysis Group Meeting II July 11, 2012, Barcelona
Genomics, meet transcriptomicsRNA sequencing of ~500 individuals from the 1000 Genomes FIN GBR CEU TSI Integrated haplotypes of SNPs, indels, structural variants 38 M total 10.7 M common (>5% MAF) YRI 1000 Genomes Phase 1 paper has just been submitted to Nature
Why are we doing this? • Technical • How to do RNAseq in a distributed setting? • Biological • How does genome variation affect transcriptome variation? • Resource • The biggest genome+transcriptome reference dataset thus far
Progress and timeline mappingQC cell line shipments and growing Biology of Genomes sequencing 2011 2012 analysis! 1st analysis meeting study design sample selection RNA extraction 10/1212 papersubmission
Sequencing data mRNA miRNA
Status of the data: mRNA • Available data • fastqs • bams: GEM (and bwa) • quantifications (Gencode v12) • exons: read counts • genes: read counts • transcripts from flux: read counts & RPKM • splice junctions…almost • introns…almost • QC • final set of samples to include & covariate analysis almost done… • Pending: • exon inclusion/exclusion quantifications • qualitative/quantitative analysis • final exon links • N-TARs • fusion genes final version? • …
Status of the data: miRNA • Available data • fastqs • first run of • QC • quantifications • novel miRNAs • To Do: • finalversions of the analysis • …
Status of the data: genotypes • Available data • combined QCd genotype files (vcf) • 423 individuals from 1000g Phase 1 + 22 individuals imputed from Omni2.5M data • annotation of the variants • all the variants reannotated using Gencode v12 by LoF group, in a new format in the vcf info field • vcf sites file available, documentation coming this week • To Do • Eigenstrat
Data and documentation • Data storage • ftp • ENA / Arrayexpress • Documentation: wiki • keep on documenting your work... • switch to a new system? • Writing tools • Mailing lists: • geuvadis_rnaseq@lists.crg.es • geuvadis_rna_analysis@lists.crg.es • lofgeuvadis@googlegroups.com
The consortium UNIGE (Geneva) Manolis Dermitzakis StylianosAntonarakis Tuuli Lappalainen Thomas Giger IsmaelPadioleau Alisa Yurovsky HalitOngen Alfonso Buil Emilie Falconnet Luciana Romano Alexandra Planchon CRG/CNAG/USC (Barcelona) Xavier Estivill Ivo Gut RodericGuigo Angel Carracedo Alvarez Gabrielle Bertier MichaSammeth ThassoGriber Paolo Ribeca Jean Monlong Pedro Ferreira Esther Lizano Marc Friedländer Marta Gut SergiBertranAgullo ICMB (Kiel) Stefan Schreiber Philip Rosenstiel Matthias Barann Daniela Esser MPIMG (Berlin) Hans Lehrach Ralf Sudbrak Marc Sultan VyacheslavAmstislavskiy HMGU (Munich) Thomas Meitinger Tim Strom Thomas Wieland Thomas Schwarzmayr LUMC (Leiden) Gert-Jan van Ommen Peter ‘tHoen Irina Pulyakhina HenkBuermans UU (Uppsala) Ann-Christine Syvänen OlofKarlberg Jonas Almlöf Mathias Brännvall EBI AlvisBrazma NataljaKurbatova Mar Gonzalez-Porta LilianaGreger Oxford University Mark McCarthy Manuel Rivas Massachusetts General Hospital Daniel McArthur MonkolLek ECACC Bryan Bolton Karen Ball Edward Burnett Jim Cooper The me who is missing!