1 / 12

Geuvadis Analysis Meeting

Geuvadis Analysis Meeting. 16/02/2012 Micha Sammeth CNAG – Barcelona. Quantification of Splice-Forms and Variants. - Quantified 615 datasets based on the Gencode v7 annotation. - Sensitivity is a function of sequencing depth. For every transcript, normalized RPKM values and

Download Presentation

Geuvadis Analysis Meeting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Geuvadis Analysis Meeting 16/02/2012 Micha Sammeth CNAG – Barcelona

  2. Quantification of Splice-Forms and Variants - Quantified 615 datasets based on the Gencode v7 annotation - Sensitivity is a function of sequencing depth • For every transcript, • normalized RPKM values and • number of deconvoluted reads Correlation coeff. 0.87 (Pearson and Spearman) - Discussion at the end if/what to do before uploading

  3. LoF Definitions [MacArthur et al. 2012] LOF = loss of function of a complete transcript LoF types SNP that introduces (directly) stop codon Indels that disrupt/shift reading frame SNP that disrupts splice site Larger deletions that remove 1st exon or >50% of transcript LoF scope X “partial” LoFaffects just some protein-coding transcripts in a locus X “full” LoF affects all protein-coding transcripts annotated X

  4. LoF Estimates Large deletion Splice across populations Frameshiftindel Large deletion Splice X X Stop in a single individual Stop Frameshiftindel [MacArthur et al. 2012]

  5. Compare RNA-Seq evidence to LoF predictions main difference Geuvadis <> 1000 Genomes: RNA-Seq vs. DNA-Seq } Frameshiftindel directly from mappings / coverage by mappings Large deletion X X X X predicted disruption of splice site X indirectly called from mappings

  6. Confirmation LoF SNPs in Geuvadis Stop - Take phase1 samples where polymorphisms have been found by exome sequencing - Additionally call SNPs by RNA-Seq (exzessive mappings) ~5000 differences, i.e.on average >2 out of 1000 calls differ Example: (not Geuvadis) >2 million genotype calls possible in both Experiments Sufficient coverage in DNA Sufficient coverage in RNA ~1000 cases where RNA is homozygous and DNA not could be explainable by allele-specific expression ~4000 cases where DNA is homozygous and RNA not (!!!) remove FPs from computational or experimental artifacts (PCR artifacts?)

  7. A/A A/A A/G A/G G/G G/G Allele-specific RNA Processing relativeabundancedistribution 1st form relativeabundancedistribution 2nd form [Montgomery 2010 dataset] 100% 1st 2nd 50% Homozygote Common Allele 0% or 100% 0% or 50% Homozygote Minority Allele Heterozygote

  8. LoF and Alternative Splicing (AS) “28.7% LoF events in a single individual affect only a subset of the known transcripts from the affected gene, Emphasizing the need to consider alternative splicing” [MacArthur et al. 2012] classification of AS influences in LoF based on a certain annotation (2) extension of an annotation by RNA-Seq evidence 5’ frame 3’ frame 2 0 2 1 ? X 2 0 X activation of latent splice sites

  9. 7 - 1,2,3,6 ^ 1,2,3,6 1,3,5,6 ^ 1,2,3,4,5,6,7 - 1,2,3,6 - 6 1,2,3,4,5,6,7 ^ [ 3,5,6 ^ - - - 1,2,3, 4,5,7 ^ 5 3,5,7 ] 1,2,3, 4,5,7 2 1,2,3,4,5,6,7 1,4 1 4 (1) classification of AS: AStalavista 1 2 3 4 5 6 7 bubble

  10. 7 - 1,2,3,6 ^ 1,2,3,6 1,3,5,6 ^ 1,2,3,4,5,6,7 - 1,2,3,6 - 6 1,2,3,4,5,6,7 ^ [ 3,5,6 ^ - - - 1,2,3, 4,5,7 ^ 5 3,5,7 ] 1,2,3, 4,5,7 2 1,2,3,4,5,6,7 1,4 1 4 (2) AS discovery by RNA-Seq Novel exon junctions supported by RNA-Seq add to graph, novel events extend annotated CDSs

  11. My Points • Quantifications: do you want a normalization before uploading or is this in the responsibility of the analyzing group? • Quantifications: • Timeline for studies—main paper Oct-end of the year. • Separate publications possible if there is sufficient material for a separate story? • What would be the constraints for a separate publication on Geuvadis data?

  12. Acknowledgements ThassoGriebel (PhD): Error Models, Pipelining Paolo Ribeca(PhD), Santiago Marco: GEM mapper + conversion EmanueleRaineri (PhD): SNP calling

More Related