1 / 32

GenXPro GmbH, Frankfurt am Main genxpro.de

Nucleotide-based Information Analysis. GenXPro GmbH, Frankfurt am Main www.genxpro.de. Our Service Portfolio. Nucleotide - based information. Transcriptomics : - RNAseq - SuperSAGE , MACE - Real-Time qPCR service - Normalization of cDNA libraries

Download Presentation

GenXPro GmbH, Frankfurt am Main genxpro.de

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Nucleotide-based Information Analysis GenXPro GmbH, Frankfurt am Main www.genxpro.de

  2. Our Service Portfolio Nucleotide-basedinformation Transcriptomics : - RNAseq - SuperSAGE, MACE - Real-Time qPCRservice - NormalizationofcDNAlibraries - microRNA Genomics: - GeneticMarkers - Genotyping - Copynumbervariations - de novosequencing; hybrid Sequencing - Exome-sequencing - Chip-Seq Epigenomics: - Methylation-specific DK (MSDK) - MethSeq Bioinformatics: - NGS Data Handling, Assembly, Quantification, Annotation, SNP Discovery, Data Interpretation

  3. NGS-Platforms Read-Length, Throughput, Costs PacBio Illumina Hiseq2000 Sequences: 1600 Million 0,85 Million Length: 2x100 bp ~ 500-10.000 bp

  4. Transcriptomics: some frequent, many rare transcripts Frequenciesoftranscriptspecies Total transcriptdistribution Frequenttranscriptsmakeup 50 % of all transcripts. Most ofthetranscriptspeciesareexpressedatlowlevels (below 10 copies per million). *

  5. One solution: RNA-Seq

  6. GenXProsoptimzedRNAseqprotocol • Strand-Information ismaintained: Sense- and Antisense-transcriptscanbedistinguished • Ligation-basedmethodfor sample preparation: avoidstheusualbias, observedwhenhexamersareusedforsecondstrandsysntehsis • Synthesis ofartificial „Antisense“ transcriptsisavoided

  7. Low-cost, digital geneexpressionanalysis • Problem: • RNAseqis still costlycomparedtomicorarrays • Oursolution = MACE: onlythe cDNA-3‘ends aresequenced • DisadvantagetoRNAseq: • Lessvariantscanbedistinguished, SNPs orfusion-transcriptsmightbemissed. • Advantages: • lowcosts: only1/20th ofsequencingdepthrequiredforsimilarresolution • Verygoodquantification • Alternative Poly-Adenylationisrevealed • concentration on themostpolymorphicsite in a gene • highlyspecificforgoodannotation

  8. Massive Analysis of cDNA Ends: MACE How it works Massive Analysis of cDNA Ends (MACE): AAAAAAA-3’ TTTTTTT-5’ 5’ 3’ cDNA Streptavidin-Beads 5’ 3’ AAAAAAA-3’ TTTTTTT-5’ cDNA 5’ 3’ AAAAAAA-3’ TTTTTTT-5’ cDNA AAAAAAA-3’ TTTTTTT-5’ 5’ 3’ cDNA AAAAAAA-3’ TTTTTTT-5’ 5’ 3’ cDNA AAAAAAA-3’ TTTTTTT-5’ 5’ 3’ cDNA

  9. Massive Analysis of cDNA Ends: MACE How it works Fragmentation AAAAAAA-3’ TTTTTTT-5’ Streptavidin-Beads AAAAAAA-3’ TTTTTTT-5’ AAAAAAA-3’ TTTTTTT-5’ AAAAAAA-3’ TTTTTTT-5’ AAAAAAA-3’ TTTTTTT-5’ AAAAAAA-3’ TTTTTTT-5’ 100-300 bp

  10. Massive Analysis of cDNA Ends: MACE How it works 2nd generation sequencing of 50-100 bp AAAAAAA-3’ TTTTTTT-5’ AAAAAAA-3’ TTTTTTT-5’ AAAAAAA-3’ TTTTTTT-5’ AAAAAAA-3’ TTTTTTT-5’ AAAAAAA-3’ TTTTTTT-5’ AAAAAAA-3’ TTTTTTT-5’

  11. Massive Analysis of cDNA Ends: MACE How it works Assembly & Counting AAAAAAA-3’ TTTTTTT-5’ AAAAAAA-3’ TTTTTTT-5’ AAAAAAA-3’ TTTTTTT-5’ AAAAAAA-3’ TTTTTTT-5’ AAAAAAA-3’ TTTTTTT-5’ AAAAAAA-3’ TTTTTTT-5’

  12. Massive Analysis of cDNA ends: MACE How it works Assembly & Counting 1 1 4 Counting, BLAST 50-400bp Onlyonefragment per transcript!

  13. Advantages of MACE orSuperSAGE vs. Micro-Arrays Example: 4.455.653 Tags from Mouse Spleen More than 75 % rare transcripts: This information is lost on microarrays ! Only this part is visible for microarrays >18.000 different transcripts excluding the singletons * >13.000 Singletons with distinct matches to the NCBI-DB

  14. TranSNiPtomics- why MACE? High coveragetodistingishbetween SNP anderror MACE AT CG AT AAAAAAAA TTTTTTTT cDNA Reads Concentration on polymorph 3‘ end: SNPs withenoughcoverage: 2 RNA-Seq AT CG AT AAAAAAAA TTTTTTTT cDNA Reads Reads distributed all overtranscript: SNPs withenoughcoverage: 0

  15. Alternative Polyadenylation Signal Hexamer <- 3‘ UTR -> MACE-tags of short variant PolyAtail miRNA MACE-tags oflong variant • Up to 50% or transcripts are differentially poly-adenylated(Tian et al., 2005). • Escape from miRNA-based regulation: • ->Upto 10x moreproteinatsimilartranscriptlevels! • Influence mRNA nuclear export and cellular localization • Oncogenes often have shorter 3’UTR because of APA (Mayr & Bartel, 2009) • Stem cells show longer, differentiated cells shorter 3’UTRs (Shepard et al. 2011) • APA is an extremely important feature of a transcript and should be assessed for every transcript.

  16. Coveragefor SNP detection Wheat, nucleosome/chromatinassemblyfactorC; 160 TPM MACE, 20 Mio Reads Sufficientcoveragefor SNP detection!

  17. Coveragefor SNP detection Wheat, nucleosome/chromatinassemblyfactor C RNA seq, 20 Mioreads, same position Coveragetoolow!

  18. RNA-Seqvs. MACE AAAAAAA-3’ TTTTTTT-5’ 5’ 3’ cDNA of transcript A RNA-Seq AAAAAAA-3’ TTTTTTT-5’ 5’ 3’ cDNA of transcript B Manyreads per transcript, reads per transcriptvaries! A MACE one read = one transcript B For the same depth of analysis, RNA-Seq requires about 20-30 times more sequencing* *Asmann et. al 2009

  19. TrueQuant Technology to solve Problem of PCR-introduced bias Certain tags or fragments are preferentially amplified during PCR biased quantification The Solution: GenXPro’s bias-proof “TrueQuant” technology: Each molecule is individually labeled prior to PCR; copies are eliminated from dataset. securequantification

  20. Problem of PCR-introduced BIAS TrueQuant-corrected vs. uncorrected Data Negative common logarithm of the p-value for differential expression of gene expression comparisons (Audic & Claverie; 1997). From human pancreatic tumors.

  21. MACE Technical Replicate 10 Mioreads, Barley Pearson correlationcoefficient= 0.9983357 (!)

  22. Bioinformatics: automatedworkflow formodeland non-model organsisms quantification Tags: Gen 1 1 annotation / mapping 1 Gen 2 unknown unknown Assembly unknown WEB tool „MACE2GO“ unknown quantification 4 BLASTX (Protein DBs) Enrichment analysis

  23. Normalization of cDNA libraries: Frequent transcripts are strongly reduced cDNA before normalization cDNA after normalization

  24. microRNAs and the RNA-degradome microRNA mRNA-ends AAAAAAA-3’ AAAAAAA-3’ mRNA AAAAAAA-3’ AAAAAAA-3’ Next-Gen-Sequencing, counting, BLAST

  25. Genomic DNA

  26. GenotypingbySequencing: ReducedComplexitiyGenomicSequencing 1. Digestion withrestriction Enzyme 5‘ 3‘ 3‘ 5‘ DNA – SAMPLE I 5‘ 3‘ 3‘ 5‘ DNA – SAMPLE II

  27. Methylation-Specific Digital Karyotyping (MSDK) 2. Adapter Ligation, Sequencing 5‘ 3‘ 3‘ 5‘ DNA – SAMPLE I 5‘ 3‘ 3‘ 5‘ DNA – SAMPLE II 3. Counting (CNVs), Comparison, Annotation

  28. Methylation-Specific Digital Karyotyping (MSDK) 1. Digestion withmethylation-sensitiverestrictionenzyme M 5‘ 3‘ 3‘ 5‘ DNA – SAMPLE I M 5‘ 3‘ 3‘ 5‘ DNA – SAMPLE II

  29. Methylation-Specific Digital Karyotyping (MSDK) 2. Adapter Ligation, Sequencing 5‘ 3‘ 3‘ 5‘ DNA – SAMPLE I 5‘ 3‘ 3‘ 5‘ DNA – SAMPLE II 3. Counting, Annotation Counting, BLAST, Statistics

  30. Exome Sequencing of genomic DNA Exons Fragmentation (100-300 bp) Introns Denaturation Binding toexon-specific, matrix-boundoligos Elution, sequencing, bioinformatics

  31. Some References

More Related