BEADS: Bias Elimination Algorithm for Deep Sequencing

BEADS: Bias Elimination Algorithm for Deep Sequencing NICOLE CHEUNG, THOMAS DOWN, JULIE AHRINGER The Gurdon Institute University of Cambridge, Cambridge UK

Bias in deep sequencing data: Tag counts for C. elegans genomic DNA and ChIP input sequence are not uniform (tag count biases also observed in human ChIP input sequence (Rozowsky et al 2009)) • Analysis of ChIP seq data needs to take bias into account: • Peak calling • Input sequence is used for significance scoring of peaks • But only regions enriched in ChIP are processed. • (e.g, MACS, PeakSeq, SPP, …) • Plotting signal across features (e.g, TSS, exons, …) • Need a method to correct signal at all positions

Patterns correlate with GC content and mappability differences Low read counts in regions of low mappability Genomic and input DNA patterns are similar to GC content track

GC rich sequences are over-represented %GC input sequence % GC genome Low % GC is under-represented High % GC is over-represented 0 10 20 30 40 50 60 70 80 % GC

Patterns of %GC, mappability, and ChIP input sequence across C. elegans exons and promoters EXONS TSSs 500 bp intron >100 bp 500 bp intron % GC Mappability Raw input sequence Peaks in TSSs of human genes observed by Rozowsky et al 2009 PeakSeq paper)

Tag counts of raw human genomic sequence across exons

Normalization strategy • 1) G+C count normalization • Apply on every read • 2) Mappability correction • Apply on each genomic location • 3) Residual local effect, e.g. DNA accessibility - correct using information from input

Three step normalization removes bias in ChIP input sequence EXONS TSSs 500 bp intron 500 bp intron >100 bp Correction applied: raw GC GC + map Signal across exons and TSSs is flat in ChIP Input sequence after correction GC + map + local

H3K4me3 previous knowledge: peaks in promoter regions TSSs EXONS 500 bp intron 500 bp intron >100 bp Correction applied: raw GC GC + map Expected promoter peaks remain after normalization Exon peaks removed after normalization GC + map + local

H3K36me3 previous knowledge: on transcribed regions, enriched in exons TSSs EXONS 500 bp intron 500 bp intron >100 bp Correction applied: raw GC GC + map H3K36me3 signal on gene body H3K36me3 enrichment on exons GC + map + local

BEADS: Bias Elimination Algorithm for Deep Sequencing

BEADS: Bias Elimination Algorithm for Deep Sequencing

Presentation Transcript

Deep Sequencing

Sequencing Deep Dive: Efficiently Making Your Applications Virtual

BIT 815: Analysis of Deep Sequencing Data

DECS: A Dynamic Elimination-Combining Stack Algorithm

A Dynamic Elimination-Combining Stack Algorithm

Beads

Beads

Beads

Beads

The Gaussian Elimination with Partial Pivoting Algorithm

Beads

An overview of BIT815 Deep Sequencing Data Analysis

Bayesian Networks Bucket Elimination Algorithm

Error Correction for Deep Viral Sequencing ( Shotgun,Amplicons )

Bias in ocean data assimilation Two-stage bias correction algorithm Bias model

Channel-Independent Viterbi Algorithm (CIVA) for DNA Sequencing

In silico Protein Design: Implementing Dead-End Elimination algorithm

A Fast Learning Algorithm for Deep Belief Nets

The Elimination Algorithm

Overview of quality of deep sequencing

Turquoise Beads For Sale

Another Beads Store for the Love of Beads