80 likes | 95 Views
This project aims to evaluate the reproducibility and quality of ChIP-Seq/DNase/FAIRE datasets through overlapping target lists, assessing replica quality, and scoring submitted datasets. It explores DNA elements, peak calling, and comparing peak callers for accuracy.
E N D
Assessing Quality of ChIP-Seq/DNase/FAIRE Datasets • Reproducibility • Threshold too stringent: Overlap between target lists (varying length) • Assessing quality of replicas (qualitative vs quantities) • Uniformly assess quality of submitted datasets (scores a’la Stam presentation) • Who is going to do this? • Number of Reads • DNase1 20 - 25 million per replica • ChIP-Seq between 6 - 12 million per replica • Controls • Matched input DNA control or pooled • Experimental Verification • qPCR (necessary for publication, gold standard? ) • ChIP-chip • Southerns for DNAse1
Elements to be called (1) • DNase1 & FAIRE • Agreement between DNase1 and FAIRE • Score as peaks (<Kb) and broad regions (multi Kb) • Integration across labs ? • Superset of all regulatory elements • Methylation (Methyl-Seq) • Is this peak data
Elements to be called (2) • ChIP-Seq • Sequence specific factor (motif binding) • Peak (punctate binding) e.g. NRSF, CTCF… • “Peaky” (localized region binding) • H3K4me3 • Pol II • Broad Binding (10Kb+ with substructure) • H3K27me3 … • Mixed (point source and broad) • Call both ways • Pol II
Example 2 Log (Pol II) Pol II ChIPSeqMini MACS QuEST PeakSeq SISSRS Input
Comparison of Peak-Callers • DNase1/FAIRE Data: • Comparison between F-Seq and HotSpot for regions called for DNase1 HS sites • Score datasets with both software • Consistency comparison of Labs and Software
Peak-Caller Comparison for ChIP-Seq • Sequence Specific Factors: • K562 CTCF (Bernstein & Crawford & Stam) • Localized Region Binding: • K562 Pol II (Different Labs) • H3K4me3 ? • Consistency Analysis (upper rank intersection) • “Gold” Standards to Compare Against • Motifs ? • qPCR ? • ChIP-chip ?
Questions • Requirement for quality of submitted data ? • Are DNase1 HS sites and FAIRE sites a superset of all regulatory elements? • Which Peak-Callers to Compare • Just Peak-Callers used by Member Labs? • Single Peak Caller for each data type? • How we decide which is better?