1 / 40

Special Topics in Genomics ChIP-chip and Tiling Arrays

Special Topics in Genomics ChIP-chip and Tiling Arrays. Gene expression microarray analysis. Clustering genes by expression profile. Search conserved sequence motifs in cluster promoters. Traditional Method for Understanding Transcription Regulation.

landry
Download Presentation

Special Topics in Genomics ChIP-chip and Tiling Arrays

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Special Topics in GenomicsChIP-chip and Tiling Arrays

  2. Gene expression microarray analysis Clustering genes by expression profile Search conserved sequence motifs in cluster promoters Traditional Method for Understanding Transcription Regulation Very challenging for mammalian genomes

  3. ChIP-chip Technology • Chromatin ImmunoPrecipitation + microarray • Detect genome-wide in vivo location of TF and other DNA-binding proteins • Can learn the regulatory mechanism of a transcription factor or DNA-binding protein much better and faster

  4. Chromatin ImmunoPrecipitation (ChIP) By Richard Bourgon at UC Berkley

  5. TF/DNA Crosslinking in vivo By Richard Bourgon at UC Berkley

  6. Sonication (~500bp) By Richard Bourgon at UC Berkley

  7. TF-specific Antibody By Richard Bourgon at UC Berkley

  8. Immunoprecipitation By Richard Bourgon at UC Berkley

  9. Reverse Crosslink and DNA Purification By Richard Bourgon at UC Berkley

  10. Amplification By Richard Bourgon at UC Berkley

  11. Genome Tiling Arrays By Xiaole Shirley Liu at Harvard

  12. Genome Tiling Arrays • Affymetrix genome tiling microarrays • Tile the genome non-repeat regions • Chr21/22 tiling (earlier version): 1 million probe pairs (PM & MM) at 35 bp resolution on 3 arrays • Whole genome: 42 million PM probes on 7 arrays PM CGACATTGATTCAAGACTACATACA MM CGACATTGATTCTAGACTACATACA Probes Chromosome By Xiaole Shirley Liu at Harvard

  13. Chromatin ImmunoPrecipitation (ChIP) By Richard Bourgon at UC Berkley

  14. ChIP-chip Array Hybridization • Map high intensity probes back to the genome • Locate TF binding location ChIP-DNA Noise Probes Chromosome By Xiaole Shirley Liu at Harvard

  15. Identify ChIP-enriched Region • Controls: sonicated genomic Input DNA • Often 3 ChIP, 3 Ctrl replicates are needed ChIP Ctrl By Xiaole Shirley Liu at Harvard

  16. Mann-Whitney U-testfor ChIP-region Detection • Affy TAS, Cawley et al (Cell 2004): • Each probe: rank probes (either PM-MM or PM) within [-500bp, +500bp] window • Check whether sum of ChIP ranks is much smaller By Xiaole Shirley Liu at Harvard

  17. TileMap (Ji and Wong, Bioinformatics 2005) STEP 1: Compute a test statistic for each probe to summarize probe level information STEP 2: Combine probe level test statistics of neighboring probes to help infer binding regions

  18. Probe 1 2 3 … I Sample Variance (df) … Mean Sum of Squares … … Shrinkage Factor Variance Shrinkage Estimator Variance Estimates A modified t-statistic Probe level test statistics Probe level test statistic: empirical Bayes approach

  19. Combining neighboring probes TileMap (MA) 1. Compute the probe level test statistic t for each probe; 2. Compute a moving average statistic to measure enrichment; 3. Estimate FDR. TileMap (HMM) 1. Compute the probe level test statistic t for each probe; 2. Estimate the distribution of t under H0 and H1; 3. Model t by a Hidden Markov Model, and decode the HMM.

  20. Shrinking variance increases statistical power Moving Average t-statistic, variance shrinking t-statistic, canonical Mean(X1)-Mean(X2)

  21. Peak 2 (180bp) transgenics Neural tube expression Transgenics

  22. Comparisons between TileMap and previous methods cMyc ChIP-chip Data: 6 IP + 6 CT1 + 6 CT2 Gold Standard: Using GTRANS and Keles’ method to analyze all 18 arrays Test data: 4 arrays, 2 IP vs 2 CT1 (s2r2) TileMap-HMM (Ji & Wong, 2005) GTRANS or TAS (Kampa et al., 2004) 1. Set a window; 2. Perform a Wilcoxon signed rank test for each window. Keles et al. (2004) 1. Compute a t-statistic t for each probe (no shrinking, two sample only); 2. Rank probes by a moving average.

  23. Shrinking variance saves money Using non-shrinking method (Keles’ method) to analyze all probes Using shrinking method to analyze half of the probes, i.e., reduce information by half

  24. MAT(Johnson W.E. et al. PNAS, 2006) • Model-based Analysis of Tiling arrays for ChIP-chip • Goal: • Find ChIP-regions without replicates • Find ChIP-region without controls • Find ChIP-regions without MM probes • Can analyze data array by array By Xiaole Shirley Liu at Harvard

  25. MAT • Estimate probe behavior by checking other probes with similar sequence on the same array • Probe sequence plays a big role in signal value • Most of the probes in ChIP-chip measures non-specific hybridization By Xiaole Shirley Liu at Harvard

  26. Probe Behavior Model Baseline on number of Ts A,C,G,T Count Square A,C,G at each position of the 25mer 25mer Copy Number along the Genome By Xiaole Shirley Liu at Harvard

  27. Probe Standardization • Fit the probe model array by array • Divide array probes to bins (3k probes/bin) • Background-subtraction and standardization (normalization) on a single array; Observed probe intensity Model predicted probe intensity Observed probevariance within eachbin By Xiaole Shirley Liu at Harvard

  28. Eliminate Normalization • Probe log(PM) values before and after standardization • If normalize before model fitting • Predicted same ChIP-regions, although less confident By Xiaole Shirley Liu at Harvard

  29. ChIP-region Detection • Window-based MATscore • ChIP without Ctrl • TM: trimmed mean • Multiple ChIP with multiple Ctrl • More probes, higher t values in ChIP, less variance (fluctuation)  more confident By Xiaole Shirley Liu at Harvard

  30. Raw probe values at two spike-in regions with concentration 2X 2X 2X ChIP_1 Log(PM) Input_1 Log(PM) Sequence-based probe behavior standardization ChIP_1 t-value Input_1 t-value Window-based neighboring probe combination for ChIP-region detection ChIP_1 MATscore ChIP_1/Input_1MATscore 3 Reps ChIP/InputMATscore By Xiaole Shirley Liu at Harvard

  31. Statistical Significance of Hits • P-value and FDR cutoff: • P-value from MATscore distribution • Estimate negative peaks under the same P value cutoff • Regional FDR = #negative_peaks / #positive_peaks By Xiaole Shirley Liu at Harvard

  32. MAT summary • Open source python http://chip.dfci.harvard.edu/~wli/MAT/ • Runs faster than array scanner • Can work with single ChIP, multiple ChIP, and multiple ChIP with controls with increasing accuracy • Use single ChIP on promoter arrays to test antibody and protocol before going whole genome • Can identify individual failed samples By Xiaole Shirley Liu at Harvard

  33. Benchmark for ChIP-chip Target Detection(Johnson D.S. et al. Genome Research, 2008) • ENCODE Spike-in experiment: both amplified and un-amplified • Blind test: Samples hybridized to different tiling arrays, predictions made before the key was released ChIP 96 ENCODE clones, 2,4,8,...,256X enrichment + total chromatin DNA Input total genomic DNA

  34. Comparison of platforms

  35. Comparison of algorithms Combined Johnson D.S. et al. Genome Research 2008 with Ji H. et al. Nature Biotechnology 2008

  36. MBR: Microarray Blob Remover By Xiaole Shirley Liu at Harvard

  37. xMAN: eXtreme MApping of oligoNucleotides • http://chip.dfci.harvard.edu/~wli/xMAN • xMAN maps ~42 M Affymetrix tiling probes to the newest human genome assembly in less than 6 CPU hours • BLAST needs 20 CPU years; BLAT needs 55 CPU days • Probe TCCCAGCACTTTGGGAGGCTGAGGC maps to 50,660 times in the genome • Can map long oligos, and paired tag high throughput sequencing fragments • Store the copy number information of every probe • mXAN filters tiling array probes to ensure one unique probe measurement per 1 kb, improves peak detection By Xiaole Shirley Liu at Harvard

  38. CEAS:Cis-regulatory Element Annotation System • Data Analysis Button for Biologists http://ceas.cbi.pku.edu.cn By Xiaole Shirley Liu at Harvard

  39. CisGenome(Ji H. et al. Nature Biotechnology, 2008) Graphic User Interface CisGenome Browser Core Data Analysis Programs

  40. Other applications of tiling arrays • Transcriptome mapping • MeDIP-chip • DNase-chip • Nucleosome localization • Array CGH and copy number variation

More Related