190 likes | 330 Views
ChIP-chip Data. DNA-binding proteins. Constitutive proteins (mostly histones) Organize DNA Regulate access to DNA Have many modifications Acetylation, methylation, … Sporadic proteins (Transcription Factors) Mediate docking of transcription apparatus Modify histones Methylate DNA.
E N D
DNA-binding proteins • Constitutive proteins (mostly histones) • Organize DNA • Regulate access to DNA • Have many modifications • Acetylation, methylation, … • Sporadic proteins (Transcription Factors) • Mediate docking of transcription apparatus • Modify histones • Methylate DNA
Histones Histones are an ancient family of proteins which serve as the scaffold for DNA Four types of histones assemble in pairs to form a nucleosome DNA is wrapped twice around each nucleosome
Histones and Modifications Histone tails can be modified DNA contacts histones on their tails Histones can stay loose or assemble tightly – this compacts the DNA
Transcription Factors • General – help to set up transcription of many genes • Specific – draw in general factors or RNA Pol II to specific genes TATA Binding Protein
DNA Methylation Adding a Methyl to Cytosine Cytosine methylation is passed on to daughter cells
Tiling Array • One probe every n base pairs over some length of chromosome • Interrupted by repeat regions • Promoter array: each (known) promoter tiled An Affymetrix tiling design
What the data look like histone acetylation on 15 samples over one promoter (raw)
Methods and Issues • Normalization • Different enrichment ratios • Different probe thermodynamics • Dye and probe bias • Estimation • Categorical or continuous? • Individual values are noisy: • For TF binding: where is the peak?
Normalization • Basic idea: compensate technical variables • Technique differences should affect different probes differently • Try to estimate what part of signal can be attributed to technical factors • Easiest variable to access: sequence
MAT • One color Affy array • Needs separate array for comparison • Normalizes probe thermodynamics & enrichment ratio • Estimation by (robust) moving average
Estimation • Try to build an intelligent moving average • Not all neighbors will be similar • Typical TF binds to 8bp • Pol II may spread wider • Typical fragment is 100-200 bp • Cannot resolve < 200 bp Pol II binding on a 100 bp grid
TileMap • Ignores normalization • ‘Shrinkage’ estimator of variance • Improves individual scores • Smooths noise by moving average