Chromatin Structure & Dynamics

Chromatin Structure & Dynamics Victor Jin Department of Biomedical Informatics The Ohio State University

Chromatin • Walther Flemming first used the term Chromatin in 1882. At that time, Flemming assumed that within the nucleus there was some kind of a nuclear-scaffold. • Chromatin is the complex of DNA and protein that makes up chromosomes. • Chromatin structure: DNA wrapping around nucleosomes – a “beads on a string” structure. • In non-dividing cells there are two types of chromatin: euchromatin and heterochromatin.

Chromatin Fibers Chromatin as seen in the electron microscope. (source:Alberts et al., Molecular Biology of The Cell, 3rd Edition) 11 nm (beads) 30 nm chromatin fiber

Nucleosome • The basic repeating unit of chromatin. • It is made up by five histone proteins: H2A, H2B, H3, H4 as core histones and H1 as a linker. • It provides the lowest level of compaction of double-strand DNA into the cell nucleus. • It often associates with transcription. H3 H2A H2B H4 1974: Roger Kornberg discovers nucleosome who won Nobel Prize in 2006.

Core Histonesare highly conserved proteins - share a structural motif called a histone fold including three α helices connected by two loops and an N-terminal tail

Histone Octamer < 11 nm > • Each core histone forms pairs as a dimer contains 3 regions of interaction with dsDNA; • H3 and H4 further assemble tetramers. • The histone octamer organizes 146 bp of DNA in 1.65 helical turn of DNA: • 48 nm of DNA packaged in a disc of 6 x 11nm < 6 nm >

Nucleosome Assembly In Vitro 4 core histones + 1 naked DNA template at 4C at 2M salt concentration, from Dyer et al, Methods in Enzymology (2004), 375:23-44.

10,000 nm DNA compaction in a human cell nucleus 11 nm 30nm 1bp (0.3nm)

The N-terminal tails protrude from the core

Histone Modifications Acetylation Ac Me Methylation Me ‘Histone Code’ Ub Ubiquitination Su Sumoylation P Phosphorylation

Acetylation of Lysines • Acetylation of the lysines at the N terminus of histones removes positive charges, thereby reducing the affinity between histones and DNA. • This makes RNA polymerase and transcription factors easier to access the promoter region. • Histone acetylation enhances transcription while histone deacetylation represses transcription.

Methylation of Arginines and Lysines • Arginine can be methylated to form mono-methyl, symmetrical di-methyl and asymmetrical di-methylarginine. • Lysine can be methylated to form mono-methyl, di-methyl and tri-methylarginine.

SUZ12 EED PC DNMT HDAC Methylation of Histone H3-K27 EZH2 K27

Functional Consequences of Histone Modification • Establishing global chromatin environment, such as Euchromatin, Heterochromatin and Bivalent domains in embryonic stem cells (ESCs). • Orchestration of DNA-based process transcription.

Euchromatin • A lightly packed form of chromatin; • Gene-rich; • At chromosome arms; • Associated with active transcription.

Heterochromatin • A tightly packed form of chromatin; • At centromeres and telomeres; • Contains repetitious sequences; • Gene-poor; • Associated with repressed transcription.

Bivalent Domains Poised state.The chromatin of embryonic stem cells has “bivalent” domains with marks of both gene activation and repression. In these domains, the tail of histone protein H3 has a methyl group attached to lysine 4 (K4) that is activating and a methyl group at lysine 27 (K27) that is repressive (above). This contradictory state may keep the genes silenced but poised to activate if needed. When the cell differentiates (right), only one tag or the other remains, depending on whether the gene is expressed or not.

DNA Methylation DNA methyltransferase S-adenosylmethionine 5-methylcytosine deoxycytosine

CpG Islands • CpG island: a cluster of CpG residues often found near gene promoters (at least 200 bp and with a GC percentage that is greater than 50% and with an observed/expected CpG ratio that is greater than 0.6). • ~29,000 CpG islands in human genome (~60% of all genes are associated with CpG islands) • Most CpG islands are unmethylated in normal cells.

Chromatin modifications

Genome-wide Distribution Pattern of Histone Modification Associated with Transcription Source: Li et al. Cell (Review, 2007), 128:707-719 Li et al. Cell (review) 128, 707-719

ChIP-chip • Step 1: Rapid fixation of cells chemically cross-links DNA binding proteins to their genomic targets in vivo. • Step 2: Cell lysis releases the DNA-protein complexes, and sonication fragments the DNA. • Step 3: Immunoprecipitation (IP) purifies the protein-DNA fragments, with specificity dictated by antibody choice. • Step 4: Hydrolysis reverses the cross-links within the released DNA fragments. • Step 5:PCR amplification of ChIP DNA • Step 6: PCR amplification on a known binding-site region for that protein will need to be performed using either conventional PCR methods followed by agarose gel electrophoresis or by quantitative PCR. • Step 7: Labeling pool of protein-DNA fragments. • Step 8: Hybridization of DNA onto microarrays featuring 60-mer oligonucleotide probes.

Major types of array platforms • NimbleGen Arrays: tiling arrays, promoter arrays, whole genome arrays. (http://www.nimblegen.com/products/chip/index.html) • Agilent Arrays: promoter arrays, whole genome arrays. (http://www.chem.agilent.com/Scripts/Phome.asp) • Affymetrix Arrays: tiling arrays, Chr21,22 arrays, whole genome arrays. (http://www.affymetrix.com/index.affx)

Measurement of intensity of probes on the array • The hybridized arrays were scanned on an Axon GenePix 4000B scanner (Axon Instruments Inc.) at wavelengths of 532 nm for control (Cy3), and 635 nm (Cy5) for each experimental sample. • Data points were extracted from the scanned images using the NimbleScan 2.0 program (NimbleGen Systems, Inc.). • Each pair of N probe signals was normalized by converting into a scaled log ratio using the following formula: • Si = Log2 (Cy5l(i) /Cy3(i))

Antibody Validation • Confirming on a known target • Different antibodies to same factor • Antibodies to different family members • siRNA-ChIP • Antibodies to two components of a complex • Antibodies to an enzyme/modification pair

Confirming on a known target

Comparison of biological replicates and antibodies to different E2Fs

Loss of E2F6 ChIP signal after knockdown of E2F6 siRNA

Reproducibility of promoter arrays using biological replicates • H3me3K27; Ntera2 cells • Top 1000 overlap • Top 1000 overlap • Promoter 1 • Promoter 2

Biological reproducibility on tiling arrays • 500 kb region of chromosome 6 • 500 kb region of chromosome 1

Amount of Sample Per ChIP

Miniaturization Standard ChIP Protocol (1x107 cells; WGA2) Promoter Arrays Genome Tiling Arrays MicroChIP Protocol (10,000-100,000 cells; WGA4) Promoter Arrays Genome Tiling Arrays

Reproducibility of MicroChIP Protocol

Peak calling programs • Moving average method by Keles et al. (2004), • A Hidden Markov Model (HMM) approach by Li et al. (2005), • TileMap by Ji and Wong (2005) using moving averages or an HMM to account for information of adjacent probes, • PMT by Chung et al. (2007) that integrates a physical model to correct for probe-specific behavior. • ChIPmix (Martin-Magniette et al. (2008)) based on a linear regression mixture model .

Spike-ins comparison • Mixtures of human genomic DNA and “spike-ins” comprised of nearly 100 human sequences at various concentrations were hybridized to four tiling array platforms by eight independent groups. • Ref: Johnson et al., “Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets”, Genome Research, 18: 393-403, 2008.

Programs in Spike-ins comparison • MAT: Model-based Analysis of Tiling arrays first standardizes each individual Affymetrix tiling arrays by modeling the effect of probe’s 25-mer sequence and genome copy number on its signal. • TAS: Affymetrix Tiling Array Software first uses quantile normalization to normalize probes on all the arrays. Then a Mann-Whitney U test (also known as Wilcoxon rank-sum test) is used across 500bp sliding windows to identify windows where the spike-in probes has higher signals than the control probes. • Weighted Average (WA): To detect enriched regions, we used an approach that judged the significance of ratios of a contiguous set of probes defining a region by comparing a score based on their weighted average to the distribution of scores of all sets of probes taken in windows of the same predefined size (500bp in this case.) • TAMAL: the algorithm proceeds in two basic steps. First, peaks are found using the TAMALPAIS. Then, the enrichment is estimated within the peak by using the maxfour approach described in Krig et al. (2007, J Biol Chem 282:9703). Bieda et al. (2006) describe four levels of stringency, called L1, L2, L3, L4, with L1 being the most stringent set of detection parameters and L4 the least stringent. • Mpeak: The model-based Mpeak method is used to identify peaks in ChIP-on-chip data. • Wavelet: The algorithm uses wavelet transform of the signals from the red and green channels of the tiling array. From the approximation coefficients of the wavelet transform we obtain clear intensity and length-scale separation between the background signal and the signal coming from the regions of the biochemical activity.

Chromatin Structure &amp; Dynamics

Chromatin Structure &amp; Dynamics

Presentation Transcript

Chromatin Structure & Dynamics

Chromatin Structure & Dynamics