670 likes | 834 Views
Přírodovědecká fakulta UK. EPIGENETIKA MB150P85. Petr Svoboda. mail : svoboda1@natur.cuni.cz tel: 24106 3147. DNA METHYLATION I (DNA METHYLATION AND ITS DETECTION). EPIGENETIKA. Epigenetika se zabývá přenosem vlastností (informací), které nejsou uložené v sekvenci DNA.
E N D
Přírodovědecká fakulta UK EPIGENETIKA MB150P85 Petr Svoboda mail: svoboda1@natur.cuni.cz tel: 241063147
EPIGENETIKA Epigenetika se zabývá přenosem vlastností (informací), které nejsou uložené v sekvenci DNA. • Tyto informace jsou přenášené především v: • struktuře a modifikacích chromatinu • chemických modifikacích DNA • RNAmolekulách
Bacteria - wide range of functions Eukaryota - numerous effects Protista, Plantae, (Mammalia?) Bacteria - protection against RE See Ratel 2006 Bacteria - protection against RE
Specific nucleotides are modified An example of a modified restriction enzyme recognition site. These sites are usually modified in organisms with the corresponding restriction activity. ** Fraction of CG dinucleotides or CNG trinucleotides varies with species and, to a lesser extent, with tissue.
DNA methylation 5-methyl cytosine methylation http://www.med.ufl.edu/biochem/keithr/research.html
Robertson 2002 Mammalian DNA methyltransferases Maintenance DNA methylation Ignore HW: Why? De novo DNA methylation
Methods to study DNA methylation Global methylation analysis HPLC (complete hydrolysis, AP) TLC (complete hydrolysis, AP, 32P labeled) Sequence-specific methylation analysis Methylation sensitive restriction enzymes - a number of methods Bisulfite sequencing MeDIP For more details, see Oakeley, 1999
Bisulfite sequencing HW: Why isn’t 5mC converted? dsDNA is resistant to conversion! From Oakeley, 1999 http://www.methods.info/Methods/DNA_methylation/Bisulphite_sequencing.html
Epitect Bisulfite Sequencing Kit (Qiagen) Classical protocol (bisulfite protocol.doc) - starting material: cells or DNA up to 200 ng - extremely sensitive (100 DNA molecules) - based on agarose embedding Drawbacks: - time consuming (approx. 10-11 hours) - low throughput (typically up to 8 samples/run) - low yield (200 ng max/reaction) Epitect - starting material: DNA 1 ng - 2 mg - sensititvity OK for most applications - faster (cca 6 hours), throughput OK Whatever … The critical component are primers!! http://www1.qiagen.com/Products/Epigenetics/Epitect/EpitectBisulfiteKit.aspx?ShowInfo=1
Model case: L1 promoter methylation analysis - active, autonomous, non-LTR class - retrotransposition in cis - the most abundant retrotransposon in mammalian genomes - 4-500 000 insertion in the human genome - ~100 full length intact elements - typically silenced in somatic cells (hypermethylation) ORF2 ORF1 5’ UTR EN RT AAAn ~6 kb
BISULFITE SEQUENCING STEP BY STEP 1) Find your sequence – NCBI Genbank and Pubmed >L1 5’ UTR GGGGGGAGGAGCCAAGATGGCCGAATAGGAACAGCTCCGGTCTACAGCTCCCAGCGTGAGCGACGCAGAAGACGGTGATTTCTGCATTTCCATCTGAGGTACCGGGTTCATCTCACTAGGGAGTGCCAGACAGTGGGCGCAGGCCAGTGTGTGTGCGCACCGTGCGCGAGCCGAAGCAGGGCGAGGCATTGCCTCACCTGGGAAGCGCAAGGGGTCAGGGAGTTCCCTTTCCGAGTCAAAGAAAGGGGTGACGGACGCACCTGGAAAATCGGGTCACTCCCACCCGAATATTGCGCTTTTCAGACCGGCTTAAGAAACGGCGCACCACGAGACTATATCCCACACCTGGCTCGGAGGGTCCTACGCCCACGGAATCTCGCTGATTGCTAGCACAGCAGTCTGAGATCAAACTGCAAGGCGGCAACGAGGCTGGGGGAGGGGCGCCCGCCATTGCCCAGGCTTGCTTAGGTAAACAAAGCAGCAGGGAAGCTCGAACTGGGTGGAGCCCACCACAGCTCAAGGAGGCCTGCCTGCCTCTGTAGGCTCCACCTCTGGGGGCAGGGCACAGACAAACAAAAAGACAGCAGTAACCTCTGCAGACTTAAGTGTCCCTGTCTGACAGCTTTGAAGAGAGCAGTGGTTCTCCCAGCACGCAGCTGGAGATCTGAGAACGGGCAGACTGCCTCCTCAAGTGGGTCCCTGACCCCTGACCCCCGAGCAGCCTAACTGGGAGGCACCCCCCAGCAGGGGCACACTGACACCTCACACGGCAGGGTATTCCAACAGACCTGCAGCTGAGGGTCCTGTCTGTTAGAAGGAAAACTAACAACCAGAAAGGACATCTACACCGAAAACCCATCTGTACATCACCATCATCAAAGACCAAAAGTAGATAAAACCACAAAG
BISULFITE SEQUENCING STEP BY STEP 1) Find your sequence >L1 5’ UTR GGGGGGAGGAGCCAAGATGGCCGAATAGGAACAGCTCCGGTCTACAGCTCCCAGCGTGAGCGACGCAGAAGACGGTGATTTCTGCATTTCCATCTGAGGTACCGGGTTCATCTCACTAGGGAGTGCCAGACAGTGGGCGCAGGCCAGTGTGTGTGCGCACCGTGCGCGAGCCGAAGCAGGGCGAGGCATTGCCTCACCTGGGAAGCGCAAGGGGTCAGGGAGTTCCCTTTCCGAGTCAAAGAAAGGGGTGACGGACGCACCTGGAAAATCGGGTCACTCCCACCCGAATATTGCGCTTTTCAGACCGGCTTAAGAAACGGCGCACCACGAGACTATATCCCACACCTGGCTCGGAGGGTCCTACGCCCACGGAATCTCGCTGATTGCTAGCACAGCAGTCTGAGATCAAACTGCAAGGCGGCAACGAGGCTGGGGGAGGGGCGCCCGCCATTGCCCAGGCTTGCTTAGGTAAACAAAGCAGCAGGGAAGCTCGAACTGGGTGGAGCCCACCACAGCTCAAGGAGGCCTGCCTGCCTCTGTAGGCTCCACCTCTGGGGGCAGGGCACAGACAAACAAAAAGACAGCAGTAACCTCTGCAGACTTAAGTGTCCCTGTCTGACAGCTTTGAAGAGAGCAGTGGTTCTCCCAGCACGCAGCTGGAGATCTGAGAACGGGCAGACTGCCTCCTCAAGTGGGTCCCTGACCCCTGACCCCCGAGCAGCCTAACTGGGAGGCACCCCCCAGCAGGGGCACACTGACACCTCACACGGCAGGGTATTCCAACAGACCTGCAGCTGAGGGTCCTGTCTGTTAGAAGGAAAACTAACAACCAGAAAGGACATCTACACCGAAAACCCATCTGTACATCACCATCATCAAAGACCAAAAGTAGATAAAACCACAAAG HW: look at CpG density of: EGFP, Bluescript, Actin promoter and transcribed region, IAP LTR, L1 ORF2
BISULFITE SEQUENCING STEP BY STEP 2) Word -> REPLACE CG WITH XY >L1 5’ UTR GGGGGGAGGAGCCAAGATGGCXYAATAGGAACAGCTCXYGTCTACAGCTCCCAGXYTGAGXYAXYCAGAAGAXYGTGATTTCTGCATTTCCATCTGAGGTACXYGGTTCATCTCACTAGGGAGTGCCAGACAGTGGGXYCAGGCCAGTGTGTGTGXYCACXYTGXYXYAGCXYAAGCAGGGXYAGGCATTGCCTCACCTGGGAAGXYCAAGGGGTCAGGGAGTTCCCTTTCXYAGTCAAAGAAAGGGGTGAXYGAXYCACCTGGAAAATXYGGTCACTCCCACCXYAATATTGXYCTTTTCAGACXYGCTTAAGAAAXYGXYCACCAXYAGACTATATCCCACACCTGGCTXYGAGGGTCCTAXYCCCAXYGAATCTXYCTGATTGCTAGCACAGCAGTCTGAGATCAAACTGCAAGGXYGCAAXYAGGCTGGGGGAGGGGXYCCXYCCATTGCCCAGGCTTGCTTAGGTAAACAAAGCAGCAGGGAAGCTXYAACTGGGTGGAGCCCACCACAGCTCAAGGAGGCCTGCCTGCCTCTGTAGGCTCCACCTCTGGGGGCAGGGCACAGACAAACAAAAAGACAGCAGTAACCTCTGCAGACTTAAGTGTCCCTGTCTGACAGCTTTGAAGAGAGCAGTGGTTCTCCCAGCAXYCAGCTGGAGATCTGAGAAXYGGCAGACTGCCTCCTCAAGTGGGTCCCTGACCCCTGACCCCXYAGCAGCCTAACTGGGAGGCACCCCCCAGCAGGGGCACACTGACACCTCACAXYGCAGGGTATTCCAACAGACCTGCAGCTGAGGGTCCTGTCTGTTAGAAGGAAAACTAACAACCAGAAAGGACATCTACACXYAAAACCCATCTGTACATCACCATCATCAAAGACCAAAAGTAGATAAAACCACAAAG
BISULFITE SEQUENCING STEP BY STEP 3) Word -> REPLACE C WITH T >L1 5’ UTR GGGGGGAGGAGTTAAGATGGTXYAATAGGAATAGTTTXYGTTTATAGTTTTTAGXYTGAGXYAXYTAGAAGAXYGTGATTTTTGTATTTTTATTTGAGGTATXYGGTTTATTTTATTAGGGAGTGTTAGATAGTGGGXYTAGGTTAGTGTGTGTGXYTATXYTGXYXYAGTXYAAGTAGGGXYAGGTATTGTTTTATTTGGGAAGXYTAAGGGGTTAGGGAGTTTTTTTTTXYAGTTAAAGAAAGGGGTGAXYGAXYTATTTGGAAAATXYGGTTATTTTTATTXYAATATTGXYTTTTTTAGATXYGTTTAAGAAAXYGXYTATTAXYAGATTATATTTTATATTTGGTTXYGAGGGTTTTAXYTTTAXYGAATTTXYTTGATTGTTAGTATAGTAGTTTGAGATTAAATTGTAAGGXYGTAAXYAGGTTGGGGGAGGGGXYTTXYTTATTGTTTAGGTTTGTTTAGGTAAATAAAGTAGTAGGGAAGTTXYAATTGGGTGGAGTTTATTATAGTTTAAGGAGGTTTGTTTGTTTTTGTAGGTTTTATTTTTGGGGGTAGGGTATAGATAAATAAAAAGATAGTAGTAATTTTTGTAGATTTAAGTGTTTTTGTTTGATAGTTTTGAAGAGAGTAGTGGTTTTTTTAGTAXYTAGTTGGAGATTTGAGAAXYGGTAGATTGTTTTTTTAAGTGGGTTTTTGATTTTTGATTTTXYAGTAGTTTAATTGGGAGGTATTTTTTAGTAGGGGTATATTGATATTTTATAXYGTAGGGTATTTTAATAGATTTGTAGTTGAGGGTTTTGTTTGTTAGAAGGAAAATTAATAATTAGAAAGGATATTTATATXYAAAATTTATTTGTATATTATTATTATTAAAGATTAAAAGTAGATAAAATTATAAAG
BISULFITE SEQUENCING STEP BY STEP 4) Word -> REPLACE XY WITH CG >L1 5’ UTR GGGGGGAGGAGTTAAGATGGTCGAATAGGAATAGTTTCGGTTTATAGTTTTTAGCGTGAGCGACGTAGAAGACGGTGATTTTTGTATTTTTATTTGAGGTATCGGGTTTATTTTATTAGGGAGTGTTAGATAGTGGGCGTAGGTTAGTGTGTGTGCGTATCGTGCGCGAGTCGAAGTAGGGCGAGGTATTGTTTTATTTGGGAAGCGTAAGGGGTTAGGGAGTTTTTTTTTCGAGTTAAAGAAAGGGGTGACGGACGTATTTGGAAAATCGGGTTATTTTTATTCGAATATTGCGTTTTTTAGATCGGTTTAAGAAACGGCGTATTACGAGATTATATTTTATATTTGGTTCGGAGGGTTTTACGTTTACGGAATTTCGTTGATTGTTAGTATAGTAGTTTGAGATTAAATTGTAAGGCGGTAACGAGGTTGGGGGAGGGGCGTTCGTTATTGTTTAGGTTTGTTTAGGTAAATAAAGTAGTAGGGAAGTTCGAATTGGGTGGAGTTTATTATAGTTTAAGGAGGTTTGTTTGTTTTTGTAGGTTTTATTTTTGGGGGTAGGGTATAGATAAATAAAAAGATAGTAGTAATTTTTGTAGATTTAAGTGTTTTTGTTTGATAGTTTTGAAGAGAGTAGTGGTTTTTTTAGTACGTAGTTGGAGATTTGAGAACGGGTAGATTGTTTTTTTAAGTGGGTTTTTGATTTTTGATTTTCGAGTAGTTTAATTGGGAGGTATTTTTTAGTAGGGGTATATTGATATTTTATACGGTAGGGTATTTTAATAGATTTGTAGTTGAGGGTTTTGTTTGTTAGAAGGAAAATTAATAATTAGAAAGGATATTTATATCGAAAATTTATTTGTATATTATTATTATTAAAGATTAAAAGTAGATAAAATTATAAAG
BISULFITE SEQUENCING STEP BY STEP - size typically up to 500 bp, ideally around 300 bp 5) Design primers - avoid low complexity sequences - Tm around 55oC - select the right region! >L1 5’ UTR GGGGGGAGGAGTTAAGATGGTCGAATAGGAATAGTTTCGGTTTATAGTTTTTAGCGTGAGCGACGTAGAAGACGGTGATTTTTGTATTTTTATTTGAGGTATCGGGTTTATTTTATTAGGGAGTGTTAGATAGTGGGCGTAGGTTAGTGTGTGTGCGTATCGTGCGCGAGTCGAAGTAGGGCGAGGTATTGTTTTATTTGGGAAGCGTAAGGGGTTAGGGAGTTTTTTTTTCGAGTTAAAGAAAGGGGTGACGGACGTATTTGGAAAATCGGGTTATTTTTATTCGAATATTGCGTTTTTTAGATCGGTTTAAGAAACGGCGTATTACGAGATTATATTTTATATTTGGTTCGGAGGGTTTTACGTTTACGGAATTTCGTTGATTGTTAGTATAGTAGTTTGAGATTAAATTGTAAGGCGGTAACGAGGTTGGGGGAGGGGCGTTCGTTATTGTTTAGGTTTGTTTAGGTAAATAAAGTAGTAGGGAAGTTCGAATTGGGTGGAGTTTATTATAGTTTAAGGAGGTTTGTTTGTTTTTGTAGGTTTTATTTTTGGGGGTAGGGTATAGATAAATAAAAAGATAGTAGTAATTTTTGTAGATTTAAGTGTTTTTGTTTGATAGTTTTGAAGAGAGTAGTGGTTTTTTTAGTACGTAGTTGGAGATTTGAGAACGGGTAGATTGTTTTTTTAAGTGGGTTTTTGATTTTTGATTTTCGAGTAGTTTAATTGGGAGGTATTTTTTAGTAGGGGTATATTGATATTTTATACGGTAGGGTATTTTAATAGATTTGTAGTTGAGGGTTTTGTTTGTTAGAAGGAAAATTAATAATTAGAAAGGATATTTATATCGAAAATTTATTTGTATATTATTATTATTAAAGATTAAAAGTAGATAAAATTATAAAG
BISULFITE SEQUENCING STEP BY STEP - size typically up to 500 bp, ideally around 300 bp 5) Design primers - avoid low complexity sequences - Tm around 55oC - select the right region! >L1 5’ UTR GGGGGGAGGAGTTAAGATGGTCGAATAGGAATAGTTTCGGTTTATAGTTTTTAGCGTGAGCGACGTAGAAGACGGTGATTTTTGTATTTTTATTTGAGGTATCGGGTTTATTTTATTAGGGAGTGTTAGATAGTGGGCGTAGGTTAGTGTGTGTGCGTATCGTGCGCGAGTCGAAGTAGGGCGAGGTATTGTTTTATTTGGGAAGCGTAAGGGGTTAGGGAGTTTTTTTTTCGAGTTAAAGAAAGGGGTGACGGACGTATTTGGAAAATCGGGTTATTTTTATTCGAATATTGCGTTTTTTAGATCGGTTTAAGAAACGGCGTATTACGAGATTATATTTTATATTTGGTTCGGAGGGTTTTACGTTTACGGAATTTCGTTGATTGTTAGTATAGTAGTTTGAGATTAAATTGTAAGGCGGTAACGAGGTTGGGGGAGGGGCGTTCGTTATTGTTTAGGTTTGTTTAGGTAAATAAAGTAGTAGGGAAGTTCGAATTGGGTGGAGTTTATTATAGTTTAAGGAGGTTTGTTTGTTTTTGTAGGTTTTATTTTTGGGGGTAGGGTATAGATAAATAAAAAGATAGTAGTAATTTTTGTAGATTTAAGTGTTTTTGTTTGATAGTTTTGAAGAGAGTAGTGGTTTTTTTAGTACGTAGTTGGAGATTTGAGAACGGGTAGATTGTTTTTTTAAGTGGGTTTTTGATTTTTGATTTTCGAGTAGTTTAATTGGGAGGTATTTTTTAGTAGGGGTATATTGATATTTTATACGGTAGGGTATTTTAATAGATTTGTAGTTGAGGGTTTTGTTTGTTAGAAGGAAAATTAATAATTAGAAAGGATATTTATATCGAAAATTTATTTGTATATTATTATTATTAAAGATTAAAAGTAGATAAAATTATAAAG
3’ End dG should be above -13.0 (calculated from the last 7 nucleotides) TTTT doesn’t matter
Duplex analysis • ignore all dimers with positive dG • only primers with negative dG <-1.5 should concern you • combination of low negative dG and perfect basepairing at the 3’ end is the worst combination
Tips how to improve amplification HOT START PCR • add Taq pol only after denaturation step • use hot start Taq pol, e.g. Amplitaq GOLD from Perkin Elmer/ABI • Amplitaq GOLD requires the intital denaturation step for 10-15 min at 95 oC TOUCHDOWN PCR 94 oC for 15 min 94 oC for 30 sec 62->55 oC for 30 sec 72 oC for 1 min 94 oC for 30 sec 55 oC for 30 sec 72 oC for 1 min 72 oC for 15-20 min 14 cycles 0.5 oC down/cycle EXTREMELY SENSITIVE TO CONTAMINATIONS !!! 36 cycles
BISULFITE SEQUENCING STEP BY STEP Restriction digest - cheap and fast - less information (up to a few CpGs) - not good for polymorphic sequences 5) Run PCR 6) cut 7) Gel Extraction (Qiagen) 8) TOPO TA II cloning (Invitrogen) 9) Miniprep (Qiagen) Pyrosequencing http://www.pyrosequencing.com/ - short read (30nt) - quantitative ratio of polymorphic nucleotides - good for one sequence analyzed from many samples 10) Sequencing with SP6 primer
Direction of sequencing is important! • GT-rich regions are more difficult to sequence • cause problems especially with products >300 bp 700 The same PCR product (amount, purity …) but different strands sequenced 400
BioEdit –it’s good and it’s free! http://www.mbio.ncsu.edu/BioEdit/bioedit.html Vector NTI is good but f*cking expensive
70 61 58 52
analysis of individual elements single-locus analysis
ORF2 ORF1 5’ UTR EN RT AAAn L1 Xp22 L1 Xq22 L1 6p22 L1 6q16 L1 8q24 L1 6p21 L1 pool female undifferentiated hES sample ? ? ? How would you explain it? How would you test your explanation?
ORF2 ORF1 5’ UTR EN RT AAAn 8q24 Xp22
HOMEWORK • Oct-4 promoter analysis • PCR cloned in pCR II • sequenced with SP6
DNA METHYLATION ACROSS PHYLA - a number of conserved genes - common basic cell types - highly variable DNA methylation - different development - conserved histone modifications - different sex determination - different epigenetic mechanisms >350 MYA MAMMALIA Mus Xenopus AMPHIBIA >400 MYA CHORDATA Danio PISCES >600 MYA DEUTEROSTOMES Strongylocentrotus ECHINODERMATA COELOMATES PROTOSTOMES ARTHROPODA Drosophila EUMETAZOA Caenorhabditis PSEUDOCOELOMATES NEMATODA
EXTENSIVE DNA METHYLATION LEAVES TRACES …. -genomes carrying CpG methylation show lower frequencies of CpG Jabbari 2004
C-T CONVERSION http://www.chemsoc.org/chembytes/ezine/2001/pufulete_mar01.htm
Drosophila 0.1-0.2% of the cytosines methylated
DNA METHYLATION DISTRIBUTION - trends and exceptions repetitive sequences - noncoding tandem repeats (satellites) hypermethylated - coding tandem repeats - rDNA variable - interspersed elements - L1, IAP, Alu hypermethylated - telomeric repeats hypermethylated unique sequences - promoter - active genes +/- unmethylated - inactive genes variable - “gene body” methylated
upstream of the UCE UCE core p. H4Ac (active) dim-H3K9 (inactive) HEK-293 total DNA rDNA - pol I transcription - tandemly arrayed on five pairs of human acrocentric chromosomes - ~400 copies per haploid genome - typically half active, half inactive (all active in the oocyte) Methylation hypo hyper transformed cell lines 50% 50% primary human cells 100% 0% primary mouse cells 100% 0% undiff. ES cells 100% 0% diff. ES cells 100% 0%
400 bp human mouse 130 bp repeats UCE Core Annotation of UCE and CORE sequences based on Heix and Grummt , 1995 - different CpG density between closely related species - gene activity does not correlate with methylation in both species - only inactive genes in transformed lines acquire methylation - sometimes, methylation is not very informative
CpG islands CpG island is a region least 200 bp long and with a GC percentage that is greater than 50% and with CpG frequency that is greater than 6% (genome average is 1%). Found in and near approximately 40% of promoters of mammalian genes (about 70% in human promoters). A “typical” CpG island is 300-3000 bp long. The CpG sites in the CpG islands of promoters are typically unmethylated if genes are expressed. This observation led to the speculation that methylation of CpG sites in the promoter of a gene may inhibit the expression of a gene. Kim et al. BMC Cancer 2006 6:180
PROMOTERS OF INACTIVE GENES - hundreds of papers with contradictory data - methylation correlates with inactivity … but that’s it ... CpG poor promoters - hypermethylated regardless of activity strong CpG island promoters - hypomethylated regardless of activity weak CpG island promoters - distinct … testis-specific promoters - methylated in somatic cells
Mammalian DNA methyltransferases Maintenance methylation - DNMT1 substrate: hemimethylated DNA function: restoration of DNA methylation after replication De novo methylation - DNMT3a, 3b (3l) substrate: unmethylated DNA function: establishemnt of new DNA methylation patterns Li 2002
Robertson 2002 Mammalian DNA methyltransferases Maintenance DNA methylation Ignore HW: Why? De novo DNA methylation
Klose 2006 Setting up and interpreting the mark …