130 likes | 291 Views
Introduction to the Tsinghua University ENCODE Journal Club. Monica C. Sleumer ( 苏漠 ) 2012-09-24. Tsinghua ENCODE Journal Club Objectives. Read and discuss all 31 ENCODE papers Discuss the 13 “Threads” in the ENCODE explorer Discuss the overall meaning of the ENCODE project
E N D
Introduction to the Tsinghua UniversityENCODE Journal Club Monica C. Sleumer (苏漠) 2012-09-24
Tsinghua ENCODE Journal Club Objectives • Read and discuss all 31 ENCODE papers • Discuss the 13 “Threads” in the ENCODE explorer • Discuss the overall meaning of the ENCODE project • Media reactions • Understand how to apply ENCODE findings to our own research • Generate a long-term repository for our findings on our journal club website: bioinfo.au.tsinghua.edu.cn/encode/
Human Genome • 3,101,804,739 base pairs • 22 chromosomes plus X and Y • 21,224 protein-coding genes • 15,952 ncRNA genes • 3–8% of bases are under selection • From comparative genomic studies • Question: What is the genome doing?
ENCODE Project Objectives • Find all functional elements • Bound by specific proteins • Transcribed • Histone modifications • DNA methylation • Use this information to annotate functional regions • Genes (coding and non-coding) • Promoters • Enhancers • Specific transcription factor binding sites • Silencers • Insulators • Chromatin states • Cross-reference data from other studies • Comparative genomics • 1000 Genomes Project • Genome-wide association studies (GWAS) Different combination in each cell type
ENCODE projects • ENCODE pilot project: 1% of the genome 2003-2007 • modENCODE: Drosophila and C. elegans • Mouse ENCODE in progress? • ENCODE main project 2007-2012 • 1649 dataset-generating experiments • 147 cell types • 235 antibodies and assay protocols • 450 authors • 32 institutes • 31 publications 2012-09-06 • 6 in Nature – all discussed on 2012-09-19 • 18 in Genome Research • 6 in Genome Biology – one of these discussed today • 1 in BMC Genetics www.nature.com/encode/category/research-papers
Materials • 147 types of human cell lines, 3 priority levels • Tier 1 cell lines: top priority for all experiments • Tier 2 cell lines to be done after Tier 1 (next slide) • Tier 3: any other cell lines
Tier 2 Cell Lines http://encodeproject.org/ENCODE/cellTypes.html
Methods Wu Dingming 2012-09-19 Ma Xiaopeng 2012-09-19 GuoWeilong He Chao 2012-09-19 Li Yanjian 2012-09-19 • All methods (DNA or RNA sequencing) can be traced back to a genomic location • Findings vary between cell types
Primary Findings • 80.4% of the human genome is doing at least one of the following: • Bound by a transcription factor • Transcribed • Modified histone • 99% is within 1.7 kb of at least one of the biochemical events • 95% within 8 kb of a DNA–protein interaction or DNase I footprint • 7 chromatin states: • 399,124 enhancer-like regions • 70,292 promoter-like regions • Correlation between transcription, chromatin marks, and TF binding • Functional regions contain lots of SNPs • Disease-associated SNPs in non-coding regions tend to be in functional elements
Applications • Visible as genome tracks in UCSC • Gene or pathway of interest • Mutation from • Cancer sequencing • Genome-wide association studies • Find out what that part of the genome is doing • Compare with your cancer data (RNA-seq) • Comparative genome analysis
Online Resources • Interactive app on Nature ENCODE main page • Journal club website: bioinfo.au.tsinghua.edu.cn/encode/ www.nature.com/encode/
Next ENCODE Journal Club Meeting Suggested meeting day: Thursday (周四) 2012-10-11 LIANG Zhengyu? One more volunteer speaker needed