110 likes | 230 Views
Core Promoters & Transcription Start Sites. Core promoter region of ~100 bp flanking the transcription start site recognition site for the basal transcription apparatus Implicated in gene regulation, not just initiation Identified sequence motifs
E N D
Core Promoters & Transcription Start Sites • Core promoter • region of ~100 bp flanking the transcription start site • recognition site for the basal transcription apparatus • Implicated in gene regulation, not just initiation • Identified sequence motifs • Inr: centered at position +1; ~25% of promoters • TATA box: centered at -40; ~20% of promoters • DPE: centered at +20; ~8% of promoters • DRE: DNA replication-related-element; ~25% • 6 more motifs defined computationally
Transcription Start Sites • Full-Length cDNAs • Cap-trapped libraries • 5’ ESTs & directed cDNA screening • 5’ RACE & primer extension • Whole genome tiling arrays • Tissue- and stage-specificity • Alternative 5’ exons • Comparative genomics may help
Core Promoter Prediction: Example of Ohler et al. 2002 • Selected ~2,000 clusters of 5’ ESTs • ≥ 3 ESTs within 11 bp • ≥ 1 cap-trapped EST • 5’-most EST defines position +1 • Find overrepresented words • Search within -60 to +40 • MEME finds 10 sequence motifs: 4 known and 6 novel • Re-train and run McPromoter • Evaluate on Adh region and chromosome arm 2R • Sensitivity / Specifity = 65% / 29% to 19% / 69% (threshold-dependent) • Conclusion: Good enough to direct wet lab experiments
P-elements Target Promoters • > 50,000 mapped insertions • Insertions associated with 40% of genes
Drosophila Promoter Annotation P-element insertion sites Promoter predictions Curated gene models - open boxes = 5’UTRs
DNase I Hypersensitive Sites • Sites of Accessible or “Open” Chromatin • Southern Blot Assay Method • Partial digestion of chromatin with DNase I • Complete digestion with a restriction enzyme • Southern blot to identify and locate site(s) • DNase digest is partial, so fragments are size-fractionated prior to detection • Related Methods • Restriction enzyme hypersensitivity • Chemical modification
Additional Methods • ENCODE Methods • ChIP-chip of proteins at promoters • RNA pol II, TBP, etc. • Transfection of reporter constructs • DNase I hypersensitive sites • EST-like sequencing of plasmid libraries • Quantitative PCR detection • Other Approaches • In vitro transcription assays • What else?
Sample ENCODE Region (Fig.3) • Promoter finding methods correlate with each other and with 5’ exons • Functional assays of constructs look good • RNA Pol II ChIP-chip can be very effective and scales to the genome • DNaseI hypersenstitve sites correlate with promoters • Some genes appear to have promoters at both ends: antisense?
Summary of Methods • Transcription Start Sites • 5’-ESTs, cDNA screening, and 5’-RACE are scalable • Whole genome tiling array hybridization is a key • Core Promoters • ChIP-chip with promoter-associated proteins is a key • Reporter constructs tested in functional assays in fly cells? • DNase hypersensitve sites • Will high-throughput methods be effective and scalable? • Are hypersenstive sites mostly promoters? • Computational Methods • Refinements to promoter prediction • Comparative sequence analysis • Comparative promoter prediction?
Promoter Annotation in Drosophila • ~14,000 protein-coding genes • >10,000 associated with 5’ ESTs • How many alternative promoters? • Release 3 annotation suggests ≥ 3% of genes • How many non-protein-coding genes? • Project Scale: ~20,000 fly promoters?