90 likes | 195 Views
Functional Elements Defined Experimentally and Computationally. Drosophila Encode Workshop Dec. 4, 2004. Session 1: 10:45-12:15. Transcripts Protein-coding Mark Yandell (C), Kevin White (E) Transcription Start Michael Brent (C), Akilesh Pandey (E) Transcription Stop UTRs
E N D
Functional Elements DefinedExperimentally and Computationally Drosophila Encode Workshop Dec. 4, 2004
Session 1: 10:45-12:15 • Transcripts • Protein-coding Mark Yandell (C), Kevin White (E) Transcription Start Michael Brent (C), Akilesh Pandey (E) Transcription Stop UTRs • Alternate Splicing Steve Mount (E) • Non-coding Ian Holmes(C), Eric Lai (E) tRNA, rRNAs, snoRNAs, microRNAs,
Session 2: 1:15-2:45 • Regulatory elements • Promoters, DNase sensitivity Roger Hoskins (E) • Cis Regulatory Elements Mike Eisen (C), Sue Celniker (E) Enhancers, repressors, silencers, insulators, Transcription factor binding sites Conserved sequences of unknown function • Chromatin Modification Gary Karpen (E) • Origins of Replication Sites of Replication termination • Genetic variation and Gene Evolution Andy Clark (C), David Begun (E) • Wrap-up Gene Myers
Experiments & Computation How accurate? Room for improvement? What inputs would help? Experiment Computation Experiment DNA sequencing cDNA/EST sequencing ChIP/chip DNAse hypersensitivity Assembly WG Alignment Protein Gene Prediction Cis-Module finding 5’ RACE RT-PCR Transposon Reporters How accurate? Scalability / HTP ? Cost/benefit analysis Validate vs. Refine? Cost/benefit analysis Why flies? Outputs: Information & Materials
Why Flies? • “Flies are most like humans” • Gene features are the most similar to human • Have a tightly spaced ladder of species as for mammals • Lot’s of alternative splicing, complex cis-control. • Closest model for human diseases. • “Flies are easy to play with” • Compact genome • Short generation times, many labs, … • “Existing Resources” • Genomes of 12 species, 50 melanogaster individuals • More known enhancers than any other organism • Good array sets, many functional assays. • Many lines of mutational variants
State of the Art • Coding Genes • Most protein genes located, remainder “unusual” • Refinement of model: 5’ + 3’ ends, splice junctions, alternate splicing possible • Comparative methods are improving, could improve more along with alignments. • Proteomics could further inform transcription • RNA Genes / The Transcriptome • Must have comparative data, look for correlated structure • High throughput validation. Not currently available. • Cis-Elements • Can predict core promoters sufficiently well to assist experimental validation • Enhancers are much, much harder: current need PWMs or • Lots of approaches possible: SELEX, ChIP/chip, Reporter “searches”
State of the Art • Higher-Order Structure • Histone modification is important • Population Statistics and Disease Correlation • “Good” model for human disease studies, only viable model for understanding functional consequences of variation • Long standing pop-stat model
The Basics Enable Comparative Informatics • Extend ladder with 1 or 2 out species • Finish some of the 12 (to some greater degree) • EST/cDNA sequencing of all of the 12 (level?) The Usual Suspects • RACE and/or rt-PCR verification of all coding genes. • Capped EST sequencing. • Junction arrays + tiling arrays • Continued refinement of annotation • Functional understanding of alternate transcription • Purification of every (most?) TF’s + SELEX • ChIP/chip of all TFs + Core promotor proteins • ChIP/chip of histone modification, DH-map on cell types
Novel Components • MS/MS of the proteome (Pandey) • Complete annotation of the transcriptome (Mount, Lai, Holmes) • Dros-map + complex phenotype/genotype correlation project (Clark, Begun) • Recombinant libraries of all transcripts, all enhancers (Celniker, Bellen) • Time (Development stage) and space (cell type) for all the above