210 likes | 224 Views
Welcome to ISMB/ECCB Genomics Session July 31 – August 4, 2004. Ying Xu and Gene W. Myers Co-Chairs, ISMB/ECCB Genomics session. Computational Genomics. Our genome encodes an enormous amount of information about our beings our looks our size how our bodies work …. our health
E N D
Welcome to ISMB/ECCB Genomics SessionJuly 31 – August 4, 2004 Ying Xu and Gene W. Myers Co-Chairs, ISMB/ECCB Genomics session
Computational Genomics • Our genome encodes an enormous amount of information about our beings • our looks • our size • how our bodies work • …. • our health • our behaviors • … who we are! gcgtacgtacgtagagtgctagtctagtcgtagcgccgtagtcgatcgtgtgggtagtagctgatatgatgcgaggtaggggataggatagcaacagatgagcggatgctgagtgcagtggcatgcgatgtcgatgatagcggtaggtagacttcgcgcataaagctgcgcgagatgattgcaaagragttagatgagctgatgctagaggtcagtgactgatgatcgatgcatgcatggatgatgcagctgatcgatgtagatgcaataagtcgatgatcgatgatgatgctagatgatagctagatgtgatcgatggtaggtaggatggtaggtaaattgatagatgctagatcgtaggtagtagctagatgcagggataaacacacggaggcgagtgatcggtaccgggctgaggtgttagctaatgatgagtacgtatgaggcaggatgagtgacccgatgaggctagatgcgatggatggatcgatgatcgatgcatggtgatgcgatgctagatgatgtgtgtcagtaagtaagcgatgcggctgctgagagcgtaggcccgagaggagagatgtaggaggaaggtttgatggtagttgtagatgattgtgtagttgtagctgatagtgatgatcgtag …….
Computational Genomics • As technologies improve, we are able to extract more and more information encoded in a genome community organs whole cell bio-complexity pathways complexes proteins biological data genes
genomics proteomics transcriptomics metabolomics gene networks systems biology molecular structures Computational Genomics • While the ultimate goal of “functional genomics” is to link behavior of cells, organisms, and populations to the information encoded in the genome, “computational genomics” is mainly about identifying and characterizing the parts-lists of complex biological systems
Computational Genomics • Genetic parts-list encoded in a genome • genome sequence and variations • genomic structures • protein-coding genes • RNA-coding genes • pseudo genes • homologs/orthologs/paralogs • promoters/terminators • regulatory elements/binding motifs • transposable elements • …….
Computational Genomics • To identify and characterize these elements, a large number of computational techniques have been developed and widely used in biological research • bio-sequence comparison • gene prediction • prediction of orthologous genes • prediction of promoters • prediction transcription factor binding motifs • prediction of operons • prediction of genome rearrangement • prediction of simple and complex repeats • prediction of SNPs and haplotype analysis • …….
Computational Genomics • Computational genomics is playing a increasingly more important role in modern biology • suggesting biological functions of predicted genes, through homology search • e.g., NF1 regulates Ras in human • suggesting possible genes associated with a particular disease, and hence reducing the search space for relevant genes • e.g., genes involved in retinal disease • suggesting an organism’s biology through genome comparison, • e.g., M. genitalium produces its macromolecules from preformed precursors that are transported into its cytoplasm from its eukaryotic host cell
Computational Genomics • suggesting component-candidate list and their possible interaction relationships in a biological pathway/network • e.g., prediction of operons in microbial genomes • providing powerful tools for studies of biological evolution • sequence/genome comparison • phylogenetic profile analysis • have played key roles in the human and other genome projects • genome assembly • protein-coding gene prediction • genome annotation are considered as major milestones in the human genome project
Challenges in Computational Genomics • One challenge comes directly from the sheer amount of sequence data and the rate at which the data is being generated • 207 genomes have been sequenced • close to 1,000 genomes are being sequenced • 506 prokaryotic genomes • 418 eukaryotic genomes • The amount of information potentially drivable through comparative genome analysis could be enormous knowing that functional elements are often conserved among “related” genomes • how to effectively derive them?!
Challenges in Computational Genomics • Prediction of protein-coding genes still represents a challenging problem • accurate prediction of exon/intron boundaries • prediction of alternatively spliced gene forms • Protein-coding genes account for ~3% of the human genome. What and where are the other “functional elements” in the rest of the genome? • how to identify them? • how to (help to) predict their functions?
Challenges in Computational Genomics • Identification of RNA-coding genes • what are the identifiable characteristics of RNA genes? • Particularly, identification of small regulatory RNA • short interference RNAs (siRNA) • microRNA (miRNA) • small modulatory RNA (smRNA) • Identification of regulatory elements/binding sites • transcription regulatory binding sites • splice factor binding sites • other classes of regulatory elements?
Challenges in Computational Genomics • Identification of other types of functional elements • transposons • …. • Identification of genome variations – polymorphisms • identification of SNPs • prediction of haplotype blocks • Recognition of genome structures • operons, regulons in microbes • genomic structures in eukaryotic genomes
Challenges in Computational Genomics • Genome is not a linear sequence; It is a 3D structure! • accurate identification and characterization of functional elements by looking at the genome as a 3D DNA structure …. and many other outstanding challenges!
Papers submitted to Genomics Session • 69 papers submitted to the Genomics session • 10 papers selected for presentation • 8 long papers • 2 short papers • Acceptance rate: 14.5%
Papers submitted to Genomics Session • Papers submitted to ISMB/ECCB (genomics session) provide hints about the hot research areas in genomics • haplotype prediction and applications • prediction of non protein-coding genes, particularly RNA genes • prediction of regulatory binding sites • characterization of genomic structures • ……. represent areas with the most number of paper submissions in the “genomics” area
10 papers selected from 69 submissions • Talk #12: Splice site identification by idlBNs, R Castelo and R Guigo • Talk #14: Improved techniques for the identification of pseudogenes, L Coin and R. Durbin • Talk #16: Exploiting conserved structure for faster annotation of non-coding RNAs without loss of accuracy, Z. Weinberg and L. Ruzzo • Talk #18: Functional inference from non-random distribution of conserved predicted transcription factor binding sites, C. Dieterich, S. Rahmann and M. Vingron • Talk #19: CHAINER: software for comparing genomes, MI Abouelhoda and E. Ohlebusch • Talk #20: Genomic features in the breakpoint regions between synetenic blocks, P Trinh, A McLysaght and D. Sankoff • Talk #22: Finding conserved primer pair candidates between two genomes using scalable genome joins with the MoBIoS, W. Xu, W. Briggs, J. Padolina, W. Liu, CR Linder and D. Miranker • Talk #24: High density linkage disequilibrium mapping using models of haplotype block variation, G. Greenspan and D. Geiger • Talk #26: Into the heart of darkness: large scale clustering of human non-coding DNA, G Bejerano, D Haussler and M. Blanchette • Talk #28: CIS: compound importance sampling method for transcription factor binding site p-value estimation, Y. Barash, G. Elidan, T. Kaplan, and N. Friedman
Selected Papers • Prediction of non protein-coding genes (3) • RNA genes (2) • pseudo genes • Prediction of binding or functional sites (3) • transcription factor binding sites (2) • splice sites • Genome comparison and structure analysis (3) • genome comparison (2) • characterization of genomic structures • Applications of haplotype blocks
Genomics Papers Presented at ISMB’94 • “Computational genomics” papers from ISMB’94 (10 years ago!) • genome assembly • restriction map construction • genetic map construction • genome alignment • multiple sequence alignment • finding repeats in C. elegans • prediction of internal exons in human genome • exon/intron parsing • (protein) gene structure prediction The field has clearly evolved quite a bit!
ISMB/ECCB Genomics Session • Part I • Lomond Auditorium starting at 2:50pm of August 1st • 4 long talks • Part II • Clyde Auditorium starting at 9:20am of August 2nd • 2 short talks • 4 long talks
Acknowledgements • Reviewers for the Genomics Session • Andy Baxevanis • Ewan Birney • Jeremy Buhler • Liming Cai • Dannie Durand • Roderic Guigo • Robert Giegerich • Dan Gusfield • Eran Halperin • Daniel Huson • Steve Jones • Daphne Koller • Anders Krogh • Shirley Liu • Mihai Pop • Isidore Rigoutsos • Marie-France Sagot • Victor Solovyev • Martin Tompa • Dong Xu
Genomics Session • Enjoy the talks!