1 / 27

State of CBIL

This article discusses the current and future research directions in CBIL, including gene discovery, EST analysis, genomic sequence analysis, gene regulation, microarray analysis, promoter/regulatory region analysis, biological data representation, data integration, and ontology.

lange
Download Presentation

State of CBIL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. State of CBIL Current and Future Directions

  2. Computational Biology and Informatics LaboratoryOctober, 2001

  3. CBIL Research • Gene Discovery • EST analysis • Genomic sequence analysis • Gene Regulation • Microarray analysis • Promoter/ regulatory region analysis • Biological data representation • Data integration • Ontology

  4. CBIL: Gene Discovery • Gene Annotation (Kolchanov) • Gene coding potential • Gene function prediction • AllGenes • EPConDB (Kaestner, Permutt, Melton) • StemCellDB/ StroCDB (Lemischka, Moore) • Mouse chromosome 5 (Bucan) • PlasmoDB (Roos, Kissinger) • ParaDB (Roos) • Posters

  5. CBIL: Gene Regulation • PaGE • PROM_REC • TESS • Pancreatic development (Kaestner) • TGF-B signaling (Bottinger) • Fetal globin expression in adults (Fortina) • Brain disease and injury (Eberwine, Meaney) • Endothelial cell function (Davies)

  6. CBIL: Biological Data Representation • Genomic Unified Schema • RNA Abundance Database • Connecting to a brain atlas (Nissanov, Davidson) • Microarray ontology (MGED)

  7. CBIL Project Architecture Sequence & annotation Gene index (ESTs and mRNAs) Microarray expression data experimental annotation Relational DB (Oracle) with Perl object layer GUS RAD

  8. Controlled vocabs. free text • GO • Species • Tissue • Dev. Stage under development GUS: Genomics Unified Schema • Genes, gene models • STSs, repeats, etc • Cross-species analysis Genomic Sequence RAD RNA Abundance DB • Characterize transcripts • RH mapping • Library analysis • Cross-species analysis • DOTS Transcribed Sequence Special Features • Arrays • SAGE • Conditions Transcript Expression • Ownership • Protection • Algorithm • Evidence • Similarity • Versioning • Domains • Function • Structure • Cross-species analysis Protein Sequence Pathways Networks • Representation • Reconstruction

  9. Clusters vs. Contig Assemblies UniGene Transcribed Sequences (DOTS) CAP4: Consensus Sequences -Alternative splicing -Paralogs BLAST: Clusters of ESTs & mRNAs

  10. Assembled Transcripts About 3 million human EST and mRNA sequences used Combined into 797,028assemblies Cluster into 150,006 “genes” Can identify a protein for 76,771 genes And predict a function for 24,127 genes About 2 million mouse EST and mRNA sequences used Combined into 355,770 assemblies Cluster into 74,024 “genes” Can identify a protein for 34,008 genes And predict a function for 15,403 genes

  11. Bridging Fingerprint Contigs and RH Maps on Mouse Chromosome 5 Crabtree et al. Genome Research 2001 Fingerprint Map Chr. 5 RH Map

  12. Predicting Gene Ontology Functions

  13. AllGenes

  14. AllGenes Enhancements: Annotated Entries

  15. AllGenes Enhancements: Genomic Data

  16. http://plasmodb.org

  17. Contig View • OM Restriction Sites • Microsatellites • Self-BLAST • NRDB-BLAST • SAGE Tags • EST/GSS • FullPHAT • GeneFinder • GlimerM • Annotation (chr2-TIGR)

  18. Experiment Raw Data Platform Metadata Processed Data Algorithm RAD: RNA Abundance Database Compliant with the MGED standards

  19. Microarray Gene Expression Database group (MGED) International effort on microarray data standards: • Develop standards for storing and communicating microarray-based gene expression data • defining the minimal information required to ensure reproducibility and verifiability of results and to facilitate data exchange (MIAME, MAGEML-MAGEDOM) • collecting (and where needed creating) controlled vocabularies/ ontologies. • developing standards for data comparison and normalization. http://www.mged.org

  20. EPConDB Pathway query

  21. Microarray Analysis: PaGE

  22. Identify shared TF binding sites Genomic alignment and comparative Sequence analysis TESS (Transcription Element Search Software) PROM-REC (Promoter recognition) RAD GUS EST clustering and assembly

  23. Promoter Analysis: PROM_REC

  24. http:www.cbil.upenn.edu

  25. CBIL: Chris Stoeckert Vladimir Babenko Brian Brunk Jonathan Crabtree Sharon Diskin Greg Grant Yuri Kondrakhin Georgi Kostov Phil Le Li Li Junmin Liu Elisabetta Manduchi Joan Mazzarelli Shannon McWeeney Debbie Pinney Angel Pizarro Jonathan Schug PlasmoDB collaborators: David Roos Martin Fraunholz Jesse Kissinger Jules Milgram Ross Koppel, Monash U. Malarial Genome Sequencing Consortium (Sanger Centre, Stanford U., TIGR/NMRC) EPConDB collaborators: Klaus Kaestner Marie Scearce Doug Melton, Harvard Alan Permutt, Wash. U Comparative Sequence Analysis Collaborators: Maja Bucan Shaying Zhao Whitehead/MIT Center for Genome Research Acknowledgements CAP4 provided by Paracel

  26. Sequence/ Sequence annotation Pathways/ Networks Gene expression experiment Proteomics, Metabolomics CBIL: Future Directions

More Related