240 likes | 421 Views
geWorkbench. John Watkinson Columbia University. geWorkbench. The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic and Cellular Networks (MAGNet).
E N D
geWorkbench John Watkinson Columbia University
geWorkbench • The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic and Cellular Networks (MAGNet). • Also, part of the NCI’s cancer Biomedical Informatics Grid (caBIG) initiative. The project was formerly called caWorkbench.
geWorkbench (cont.) • A desktop application for integrative genomics. • Runs on Windows, Linux and Macintosh. • Includes a variety of informatics tools, but specializes in microarray analysis. • Open-source and free for non-commercial use. • Includes an API for plugin development.
Integrative Genomics • Increasingly, researchers need to combine several data sources (microarray assays, DNA/RNA/protein sequences, protein structure, gene ontology, clinical data, etc.) • geWorkbench attempts to move past simple microarray analysis to include integrative methods. • Plugin framework allows geWorkbench to interact with other major software packages, including BioConductor, GenePattern and Cytoscape.
Data Support • Microarray assays (one-color and two-color, as well as caARRAY assays). • Sequence files. • BLAST queries. • Gene-Gene interaction networks (Interactomes). • Gene Ontology Terms. • caBIO pathways and annotations. • Protein structure files (PDB).
Components • geWorkbench has a plugin interface for the development of 3rd-party components. • Documentation and developer support is available from the geWorkbench team. • All visualizations and analyses have been written using the API. Several groups at Columbia are developing for the platform.
Microarray Analysis • Summarization of raw chip data (via BioConductor). • Normalization and Filtering. • Differential expression analysis. • Clustering (Hierarchical and Self-Organizing Maps). • Classification (SVM and SMLR). • Many visualization tools.
Sequence Analysis • BLAST and HMM search interface. • Pattern discovery. • Synteny analysis. • Promoter region analysis. • A variety of sequence viewers.
GO Term Enrichment • Traditional t-tests on microarray data determine differentially expressed genes between two different phenotypes. • Gene Ontology (GO) term enrichment can determine which functional or structural categories show significant differentiation. • Supported in geWorkbench’s GO Panel component. • A similar technique can be applied to other gene sets, such as KEGG pathways.
Reverse Engineering • Microarray data can be used to infer biological pathways. • geWorkbench’s Reverse Engineering component uses the ARACNE algorithm to build gene-gene interaction networks. • These can be compared and combined with an online database of interactions, curated by Columbia.
Matrix REDUCE • Given microarray data and upstream sequences for genes, transcription factor binding sites can be inferred. • The Matrix REDUCE component in geWorkbench provides this analysis and tools to visualize the results.
For More Information • http://www.geworkbench.org • Mailing List: geworkbench-users@gforge.nci.nih.gov • John Watkinson: watkin@c2b2.columbia.edu
Acknowledgements • ARACNE algorithm by Califano et al. • Matrix REDUCE algorithm by Bussemaker, et al. • geWorkbench team: Aris Floratos, Eileen Daly, Kenneth Smith, Kiran Keshav, Xiaoqing Zhang, Manjunath Kustagi, Matthew Hall, Bernd Jagla, Mary VanGinhoven, John Watkinson.