200 likes | 210 Views
GenePattern 2.0 is a powerful platform developed at the Broad Institute of MIT and Harvard that offers a repository of analytic and visualization tools for biomedical researchers. It enables the combination of multiple data sources and methods, allowing for reproducible research. GenePattern includes modules for easy analysis, pipelines for method chaining, and a programming environment for customization.
E N D
Developed at the Broad Institute of MIT and HarvardReich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38 no. 5 (2006): pp500-501GenePattern is supported by funding from the NIH
Today… • Introduction to GenePattern • Why • What • How • Demonstration • Summary
Challenges • Modern research methods follow a more integrative approach • Tools are not available to biomedical researchers • Tools are difficult to use • Results difficult to interpret correctly
Purpose • Create tools that are easily accessible to biomedical researchers • Allows for a combination of multiple data sources and methods • Allows for “reproducible research”
GenePattern • Offers a repository of analytic and visualization tools: Modules • Easy creation of complex methods from these tools: Pipelines • The rapid development and dissemination of new methods: Programming Environment
1. Modules • Point and click • ~ 60 analysis modules (handout) • Documentation • Designed for Affymetrix data • 14 different file extensions
2. Pipelines • Golub et al illustrates need • Records the methods, parameters and data to ensure reproducibility • Allows methods to be “chained” • Published or create new • Easily shared • Assigns version numbers
3. Programming environment • Libraries allow transparent access to GenePattern modules from R, Matlab and Java • Language independent mechanism to add new tools to the module repository • Tools can be your own or public (e.g. from Bioconductor)
Functional Architecture Taken from Reich et al Nature Genetics 2006
Components • The GenePattern server • The Java Client • The Web Client
Software Architecture Reich et al Nature Genetics 2006
GenePattern • Current version • Release: 2.0.1, Release date 3/2/2006 • OS compatibility: • Windows: XP, 2000, 2003 • Mac: OS X 1.3.9 or later • Unix: Linux, Solaris, Tru64 • Hardware requirements: • 256MB RAM • 500MB disk space
Demonstration http://www.broad.mit.edu/cancer/software/genepattern/
Gene Expression Analysis • Four broad categories • Differential analysis/Marker selection • Prediction • Class discovery • Pathway analysis • Data Formats • Annotations
Proteomics • SELDI, MALDI and LC-MS in mzXML format • Quality assessment • Peak detection • Spectra comparison • Proteomic analysis pipeline • Data conversion
SNP analysis • In alpha testing • Uses high-density SNP microarray data • Copy number alterations • Loss of heterozygosity (LOH) detection
Data preprocessing and conversion • Importing, exporting and file conversion • Normalization, filtering and imputing • ID conversion and annotation • Row and column extraction, transpose, reorder and split data
Comparison of Selected Microarray Analysis Software Platforms Reich et al Nature Genetics 2006
Summary • Has a few minor problems • Is it something MIBLab can use? • Who is user? • What is it missing? Should be easily added
Sources Gould J, Getz G, Monti S, Reich M, Mesirov JP. Comparative Gene Marker Selection suite. Bioinformatics. 2006 May 18; Liefeld T, Reich M, Gould J, Zhang P, Tamayo P, Mesirov JP. GeneCruiser: a web service for the annotation of microarray data. Bioinformatics. 2005 Sep 15;21(18):3681-2. Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, Mesirov JP. GenePattern 2.0. Nature Genetics 2006 May;38(5):500-1.