140 likes | 290 Views
Expression Data Integration. Microarray Gene Expression Database Meeting Sunday 14th November 1999. Key Topics. Incyte’s experience with expression databases The need for integrated data management and analysis A technology-independent exchange format for expression data.
E N D
Expression Data Integration Microarray Gene Expression Database Meeting Sunday 14th November 1999
Key Topics • Incyte’s experience with expression databases • The need for integrated data management and analysis • A technology-independent exchange format for expression data
Technologies For Genome Wide Analysis Protein Gene Transcript Genome Transcriptome Proteome
Key Components of Expression Databases Toxicology Drug profiles Pathway analysis Disease tissues Normal tissues Incyte has the key components >1500 CPU PC-Farm >75 Terabytes of capacity Software 4,645,958 Sequences 109,938 human Genes (5’-3’ confirmed) Genes Data Management GEM™ cDNA Microarrays Proteomics Proteomics Databases HTP technology with OGS Matched RNA/Protein Exp. 10,000 genes per GEM >100,000 genes on GEMs 19 different GEMs in total
Incyte’s Expression Databases Support Target Discovery and Lead Optimization Protein Protein RNA RNA Target Discovery Lead Discovery and Optimization Target Seln. Screen Dev. Primary Screening Secondary Screening Lead Optn. Target Idn. Make-Test Cycle “Accelerating Compound Selection and Decreasing the Attrition Rate” “Accelerating Selection of High Quality Targets ”
Gene Expression Databases Require Integration Analytical Tools Integrated DB Non-Proprietary Data Proprietary Data
Current players Incyte Affymetrix NEN Clonetech Emerging Players Motorola Hewlett-Packard Perkin-Elmer Amersham Corning Roche The rate of change in microarray technology will accelerate from major impact players entering the marketplace
Microarray Data Management and Analysis • Technology-independent: Can store and analyze data from any microarray technology (single or dual channel) • Provides tools to allow users to load their own microarray data into the database
Data Management • Clinical and experiment information • Sample preparation • Hybridization conditions • Genes/clones/sequences • Expression values • Summarization/Normalization Methods • Microarray Design
Analytical Tools • Query on most database attributes • Average hybridizations; Composite hybridizations • Data visualization • Clustering • Sequence analysis • User-defined gene groups • Data export; Spotfire™ integration • Links to Incyte and PD databases
Take-home message: Central to the successful creation of an expression community will be the ratification of a common data exchange protocol and format.
LifeArray™ Data Import RDBMS • Requires no knowledge of database structure • Minimize need for end-user to change software when schema changes are required • No knowledge of user-specific systems required Database Loader PMD File PMD Driver Raw Data File
Combining the Power of PMD with the Extensibility of XML Why XML? • XML: Extensible Markup Language • W3C Standard: V1.0, Feb 1998 • Powerful language for defining custom markup languages • Well suited for PMD content DTD DB