220 likes | 230 Views
This presentation discusses the design and considerations for local research microarray databases. It explores the differences between public repositories and local databases, the components of MIAME standards, data analysis and variability, and quality views to identify variability sources. Examples and validation methods are also provided.
E N D
MGED IV MeetingConsiderations in the Design of Local Research Microarray Databases Jason Gonçalves Slides available at http://www.iobion.com/slides/
Public Repositories vs. Local Microarray Databases Aims - Public Microarray Repositories: • Exchange of published (complete) microarray datasets • Facilitate meta-analysis of previously published microarray datasets Aims - Local Microarray Databases: • Storage of all microarray data (images, etc) • Facilitate analysis of microarray data • Enable validation and debugging of local microarrays, reagents and equipment
Sample Annotation – MIAME++ Detailed description of replication: At what level is this a replicate?
Data Analysis and Variability Evolving Microarray Analysis • Develop statistical methods for microarray data analysis • Study and understand the multiple sources of variability • Develop methods to reduce variability and develop experimental designs with the sources of variability in mind s s s “Replication does not ensure duplication of results, of course this is not obvious when replication is not used”
Quality Views To Find Variability Sources Time Points Time Course Hyb Dates Print Batch ID Slide Batch ID
Dissecting Spatial Effects Background - Simulated Original Image Data Ratio View - Simulated Quantified Microarray Data
Using Replicate Statistics within a Hybridization Group to Identify Variable Genes Note: SD of the Hybridization Group Standard Deviations • Z-score values identify genes that are highly variable relative the other genes in the group while individual variance values report the absolute levels • Identify genes with extreme standard deviations - z-score > 3
Clustering SD Z-Scores to Identify Random and Systematic Sources of Variability Gene with Z-Score > 3 • Calculated Z-score for all genes • and hybridization groups • Filtered data set to include • genes with at lease one observed • z-score > 3 • Random Noise: • Dust spots • Incorrect patch placement • Ghost spots • Systematic Variability: • Using poor gene annotation (e.g. using Unigene vs. Clone ID) • Microarray production problems • PCR failure, double bands, end of print plate errors
Systematic Production Problem Example Unique PCR Amplifications Unique Hybridizations
Multiple Myeloma Cell Lines Interesting Genes: p53, x-actin, ras, etc. Other Genes (49 clones) Lambda Ig light chain (62 clones) Validate with RT-PCR Kappa Ig light chain (74 clones) All interesting genes are artifacts Other Genes (12 clones)
Iobion - GeneTraffic www.iobion.com
Acknowledgements • Iobion Informatics LLC. • Daniel Iordan • Harry Liu • WL Marks • William Roboly • Faye Barron • Bogdan Georgescu University Health Network, University of Toronto • Mark Takahashi • Neil Winegarden • James R. Woodgett • All UHN Microarray Center Staff