240 likes | 466 Views
ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond. ArrayTrack – A brief history in the 5 years Development Cycle. AT version 1 (2001) Filter array; data management tool; AT version 2 (2002): in-house microarray core facility
E N D
ArrayTrack--- Data management, analysis and interpretation tool for DNA microarray and beyond
ArrayTrack – A brief history in the 5 years Development Cycle • AT version 1 (2001) • Filter array; data management tool; • AT version 2 (2002): in-house microarray core facility • Customized two color arrays; data management, analysis and interpretation; • Open to public (late of 2003) • AT version 3.1 (2004): VGDS • Affymetrix; analysis capability enhanced; • AT version 3.2 (2005): MAQC • Tested on 7 commercial platforms (Affy, Agilent one- and two-color arrays, ABI, CodeLink, Illumina …); • Integrated with other software (IPA, MetaCore, DrugMatrix, CEBS, SAS/JMP …) • AT version 4 (2006 – present) • CDISC/SEND standard; • VGDS VXDS
ArrayTrack: Client-Server Architecture CLIENT Analysis Tools Study data (Clinical and non-clinical data) Microarray Proteomics Metabolomics Pub data (Gene annotation, Pathways …) SERVER CDISC/SEND MIAME NCBI, KEGG, GO …
Microarray data Proteomics data Metabolomics data Public data ArrayTrack: An Integrated Solution Clinical and non-clinical data Chemical data ArrayTrack
ArrayTrack Website http://www.fda.gov/nctr/science/centers/toxicoinformatics/ArrayTrack/
uploading Exploring Gene selection • Interpretation • pathways • GO ArrayTrack: MicroarrayDB-LIB-TOOL- An integrated environment for microarray data management, analysis and interpretation TOOL Microarray DB LIB
ArrayTrackComponents Microarray DB GeneTools GeneLib ArrayTrack for Microarray Data Management and Analysis Hypothesis Exp Design Microarray Exp Data management Data analysis Data interpretation
MicroarrayDB – Storing data associated with a microarray exp Microarray database: • Handling both one- and two-channel data, including affy data • Only the CEL file is required for affy data • Supporting toxicogenomics research by storing tox parameters, e.g., dose schedule and treatment, sacrifice time • MIAME supportive to capture the key data of a microarray experiment • Will be MAGE-ML compliant to ensure inter- exchangeability between ArrayTrack and other public databases Microarray DB
Human Genome Project Human Genome Project Human Genome Project Human Genome Project Mirrored Databases Public Databases LIB Component – Containing functional information for microarray data interpretation Functional data: • Individual gene analysis • Pathway-based analysis • Gene Ontology – based analysis • Linking expression data to the traditional toxicological data Microarray DB LIB
TOOL Component- Containing functionality for microarray data analysis Analysis tools: • Four normalization methods • Mean/median scaling for affy data • LOWESS for 2-color array • Gene selection method • T-test, permutation t-test, … • Filtering using fold changes, intensity, flag inf … • Volcano plot, p-value plot … • Data exploring (e.g., HCA, PCA) • Many visualization tools (e.g., flexible scatter plot, Bar chart viewer,… TOOL Microarray DB LIB
Importing data Normalization Apply to Gene Selection Data exploring Apply to Interpretation Supporting Eight Platforms • Affy, Agilent, ABI, Combimatrix, Eppendorf, GE Healthcare, Illumina and customized arrays • Affy data • Probe data (.cel file) • Probe-set data Individual hyb import Batch import
TOOL Microarray DB LIB
Importing data PCA Scatter Plot Data uploading and QC Normalization 2-way HCA Apply to Expression pattern using the bar chart plot Four normalization methods, including LOWESS Gene Selection Data exploring Apply to Gene Ontology analysis Interpretation Significant genes can be identified based on: • Cut-off of p-value (with or without Banferroni correction), fold-change, intensity or combinations thereof • Volcano Plot (considering both p and fold-change) • P-Value Plot (considering false positives/negatives) Individual gene analysis Pathway analysis
Data Interpretation- GO-based analysis using GOFFA • GOFFA – Gene Ontology For Functional Analysis • It is developed based on Gene Ontology (GO) database • Important for grouping the genes into functional classes • GO – Three ontologies • Molecular function: activities performed by individual gene products at the molecular level, such as catalytic activity, transporter activity, binding • Biological process: broad biological goals accomplished by ordered assemblies of molecular functions, such as cell growth, signal transduction, metabolism • Cellular component: the place in the cell where a gene product is found, such as nucleus, ribosome, proteasome
Study domain Array domain TOOL TOOL Study DB Microarray DB LIB
Importing data Normalization Apply to Gene Selection Data exploring Apply to Interpretation Data Interpretation Pathway-based tools: • Ingenuity Pathways Analysis • KEGG • PathArt GOFFA: Gene Ontology-based tool Gene Annotation
Ingenuity Pathways Analysis (IPA) Ingenuity Pathways Analysis • KEGG and PathArt provide canonical pathways • IPA provides both canonical and de-novo pathways Conduct statistical analysis Interrogate genes or proteins on “omics” scale Elucidate functional pathways Understand markers of efficacy and safety
Review Tool for Pharmacogenomics Data Submission: ArrayTrack Receive the data; support future regulatory policy Verify the biological interpretation Analyze the data Microarray DB Tool Lib Data repository Analysis Interpretation ArrayTrack Components
ProteinTools PathwayTools Proteomics DB Metabonomics DB ToxicantLib Future Direction - Toxicoinformatics Integrated System (TIS) GeneTools Microarray DB GeneLib ProteinLib PathwayLib
Importing data PCA Scatter Plot Data uploading and QC Normalization 2-way HCA Apply to Expression pattern using the bar chart plot Four normalization methods, including LOWESS Gene Selection Data exploring Apply to Gene Ontology analysis Interpretation Significant genes can be identified based on: • Cut-off of p-value (with or without Banferroni correction), fold-change, intensity or combinations thereof • Volcano Plot (considering both p and fold-change) • P-Value Plot (considering false positives/negatives) Individual gene analysis Pathway analysis
ArrayTrack – Summary • An integrated solution for microarray data management, analysis and interpretation • Review tool for FDA pharmacogenomics data submission • Training course is provided to the FDA reviewers every two months • At present, ~40 reviewers has been trained • Freely available to public (http://edkb.fda.gov/webstart/arraytrack) • Users at big Pharma, academic and government institutions; U.S., Europe & Asia