110 likes | 229 Views
ArrayExpress - a Public Repository for Microarray Based Gene Expression Data. European Bioinformatics Institute - EMBL outstation and German Cancer Research Centre. The decision to establish a public gene expression data repository for array based gene expression data.
E N D
ArrayExpress - a Public Repository for Microarray Based Gene Expression Data European Bioinformatics Institute - EMBL outstation and German Cancer Research Centre
The decision to establish a public gene expression data repository for array based gene expression data • Responding to EBI Industry partner needs, EBI got involved in gene expression data analysis in 1997 • As the result of this work the need for the repository became apparent about 18 months ago • After consulting many of the major microarray laboratories world-wide, the decision was made to start designing the database (Nature 398, p. 646, March 99)
ArrayExpress database • A public repository for array based gene expression data • Draft conceptual design for a public repository for array based gene expression data in Rational Rose • In close collaboration with DKFZ and Sanger Centre • Discussions with other potential data submitters world wide
Expression Profiler • Internet based tool for gene expression data analysis tool (www.ebi.ac.uk/arrayexpress) with a database of yeast expression profiles for about 100 experimental conditions from Stanford and MIT • Option for uploading users own data • Available online (first online tool as far as we know) • Still under development, will be used as an “attractor” for data submissions
Motivation of the design • When representing data from physical experiments we have to decide on the level of pre-processing before storing • Two perspectives • high level data analysis • not loosing essential information • One number per spot vs. image, gene centric vs. spot centric
Conservative compromise • Raw data captured • Information structured in a way that high level views can be easily precomputed • ArrayExpress • image analysis output for each spot • “divide and conquer” approach in annotations • reusability of information already in the database
Two types of submissions • Experiment - a set of hybridisations • Array description
“Divide and conquer” approach • Experiment - a set of hybridisations • A single hybridisation - array + sample • Array description (grid + spots) • spots may be linked to genes • Sample description • Hybridisation analysis (scanning, quantitation, data)
Publication (e.g. , PubMedCentral) External links Hybridisation Experiment Analysis Target Array ArrayExpress Source (e.g., Taxonomy) Gene (e.g., EMBL) Top level structure of ArrayExpress database