1 / 17

ArrayExpress

ArrayExpress. www.ebi.ac.uk/arrayexpress. Ugis Sarkans EMBL - EBI. Outline. why the domain model is not simple ArrayExpress object model ArrayExpress implementation status future developments. Underlying principles.

liam
Download Presentation

ArrayExpress

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ArrayExpress www.ebi.ac.uk/arrayexpress Ugis Sarkans EMBL - EBI

  2. Outline • why the domain model is not simple • ArrayExpress object model • ArrayExpress implementation status • future developments

  3. Underlying principles • must be able to accommodate needs of a technology that is under constant development • must be able to manage data in absence of standard measurement units and standards for reliability information • gene expression data have any meaning only in the context of what are the experimental conditions • controlled vocabularies and ontologies needed for unambiguous sample annotation • MIAME-compliant

  4. ArrayExpress - conceptual overview

  5. Simple version of AE object model -ArrayExpressBasic

  6. Motivation for 2 object models • many spots - one gene • raw data - cleaned-up data - ratios - normalizations - higher-level analysis • how detailed sample description is needed? • for data mining we need ways to unify several datasets: • array features across different array platforms • samples from different experiments • various raw and derived measurements

  7. ArrayExpressComplete

  8. Scope of ArrayExpress object models • useable for a public repository as well as a laboratory database (e.g., as a part of LIMS) • implementation of “intermediate” models possible • mapping to RDBMS tables - not necessarily straightforward • models and documentation available atwww.ebi.ac.uk/arrayexpress

  9. ArrayExpress - features • able to import MAML format • can deal with both raw and processed data • independence of: • experimental platforms • image analysis methods • data normalization methods • object model-based query mechanism • will support upcoming OMG standard for expression data

  10. Key constructs in the AE object model • structured sample descriptions • notion of ExpressionValueSet • several dimensions for ExpressionValues • Transformations working on ExpressionValueSets and their dimensions

  11. treatment Derived sample 1 Primary sample 1 Sample source extraction Derived sample 2 treatment Primary sample 2 Extract 1 A new state of sample source Extract 2 labeling Hybridization Labeled extract 2 Labeled extract 1 Structured representation of sample and treatment relations

  12. Microarray expression valuerepresentation expression value types composite spots primary measurements derived values primary spots composite images e.g., green/red ratios primary images

  13. Current status • object model - stable, supports current MIAME • physical database schema • MAML data loader • populated with one dataset from EMBL • currently accessible through SQL

  14. In development • data loader - changes following MAML evolution • annotation & MAML export tool • Web interface to ArrayExpress • programmatic interface will follow

  15. Proposed architecture application server Web server MAML data ArrayExpress data warehouse data submission & curation database image server? curation pipeline

  16. Future developments • will support upcoming OMG standard for gene expression data (XML, queries) • diagrammatic interface to sample description submodel • integration with other databases • analytical tools running on top of ArrayExpress • data curation pipeline development

  17. Acknowledgements • MGED - MIAME, MAML • Incyte - Genomic Knowledge Platform • OMG gene expression data proposal submitters - Rosetta & NetGenics

More Related