1 / 47

Introduction and Applications of Microarray Databases

MIAME (Minimum Information About a Microarray Experiment). MIAME describes the Minimum Information About a Microarray Experiment that is needed to enable the interpretation of the results of the experiment unambiguously and potentially to reproduce the experiment. [Brazma et al, Nature Genetics] . M

hasana
Download Presentation

Introduction and Applications of Microarray Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Introduction and Applications of Microarray Databases Chen-hsiung Chan Department of Computer Science and Information Engineering National Taiwan University

    2. MIAME (Minimum Information About a Microarray Experiment) MIAME describes the Minimum Information About a Microarray Experiment that is needed to enable the interpretation of the results of the experiment unambiguously and potentially to reproduce the experiment. [Brazma et al, Nature Genetics]

    3. MIAME raw data (CEL or GPR files) final processed (normalized) data essential sample annotation including experimental factors and their values experimental design including sample data relationships sufficient annotation of the array essential laboratory and data processing protocols

    4. Databases using MIAME ArrayExpress at EBI GEO at NCBI CIBEX at DDBJ

    5. ArrayExpress http://www.ebi.ac.uk/microarray-as/aer/ Stores transcriptomics and related data Data warehouse stores gene indexed expression profiles In accordance with MGED recommendations: MIAME

    7. ArrayExpress statistics Experiment repository: 2,914 experiments (each with at least 6 microarrays) and growing Expression profiles: including 267 experiments, 121,891 genes Data warehouse updated everyday

    8. Searching ArrayExpress Keywords: breast cancer, cell cycle, … etc. Accession numbers: E-XXXX-d, e.g. E-AFFY-1281, E-TIGR-372, … etc. Secondary accession numbers: GEO accession, e.g. GSE5389. Species names mainly in Latin names (e.g. Homo sapiens), common names may be used as well (e.g. human).

    10. ArrayExpress interface

    12. ArrayExpress Search/Browse Result Keyword: lung cancer

    13. ArrayExpress Search/Browse Result Detailed view

    19. Expression Profile results Thumbnail view BigPlot view Gene ranking (most differentially expressed experiments are top ranked) Similarity search: search genes with similar expression levels

    23. Gene Expression Omnibus (GEO) http://www.ncbi.nlm.nih.gov/geo/ Gene expression/molecular abundance repository MIAME compliant Supports browsing, query and retrieval

    25. GEO record types Platform Sample Series DataSet Profile

    26. GEO Platform Platform record defines the list of elements that may be detected and quantified in that experiment (e.g., cDNAs, oligonucleotide probesets) Each Platform record is assigned a unique and stable GEO accession number (GPLxxx) A Platform may reference many Samples that have been submitted by multiple submitters

    27. GEO Sample Sample record describes the conditions under which an individual Sample was handled, the manipulations it underwent, and the abundance measurement of each element derived from it Each Sample record is assigned a unique and stable GEO accession number (GSMxxx) A Sample entity must reference only one Platform and may be included in multiple Series

    29. GEO Series A Series record links together a group of related Samples and provides a focal point and description of the whole study Series records may also contain tables describing extracted data, summary conclusions, or analyses Each Series record is assigned a unique and stable GEO accession number (GSExxx)

    31. GEO DataSet Assembled in NCBI Samples are all equivalently measured and normalized Can be viewed and analyzed with NCBI’s advanced data display and analysis tool

    33. GEO Profile Profile consists of the expression measurements for an individual gene across all Samples in a DataSet Profiles can be searched using Entrez GEO Profiles Similar to Expression Profile in ArrayExpress

    36. SOFT (Simple Omnibus Format in Text) Text based Line based Easily parsed with text processing languages, including Perl, Python, Ruby, PHP, … etc.

    39. Network Biology Visualization and Analysis

    40. Cytoscape Open source network visualization and analysis software ‘Core’ features include network layout and query, also integrate visualizations with state data Can be extended by plugins

    41. Cytoscape developers University of California at San Diego (Trey Ideker) Institute for Systems Biology (Leroy Hood) Memorial Sloan-Kettering Cancer Center (Chris Sander) Institut Pasteur (Benno Schwikowski) Agilent Technologies (Annette Adler) University of California at San Francisco (Bruce Conklin)

    42. Cytoscape A java application Require Java 5 or 6 (JDK5/6 or JRE5/6)

    44. Simple Interaction Format (SIF) Each line denotes one interaction InteractorA xx Interactor B ‘xx’ are interaction types: pp: protein-protein interaction pd: protein-DNA interaction (transcription factor/regulation) pr (protein-reaction), rc (reaction-compound), cr (compound-reaction), gl (genetic-lethal), pm (protein-metabolite), mp (metabolite-protein)

    45. Other interaction formats supported GML XGMML SBML BioPAX PSI-MI Tab-delimited text table and excel

    47. Applications of Gene Expression Gene selection (differentially expressed genes) State annotation in networks (expression level) Gene regulatory network identification

More Related