1 / 18

Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics

Explore the integration of heterogeneous information sources for proteomics and transcriptomics analysis, covering gene expression, data flow, sample preparation, measurements, and interpretation. Utilize advanced technologies like mass spectrometry and microarrays, and benefit from agent technology for automated data retrieval and interpretation. Enhance data organization, access MS spectra, and leverage external databases for deeper insights into gene expression patterns and disease relevance. The system fosters communication among researchers, aiding in the identification of genes and pathways significant for further investigations.

cshepard
Download Presentation

Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Integration of Heterogeneous Informations Sourcesfor Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

  2. Sample Z.z Group Z Sample A.a Sample Z.1 Group A Sample A.1 Data Flow and Motivation ... • List of genes products with changed expression level • Description of variants of genes Sample Selection Question Preparation Interpretation Measurements Analysis

  3. Data available online Lab-internalinformation • Grouping of samples in homogeneous groups • Portioning and preparation of samples • Data derived from a preparation • DNA/RNA sequencing • Affymetrix Microarrays • 2DE Gels • (Tandem) mass spectrometry • External bioinformatics databases • Internal extensions to the above • Communication of ideas between researchers Measurements Aids forInterpretationof Data

  4. Organisation of Samples

  5. Access of MS Spectra • MASCOT peptide identification • MS/MS fragment sequencing

  6. Addition to external data sources • Genes discussed among researchers

  7. Figure 1: The figure shows the integration of protein expression as derived from the analysis of the gel with the RNA expression from a chip experiment, represented as a linearly scaled yellow bar. The spot volume is equivalently depicted as in red, the area in green and the peak intensity in blue. Overview on Identified Spots on Gel • Integration of Protein expression levels • Spot Volume • Spot Area • Spot Peak intensity • with RNA expression levels • from Affymetrix chips

  8. Application of Agent Technology • Automated retrieval and integration of presumed relevant in-house data • Assistance in interpretation • Heuristics to extend/shrink list of genes presumed relevant • Integration with external online data • Pathways • Known relevance of genes in other diseases

  9. Data Flow Seed of Genes Adapted for Agents: • Input: List of Gene IDs • Output: List of ( Gene ID Agent ID Evaluation Explanation History) Heuristic ModifiedList of Genes

  10. Examples for Heuristics • Towards extension/shrinking of list of genes under investigation • Gene lies within chromosomal locus linked to disease • Chromosomal neighbourhood to other genes of investigation • Gene is of presumed low abundance • Guidance of further wet-lab analysis • Comparison of ration RNA/protein levels • Search for pre- or post-transcriptional control

  11. Example: Interaction with EnsEMBL • Visualisation of QTLs with expression data(G. Fischer et al. 2002, submitted)

  12. Transfer from Automated Sequence Annotation • EDITtoTrEMBL (Möller et al. 1998) • Introduction of intermediate level for data integration • Hierarchical organisation of agents Integration Program TrEMBL Program

  13. EDITtoTrEMBL: Self-introducing Agents • Dispatchers provide automated planning of annotation path of entries • Sequence-Analysing agents described their input and their output to dispatching agents • SWISS-PROT syntax and controlled vocabulary • Regular expressions as constraints

  14. Application in sequence annotation of transmembrane proteins • A variety of programs exist to predict • membrane spanning regions • direction of insertion into the membrane Out In

  15. Conflict resolution • Implemented with REVISE (C. V. Damasio; 1997) application described in (S. Möller, M. Schroeder; 2000)

  16. Problems with the transfer of these techniques to the wet-lab • Analysers cannot describe themselves or their results • No ontology for methods of expression data analysis has been defined (yet) • The motivation of an analyser to include a gene cannot be formally expressed • No rules for conflict resolution applicable • Conflicts point the unexpected, not to artefacts

  17. Discussion • Should I implement the best possible agent system or rather ASAP hunt for the causing agents of autoimmune diseases? • New agents are recruited from Perl scripts that are implemented to provide a quick answer to requests of biological researchers. • Integration on a pragmatical level • The system is accepted by wet-lab researchers. • The system has a PHP-based web-frontend, • communication between agents is implemented via SOAP • adaptations and extensions to the system are easily implemented.

  18. Acknowledgements University of Rostock Michael Kreutzer, Gertrud Fischer, Bernd Scheidt, Ines Weber, Angelika Allenberg, Björn Damm, Michael Glocker, Hans-Jürgen Thiesen City University, London Michael Schroeder EMBL-EBI, Cambridge Rolf Apweiler Funded by the BMBF Leitprojekt „Proteom-Analyse des Menschen“and the Landesforschungsschwerpunkt „Genomorientierte Biotechnologie“

More Related