1 / 34

Daehee Hwang Leroy Hood Institute for Systems Biology

Daehee Hwang Leroy Hood Institute for Systems Biology. Why Prequips for Systems Biology with proteomic data?. Need for visualization, analysis, and integration of multiple proteomic datasets: raw data level, peptide level, protein level, multi sample analysis

jenis
Download Presentation

Daehee Hwang Leroy Hood Institute for Systems Biology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Daehee Hwang Leroy Hood Institute for Systems Biology

  2. Why Prequips for Systems Biology with proteomic data? • Need for visualization, analysis, and integration of multiple proteomic datasets: raw data level, peptide level, protein level, multi sample analysis • Need for an interface between proteomic data and systems biology analytical tools such as network/pathway analyses

  3. Integration of proteomic data at various levels ? ? ? Communication not possible! Protein Id + Quantitation Protein Id + Quantitation Protein Id + Quantitation Trans-Proteomic Pipeline Trans-Proteomic Pipeline Trans-Proteomic Pipeline Peptide Id + Quantiation Peptide Id + Quantiation Peptide Id + Quantiation Raw Data (MS, MS/MS) Raw Data (MS, MS/MS) Raw Data (MS, MS/MS)

  4. Pep3d: Quality Assessment Interaction Database STRING Network Analysis Cytoscape Microarray Data Analysis Mayday, TIGR Pathway Database KEGG Pep3D Gaggle ? Protein Id + Quantitation Multi Sample Trans-Proteomic Pipeline Peptide Id + Quantiation Raw Data (MS, MS/MS) Properties • quality assessment • 2D gel-like visualization Prequips

  5. Pep3d: Quality Assessment Pep3D Pep3D Instance 1 Instance 2 Communication not possible!

  6. Interface to Systems Biology ? Communication not possible! Protein Id + Quantitation Trans-Proteomic Pipeline Peptide Id + Quantiation Raw Data (MS, MS/MS) Interaction Database STRING Network Analysis Cytoscape Gaggle Microarray Data Analysis Mayday, TIGR Pathway Database KEGG

  7. Prequips Overview Interaction Database STRING Network Analysis Cytoscape Microarray Data Analysis Mayday, TIGR Pathway Database KEGG Gaggle ? Protein Id + Quantitation Multi Sample Trans-Proteomic Pipeline Peptide Id + Quantiation Raw Data (MS, MS/MS) Key Properties • handles multiplesamples at all levels Prequips • integrates high-levelanalysis tools • is extensible

  8. Integration of proteomic datasets at various levels e.g. protXML, ... Mass Spectrometer Protein Quantitation protein-level data further analysis results annotation Protein Inference raw data e.g. mzXML, mzData, ... Peptide Quantification Validation Database Search peptide-level data e.g. pepXML, AnalysisXML,... Trans-Proteomic Pipeline

  9. Data model Project Multi-Sample Analysis Viewers Perspectives Single-Sample Analysis Protein Level Peptide Level Raw Data Data Structures Core Meta Core Meta Core Meta Data Providers protein-level data source, e.g. protXML files peptide-level data source, e.g. pepXML, dta or AnalysisXML files raw data level, e.g. mzXML or mzData files

  10. Case Study: Toponomic change in drug treated Mø Mock1 Mock2 Thapsigargin 8% 28% 2 4 6 8 10 12 14 16 18 20 Fraction #: Calreticulin BiP ATPase Bcl2 Lamp1 114 115 116 117

  11. Visualization: Single exp. project manager peak map for run 29 CID spectra that have been selected all scans of Mock 1 experiment detailed information about one of the level 2 spectra level 1 spectrum & corresponding CID spectra level 1 level 2 level 2

  12. Visualization: Multiple exps. (polymer?) contamination in all 4 runs (this would be hard to see with Pep3D) green = 0 red = 1

  13. Visualization: assess, quntify, etc. retention time min max m/z min max Mock Up (software is under development): map 1 map 2 map 3 map 4 map 5 map 6 X X map 1 map 2 X map 3 map 4 Doesn’t really match the remaining 3 maps!

  14. Prequips & the Gaggle Gaggle Boss DAVID KEGG Browser Prequips Cytoscape Mayday Exchange of data structures such as name lists, lists of name-value pairs, matrices and networks. R statistical environment

  15. Mayday

  16. Cytoscape overall mouse protein/protein interaction map in Cytoscape

  17. Analysis: Feature extraction Protein table Filters Gaggle plugin for interaction with other tools

  18. Analysis: Feature extraction calreticulin Gaggle plugin: selection for broadcast

  19. Analysis: Feature selection Mock1 Mock2 Thapsigargin

  20. Broadcast to Gaggle

  21. Prequips to Gaggle Gaggle Boss DAVID KEGG Browser Prequips Cytoscape Mayday Exchange of data structures such as name lists, lists of name-value pairs, matrices and networks. R statistical environment

  22. Gaggle Boss

  23. Gaggle to Cytoscape Gaggle Boss DAVID KEGG Browser Prequips Cytoscape Mayday Exchange of data structures such as name lists, lists of name-value pairs, matrices and networks. R statistical environment

  24. Integration: Network Analysis chaperones actin filament regulation proteasome complex Thapsigargin 114 iTRAQ ratio ribosome large subunit

  25. Cytoscape to Prequips Gaggle Boss DAVID KEGG Browser Prequips Cytoscape Mayday Exchange of data structures such as name lists, lists of name-value pairs, matrices and networks. R statistical environment

  26. Analysis: Feature extraction- Module selection the ids sent from Cytoscape through the Gaggle proteasome proteins

  27. Prequips & the Gaggle Gaggle Boss DAVID KEGG Browser Prequips Cytoscape Mayday Exchange of data structures such as name lists, lists of name-value pairs, matrices and networks. R statistical environment

  28. Analysis: Functional enrichment the proteasome complex enriched compared to a mouse genome background

  29. Prequips Summary Interaction Database STRING Network Analysis Cytoscape Microarray Data Analysis Mayday, TIGR Pathway Database KEGG Gaggle ? Protein Id + Quantitation Multi Sample Trans-Proteomic Pipeline Peptide Id + Quantiation Raw Data (MS, MS/MS) Key Properties • handles multiplesamples at all levels Prequips • integrates high-levelanalysis tools • is extensible

  30. Conclusion • general and extensible software for systems biology research with proteomics mass spectrometry data. • Integration capability of data from various sources for visualization and analysis. • An interactive environment that supports (visual) data exploration.

  31. Software details • implemented in Java • based on Eclipse Rich Client Platform • extremely modular architecture • multiple plugin interfaces • e.g. viewers, data providers, algorithms • meta information framework • analysis results, sequence information, annotation, ... • data structures as plugins • requirement to support future analytical tools and data sources

  32. Acknowledgements • Special thanks to Nils Gehlenborg • Hood Lab: Inyoul Lee • Kay Nieselt • Aebersold Lab: Nichole King, James Eddes, Eric Deutsch, Ning Zhang, David Shteynberg, Wei Yan, and Andrew Garbutt • Paul Shannon for help with the Gaggle

  33. Mayday Core WEKA Library Machine Learning SBEAMS installation SBEAMS Visualization R environment Bioconductor R PostgreSQL database Database Gaggle Prequips MySQL database anything else Excel

  34. Cytoscape

More Related