1 / 30

Richard H. Scheuermann, Ph.D. November 5 , 2012

Support for Systems Biology Data in IRD/ ViPR - Proteomics. Richard H. Scheuermann, Ph.D. November 5 , 2012. Projects with Host Factor Data. Four s ystems biology groups funded by NIAID, including: Systems Virology (Michael Katze group, Univ. Washington)

hans
Download Presentation

Richard H. Scheuermann, Ph.D. November 5 , 2012

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Support for Systems Biology Data in IRD/ViPR - Proteomics Richard H. Scheuermann, Ph.D. November 5, 2012

  2. Projects with Host Factor Data • Four systems biology groups funded by NIAID, including: • Systems Virology (Michael Katze group, Univ. Washington) • Influenza H1N1 and H5N1 and SARS Coronavirus • statistical models, algorithms and software, raw and processed gene expression data, and proteomics data • Systems Influenza (Alan Aderem group, Institute for Systems Biology/Seattle Biomed) • Various influenza viruses • microarray, mass spectrometry, and lipidomics data • ViPR Driving Biological Projects • Abraham Brass, Mass. General Hospital • Dengue virus host factor database from RNAi screen • Lynn Enquist / MoriahSzpara, Princeton University • Deep sequencing and neuronal microarrays for functional genomic analysis of Herpes Simplex Virus • Richard Kuhn, Purdue University • Metabolomics data of Dengue virus infection of human cells and mosquitos • Mike Diamond, Washington University • Identification of inhibitory interferon-stimulated genes against flaviviruses and noroviruses using shRNA knockdown • Determine the mechanism of action of individual inhibitory ISGs

  3. Strategy for Handling “Omics” Data • “Omics” data management (MIBBI vs MIBBI-DB) • Project metadata (1 template) • Title, PI, abstract, publications • Experiment metadata (~6 templates) • Biosamples, treatments, reagents, protocols, subjects • Primary results data • Raw expression values • Data processing metadata (1 template) • Normalization and summarization methods • Processed data • Data matrix of fold changes and p-values • Data interpretation metadata (1 template) • Fold change and p-value cutoffs used • Interpreted results (Host factor biosets) • Interesting gene, protein and metabolite lists • Visualize biosets in context of biological pathways and networks • Statistical analysis of pathway/sub-network overrepresentation

  4. Data Submission Workflows Free text metadata GEO/PRIDE/PNNL/SRA/MetaboLights Primary results submission Study metadata pointer submission Experiment metadata ViPR/IRD/PATRIC Analysis metadata Processed data matrix pointer Host factor bioset Systems Biology sites

  5. Metadata Submission Template Examples

  6. Host Factor Data

  7. 8 Studies To Date

  8. Host Factor Bioset

  9. Transcriptomics => Proteomics • Metadata fields are largely re-usable, with some exceptions • Exp_sample_template (protein).xls • Results data differences • Peptide-level and protein-level • IM005_Peptide_normalization_matrix.V2.xlsx • IM005_Protein Normalization matrix.xlsx • Statistical measures • Results_matrix_ IM005_sig Protein_RM.xlsx

  10. Metadata Field Changes • GEO GSM ID => Primary Data Archive + Primary Data Archive ID • Semi-structured Experiment Variable to Structured Experiment Variable • Free text (1 day) => value unit pairs in separate fields (1/day; 10^4/plaque forming units) • Multiple processed data matrix files • Concatenated IDs separated by (; |) • Reagents and protocols are different but should not require submission template changes

  11. Normalized Data • Archive at BRC (standard format?) • Peptide normalized data • Protein normalized data • Results matrix of significant proteins • BRCs derive bioset lists from results matrix • Handling different significance measures • t-test flag, t-test p-value, g-test flag, g-test p-value, log10 ratio

  12. Host Factor Bioset

  13. On Deck • Metabolomics and lipidomics data • Integration of RNA expression, protein abundance and metabolite abundance • Pathway/network visualization and analysis

  14. Acknowledgement • Lynn Law, U. Washington • Richard Green, U. Washington • Peter Askovich, Seattle Biomed • Brett Pickett, U.T. Southwestern/JCVI • Jyothi Noronha, U.T. Southwestern • Eva Sadat, U.T. Southwestern • Entire Systems Biology Data Dissemination Task Force, especially Jeremy Zucker • NIAID (Alison Yao and ValentinaDiFrancesco)

  15. Future Development Plans

  16. GO enrichment Network visualization GO GO GO GO GO GO GO GO GO GO GO GO GO GO

More Related