300 likes | 584 Views
Support for Systems Biology Data in IRD/ ViPR - Proteomics. Richard H. Scheuermann, Ph.D. November 5 , 2012. Projects with Host Factor Data. Four s ystems biology groups funded by NIAID, including: Systems Virology (Michael Katze group, Univ. Washington)
E N D
Support for Systems Biology Data in IRD/ViPR - Proteomics Richard H. Scheuermann, Ph.D. November 5, 2012
Projects with Host Factor Data • Four systems biology groups funded by NIAID, including: • Systems Virology (Michael Katze group, Univ. Washington) • Influenza H1N1 and H5N1 and SARS Coronavirus • statistical models, algorithms and software, raw and processed gene expression data, and proteomics data • Systems Influenza (Alan Aderem group, Institute for Systems Biology/Seattle Biomed) • Various influenza viruses • microarray, mass spectrometry, and lipidomics data • ViPR Driving Biological Projects • Abraham Brass, Mass. General Hospital • Dengue virus host factor database from RNAi screen • Lynn Enquist / MoriahSzpara, Princeton University • Deep sequencing and neuronal microarrays for functional genomic analysis of Herpes Simplex Virus • Richard Kuhn, Purdue University • Metabolomics data of Dengue virus infection of human cells and mosquitos • Mike Diamond, Washington University • Identification of inhibitory interferon-stimulated genes against flaviviruses and noroviruses using shRNA knockdown • Determine the mechanism of action of individual inhibitory ISGs
Strategy for Handling “Omics” Data • “Omics” data management (MIBBI vs MIBBI-DB) • Project metadata (1 template) • Title, PI, abstract, publications • Experiment metadata (~6 templates) • Biosamples, treatments, reagents, protocols, subjects • Primary results data • Raw expression values • Data processing metadata (1 template) • Normalization and summarization methods • Processed data • Data matrix of fold changes and p-values • Data interpretation metadata (1 template) • Fold change and p-value cutoffs used • Interpreted results (Host factor biosets) • Interesting gene, protein and metabolite lists • Visualize biosets in context of biological pathways and networks • Statistical analysis of pathway/sub-network overrepresentation
Data Submission Workflows Free text metadata GEO/PRIDE/PNNL/SRA/MetaboLights Primary results submission Study metadata pointer submission Experiment metadata ViPR/IRD/PATRIC Analysis metadata Processed data matrix pointer Host factor bioset Systems Biology sites
Transcriptomics => Proteomics • Metadata fields are largely re-usable, with some exceptions • Exp_sample_template (protein).xls • Results data differences • Peptide-level and protein-level • IM005_Peptide_normalization_matrix.V2.xlsx • IM005_Protein Normalization matrix.xlsx • Statistical measures • Results_matrix_ IM005_sig Protein_RM.xlsx
Metadata Field Changes • GEO GSM ID => Primary Data Archive + Primary Data Archive ID • Semi-structured Experiment Variable to Structured Experiment Variable • Free text (1 day) => value unit pairs in separate fields (1/day; 10^4/plaque forming units) • Multiple processed data matrix files • Concatenated IDs separated by (; |) • Reagents and protocols are different but should not require submission template changes
Normalized Data • Archive at BRC (standard format?) • Peptide normalized data • Protein normalized data • Results matrix of significant proteins • BRCs derive bioset lists from results matrix • Handling different significance measures • t-test flag, t-test p-value, g-test flag, g-test p-value, log10 ratio
On Deck • Metabolomics and lipidomics data • Integration of RNA expression, protein abundance and metabolite abundance • Pathway/network visualization and analysis
Acknowledgement • Lynn Law, U. Washington • Richard Green, U. Washington • Peter Askovich, Seattle Biomed • Brett Pickett, U.T. Southwestern/JCVI • Jyothi Noronha, U.T. Southwestern • Eva Sadat, U.T. Southwestern • Entire Systems Biology Data Dissemination Task Force, especially Jeremy Zucker • NIAID (Alison Yao and ValentinaDiFrancesco)
GO enrichment Network visualization GO GO GO GO GO GO GO GO GO GO GO GO GO GO