400 likes | 500 Views
A Data Management and Analysis Software Platform for Phospho -Proteomics Data . Larry Lam Southern California Bioinformatics Summer Institute 2009 Graeber Lab – Crump Institute for Molecular Imaging UCLA . Outline. Graeber Lab Background Project Objective
E N D
AData Management and Analysis Software Platform for Phospho-Proteomics Data Larry Lam Southern California Bioinformatics Summer Institute 2009 Graeber Lab – Crump Institute for Molecular Imaging UCLA
Outline • Graeber Lab Background • Project Objective • My Experimental Project (Example Dataset) • Software Design • Software Demo • Conclusion / Future Work • Acknowledgements
Systems Biology of Cancer Signaling • Lab Goals • Understand Cancer Signaling Through Systems Biology Approaches • [long term] Improve Cancer Treatment • Signaling Pathway Modeling Through • Kinetics • Phospho-Profiling • Adaptor Complex Analysis
Project Objective • Develop a Software Platform for Convenient Storage and Analysis of Large-Scale Data Sets • Design Database to Collect and Store Large Scale Proteomic Data Sets • Allow for Comprehensive Meta Information • Simplify Access to Multiple Data Sets • Simplify The Use of Common Tools of Analysis
BCR/Abl Leukemia • BCR/Ablfusion protein found in • - 90% - 95% of chronic myleoid leukemia • - 20% of adult acute lymphoblastic leukemia • - 5% of children acute lymphoblastic leukemia • Analyze the adaptor proteins in BCR/Abl signaling • - Adaptor proteins mediate protein interactions Prey • Prey Complex Capture Protein Bait Interacting Protein http://www.annals.org/cgi/content/full/138/10/819
Experimental Workflow Experimental Protocol IPI Proteomics Database Mass Spectometry Quantitation Pipeline Quantitation Output File Phospho Profiling Complex Purification Manual Organization/ Analysis [Complex] NS Filter/ Consolidation Current Workflow
Identifying Interactions of the Crk Adaptor Proteins • Genetic modification of pro-B-lymphocytes (Baf3) • Express adaptor + streptavidin binding peptide(SBP) • Culture • Lyseeach culture for protein complex purification Crk I Lysate Crk L Lysate Crk II Lysate NTAP Lysate
Protein Complex Purification • Separation of protein complex with streptavidin beads • Trypsin digestion from proteins to peptides • Separation of phosphorylated peptides with Fe(III)-NTA beads • Liquid Chromotography + Mass Spectometry • Quantitation Pipeline P P P P
Quantitation Output File Consolidation of quantified peptides and associated proteins per sample • All peptides identified • All adaptor proteins used • Phosphorylation position within the peptide [optional]
NS Filter/Consolidation Quantitation Output File Collapse Peptides To Protein Quantity Remove Insignificant Proteins • Quantity Is Normalized For Each Row Remove Known Contaminants Heatmap Analysis
NS Filter/Consolidation Quantitation Output File Collapse Peptides To Protein Quantity Remove Insignificant Proteins Remove Known Contaminants Heatmap Analysis
NS Filter/Consolidation Quantitation Output File Collapse Peptides To Protein Quantity Remove Insignificant Proteins Remove Known Contaminants Heatmap Analysis s + s Protein Enrichment Factor = (Median – NTAP Median)/ Protein NTAP
NS Filter/Consolidation Quantitation Output File Collapse Peptides To Protein Quantity Remove Insignificant Proteins Remove Known Contaminants • Configuration File of Known Contaminants Heatmap Analysis
Statistical Analysis: Peptide Quantity Heatmap High Quantity Crk I Crk L CrkII NTAP Low Quantity • Cbl Peptides • Crk I Peptides Java TreeView
Experimental Workflow Experimental Protocol IPI Proteomics Database Mass Spectometry Quantitation Pipeline Quantitation Output File Quantitation Import Phospho Profiling Complex Purification External Sources Local DB External Sources External Sources Manual Organization/ Analysis [Complex] NS Filter/ Consolidation Statistical Analysis Current Workflow New Workflow
Program Design • Programming Language: C# • Database: MySQL • Free • Statistical Computing: R • Free, Accessible to C# QuantitationData Set C# GUI Application Quantitation Output File DATA IMPORT DATA QUERY MySQL Database R Statistical Function
Data Import Methodology Define Meta Data (Descriptors) And Relationships About The Quantitation Values Create The Tables In MySQL Access Using MySQL Connector/Net http://dev.mysql.com/downloads/connector/
Statistical AnalysisMethodology • R Language and Environment for Statistical Computing and Graphics • Modeling • Statistical Tests • Clustering • Heatmaps • Develop a Graphical User Interface To R Functions • - Access R Functions Through R-(D)COM Interface http://cran.r-project.org/contrib/extra/dcom/
Conclusion • Management Software • Standardized approach in maintaining lab data • Analyze Data Sets • Analysis tools highly accessible to biologists of various technical levels • Combine Data Sets • Potentially lead to new discoveries
Future Work • Add More Links To External Database • Enhance Data Query • Include More Analysis Functions
Acknowledgments • Graeber Lab Members • Dr. Thomas Graeber • Dr. BjörnTitz • SoCalBSI Faculty and Members • Dr. JamilMomand • Dr. Sandy Sharp • Dr. Nancy Warter-Perez • Dr. Wendie Johnston • Dr. Beverly Krilowicz • Ronnie Cheng • Funding
Data Import Design Methodology • Define Meta Data (Descriptors) About The Quantitation Values • - Define Relationships • Create The Tables In MySQL • Develop Support for MySQL Access • - MySQL Connector Feature V Label Description Feature Type V Batch Feature Value Label Description Experimenter Date Value Value Type Sample V V Label Description Quality