280 likes | 550 Views
NCRR Proteomics, Glycomics, and Mass Spectrometry Centers. Proteomics Research Resource for Integrative Biology PNNL
E N D
1. National Resource for Proteomics and PathwaysP41 funded by NCRR Philip Andrews, Director (UM)
Russ Finley (WSU)
Trey Ideker (UCSD)
George Michailidis (UM)
Curt Wilkerson (MSU)
David States (UM)
2. NCRR Proteomics, Glycomics, and Mass Spectrometry Centers Proteomics Research Resource for Integrative Biology
PNNL – Richard Smith
National Resource for Proteomics and Pathways
University of Michigan – Philip Andrews
National Center for Glycomics and Glycoproteomics
University of Indiana- Milos Novotny
Integrated Technology Resource for Biomedical Glycomics
University of Georgia- J. Michael Pierce
Research Resource for Integrated Glycotechnology
University of Georgia- James Prestegard
Mass Spectrometry Resource for Biology and Medicine BU- Catherine Costello
Bio-organic Biomedical Mass Spectrometry Resource UCSF – Alma Burlingame
National Resource for Mass Spectrometric Analysis of Biological Macromolecules Rockefeller University – Brian Chait
Resource for Biomedical and Bio-organic Mass Spectrometry Washington University- Michael Gross
National Resource for Biomedical Accelerator Mass Spectrometry
LLNL- Ken Turteltaub
3. The Primary Goals of the NRPP Develop computational and bioinformatics tools for proteomics.
Provide the datasets required to build predictive organismal models.
Develop and prove technologies needed to produce proteomics data.
4. NRPP Principal Technologies
5. Protein Interaction Maps All interaction detected by yeast two-hybrid screening.
1 large, 69 small components. Largest single data set yet obtained. Random baits used. Covers nearly half of proteome.
1 large, 6 small components. Second largest data set, independently obtained, baits focused on a function, useful to compare with this data.
1 large, 2 small components. New data, protoeome-screening for a bacterial pathogen, widest coverage for an organism yet.
Concerned about how we view and use this data.All interaction detected by yeast two-hybrid screening.
1 large, 69 small components. Largest single data set yet obtained. Random baits used. Covers nearly half of proteome.
1 large, 6 small components. Second largest data set, independently obtained, baits focused on a function, useful to compare with this data.
1 large, 2 small components. New data, protoeome-screening for a bacterial pathogen, widest coverage for an organism yet.
Concerned about how we view and use this data.
6. C. jejuni Interaction Map 1,654 ORFs
1,477 cloned
~336,000 assays
16,172 interactions
11,616 confirmed
2,829 high confidence
7. Overlap with Reference Sets
9. Integrating Interaction Data Drosophila Gene and Protein Interactions Database
10. Conserved Complexes PathBlast - Ideker
11. Expression Profiling Protein Expression Profiling
PTM profiling
Integration of mRNA expression data with interaction maps.
12. Anthrax Infection, Human Lung 2001
18. Time Course ProfilesA549 cell response to TGFbeta
19. Proteome Informatics Open Source code (most efficient approach for Proteomics)
Standardized on Java
IO Framework, MS Expedite, Data Extractor, various modules
Proteome Commons- Open Source site
The PC Dissemination System (DFS)
20. Interactions with other NCRR Centers. PNL- Richard Smith
Proteome Commons
DFS collaborations in progress
Software development- coordination of efforts.
North Dakota INBRE- Donald Sens
Training, Bioinformatics infrastructure
UCSF- Burlingame
PRIME ~ Protein Prospector
DFS- planned.
21. Proteome Commons Services
One stop for most proteomics Open Source tools
Data sets to develop and test new algorithms
Code development tools (versioning, Subversion)
Proteomics news and announcements
Distributed File System (The DFS)
22. Data Sharing and Publishing: Challenges and Solutions Bandwidth/file transfer rates.
New journal guidelines ask for data access.
The vanishing webserver syndrome.
There are some file and annotation standards and efforts to develop more.
23. DFS Peer-to-Peer (P2P) Distributed File System
Open, simple, cross-platform protocols
e-Commerce-grade encryption makes it appropriate for scientific research (allows peer-review and traceability)
It can easily grow to accommodate very large amounts of data and users
Commodity hardware! $0.37 per GB storage
26. Current Status ~16 TB over 13 servers (46 TB online in June)
1 BioChem, 8 NRPP, 1 ISB, 1 MSU, 1 UC Davis, 1 UNC
1.3 terabytes of MS/MSMS data (all public data available).
Docs, tools, code, credits and more: http://www.proteomecommons.org/dev/dfs
Data sets
TheGPM, PNNL, Aurum, QqTOF vs QSTAR, sPRG ABRF 2006, HUPO PPP
PeptideAtlas, OPD
27. Acknowledgements NRPP
Xuequn Chen
Jayson Falkner
Brian Maso
Panagiotis Papoulias
Gary Rymar
Eric Simon
John Strahler
Peter Ulintz
Donna Veine
Angela Walker
29. https://dfs/proteomecommons.org/