290 likes | 398 Views
Medizinisches Proteom-Center (MPC) --- Decoy Database Advantages and Protein Balancing for understanding the complexity of life Dr. Christian Stephan Medizinisches Proteom-Center Ruhr-Universität Bochum Germany 02-07. Februar 2008. Proteomics.
E N D
Medizinisches Proteom-Center (MPC) --- Decoy Database Advantages and Protein Balancing for understanding the complexity of life Dr. Christian Stephan Medizinisches Proteom-Center Ruhr-Universität Bochum Germany 02-07. Februar 2008
Proteomics Proteome: The set of proteins expressed by the genetic material of an organism under a given set of environmental conditions. Proteomics can be defined as the qualitative and quantitative comparison of proteomes under different conditions to further unravel biological processes. Expasy.org … and a valid interpretation of proteomics data
Composite Decoy Database Definition: Decoy ≡ pitfall/trap Composite ≡ hybrid/mixed Goal is to determine stochastical/unspecific hits by using a trap for the search engines. The stochastical/unspecific hits will be happen also to a shuffle generated amino acid composition. >IPI:IPI00000001.1|SWISS-PROT:O95793-1|TREMBL:Q5JW29|REF MSQVQVQVQNPSAALSGSQILNKNQSLLSQPLMSIPSTTSSLPSEN >IPI:IPI00000005.1|SWISS-PROT:P01111|REFSEQ_NP:NP_002515 MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVID >IPI:IPI00000006.1|SWISS-PROT:P01112|TREMBL:Q9UCE2|REFS MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVID >IPI:IPI00000012.4|TREMBL:Q6XR72;Q9NPW0|REFSEQ_NP:NP_0 MGRYSGKTCRLLFMLVLTVAFFVAELVSGYLGNSIALLSDSFNMFSD >IPI:IPI00000001.1|SWISS-PROT:O95793-1|TREMBL:Q5JW29|REF MSQVQVQVQNPSAALSGSQILNKNQSLLSQPLMSIPSTTSSLPSEN >IPI:SHD00000001.1|SWISS-PROT:O95793-1|TREMBL:Q5JW29|RE VQKTFSNTSPESKPVGEPEYSNTFESIALSAEGIEYTIHLSAQEPCTV >IPI:IPI00000005.1|SWISS-PROT:P01111|REFSEQ_NP:NP_002515 MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVID >IPI:SHD00000005.1|SWISS-PROT:P01111|REFSEQ_NP:NP_002515 SCDVEYDTVPLYVKGFVGVQLGEEIVAYLEEMLDKQFQMTQMYKG
Advantages of decoy databases • You can determine “true positives” (TP) TP is the number of correct hits with scores above threshold • You can determine “false positives” (FP) FP is the number of incorrect hits with scores above threshold • You can determine “false negatives” (FN) FN is the number of correct hits with scores below threshold • You can determine “true negatives” (TN) TN is the number incorrect hits with scores below threshold • False Discovery Rate
Several search engines, why?- Proteins - SEQUEST Mascot 750 689 101 (+8,0%) 132 (+10,4%) ProteinSolver 693 Phenyx 56 31 29 669 194 (+15,3%) 113 (+8,9%) 38 118 295 52 25 27 28 31 1270 (+69,3%)
Several search engines, why?- Peptides - SEQUEST Mascot 3792 3229 212 (+4,2%) 486 (+9,6%) ProteinSolver 3203 Phenyx 179 168 40 3186 329 (+6,5%) 380 (+7,5%) 348 501 1776 139 96 77 195 146 5072 (+33,8%)
SEQUEST Mascot 3792 3229 212 (+4,2%) 486 (+9,6%) ProteinSolver 3203 Phenyx 179 168 40 3186 329 (+6,5%) 380 (+7,5%) 348 501 1776 139 96 77 195 146 ProteinExtractor ProteinExtractor Several search engines, why?- Peptides - 1068 Proteins +42,4% Normalize scores by a factor for each SE 5072 spectra 1325 Proteins +76,6% • peptides • (+33,8%)
How many decoy peptides? • 5% decoy proteins Sequest: 2,82% Mascot: 2,16% ProteinSolver: 2,15% Phenyx: 2,31%
HUPO Test Sample • 20 Human proteins • Expressed in E.coli BL21 Star™ (DE3) • purification
HUPO Test Sample on protein level 1st run 2nd run 28 30 23 65% 7 5 35 on peptide level 1st run 2nd run 388 417 291 57% 126 97 514
Different search engine scores Sum 1218 spectra
Result II decoy entries target entry
Comparison of different transgenic cell lines Mao L, Zabel C, Herrmann M, Nolden T, Mertes F, Magnol L, Chabert C, Hartl D, Herault Y, Delabar JM, Manke T, Himmelbauer H, Klose J. Proteomic shifts in embryonic stem cells with gene dose modifications suggest the presence of balancer proteins in protein regulatory networks. PLoS ONE. 2007 Nov 28;2(11):e1218
Theorie of „Balancer“ and „Effector“ Proteins I Balancer Proteins protein amount • Deletion • Mutation • Expression change • Development • Aging • etc.
Theorie of „Balancer“ and „Effector“ Proteins II Effector Proteine protein amount • Deletion • Mutation • Expression change • Development • Aging • etc.
Balancer und Effector proteins a complex system Mao et. al. 2007
Scheduled tasks • Establish data collection center for quant. data • Collect quantitative data • Own data • Publications • Collaborations • Identify „balancer“ and „effector“ proteins • Identify „balancer“ and „effector“ attributes • Simulate both „balancing“ and „effects“ for identification of new “balancer” and “effector” proteins • Cross correlation with other Omics
Ministerium für Wissenschaft und Forschung des Landes Nordrhein-Westfalen