1 / 44

Common parameters

Common parameters. At the beginning one need to set up the parameters. http://human.thegpm.org. Common parameters. Most important: the input experimental spectra Self-explaining. . Common parameters. Taxon, and database Self-explaining.

judah
Download Presentation

Common parameters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Common parameters • At the beginning one need to set up the parameters. • http://human.thegpm.org

  2. Common parameters • Most important: the input experimental spectra • Self-explaining.

  3. Common parameters • Taxon, and database • Self-explaining. • E.g. samples form human cells should be queried against human protein database. • Sometimes Protein Sequence libraries are available.

  4. Common parameters • Parent mass tolerance • If it is much smaller than the optimal would be: • the correct peptide can be eliminated from the search space • Execution timedecreases Spectra comparison 

  5. Common parameters • Parent mass tolerance • If it is much bigger than the optimal would be: • decreases the significance of the scores, • makes execution time longer Spectra comparison

  6. Common parameters • Parent mass tolerance • Usually is around 1Da. Spectra comparison 

  7. Common parameters • Fragment ion match tolerance • Depends on the instrument accuracy. • If it is mach small than the optimum:matches will be lost 100% 0% 1 0

  8. Common parameters • Fragment ion match tolerance • If it is much smaller than the optimal would be:Correct matched peaks will be lost.Increases the FDR, increases the false negatives, decreases the sensitivity,

  9. Common parameters • If the fragment ion match tolerance is much bigger than the optimal would be: • Many theoretical peaks will match to an experimental peak • Increases the random scores and it decreases the statistical significance

  10. Common parameters

  11. Fragment ion tolerance (T) T = 0.4Da (correct) T = 0.05Da (too small) T = 2.0Da (too large)

  12. Fragment ion tolerance (T) T = 0.4 (correct) T = 0.05 (too small) T = 2.0 (too large) 217 proteins 713 homologs 930 proteins 132 proteins 406 homologs 538 proteins 197 proteins 589 homologs 786 proteins

  13. Common parameters • Instrument • Some database search software's allow you to select the type of the instruments like ESI QUAD or Quad-TOF • This fine-tunes the search engine according to which fragment ion series will be used for scoring. • E.g.: Immonium ions, a series ions, b-, c-, x-, a-NH3,z+H series, y-H2O etc.

  14. Common parameters • Enzyme, • the enzyme used for enzymatic digestion in the biological sample preparation. • This will be used for the in silico digestion of protein sequences for peptide generation.

  15. Common parameters • E-value cut off

  16. Common parameters • Ion mass search type • Monoisotopic (default) • More accurate, • Average • Might need larger fragment ion tolerance,

  17. Common parameters • Charge state • Too high charge state increases the FDR.

  18. Common parameters • Decoy search • Includes reversed dataset in the peptide identification. • Provides more accurate p-value and FDR estimation • Can double the search time

  19. Common parameters • Error tolerant search. Large number of spectra remain without significant score. Reasonable number of fragment ion peaks might have not match. • Underestimated mass measurement error (should be seen in peptide view graphs, • Incorrect determination of precursor charge state • Peptide sequence is not in the database. • Missed cleavage & unexpected cleavage, • Unexpected chemical & post-translational modification.

  20. Input data Peptide assignment Validation Protein inference Interpretation Scores: 13. 15 6. 4 1. 4 9. 3 4. 3 3. 2 7. 2 11. 2 8. 1 10. 1 2. 1 5. 1 12. 1 Quantitation Input data Experimental Spectra Score: 32 Peptide: SHLITLLLFLFHSETICR Cn=(32-4)/32=0.875 Score: 4 Peptide: AELDLNMTR Cn=(4-4)/4=0 Score: 3 Peptide: MEICRGLR Cn=(3-3)/3=0 Score: 15 Peptide: LLHGDPGEEDK Cn=(15-4)/15=0.733 Score: 4 Peptide: MDHPEDESHSEK Score: 5 Peptide: SAEDLEADK Protein sequence DB Score: 3 Peptide: SIEAKLTLR Keep the peptide assignment that exceeds a certain limit.

  21. Peptide assignment Input data Validation Protein inference Interpretation • Scores: • 1. 2 Quantitation Input data Experimental Spectra Unexpected cleavages TFGQVVAR FGQVVAR GQVVAR QVVAR VVAR VAR AR TFGQVVA TFGQVV TFGQV TFGQ TFG TF Spectra comparison: Protein sequence DB >IPI:IPI00000044.1|SWISS-PROT:P01127 MNRTFGQVVARLVSAEGDPIPEELYEMLSDHSIRSFDDLQRLLHGDPGEEDKAELDLNMTRSHSGGELESLARGRRSLGSLTIAEPAMIAECKTRTEVFEISRRLIDRTNANFLVWPPCVEVQRCSGCCNNRNVQCRPTQVQLRPVQVRKIEIVRKKPIFKKATVTLEDHLACKCETVAAARPVTRSPGGSQEQRAKTPQTRVTIRTVRVRRPPKGKHRKFKHTHDKTALKETLGA

  22. Peptide assignment Input data Validation Protein inference Interpretation • Scores: • 1. 2 Quantitation Missed cleavages Input data Experimental Spectra Spectra comparison: Protein sequence DB >IPI:IPI00000044.1|SWISS-PROT:P01127 MNRCWALFLSLCCYLRLVSAEGDPIPEELYEMLSDHSIRSFDDLQRLLHGDPGEEDKAELDLNMTRSHSGGELESLARGRRSLGSLTIAEPAMIAECKTRTEVFEISRRLIDRTNANFLVWPPCVEVQRCSGCCNNRNVQCRPTQVQLRPVQVRKIEIVRKKPIFKKATVTLEDHLACKCETVAAARPVTRSPGGSQEQRAKTPQTRVTIRTVRVRRPPKGKHRKFKHTHDKTALKETLGA

  23. Peptide assignment Input data Validation Protein inference Interpretation • Scores: • 2 • 2 Quantitation Missed cleavages Input data Experimental Spectra Spectra comparison: Protein sequence DB >IPI:IPI00000044.1|SWISS-PROT:P01127 MNRCWALFLSLCCYLRLVSAEGDPIPEELYEMLSDHSIRSFDDLQRLLHGDPGEEDKAELDLNMTRSHSGGELESLARGRRSLGSLTIAEPAMIAECKTRTEVFEISRRLIDRTNANFLVWPPCVEVQRCSGCCNNRNVQCRPTQVQLRPVQVRKIEIVRKKPIFKKATVTLEDHLACKCETVAAARPVTRSPGGSQEQRAKTPQTRVTIRTVRVRRPPKGKHRKFKHTHDKTALKETLGA

  24. Peptide assignment Input data Validation Protein inference Interpretation • Scores: • 2 • 2 • 1 Quantitation Missed cleavages Input data Experimental Spectra Spectra comparison: Protein sequence DB >IPI:IPI00000044.1|SWISS-PROT:P01127 MNRCWALFLSLCCYLRLVSAEGDPIPEELYEMLSDHSIRSFDDLQRLLHGDPGEEDKAELDLNMTRSHSGGELESLARGRRSLGSLTIAEPAMIAECKTRTEVFEISRRLIDRTNANFLVWPPCVEVQRCSGCCNNRNVQCRPTQVQLRPVQVRKIEIVRKKPIFKKATVTLEDHLACKCETVAAARPVTRSPGGSQEQRAKTPQTRVTIRTVRVRRPPKGKHRKFKHTHDKTALKETLGA

  25. Common parameters • Automatic error tolerant search. • Chemical and Post-Translational Modifications (PTMs) • Fixed modification (simply modifies the mass of the Amino Acid) • Variable modifications (can modify the mass) • Search engines iteratively insert all combination of the possible PTMs.

  26. Common parameters • Automatic error tolerant search. •  more peptides can be indentified. •  enlarges the search space much more • Increases the execution time • Decreases the statistical significance, increases the FDR.

  27. Common parameters • Automatic error tolerant search. • In order to reduce the search space two pass approach is applied. • 1st pass: • Identification of perfect peptides (no PTMs, perfect digestion) • 2nd pass: • Pass the proteins whose one of the peptides was identified in the 1st pass. • Extensive search in the reduced protein sequence, including missed and unexpected cleavage, PTMs, point mutations, etc.

  28. Common parameters • Output parameters • Mainly about formatting the results files. What and how many details want to see.

  29. Common parameters • Other program specific parameters. • Different for X!tandem, Mascot, Sequest, etc.

  30. X!Tandem

  31. Outputs – Browsing the results

  32. Outputs – Browsing the results

  33. Outputs – Browsing the results

  34. Outputs – Browsing the results

  35. Outputs – Browsing the results

  36. OMSSA’s search engine

  37. OMSSA’s output

  38. OMSSA’s result

  39. Good spectrum, good score, bad annotation • Rare if the p-value is significant • Good spectrum, bad score, bad annotation • Peptide might be modified, non-perfect digestion, not in the database.

  40. Badspectrum, bad score, badannotation

  41. Goodspectrum, good score, good annotation

  42. Trans-Proteomic Pipeline (TPP) • Trans-Proteomic Pipeline (TPP) is a data analysis pipeline for the analysis of LC/MS/MS proteomics data. • TPP includes modules for validation of database search results, quantitation of isotopically labeled samples, and validation of protein identifications, as well as tools for viewing raw LC/MS data, peptide identification results, and protein identification results. • The XML backbone of this pipeline enables a uniform analysis for LC/MS/MS data generated by a wide variety of mass spectrometer types, and assigned peptides using a wide variety of database search engines. 

  43. Trans-Proteomic Pipeline (TPP)

  44. Summary • Protein identification from MS/MS data is not a black box. • Always look at the results and understand how it

More Related