1 / 18

Modification Site Localization

Modification Site Localization. Why is this a problem? Calculating localization reliability Ways of representing reliability Modification ambiguity. PTM Analysis: An Exploding Field. Large-scale PTM characterization studies are now common Phosphorylation O- GlcNAcylation Acetylation …

aizza
Download Presentation

Modification Site Localization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modification Site Localization • Why is this a problem? • Calculating localization reliability • Ways of representing reliability • Modification ambiguity

  2. PTM Analysis: An Exploding Field • Large-scale PTM characterization studies are now common • Phosphorylation • O-GlcNAcylation • Acetylation • … • Database search engines can identify modified peptides and report a measure of reliability for peptide IDs • Peptide Level: p-value; e-value • Dataset Level: FDR • Most search engines do not assess modification site assignment reliability. • No standard FLR calculation method

  3. Search Engine Performance for Site Assignment • Database search engines are optimized for peptide identification • Optimal parameters for discriminating between correct and random answers are not same as for site identification • More peaks may be needed for site assignment • Reliability of modified peptide identifications is higher than PTM site assignments • What most search engines do: • Report site consistent with data • May be more than one site equally consistent with the data • No information about how reliable site assignment is Bradshaw et al. J Mass Spectrom (2010) 45 10 1095-1097

  4. There are Mistakes In The Literature • There are several large-scale PTM datasets where site assignment was ‘by manual verification’. • Did authors carefully look at 1000+ spectra? • Results from publications are used to populate other databases Phosphosite SwissProt

  5. Evidence for Serine 486 Phosphorylation • Spectrum from publication reporting unambiguous assignment of serine 4 (serine 487) phosphorylation. Annotated spectra associated with publications are useful!

  6. Why I highlighted this example • I found this modification site in my own data in 2006 SwissProt Entry of this protein in 2006

  7. Site Assignment Scoring Methods (1) • Probability of randomly observing a given peak • A-Score (Gygi) • PTM Score (Mann) • Probability calculation based on unit mass measurement and assuming all masses equally possible at random: • e.g. if considering 4 peaks per 100 Da, then probability of random match of a given peak is 4% • A-score is a number; PTM score reports a probability • How valid are these assumptions? • Nominal mass may be appropriate for poor mass accuracy ion trap data, but not for high mass accuracy data • Could adjust probability calculation to more mass ‘bins’ • All masses are not equally probable; e.g. for b ions: • 201 – EA, LP, IP, TV 204 – Not possible • 202 – NS 205 – FG, CT • 203 – MA, CV, TT 206 – Not possible

  8. Site Assignment Scoring Methods (2) • Score/probability difference • Compare search engine probabilities for peptide IDs with different site assignments • Mascot Delta Score • SLIP Score • e.g. Top scoring assignment: E-value: 1E-5 • Next best site assignment: E-value 1E-4; SLIP score=10 • Next best site assignment: E-value 1E-3; SLIP score=20 • Advantages: • Can be calculated as part of database search • Accounts for variation of probability of observing different masses • If search engine makes use of mass accuracy, score will adjust to data of different mass accuracy

  9. Assessing Reliability of Site Localization Scoring • Data from 180 synthetic phosphopeptides • Tested with wide range of fragmentation data (CID, HCD, ETD, MSA…) • Comparison of Mascot Delta Score to A-score • SLIP Score in Protein Prospector • PhosphoRS used different set of synthetic phosphopeptides Savitskiet al. Mol Cell Proteomics (2011) M110.003830

  10. SLIP Score vs A-Score vs MD-Score • Dataset: QTOF Micro CID Data of 180 synthetic phosphopeptides1 • Modification sites known • Data Searched by Mascot: 2174 correct spectra matches • Data Searched by PP: 2334 correct spectra matches Baker et al. Mol Cell Proteomics (2011) M111.008078

  11. Decoy Sites for Estimating PEP (Local FLR) • Test Dataset: Synaptic phosphopeptides acquired in LTQ-OrbitrapVelos (IT-CID): 70,000 phosphopeptide spectra identified • Altered Batch-Tag to allow for phosphorylation of Pro and Glu • Filtered results to only phosphopeptide IDs containing one S, T or Y • Modification site known SLIP Score • Local FLR: SLIP score of 6 = 95% correct • Global FLR (matches to phosphoP and phosphoE) similar to QTOF Micro data. • Similar score threshold appropriate for ion trap CID and quadrupole CID data

  12. Representing Ambiguity VATVSVLATR – Singly phosphorylated Phospho@5=3 Best site assignment with associated score. No information as to which is second best site. Example software: A-Score; Mascot Delta Score; SLIP Score Phospho@3|5 Indicating inability to differentiate between two sites, either due to no information, or confidence below a defined threshold Example software: SLIP Score; VML Score VAT(0.1)VS(0.89)VLAT(0.01)R Probabilities for all potential site assignments within peptide are reported Example software: PTM Score / MaxQuant; PhosphoRS

  13. Representing Ambiguity VATVSVLATR – Doubly phosphorylated Phospho@3=12; Phospho@5=3 Best site assignments with associated scores. Separate score calculated for each site assignment. Score is in comparison to best assignment not containing a particular modification site; i.e. @3 is relative to when residues 5 and 9 are modified. Phospho@3=12; Phospho@5|9 One site has confidence measure; other site does not. VAT(0.95)VS(0.9)VLAT(0.15)R Probabilities are combination probabilities for one of the two modifications.

  14. Site-Level or Peptide-Level Assesment for Localization Reliability All current software reports reliability for individual site localizations, but software could in theory calculate a reliability for the combination of modifications reported: e.g. VAT(0.95)VS(0.9)VLAT(0.15)R Could be reported as VAT(phospho)VS(phospho)VLATR with probability (0.95x0.9=) 0.86

  15. Modification Ambiguity • Some modifications are isobaric • Acetyl vsTrimethyl; PhosphovsSulfo; Ser->Thrvs Methyl • Some combinations of modifications are isobaric /isomeric with a single modification • Methyl + Methyl vsDimethyl • Carbamidomethyl + CarbamidomethylvsGlyGly (ubiquitin) • Carbamidomethyl + methyl vspropionamide (acrylamide) • Acetyl + K+/Ca2+ adduct vsphospho

  16. Modification Ambiguity • Many of the published site localization software were specifically written for phospho, so will not work for other PTMs. • Site localization scoring based on search engine results should work for all modifications • SLIP score; Mascot Delta score; VML score • However, they will only be meaningful if the competing modification alternatives were considered in the initial database search • If carbamidomethyl modification of lysines or N-termini in addition to cysteines was not considered, then two carbamidomethyl modifications may not be considered as an alternative to ubiquitination. • Knowledge of modifications considered relevant to evaluating site localization reliability

  17. PTMs in Crosslinked Peptides For crosslinked peptides, ambiguity may be between peptides: CAMKER TMAKER Oxidation could be on methionine in either peptide.

  18. What is an Acceptable FLR? • 2012 iPRG study involved identification of modified peptides • Participants were asked to return results with 1% FDR at PSM level • They were asked to indicate for which peptides they thought PTM site assignments were reliable • Modified peptides were spiked in, so correct site localizations were known • What was reliability of results reported?

More Related