200 likes | 335 Views
In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry. Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics Solutions Inc, Canada 2 University of Waterloo, Canada. Protein sequence analysis. Problem
E N D
In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang2; Baozhen Shan1; Bin Ma2 1Bioinformatics Solutions Inc, Canada 2University of Waterloo, Canada
Protein sequence analysis • Problem Complete protein sequence coverage • antibody confirmation • biomarker discovery Database search software along is insufficient
Protein sequence analysis • Possible reasons for incomplete coverage • “non-database” peptides • unexpected modifications • mutated residues • novel peptide • database errors • Meanwhile Large amount of high-quality spectra are not matched.
Proposed workflow for in-depth analysis • A workflow to identify both the database and “non-database” peptides • Objective • Maximize protein sequence coverage • Explain more high-quality MS/MS spectra
Proposed workflow for in-depth analysis • Multiple protein digests with different enzymes • High accuracy MS for both precursor and fragment ions • Workflow Multiple enzyme
Proposed workflow for in-depth analysis • Workflow Multiple enzyme • Identify de novo sequence tags • Reveal a set of high quality spectra PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Commun Mass Spectrom. 2003;17(20):2337-42.
Proposed workflow for in-depth analysis • Workflow Multiple enzyme • Identify database peptides. • Database search result validated by de novo tags • Reveal a set of confident proteins PEAKS DB: De Novo sequencing assisted database search for sensitive and accurate peptide identification. Mol Cell Proteomics 2012; 11:10.1074, 1–8.
Proposed workflow for in-depth analysis • Workflow For input spectra with + highly confident de novo tags - no significant database matches Multiple enzyme • Identify peptides with unexpected modifications • Peptides from the set of confident proteins are “modified” in-silico by trying all possible modifications in UNIMOD. • Speed up by de novo tags PeaksPTM: Mass spectrometry-based identification of peptides with unspecified modifications. Journal of Proteome Research 10.7 (2011) : 2930-2936
Proposed workflow for in-depth analysis • Workflow For input spectra with + highly confident de novo tags - no significant database matches Multiple enzyme • Identify peptides with mutation, such as residue insertion, deletion, and substitution. • Screen the protein database to find short sequences similar to de novo tags • Use both the de novo tags and database sequence to reconstruct the most probable sequences that match the spectrum SPIDER: software for protein identification from sequence tags with de novo sequencing error. J BioinformComput Biol. 2005 Jun;3(3):697-716.
Proposed workflow for in-depth analysis • Workflow Multiple enzyme Unassigned de novo sequence tags are reported as possible novel peptides
Proposed workflow for in-depth analysis • Result integration
In-depth analysis of BSA Test the workflow with the standard bovine serum albumin • Sample • Workflow Trypsin GluC LysC • Pure ALBU_BOVIN from SIGMA • 3 digests with Trypsin, LysC, GluC. • LC-MS/MS with Thermo LTQ-Orbitrap XL. LC-MS/MS • Workflow implemented in PEAK 6 • 3 digests in one project • Searched database: Swiss-Prot Workflow
Result • More PSMs are identified in each additional step: 5,152 MS/MS spectra 1,737 PSMs Filtered at 1% FDR 1,737 ->2,687 PSMs 906 PSMs 44 PSMs PEAKS ALC score > 70% 38 MS/MS spectra
Result • BSA coverage • The uncovered 4% is in the protein N-terminal region, which is mostly likely cleaved-off and not in the purchased sample1. • 1specific binding site (Asp-Thr-His-Lys) for Cu(II) ions. • T. Peters Jr., F.A. Blumenstock. J. Biol. Chem., 242 (1967), p. 1574
Result • Contaminants • Identified with at least 3 unique peptides. • Human keratin proteins (K2C1_HUMAN and K1C_HUMAN) • Bacteria protein (SSPA_STAAR) • Trypsin (TRY1_BOVIN)
Result • PTMs • Unsuspected modifications identified by PTM search • Three PTMs specified in database search • Carbamidomethylation (C) • Oxidation (M) • Deamidation (NQ)
Result • Mutation • 214th amino acid A T • Brown 1975, Fed. Proc. 34:591
Result • Unexplained de novo tags • Might be… • Novel peptides outside of the searched database KK.QTALVELLK.HK ||||||| DPALVELLKK
Summary • A software workflow proposed for in-depth protein sequence analysis • Found many things in a “pure” sample • Contaminants • Unsuspected PTMs • Mutations • Improved protein sequence coverage • BSA coverage: 87% ->96% • Explained more high-quality MS/MS spectra • Identified MS/MS spectra: 1,737 -> 2,687