580 likes | 1.2k Views
Peptide Mass Fingerprinting . Manimalha Balasubramani Genomics and Proteomics Core Laboratories. Genomics and Proteomics Core Lab website www.genetics.pitt.edu. GPCL Inventory. ABI Voyager DE PRO, user operated ABI 4700 Proteomics Analyzer Thermoelectron LCQ Deca with Surveyor HPLC
E N D
Peptide Mass Fingerprinting Manimalha Balasubramani Genomics and Proteomics Core Laboratories
Genomics and Proteomics Core Lab websitewww.genetics.pitt.edu
GPCL Inventory • ABI Voyager DE PRO, user operated • ABI 4700 Proteomics Analyzer • Thermoelectron LCQ Deca with Surveyor HPLC • ABI Qstar Elite with Ultimate 3000 HPLC • Bruker micrOTOF with Ultimate 3000 HPLC • Bruker 12 Tesla FTMS with Ultimate 3000 HPLC
4700 Proteomics Analyzer, ABI Voyager DE PRO, ABI micrOTOF, Bruker
LCQ Deca XP, Thermofisher 12T FT MS, Bruker Qstar Elite, ABI
Peptide mass fingerprinting (PMF) is a technique for protein and peptide identification
Outline • PMF Workflow: • Sample preparation • Mass spectra: MS, and MS/MS • Database searches • Examples, hands-on exercises • Contaminants, post-translational modifications, enzyme digestions • Evaluating PMF analysis
Peptide fingerprint PMF: Sample preparation
Mass Spectra are acquired with.. MALDI TOF MS(Voyager DE PRO, ABI) MALDI TOF/TOF MS(4700 Proteomics Analyzer, ABI) MALDI – Matrix AssistedLaser DesorptionIonization TOF – Time Of Flight MS– Mass Spectrometry
Intensity Mass to charge ratio (m/z) Mass Spectrum: MS
FWHM Full width at half maxima of a peak Source: wiki
Resolution and mass accuracy Δm measured at 50% peak height is the Full Width at Half Maxima (FWHM) R = M Δm R = resolution M = mass of the peak of interest Δm = width in daltons of the peak
Mass accuracy is measured as parts per million value ppm = 106Δm= 106 M R
Mass spectrum processing, calibration • External calibration • Internal calibration • trypsin autodigestion peaks • Keratin peaks • Spiking with an internal standard
Peak List • Spectrum viewer • Compiled from the mass spectra • Mass list • Mass list and intensity • Peak list is submitted for Database searching
Description of database searching using Mascot program • At GPCL, 4800 Proteomics analyzer data is presented to the Mascot webserver through ProteinPilot • Mascot can be accessed through the web • http://www.matrixscience.com
A frequency factor matrix, F, is created, in which each row represents an interval of 100 Da in peptide mass, and each column an interval of 10 kDa in intact protein mass. As each sequence entry is processed, the appropriate matrix elements fi,j are incremented so as to accumulate statistics on the size distribution of peptide masses as a function of protein mass. The elements of F are then normalised by dividing the elements of each 10 kDa column by the largest value in that column to give the Mowse factor matrix M: After searching the experimental mass values against a calculated peptide mass database, the score for each entry is calculated according to: Where MProt is the molecular weight of the entry and the product term is calculated from the Mowse factor elements for each match between the experimental data and peptide masses calculated from the entry. Mascot scoring Source: http://www.matrixscience.com/
Parameters used in database searching • Database searched • Taxonomy • Enzyme • Missed cleavages • Fixed versus variable modifications (PTMs) • MW and pI • Mass tolerance
Oxidation of methionine in proteins and peptides +32 Da +16 Da From Ionsource.com
S-carboxymethylation of the amino acid residue cysteine with the alkylating agent iodoacetic acid Or s-carbamidomethylation with iodoacetamide (+57 da) + 58 Da From Ionsource.com
Databases: NCBI nr.*tar.gz non-redundant protein sequence database with entries from GenPept, Swissprot, PIR, PDF, PDB, and NCBI RefSeq
1075.513062 1086.581177 1090.547241 1092.517822 1100.630249 1103.572754 1106.553223 1107.529663 1118.498779 1119.519531 1121.509644 1129.604492 1141.572388 1156.586792 1166.537231 1170.607422 1172.612183 1179.590332 1194.604126 1217.567749 1232.610474 1252.583740 1308.654297 1312.705811 1314.744385 1337.672485 1401.651245 1424.745728 1427.830566 1435.718872 1475.762695 1479.710327 1493.734131 1502.774780 1530.834717 1575.850952 1607.807007 1629.868408 1639.935425 1752.863892 1753.904663 1754.915161 1791.744507 1792.805054 1794.820801 1816.801392 1875.976196 1902.006104 1940.941650 1960.053345 1962.928955 2211.118652 2225.130371 2233.105225 2249.076660 Submit a peak list to Mascot http://matrixscience.com/cgi/search_form.pl?FORMVER=2&SEARCH=PMF
Hands-on exercise • Go to Desktop • open txt file • copy and paste in Mascot search page • Specify search parameters • Allow 100ppm error for PMFal_100.txt • Allow 25ppm error for PMFgd_25.txt
Not all peaks are matched –why? • Theoretical peptide list • peptides lengths vs. MS range • Enzyme – missed/non-specific cleavage • Incorrect ORF • Amino acid substitutions • Ion suppression/efficiency
Not all peaks are matched –why? • Experimental peptide list • Contaminants • Trypsin autolysis peptides • Hair, skin keratins • Matrix molecules, clusters • Unknown contaminants • Modifications • PTM’s – known and unknown, biological origin • Oxidized methionines, – gel induced artifacts • Chemical – cysteine carbamidomethylation, sample handling introduced • Adducts • Amino acid substitutions • Splice variant
Database search takes into account contaminants, modifications, For eg.
Evaluating PMF analysis • Acceptable hit • High score • Major peaks accounted for • No hit • Insufficient data – low intensity MS • Single gel band contains >2-3 proteins • Protein not represented in database – ORF/genome • Further analysis • MS/MS confirmation of few major peaks, unaccounted peaks – Ideal • Low score, good spectrum – LC MS/MS • Low score, low intensity spectrum – concentrate sample, reacquire • High score, some unaccounted peaks – MS/MS
MS/MS • Plot of m/z versus intensity • At GPCL, • MALDI TOF/TOF MS • ESI QqTOF MS • ESI IT MS • MALDI/ESI FT ICR MS
Tandem MS 4700 Proteomics Analyzer, Applied Biosystems
MS MS, followed by precursor ion selection
Fragment ion spectrum Tandem MS
Tandem mass spectrum http://qbab.aber.ac.uk
Tandem mass spectra (MS/MS) can be used for peptide sequencing • Database Searching • Peptide Mass Fingerprinting • Sequence tag approach • De novo sequencing • inspect raw data http://qbab.aber.ac.uk
Mascot Search Results Search title : SampleSetID: 362, AnalysisID: 567, MaldiWellID: 15790, SpectrumID: 17225, Path= \ Man i \ 102004 \ New Analysis 1 Database : NCBInr 20040606 (1846720 sequences; 611532004 residues) Timestamp : 20 Oct 2004 at 14:52:50 GMT Top Score : 681 for gi|180570 , creatine kinase [Homo sapiens] Probability Based Mowse Score Score is - 10*Log(P), where P is the probability that the observed match is a random event. Protein scores greater than 75 are significant (p<0.05).
Top hits from Mascot Search – there are multiple accession numbers for the same protein
Search returns a cluster of proteins with the same matching peptides
Creatine kinase B is the highest scoring protein Match to: gi|21536286 ; Score: 681 Creatine kinase - B [Homo sapiens] Nominal mass (Mr): 42591; Calculated pI value: 5.34 Observed Mass & pI: 43kd, 6.2-6.27 Sequence Coverage: 46%
GPCL resources for Bioinformatic analysis • Mascot version 2.1.0, Matrix Science Ltd • Mascot Daemon • ProteinPilot software 2.0, Applied Biosystems/MDS Sciex • Paragon algorithm • And Mascot algorithm • Sequest, Thermoelectron Selected list
Resources http://www.hsls.pitt.edu/guides/genetics/obrc/proteomics
..its high-throughput… 1st Dimension - Isoelectric focussing 2nd Dimension – SDS PAGE Spot picking Trypsin gel digest
Sample separation.. In-solution Isoelectric focussing 1D or 2D LC MALDI HPLC
GPCL services.. • Fee for service model • Support investigators • Scientific expertise • Technical expertise • Grant submission