520 likes | 665 Views
Protein Sequencing Research Group: Results of the PSRG 2012 Study. Terminal Sequencing of Standard Proteins in a Mixture Year 1 of the 2-year Study . Current PSRG Members. Henriette Remmer (Co-Chair) University of Michigan Jim Walters (Co-Chair) Sigma-Aldrich
E N D
Protein Sequencing Research Group: Results of the PSRG 2012 Study Terminal Sequencing of Standard Proteins in a Mixture Year 1 of the 2-year Study
Current PSRG Members • Henriette Remmer (Co-Chair) University of Michigan • Jim Walters (Co-Chair) Sigma-Aldrich • Robert English* University of Texas Medical Branch • Pegah Jalili* Sigma-Aldrich • Viswanatham Katta Genentech, Inc • Kwasi Mawuenyega Washington University School of Medicine • Detlev Suckau Bruker Daltonics • Bosong Xiang Monsanto, Co. • Jack Simpson (EB liaison) United States Pharmacopeia * new members added in 2011
PSRG 2012/13 – Study Background and Design Status of Terminal Sequencing : • In the midst of a technology transition from classical Edman sequencing to mass spectrometry (MS) based sequencing • Both technique have varied strengths and weaknesses and both have a role in biochemical research. • With a complimentary role realized, we attempt to push the capabilities of the various sequencing techniques, namely terminal sequencing of proteins in mixture Concept of the 2012 Study- Terminal Sequencing of Proteins in a Mixture: • Sequencing proteins in a mixture requires separation of proteins prior to analysis • Edman Sequencing : SDS-PAGE and electroblotting prior to analysis – • well established in most core facilities • MS based sequencing: LC separation necessary prior to analysis- • not well established in most core facilities => PSRG designed a 2-year study YEAR 1: Terminal sequencing and identification of three separatedstandard proteins YEAR 2: Same three proteins distributed, this time in mixture
PSRG 2012 Year 1: Study Objective To obtain N-terminal sequence information on three standard proteins supplied as separated samples.
2011 Study Design – The Samples • Participants were asked to analyze the samples for terminal sequencing using any technology available • Participants obtained all three proteins with ID in sufficient amounts to sequence each protein utilizing all three technologies. Feasibility of analysis had been validated by PSRG members. • Participants also filled out a survey, all responses were kept anonymously
Participation and Survey results • 25 laboratories from 12 countries requested samples for Edman sequencing and most of the labs (23) also for MS sequencing. • 14 of the 25 participating laboratories (56%) completed the survey. • 7of the 14 labs utilized Edman sequencing , 6 top-down MS and 6 bottom-up MS. • Out of 14 respondents, • 9 labs analyzed the reference protein BSA, 8 correctly determined the N-terminus • 13 labs analyzed Protein A , 5 correctly determined the N-terminus • 14 labs analyzed Endostatin, 12 labs correctly determined the N-terminus , only 7 identified the presence of the second N-terminus
Edman Workflows PSRG 2012 Samples SDS PAGE – blotting on PVDF (2) Used sample as Provided (5) blotting on PVDF (1) ABI Procise 4 - 494 HT’s 1 – 492 cLC 2 - 494 cLC Shimadzu PPSQ-33A
C10 Edman sequencing Protein A PROTEIN A- FUSION PROTEIN- N-TERMINUS BLOCKED De-blocking (PGAP) ABI Procise Biosystems Model 494HT Polybrene-precycled glass fiber filters 100 pmol
Edman sequencing of Endostatin A00 H2O with 0.1 % TFA Shimadzu PPSQ-33A blotting on PVDF Initial Yield: 36.95 % Repetitive Yield: 84.98 % Probability 1: position 4 Proline to Arginine Probability 2: position 7 Histine to Glutamine
Edman sequencing of Endostatin A00 Information about the sequence: SwissProt output Sequence Verification: with Blast P
All lab returned N-terminal data which correlate well with the published protein sequences It can produce the data with and without separation (SDS PAGE and chromatography) No C-terminal data was produced with Edman. If the protein N-terminally blocked, the reaction will not proceed for most but not all modifications. The reagents for Edman sequencing are very expensive PSRG 2011 Edman Conclusions & Observations Edman sequencing allows for direct determination of the protein’s N-terminal sequence.
Mass Spectrometry Methods Used Top-Down Sequencing (no digests) • ISD, T³: AB Sciex 4800 MALDI-TOF/TOF • MS, ISD, T³: Bruker Ultraflex MALDI-TOF/TOF • MS, ETD,CID: Bruker maXis 4G UHR-QTOF Only Top-Down N-term results were returned. Some participants used Bottom-Up MS as validation step Bottom-Up MS/MS (digests) • MALDI-TOF/TOFs: AB/Bruker • ESI-Orbitrap: Thermo
Top-Down Experimental Bruker Ultraflex Bruker UltrafleXtreme AB Sciex 4800 Top-Down Instrumentation Sample ISD ISD/T³ Separation HPLC Direct infusion As provided 0.1% TFA MeOH/H2O/HOAc 6M GndHCl Various organic/H2O/acid Bruker Autoflex speed ISD/T³ ETD CID Triversa Nanomate Agilent 1200 Bruker MaXis 4G
Software used for MS Top-Down Analysis • BioTools 3.2: Sequence-tags, automatic de-novo sequencing, trigger Mascot TD searching, result visualization, terminal assignments, TD report generation(Bruker) • Mascot 2.3: TD and BU Database searches (Matrix Science) • BLAST/MS-BLAST: Protein identification based on sequence tags(NIH, Harvard/EMBL) • ISDetect: Sequence-tags, semi-automatic de-novo sequencing, result visualization (Genentech, Y Gan et al, in prep. )
The Top-Down MS Standard Analysis Strategies • MW Determination: Check Sample Quality + Final QC • ETD/ISD: obtain internal sequence Tags • ID Protein: e.g. Mascot search • Extend Sequence towards N-terminus (and C-term alike) • Compare with obtained protein sequences incl. PTMs) • T³-Sequencing, i.e. MS/MS analysis of MALDI-ISD fragments • Edman sequencing • Problems: unknown terminal modifications (Sample B), fusion proteins (Sample B), ragged ends (Sample C)
BSA ISD Spectrum in DAN matrix PSRG123good calibrant for ISD Spectra
Sample A: BSA, ISD+Edman C10following the basic strategy • BSA sequence Accession number: AAI02743 • c-ions in the MALDI-ISD spectrum revealed the sequence from Arg10 -Tyr30. • Edman sequencing provided Asp1 to Gly15 • Data from the orthogonal methods were put together to obtain 30 residues of BSA sequence. FINAL SEQUENCE OBTAINED FOR BSA: 1 10 20 30 40 DTHKSEIAHRFKDLGEEHFKGLVLIAFSQYLQQCPFDEHVKLVNELTEF… Coverage by Edman Coverage by MALDI-ISD Coverage by both
Sample B Endostatin (donated by Sigma)issues: ragged N-term, C-term loss of K C-term K excised added C-term K excised
Endostatin L36Annotated ISD Spectrum from on/off gradient Interfering component
Endostatin L36HPLC chromatogram, separation of two variant, ISD of F1, F2 not assigned The recovery from the endostation sample might be lower than 100 pmol 100 pmol Myoglobin standard F2 F1 LC-separation detected the protein heterogeneity, removed polymeric contamination but reduced the sample amount and readout length
Intens. +MS, 5 x10 1496.8469 1.0 1621.4171 1390.0011 0.8 0.6 1945.6003 1768.8184 0.4 1297.3352 1221.9913 0.2 0.0 1200 1400 1600 1800 2000 m/z Z10 UHR-QTOF MS analysis of Endostatin: 2 Components In contrast to MALDI-ISD, the QTOF-ETD analysis takes place after precursor ion selection
Z10 ETD Analysis of Endostatin, First Precursor: Mascot Database Search Result Simplest Use of Top-Down Data: Mascot Search
Z10 TDS Analysis of Endostatin, First Precursor: Deconvoluted and Annotated ETD Spectrum c 26 c 9 c 2
Intens. +MS, 0.5-20.4min, Deconvoluted (MaxEnt) 5 1+ x10 C866H1340N250O250S6, 19433.8151 19444.8432 4 1+ 1+ 19443.8408 19446.8479 1+ 1+ 19442.8383 19447.8502 3 1+ 19448.8526 1+ 19441.8359 1+ 19449.8548 2 1+ 19440.8334 1+ 19450.8571 1+ 1+ 19439.8308 19451.8594 1 1+ 19452.8616 1+ 19438.8282 1+ 19453.8638 1+ 19437.8256 1+ 19436.8229 0 19436 19438 19440 19442 19444 19446 19448 19450 19452 19454 m/z Z10 TDS Analysis of Endostatin, First Precursor: Mass Accuracy of intact Protein Measured Monoisotopic mass 19433.8783 Theoretical Monoisotopic mass 19433.8151 Mass error 3.2 ppm Measured (black) Spectrum Simulated (red) Spectrum Precision MW allows to confirm proper N-term and C-term loss of Lysin
Endostatin: TDSSequence 2 PSRG123 If ISD spectral quality is good, both sequences can be directly read and N- and C-termini can be assigned from THE SAME SPECTRUM
Rec. Protein A (donated by Repligen)Issues: N-term methylation, fusion site after residue 18 E.colib-Glucuronidase SPA_STAAU C-term sequence does not match intact MW (nice challenge for Top-Down MS in the Future..)
ISD Spectrum Protein A (DAN) E20 manual sequence generation T R I/L K/Q K/Q I/L D E G A I/L K/Q D H E A K/Q K/Q
Protein A Identification E20 ISD spectrum for Samples #2 (Protein A) was manually interpreted by sequential subtraction of ions Resultant sequence: was Blasted against the Dayhoff public database (below) TRE[IL][KQ][KQ][IL]DG[IL]A[KQ] Only two sequences matched. Homology searching of the N-term Tag provided a) b-Glucuronidase, b) its N-terminally extended sequence, c) mass offset indicates N-term Methylation
Protein A MS/MS E20 ISD c-ion m/z 1056.538 T³-sequence analysis of c9 confirms N-term methylation
Protein A L36MS/MS of N-terminal tryptic fragment Validation of assigned N-term methylation and glucuronidase sequence by Bottom-Up LC-MALDI-TOF/TOF analysis
Protein A L36Annotated ISD spectrum The N-terminal sequence is b-gluronidase fused with protein A. The N-terminal Methionine is methylated. The N-terminal aminoacids not confirmed by ISD was confirmed by MS/MS of the N-terminal tryptic fragment
Results from MS Analyses Please look at poster ##?? For more details
Lessons to be Learned from this Years StudyMass Spec Lessons.. • Top-Downwith ETD or ISD provides reliable N-term sequences • Top-Down CID was most easily misinterpreted • Edman and Top-Down Complement each other very well: Edman for the first ~10 residues, Top-Down for the inexpensive extension of calls (e.g. through the fusion site of Protein A) • Validation of the N-term by either T³-sequencing or Bottom-Up works as well • Efficient use of Top-Down MS requires good software support • Bottom-Up was great to confirm N-term results but not to generate them • Use of protein HPLCresulted in shortened readouts • Protein A Successful analysis of the fusion required high experience • Endostatin ragged N-termini were recognized by those that determined the intact molecular weight(s) , detected heterogeneity by HPLC or Edman • Top-Down by ETD or ISD permitted the detection of the C-terminal removal of Lysine, intact MW determination allowed to validate the finding
Next years ABRF-PSRG2013 studywhat's going to happen? • Most likely, the same proteins will be provided again! • But: provided as a stew in a single pot! • Task: Isolate/separate them from the mixture • Problem: SDS-PAGE works well for Edman, but it is difficult to extract intact proteins • Hints: • Protein LC needs to be established, to get to the next level! • Always try to get intact MW information! • Use high sample amounts as you loose a lot during LC
The ABRF-PSRG Acknowledges the following Support • Recombinant Protein A was obtained as donation from RepliGen (Waltham, MA) • Endostatin was obtained as donation from SIGMA-ALDRICH (St Louis, MO) • Steve Smith (University of Texas Medical Branch) and Larry Dangott(Texas A&M University) for Edman sequencing to provide reference data for this study.
End • Following slides are bonus material
In-Source Decay (MALDI-ISD) MALDI-ISD • “pseudo-MS/MS” technique, no precursor selection • ISD of protein in the MALDI plume at <nsectimescale (similar to ETD) • Fragmentation due to radical transfer from matrix to analyte (Takayama, 2001) • a,c- ions: N-terminus; y, z+2-ions: C-terminus – simultaneous sequencing • TOF/TOF allows for T³-sequencing: MS/MS analysis of ISD fragments
MALDI-ISD and T³-Sequencing Suckau & Resemann (2003) Anal Chem 75
ESI-ETD (Electron Transfer Dissociation) • CID • Collision with inert gas • protein is internally heated globally • it fragments in statistic process • weak bond cleavages • ETD • Collision with electron donating gas • perturbateselectronic structure locally • resulting in localbond cleavages • ETD fragments all bond (except Pro) • for top down MS/MS of intact proteins with precursor ion selection
Reaction Cell ETD Measurement Cycle on QTOF 1. Precursor Ion Accumulation 2. Electron Transfer Reagent Addition 3. ETD Reaction 4. Fragment Ion Transfer and Detection 10 kHz n-CI Source Tsybin et al. (2011) Anal Chem 83:8919
E20 ISD Endostatin (DAN): initial manual interpretation I/L N I/L S G G M R G D N K/Q F C K/Q
Data base search for [IL]SGGMRGNR[KQ]DF[KQ]CF E20 Sequence from spectrum was found beginning at 1548.694, so we know there are a handful of residues preceding this seq Excerpt from COIA1_HUMAN Excerpt from COIA1_MOUSE Differences between human and mouse can be seen in the -2 position from the start of ISD sequence (ie. LNSPL in human and LNTPL in mouse)
E20 To confirm N-terminus not covered in the ISD spectrum, MS/MS was performed on m/z1364.6 Immonium Ions K/Q I/L H P y6 b3 b7 b8 b9 b10 b2 y9 y7 b4 b5 010212_B23_10pmol_Endostatin_MSMS_2kV_1364.65
E20 Determination of Endostatin N-termini by Edman degradation. • Major sequence matches CO1A1_HUMAN at position 1576. • A second sequence was found from position 1572. • Both sequences concur with the ISD findings. Edman sequencing detected the ragged N-term, ISD confirmed and extended it Largely manual analysis of ISD spectra made it difficult to extract full information