710 likes | 869 Views
Terminal Sequencing of Standard Proteins in a Mixture. Protein Sequencing Research Group (PSRG): Results of the PSRG 2012/13 Study Year 2. PSRG Members. Current Members Greg Cavey Southwest Michigan Innovation Center Robert English (Co-Chair) University of Texas Medical Branch
E N D
Terminal Sequencing of Standard Proteins in a Mixture Protein Sequencing Research Group (PSRG): Results of the PSRG 2012/13 Study Year 2
PSRG Members Current Members • Greg CaveySouthwest Michigan Innovation Center • Robert English (Co-Chair) University of Texas Medical Branch • Mark GarfieldNIH/NIAID • PegahJaliliSigma-Aldrich • Sara McGrath (Co-Chair)FDA • EjvindMortzAlphalyse Outgoing Members • Henriette Remmer (Co-Chair)University of Michigan • Jack Simpson (EB liaison)United States Pharmacopeia • Detlev SuckauBrukerDaltonics • Jim Walters (ad-hoc)Sigma-Aldrich • ViswanathamKattaGenentech, Inc Staying on as ad-hoc member for the coming year: Henriette Remmer
Study Background and Design Status of Terminal Sequencing • N-terminal sequencing is in the midst of a technology transition from classical Edman sequencing to mass spectrometry (MS)-based sequencing. • Edman sequencing and MS-based techniques both have strengths and weaknesses. • With a complimentary role realized, the PSRG attempts to push the capabilities of the various sequencing techniques, namely terminal sequencing of proteins in a mixture. Concept of the Study - Terminal Sequencing of Proteins in a Mixture • Sequencing proteins in a mixture typically requires separation of proteins prior to analysis for both Edman sequencing and MS-based technology platforms. • Edman Sequencing : SDS-PAGE and electroblotting of the separated proteins – well established everywhere • MS-based sequencing:LC separation necessary prior to analysis • – not well established in most core facilities PSRG designed a 2-year study • YEAR 1: Terminal sequencing and I.D. of three separatedstandard proteins • YEAR 2: Proteins (+ one new protein) distributedin a Mixture
Last Year’s Study Objective To obtain N-terminal sequence information on three standard proteins supplied as separated samples
Last Year’s Study Design – The Samples • Participants were asked to analyze samples for terminal sequencing using any technology available. • Participants received all three proteins with ID in sufficient amounts to sequence each protein utilizing all three technologies. Feasibility of analysis had been previously validated by PSRG members. • Participants also filled out a survey, all responses were kept anonymous.
Last Year’s Participation & Survey Results • 25 laboratories from 12 countries requested samples for Edman sequencing and most of the labs (23) also for MS sequencing. • 14 of the 25 participating laboratories (56%) completed the survey. • 7 of the 14 labs utilized Edman sequencing , 6 top-down MS and 1 bottom-up MS (5 used bottom-up for confirmation). • Out of 14 respondents, • 9 labs analyzed the reference protein BSA, 8 correctly determined the N-terminus. • 13 labs analyzed Protein A , 4 correctly determined the N-terminus (methyl-Met). • 14 labs analyzed Endostatin, 12 labs correctly determined the N-terminus , only 7 identified the presence of the second N-terminus.
Last Year’s Edman Summary &Observations Edman sequencing allows for direct determination of the N-terminal sequence of a protein. • All labs returned N-terminal data which correlated well with published protein sequences. • Edman can produce data with and without separation (SDS PAGE and chromatography). • No C-terminal data is produced with Edman. • If the protein is N-terminally blocked, the reaction will not proceed for most (but not all) modifications.
MS Lessons Learned from Last Year • Top-Downwith ETD or ISD provided reliable N-term sequences • Top-Down CID was most easily misinterpreted • Edman and Top-Down complement each other very well: Edman for the first ~10 residues, Top-Down for the inexpensive extension of calls (e.g. through the fusion site of Protein A) • Validation of the N-term by either T³-sequencing or Bottom-Up • Efficient use of Top-Down MS requires good software support • Bottom-Up was great to confirm N-term results, but not to generate them • Use of protein HPLCresulted in shortened readouts • Protein A: Successful analysis of the fusion protein required high experience • Endostatin ragged N-termini were recognized by those that determined the intact molecular weight(s) , detected heterogeneity by HPLC or Edman • Top-Down by ETD or ISD permitted the detection of the C-terminal removal of Lysine, intact MW determination allowed validation of the finding
PSRG 2012/13 (Year 2/2) Study Objective To obtain N-terminal sequence information on three standard proteins supplied in amixture
PSRG 2012/13 Study Design – The Samples • Each participant received 2 vials of the mixture (except one late participant), and each vial contained the protein amounts listed above with buffer components including 4M urea and PBS. Each participant also received a third vial containing 750 pmol BSA. • Participants were asked to a) separate the proteins, and b) analyze samples for terminal sequencing using any technology available. • All protein components showed solubility in traditional proteomic buffers, including water, 0.1% FA, and 0.1% TFA. Specifically, endostatinhad shown solubility in 20% ACN, 0.1% TFA, 50% pyridine, and buffers compatible with 1D gel electrophoresis. • Participants also filled out an online survey (responses were kept anonymous). • * Protein not included in last year’s study.
Protein N-termini • Two participants, 16X and 20M submitted Edman sequencing results. • Participant 16X tried different techniques for protein separation • The PSRG used Edman sequencing without prior separation of proteins for comparison
Edman Sample Preparation Workflows PSRG 2013 Samples Gel Eluted Liquid Fractionation Entrapment Electrophoresis (GELFrEE) (16X) Used sample as Provided -no separation- (PSRG) SDS PAGE – blotting on PVDF (16X, 20M) HPLC (16X) ABI Procise 2- 494 HT 1 – 492 HT
16 X Edman Results Entire sample tube A used for HPLC: 50% of fractions 7, 10 and 11 for Edman sequencing: 30 pmol Protein A 150pmol Endostatin 400 pmola-S1 Casein Fraction 10 Fraction 11 Fraction 7 Fraction 10: no amino acid assignments
Edman Results 16 X 90% of sample tube B separated on GELFrEE tube gel eluter : GELFrEE System: Gel Eluted Liquid Fractionation Entrapment Electrophoresis • Disposable cartridges contain SDS-polyacrylamide gel matrix • Proteins are solubilized and electrophoresed • Size based separation and liquid phase recovery • Sample Preparation after GELFrEE separation: • Fractions evaporated to dryness • Konigsberg acetone precipitation1 (3x) followed by 2 acetone washes • Dissolved precipitate in 0.1% TFA • 50% of solution for Edman sequencing-applied to Glass Fiber Filter with polybrene treatment • 27 pmol Protein A; 135 pmolEndostatin , 360 pmola-S1Casein No meaningful Edman data obtained –only Gly and Tris artifact peaks 1 LE Henderson, S Oroszlan, W Konigsberg; Anal. Biochem. 1979, 93(1), 153-157
Edman Results 20M 30% of one sample tube used with sample preparation by SDS-PAGE/pvdf blotting: 20 pmol Protein A; 100 pmolEndostatin , 250 pmola-S1Casein 12 of 15 residues correctly assigned 15 residues correctly assigned 25 residues correctly assigned
Edman Results PSRG Sequencing of protein mix without prior separation Started with 30% of the mix from one tube: 15 pmol Protein A, 80 pmolEndostatin, 200 pmola-S1 casein Repetitive Yield: 95%
Effectiveness of HPLC for sample preparation compared to SDS-PAGE/blotting
No one detected all 4 N-termini by Edman sequencing of the separated proteins SDS/PAGE/electroblotting to pvdfwas the most successful for sample preparation HPLC can have preparative losses and electroelutionbuffer interference Edman sequencing can produce data without prior separation of proteins, but for complex mixtures, protein separation is necessary If the protein is N-terminally modified, the reaction will not proceed for most modifications PSRG 2013 Edman Conclusions & Observations
What is Top-Down Mass Spectrometry?why is it Most Appropriate for Terminal Protein Sequencing? • All MS Analysis based on intact undigested protein • Intact molecular weight: gross structure validation • Targeted sequencing of the N- and C-terminus • Bottom-up analysis tackles the termini only arbitrarily N-TERMSEQUENCE C-TERMSEQUENCE
LASER N M N T R M MALDI Intact Sequencing N T E E R M N T E R T E R M N T E R M C T E R M N T E R M S E C T E R M N T E R M S E C E C T E R M Most employed Terminal Sequencing in PSRG Studies are base on MALDI-In Source Decay (MALDI-ISD) MALDI N & C-Terminal Sequencing 1,5-DAN N T E R M S E Q U E N C E C T E R M MALDI TOF/TOF 40,000 resolution 30 seconds! • Confirm N & C terminal sequences • Identify truncations/terminal PTMs • Generate sequence information without proteolytic digestion • In less than 1 minute
MALDI-ISD Process is most affected by the choice of MALDI Matrix
K Intact Panitumumab N & C-Terminal Sequencing – 1 ISD Spectrum > 2 Protein Sequencessequence match of LC confirming the sequence of both termini 64 21
pyroGlu- X Lys-truncation K Intact Panitumumab N & C-Terminal Sequencing 1 ISD Spectrum > 2 Protein Sequences sequence match of HC confirming N-term pyro-glutamylation and C-term lysine truncation 72 50
Heavy Chain Q/pE (-18 Da) Q/pE (-18 Da) FR1 FR1 CDR1 CDR1 FR2 FR2 CDR2 FR1 CDR2 FR3 FR3 FR1 CDR3 CDR1 CDR3 FR2 CDR1 FR4 FR4 FR2 CDR2 CDR2 FR3 FR3 CDR3 CDR3 FR4 FR4 CH1 CH1 CL CL Hi Hi Light Chain C C 18 C-C (- 36Da) H H N295ST 2 2 glycosylation N - C C H H 3 3 0,1, 2 K Panitumumab Sequence Covered in a Single MALDI-ISD Spectrum c-ions z,y-ions
Heavy Chain Q/pE (-18 Da) Q/pE (-18 Da) FR1 FR1 CDR1 CDR1 FR2 FR2 CDR2 FR1 CDR2 FR3 FR3 FR1 CDR3 CDR1 CDR3 FR2 CDR1 FR4 FR4 FR2 CDR2 CDR2 FR3 FR3 CDR3 CDR3 FR4 FR4 CH1 CH1 CL CL Hi Hi Light Chain C C 18 C-C (- 36Da) H H N295ST 2 2 glycosylation N - C C H H 3 3 0,1, 2 K Sequence Covered after LC-SeparationMALDI-TDS Middle-Down Increases Sequence Coverage for mAbs c-ions z,y-ions
HC Fd LC Fc/2 3.Full TCEP reduction 1. Endoglycosidase F2 2.IdeS cleavage Fast and Extensive mAb Product Characterization Middle-Down Panitumumab Analysis 1. Remove glycan, 2. cleave HC, 3. reduce • IdeScleaves the HC of mAbs specifically at a conserved Gly-Gly motif in the hinge region • Works for most mammalian antibodies in their native form
Middle-Down Analysis of IdeS Digest Panitumumab: LC-MALDI Fd (M+H)1+ (M+2H)2+ LC-MALDI-TDS Analysis of Panitumumab Fabricator Digest Column: Zorbax C8 Matrix: sDHB R(t) (M+3H)3+ (M+2H)2+ (M+H)1+ LC Fc
Middle-Down Analysis of IdeS Digest PanitumumabIdeS+GlycoZERO: TDS of Fc-Fragment LC-MALDI-TDS Fc Fragment Localization of glycan C-terminal K truncation Coverage: 62%
Middle-Down Analysis of IdeS Digest Panitumumab: TDS of LC LC-MALDI-TDS Light Chain: 90 AA from N-terminusCoverage: 70 %
Middle-Down Analysis of IdeS Digest Panitumumab: TDS of Fd-Fragment LC-MALDI-TDSFd Fragment 91 AA of variable N-terminus Coverage: 58 %
Considerations for Top-Down LC-MALDI-ISD Analysis • ≥ 50 pmol protein/LC-fraction is desired • ≥ 100 pmol/protein need to be applied to column for each protein • i.e., a 10 protein mix loads > 1 nmol on column • Suitable columns: C8capLC, monolithic columns • Severe protein losses typical during LC decrease signal • Reduction/alkylation and oxid. typically increase noise
Software for Top-Down Sequence Analysis • Functionality • Assign expected sequence to TDS spectrum (BioPharma QC) • ID Proteins through TD standard Mascot searches (Discovery, ID) • Manual/Automatic de novo sequencing (Edman like) • Manual/Automatic de novo sequencing + BLAST (Sequencing+ID) • Generate test hypothesis to explain Dm´s (terminal mods, truncations) • Available software • BioTools 3.2 (Bruker Daltonik ECD/ETD/MALDI-ISD) • ProSight PTM (Kelleher Group ETD) • ProSightPC 2.0 (Thermo ETD)
MALDI Top-Down andMiddle-Down Analysis in routineBioPharmaQC • Intact Protein MW < 10 ppm for MW upto 30 kDa • Middle Down antibodyworkideallysuitedfor MALDI • Screen forPTMsorprocessingerrorson thedomainlevel • Validateprocessingerrorsby MALDI-TDS • Automatedscreening /validation in BioPharmaCompass • MWs in minutes • TDS in sec underfullautomation
MALDI-ISD Fragments and Database Searchingy-(z+2) = 15.01 Da, c-a= 45.02 Da • Top-Down Search principle • Select m/z range of ISD fragments as “virtual precursor ions” • lower mass fragments are used as dependent fragments • Standard MS/MS Mascot search with MALDI-ISD as fragment ion set
MALDI Top-Down Sequencing: Identificationfromtheprotein DB in seconds.. MH+ b-Galactosidase
MALDI-TDS Resultfromb-GalactosidaseTerminal SequencesConfirmed in Seconds.. N-termMethioninetruncationconfirmed
Manual de novo Sequencing • c-ion series assigned by: • High intensity • -45 Da a-ion satellites • Proline gaps
MALDI-TDS De Novo Analysis of the N-terminus Resemann et al. 2010 Anal Chem 82:3283-92 Top-down de novo protein sequencing of a 13.6 kDa camelid single heavy chain antibody by MALDI-TOF/TOF MS. Resemann et al. 2010 AC
Cap-LC (monolithiccolumn, 10 % sample loaded)Dionex PS-DVB 500 Protein A intact UV LC traces allow to quantify Proteins in unknown samples His-Tag Casein Protein A truncation forms Endostatins
LC-MALDI MS Analysis of the PSRG2013 Sample Endostatins His-Tag Casein Protein A intact Protein A truncation forms
Workflow needed this year for MS analysis • Separate proteins by LC • Determine Protein MW after LC separation • Assign Protein IDs based on MW (sequences known) • Establish Top-Down sequence analysis online or offline • Map experimental terminal sequences to the known protein sequences
MS Strategies Used in 2013 Study Most respondents Additional work by some Trypsin digest of fractions containing proteins CID or ISD for sequence determination (bottom-up) Some needed more sample cleanup or SDS-PAGE visualization of protein fractions • HPLC of intact proteins with fraction collection • LC-UV or ESI-MS to determine fractions of interest • Intact MW by MALDI or ESI • Sequence determination by ISD (top-down)
Goal 1 = good separation, detect expected proteins • Data provided shows good protein separation • First use intact MW to detect known proteins • MALDI MW usually very accurate • ESI MW determination possible with deconvolution software • Poor S/N or modifications complicate interpretation • Most intact MW data provided is good enough to indicate protein variants • Cannot ID specific differences • Probably not sufficient for samples with unknown proteins
Successful Protein Separation 08D • HPLC system: Agilent 1200 • Column: Agilent Poroshell 300SB-C8, 2.1x75 mm 5-micron • Solvent A: 0.1% TFA in water, Solvent B: 0.1% TFA in ACN • Gradient: 25-60% B in 9.5 min, Flow: 0.5 ml/min, Column temperature: 70°C 12P • HPLC system: Thermo Surveyor • Column: Waters Biosuite Phenyl 1000 2.0x75 mm 10-micron • Solvent A: 0.1% FA in water, Solvent B: 0.1% FA in ACN • Gradient: 5-95% B in 60 min, Flow: 0.1 ml/min 16X • HPLC system: Agilent 1260-Dionex Chromeleon • Column: Zorbax 300SB-C8 2.1x150mm • Solvent A: 0.1% TFA in water, Solvent B: 0.1% TFA in ACN • Gradient: 5-95% B in 40 min, Flow: 0.2 ml/min, Column temperature: 40°C 16X • HPLC system: Agilent ChipCube • Acquisition conditions unknown • Agilent does offer Intact Protein Chip • C-8 SB-ZORBAX, 300Å, 75 μm x 43 mm
08D Identification by Intact MW – Endostatin MALDI of fractions 6-9 shows protein corresponding to Endostatin in all Variant 2 only appears in fraction 6 5 4 8 6 9 1 7 2 3 • HPLC system: Agilent 1200 • Column: Agilent Poroshell 300SB-C8 2.1x75 mm 5-micron • Solvent A: 0.1% TFA in water • Solvent B: 0.1% TFA in ACN • Gradient: 25-60% B in 9.5 min • Flow: 0.5 ml/min • Column temperature: 70°C • MS: BrukerAutoflex Speed • MALDI /DHB matrix for intact MW
16X Identification by Intact MW – Endostatin ESI-MS chromatogram highlighting 3rd peak Mulitply-charged ESI envelope with good S/N – easy to interpret Instrument software allows deisotoping, intact MW, and determination of variants • HPLC system: Agilent ChipCube • MS: Agilent 6210 ESI-TOF • Acquisition conditions unknown
12P Identification by Intact MW – Protein A ESI-MS chromatogram highlighting 1st peak Mulitply-charged ESI envelope with poor S/N –harder to see by eye Instrument software deisotoping and intact MW determination still good for this protein • HPLC system: Thermo Surveyor • Column: Waters Biosuite Phenyl 1000 2.0x75mm 10um • Solvent A: 0.1% FA in water, Solvent B: 0.1% FA in ACN • Gradient: 5-95% B in 60 min, Flow: 0.1 ml/min • MS: LTQ-FT Ultra • ESI infusion for intact MW
16X Accurate N-term sequence – Protein A G D • Sample: 0.1% TFA + 50 mg/mL TCEP • Mix 1:1 with 10 mg/mL DAN matrix • MS: BrukerUltraflex TOF/TOF ("A") • MALDI/ISD top-down
08D Accurate N-term sequence – Protein A Methylation of N-terminal methionine • MS: BrukerAutoflex Speed • MALDI /DHB matrix • Fractions for ISD selected based on intact MW results