560 likes | 666 Views
Practical tips for cloning, expressing and purifying proteins for structural biology. Aled Edwards Banting and Best Department of Medical Research University of Toronto, Canada aled.edwards@utoronto.ca Affinium Pharmaceuticals Toronto, Canada aedwards@afnm.com.
E N D
Practical tips for cloning, expressing and purifying proteins for structural biology Aled Edwards Banting and Best Department of Medical Research University of Toronto, Canada aled.edwards@utoronto.ca Affinium Pharmaceuticals Toronto, Canada aedwards@afnm.com
Molecular biological approaches to structural biology • An excellent structural sample usually has the following properties • Lack of conformational heterogeneity • Soluble at high concentrations • Pure • Molecular biology is probably fastest way to transform “poor” • sample into an “excellent” one.
Outline • Historical perspective on engineering proteins for structural biology • Practical advice for cloning/purification of structural samples • Ancillary benefits of high-throughput studies
RNA polymerase IIFrom 15Å to 3Å by eliminating heterogeneity
Another source of sample heterogeneityEukaryotic proteins comprise multiple domains • Conformational heterogeneity lowers probability of crystallization • Protein domains • Are resistant to proteolysis • Fold autonomously • Can usually be expressed in bacteria • Are between 15 and 30kDa (NMR or X-ray size) • Are fundamental unit of protein function • Domains are often only tractable targets for HTP crystallography
A B RPA Domain StructureA collection of OB-folds RPA70 RPA32 RPA14
RPA crystallization • Start with full-length protein purified using baculovirus (Wold) • Identify domain (aa 1-442) soluble in E coli (Wold) • Crystallize domain (7Å) • Use limited proteolysis to define smaller domain (aa161-442) (3.5Å….and same cell as 7Å crystal) • Create many constructs varying N- and C-termini to identify final construct (aa 181-422). (2.2Å…solve structure) • Final tally: 15 different constructs
RPA70 Domains A and BTwo OB-folds bound to DNA L12 loops A B L45 loops
Domain mapping using limited proteolysis TFIIS Protease Integrative Proteomics
TFIIS Domain Structure 240 309 264 1 131 124 Transcript cleavage and read-through (Nucleic acid binding?) Binds holoenzyme. Similar to elongin, CRSP70 RNA polymerase binding I II III
DomainHunterTM Industrialized Domain Mapping • Partial proteolysis in 96 well plates • Optimized set of proteases • Low protein requirement • No SDS-PAGE • No N-terminal sequencing • Direct identification of domains by mass spectrometry
31650 35057 r.i. 0.2 23332 33318 0 -0.0 25360 21952 20507 21612 0.1 -0.2 0.25 Protease Titration -0.4 1.0 -0.6 2.5 5 -0.8 25 -1.0 23000 28000 33000 m/z DomainHunterTM
DomainHunter Applied to NMR Sample Residue Number N 20 40 60 80 100 120 140 V8 cleavage site B C Chymotrypsin site A D Fragment Matching sequence Expression Solubility Mass B 10324.0 G[44-133]R +++ ++ C 12352.0 G[44-150]D no A 9131.0 I[55-133]R ++ ++ D 11159.0 I[55-150]D no A B
MTH40 MTH1615 MTH152 MTH1184 MTH1175 MTH538 MTH150 MTH1790 MTH129 MTH1048 MTH1699 Structural Proteomics Nat. Str. Biol. Oct/Nov 2000
5 more done 3 more soon
Molecular biology for crystallization and for large-scale studies 1. Basic steps in creating expression vectors for E. coli 2. Practical tips for making fewer mistakes 3. Application of methods to higher-throughput 4. Alternate expression systems 5. Some results
E coli is the first choice……why? • Cost effective • Easy to grow • Abundance of expertise and reagents • Easy to incorporate selenomethionine • High yield • Rapid doubling time and rapid scale-up
Factors involved in successful expression of recombinant proteins in Escherichia coli cytoplasm Expression vector Copy number (gene dosage – sometimesbetter less than more) Promoter choice (T7, Ptac, Plac, Para ) Little or no expression before induction Reliable and adjustable expression mRNA stability (RNAaseE- mutant) Translation Consensus SD sequence Proper spacing and sequence before the initiation codon Possible mRNA secondary structures that block ribosome binding or internal ribosome binding site Codon Bias
But which E coli? BL21(DE3) F- ompT hsdSB (rB-,mB-), gal, dcm, (DE3) BL21-Star(DE3) F- ompT hsdSB (rB-,mB-), gal, dcm, rne131, (DE3) BL21-Gold(DE3) F- ompT hsdS (rB- mB-) dcm+ Tetr gal endA (DE3) Tuner(DE3) F- ompT hsdSB (rB- mB-) gal dcm lacY1 (DE3)
Conventional cloning approach 1. Select vector of choice 2. Restriction digest the vector 3. PCR the insert 4. Restriction digest the insert 5. Ligate the vector and insert 6. Transform and plate 7. Pick colonies and screen for insert 8. Screen positive clones for protein expression 9. Sequence positive clones
Which vector/tag? 1. T7 RNA polymerase-based systems is overwhelming choice - Highly specific - High yields - Exquisitely controlled 2. Choice of vector - Restriction sites (are there internal sites in gene?) - Are there many possible sites? - Are the enzymes commonly available? - Do the enzymes cut near ends of DNA fragments? 3. Which tag? - Relatively little data on which generates best proteins for crystallization - His-tag, GST, MBP all are effective at purification - His tag offers advantage of being able to screen +/- tag for crystals (double bang for the buck) - Make sure there is a protease site to remove tag
Practical issues with cloning 1. Choice of protease??? - Thrombin (more difficult to get but highly effective) - TEV, recombinant with his-tag, stable mutant with less autoproteolysis activity (Waugh), needs calcium, finicky - Factor X, enterokinase…..avoid
Practical issues with cloning Restrict the plasmid - Double digestion often leave one end undigested, which in turn results in high background due to re-ligation - Phosphatase treatment and gel purification of large prep makes life much easier in long run - Optimize system to get no background
Practical issues with cloning PCR the insert - For HTP studies need to optimize condition for genome or clone - Order primers from reputable supplier (most common problem is in deprotecting oligos) - Have someone else double-check primer sequence - Order primers with requisite overhang (be over-cautious) - Use error-correcting polymerase
Practical issues with cloning Digest the PCR insert - Make sure that there are no internal sites - Purify the restricted product
Practical issues with cloning Ligation and transformation - If vector control background is low, and PCR product is purified, then should be no problem - Use highly competent cells
Practical issues with cloning Screen for positive clones - PCR screen from colony - Screen by protein expression - Make note of expression, as well as solubility
Cloning (conventional method) gene T7 6His TEV STOP T7 TEV 6His STOP T7 6His MBP TEV STOP T7 6His TRX TEV STOP Screening for inserts by PCR Clones
E.coli attP attL IHF, Int IHF, Int, Xis attR attB E.coli lysogen attR attL attL+attR attB+attP GATEWAY™ Cloning System Technology - l Phage l attP attB
l attP attP attP1 attP2 ? E.coli attB1 attB attB attB2 attP1 attP2 attL1 attL2 ? IHF, Int IHF, Int, Xis x x ? attR1 attR2 attB1 attB2 attL1 attL2 ? attR1 attR2 attB1 x attP1 attB2 x attP2 attR1 x attL1 attR2 x attL2 GATEWAY™ Cloning System Technology -l Phage
Cloning and Test Expression ligate transform clones PCR x96 X 96 24 x 3ml LB Kan, Amp 37C, Induce at OD600 Grow O/N 15C or 20C Kan, Amp X 96 300 ul 300 ul X 96 X 96 supernatant Spin, Dissolve pellet in SDS Spin, Freeze, Lyse with BugBusterTM Spin again SDS PAGE
1750 clones 100 90 80 70 60 50 40 30 20 10 0 cloned expressed soluble
Expression systems for eukaryotic proteins • Baculovirus infection of insect cells • Simple, relatively cost effective, selenomethionine-compatible, not fully able to replicate human post-translational modifications • Viral infection of human cells • Viruses not as easy to work with, high yield, proper modification • Stable transformation of human cells • Usually lower expression. After selection, transcription sometimes goes away. Low throughput due to selection process • Transfection of human cells • High expression in few cells, uses up lots of DNA
Purification parallel des proteines 1. 2. 1’ 2’ 3’ 4’ 5’ 1 2 3 4 5
ProteoMax – Automated Protein Purification and Concentration System Affinium Pharmaceuticals
Structure determination strategy 15N-labeled > 20 kDa < 20 kDa 3-5 weeks of NMR data collection Se-Methionine labeled Synchrotron Data 15N/13C-labeled
Orthologues 68 Escherichia coli 68 Thermotoga maritima Topt 80 °C Topt 37 °C 1,860,725 bp 4,639,221 bp 1, 877 ORFs 4, 288 ORFs Expressed & soluble 62 48 Concentratable to > 2mg/ml 50 44 9 Proteins could not be purified from either species 15 35 9
Total Crystals (30) T. maritima E. coli 11 3 13 Total Good/Promising NMR spectra (14) T. maritima E. coli 2 4 4
NMR & Crystallography: complementary! 24 small proteins for which both crystal trials and NMR data collected Good/promising HSQC crystals 10 3 6 Of 32 proteins that gave poor HSQC’s 7 have crystallized