590 likes | 729 Views
The folding network of villin headpiece subdomain. Hongxing Lei Beijing Institute of Genomics Chinese Academy of Sciences. The Protein Folding Problem. ?. The importance of protein folding. Amyloid diseases Alzheimer ’ s disease (AD) Parkinson ’ s disease (PD) Huntington ’ s disease
E N D
The folding network of villin headpiece subdomain Hongxing Lei Beijing Institute of Genomics Chinese Academy of Sciences
The importance of protein folding • Amyloid diseases • Alzheimer’s disease (AD) • Parkinson’s disease (PD) • Huntington’s disease • Prion diseases • Amyotrophic lateral sclerosis (ALS) • Protein structure prediction • Protein design
formation of unfolded state microdomains formation of diffusion and collision of collapse a nucleus microdomains native state
Folding funnel Onuchic & Wolynes, COSB 2004, 14:70-75
The challenges in all-atom protein folding Time scale • Protein folding: seconds • Simulation: microsecond • Gap: 106 • Solution: Ultrafast-folding proteins / Supercomputers Energetic accuracy • ΔGfold (a few kcal/mol, hydrogen bond) • High accuracy of force field
Ab initio all-atom protein folding • 1998: villin headpiece, 36 amino acids, 3+Å • 2002/2003: • trpcage, 20 amino acids, 1Å • Villin headpiece by Folding@Home (3.8Å) • Villin headpiece by Shen et al (3.0Å) • BBA5 by Folding@Home (2.2-2.5Å) • Recently (Scheraga and others) • A few small proteins 2.0-4.0Å
Best folded structure from simulation Cα RMSD 0.39 Å
Free energy landscape 0-2.5 us 2.5-5.0 us 7.5-10.0 us 5.0-7.5 us
Scale free property R2 = 0.786
Hubs Degree: 45 RMSD-ALL: 7.26 Å RMSD-CA : 5.90 Å RMSD-segment A : 5.17 Å RMSD-segment B : 1.63 Å RGYR : 9.75 Å Population: 124243 Degree: 17 RMSD-ALL: 5.98 Å RMSD-CA: 4.27 Å RMSD-segment A: 3.96 Å RMSD-segment B: 1.18 Å RGYR: 10.80 Å Population: 1735 Degree: 24 RMSD-ALL: 3.75 Å RMSD-CA : 1.50 Å RMSD-segment A : 0.36Å RMSD-segment B : 0.59 Å RGYR : 10.17 Å Population: 32090
Bottlenecks Betweenness: 2.78 RMSD-ALL: 6.24 Å RMSD-CA : 5.02 Å RMSD-segment A: 4.40 Å RMSD-segment B : 1.53 Å RGYR : 10.86 Å Population : 550 Betweenness: 2.95 RMSD-ALL: 5.70 Å RMSD-CA : 4.34 Å RMSD-segment A: 3.38 Å RMSD-segment B : 1.34 Å RGYR : 10.42 Å Population : 237 Betweenness: 4.11 RMSD-ALL: 6.63 Å RMSD-CA : 4.03 Å RMSD-segment A: 4.64 Å RMSD-segment B : 1.07 Å RGYR : 11.02 Å Population : 873
Free energy landscape 0-2.5 us 2.5-5.0 us 5.0-7.5 us 7.5-10.0 us
Scale free property R2 = 0.723
Hubs Degree : 36 RMSD-ALL: 3.73 Å RMSD-CA : 1.71 Å RMSD-segment A: 0.63 Å RMSD-segment B : 0.69 Å RGYR : 10.05 Å Population : 61485 Degree : 31 RMSD-ALL: 5.99 Å RMSD-CA : 3.92 Å RMSD-segment A: 4.13 Å RMSD-segment B : 0.97 Å RGYR : 11.50 Å Population : 2689 Degree : 30 RMSD-ALL: 6.83 Å RMSD-CA : 5.83 Å RMSD-segment A: 4.88 Å RMSD-segment B : 1.65 Å RGYR : 9.93 Å Population : 5991 Degree : 22 RMSD-ALL: 6.75 Å RMSD-CA : 5.13 Å RMSD-segment A: 5.04 Å RMSD-segment B : 0.61 Å RGYR : 12.30 Å Population : 2854
Bottlenecks Betweenness: 2.27 RMSD-ALL: 6.22 Å RMSD-CA : 4.50 Å RMSD-segment A: 4.84 Å RMSD-segment B : 1.82 Å RGYR : 10.97 Å Population : 890 Betweenness: 2.46 RMSD-ALL: 7.23 Å RMSD-CA : 5.80 Å RMSD-segment A: 5.17 Å RMSD-segment B : 0.82 Å RGYR : 10.63 Å Population : 392 Betweenness: 2.48 RMSD-ALL: 6.62 Å RMSD-CA : 4.93 Å RMSD-segment A: 4.50 Å RMSD-segment B : 1.13 Å RGYR : 11.43 Å Population : 260
SCORING FUNCTIONS Knowledge-based functions (well compacted; surface area; contact order) Physics-based functions (free energy; potential energy; hydrogen bond energy; VDW energy)
OUR SCORING FUNCTION F(E)=ESE + a*EFF + b*EHB ESE= the statistical energy EFF= the force field physical energy with GB solvation model EHB= the main chain hydrogen bonding energy a= the coefficient of the force field physical energy term b= the coefficient of the main chain hydrogen bonding energy term
DECOY SETS http://depts.washington.edu/bakerpg/decoys/ a wide variety of different proteins; close to the native structure; produced by a relatively unbiased procedure
Decoy sets RMSD <5Å acceptable decoys Total : 534, 38.14% Training sets ( 14 × 100 ) Testing sets ( 13 × 100 ) Group a: contain 3-11 acceptable decoys Group b: contain at least 93 acceptable decoys
F(E)=ESE+ A*EFF + B*EHB EFF = the force field physical energy with GB solvation model Two protocols: only a minimization; after minimization, a 40 ps molecule dynamic run followed by another minimization. (The results from both protocols are very similar, and therefore, the use of the less time consuming protocol was adopted. )