500 likes | 651 Views
Boundary-element methods in biological science and engineering. Jaydeep P. Bardhan Dept. of Electrical and Computer Engineering Northeastern University, Boston MA. Four Points For Today. Cellular and molecular biomedical problems also need efficient simulation methods.
E N D
Boundary-element methods in biological science and engineering Jaydeep P. Bardhan Dept. of Electrical and Computer Engineering Northeastern University, Boston MA
Four Points For Today • Cellular and molecular biomedical problems also need efficient simulation methods. • Fast BEM solvers represent an appealing approach even at the molecular scale! • Challenge: Persuading community to abandon beloved ad hoc fast methods for systematic ones. • Strategy: Systematic methods are more flexible as we add new physics and address inverse problems.
Point 1: Efficient solvers are needed not only for macroscopic biomedical problems… 10-5 m 10-1 m 10-9 m 100 m 10-2 m Human body Human cells Molecules Organs/ tissues Electrical Impedance Tomography (EIT) Electrocardiography (ECG) Tumor growth Transport through blood vessel walls Balsim et al. ‘10 Electroencephalography (EEG) Brain-computer interfaces Lowengrub et al. ‘09 Cochlea (ear) ... And MEG Human eye for keratoplasty Hip prostheses Briare et al. ‘00 Peratta et al. ‘08 Prof. Bin He, UMN Sfantos et al ‘07
… but for microscopic ones as well! 10-5 m 10-1 m 10-9 m 100 m 10-2 m Human body Human cells Molecules Organs/ tissues Nanotechnology (quantum dots) Quantum mechanics Biomolecule electrostatics and hydrodynamics Blood flow • Drug binding • Protein folding • Cell physiology • Molecular design Gelbard ‘01 Rahimian, Biros et al (2010) Cell locomotion using flagella (sperm, bacteria) Molecular flexibility Cell “rolling” along tissue surface Cell adhesion to surfaces under shear flow Ramia ‘91 M. Bathe ‘08 Gaver and Kute ‘98 King and Hammer ‘01
d = 0 d = 1 A central molecular-scale modeling problem: water. • Biology uses water to control molecular binding, protein folding, etc. Binding example is simple: Protein
Modeling ions in solution is critical! But today’s focus is on the simpler math of “pure” water. Basic Continuum Electrostatic Theory • 100-1000 times faster than MD • Protein model: • Shape: “union of spheres” (atoms) • Point charges at atom centers • Not very polarizable: = 2-4 • Water model: no fixed charges • Single water: sphere of radius 1.4 Angstrom • Highly polarizable: = 80 • In total: mixed-dielectric Poisson Linearized Poisson- Boltzmann equation
+ - + + + + - - + + - - + - - + Conservation law Constitutive relation A Boundary Integral Method For the Poisson Biomolecule Problem Boundary conditions handled exactly Point charges are treated exactly Meshing emphasis can be placed directly on the interface
Fast BEM Solvers are Essential Solve Ax=bapproximately using Krylov-subspace iterative methods such as GMRES: Compute dense matrix-vector product using O(N) method (fast multipole; tree code; precorrected FFT; FFTSVD) Improve iterative convergence with preconditioning For many problems, use diagonal entries! Iteration converges faster if matrix eigenvalues are “well clustered” P“looks like”A-1 Memory growth is QUADRATIC Time is CUBIC!! Replace quadratic memory and cubic time requirements with LINEAR requirements!
Application-Specific Challenges • “Continuum-solvent dynamics” • Replace water molecules with dielectric • Calculate forces at each time step and integrate • Continuum post-processing of molecular dynamics • Sample structures from explicit-water MD • Compute average continuum energy from samples • Electrostatic component analysis • Compute each atom’s interaction with every other • Useful in drug design and protein engineering! These lead to thousands, or even millions, of electrostatic simulations... Some with identical dielectric boundaries, some with changing boundaries!
Community’s Solution: Fast Ad Hoc Models • Up to 100X faster than solving Poisson • Define effective (nonphysical) parameters • Plug in to ad hoc (nonphysical) formula Find the radius R of a sphere that would have the same energy given a central charge A given charge q in complex molecule gives rise to an energy E Distance between charges Effective radii
Generalized Born theory • Can give blatantly unphysical results … • … exhibits incorrect dependence on dielectric constants… • … needs all manner of handwaving justifications for improvements … • … is VERY, VERY popular.
BIBEE: A New, Rigorous Model of Continuum Electrostatics for Proteins “Boundary Integral Based Electrostatics Estimation” • Idea: Use preconditioner to approximate inverse • No need to compute sparsified operator (saves time and memory) • No need for Krylov solve • Test of elementary charges in a 20-Angstrom sphere: Single +1 charge +1, -1 charges 3 A apart
BIBEE: Introducing Different Variants The preconditioning approximation takes into account the singular character of the electric-field kernel: The Coulomb-field approximation ignores the operator entirely: CFA seems better here… …and worse here.
R1 R2 R3 BIBEE/CFA is the extension of CFA to multiple charges! No ad hoc parameters, no heuristic interpolation BIBEE: Natural, Rigorous Generalized Born + + BIBEE approx. charge includes all contributions “Effective Born radius” - the radius of a sphere with the same solvation energy Coulomb-field approximation: corresponds exactly to ignoring the integral operator. Still equation: the basis of totally nonphysical Generalized Born (GB) models Same approach taken by Borgis et al. in variational CFA
Complementary Regimes of Accuracy V20 V1 V2 • Small molecule’s reaction potential matrix and eigendecomposition (not the integral operator) • Top right: the electric fields induced by several eigenvectors of L, at the dielectric boundary • Charge distributions that generate uniform displacement fields are “like” low-order multipoles: CFA does well here and P does poorly • Small eigenvalues are associated with charge distributions that generate rapidly varying displacement fields; these are approximated well by P, not CFA
Goal: Make Fast Models More Rigorous Many advantages for chemists/biophysicists: • Enables systematic model improvement • Prove approximation properties • Leverage existing fast, scalable algorithms • Can add better physics as we learn them • Natural coupling to inverse problems
i -1/10 -1/6 -1/2 A hundred years of analysis We really want to approximate the dominant modes of the integral operator. • The integral operator has to be split into two terms • BIBEE approximates E’s eigenvalues • P uses 0 (limit for sphere, prolate spheroid) • CFA uses -1/2 (known extremal) Sphere: analytical • Eigenvalues are real • in [-1/2,+1/2) • -1/2 is always an EV • Left, right eigenvectors of -1/2 are constants
i -1/10 -1/6 Dominant energies come from dominant modes: try to capture dipole/quadrupole modes approximately! -1/2 Mathematical Rigor Enables Systematic Improvements Snapshots from MD Many parameters and ad hoc correction terms BIBEE fluctuations track actual ones very closely – possible applications in uncertainty quantification Mean absolute error: 4% ! • This effective parameter is expected to be rigorously determined by approximating protein as ellipsoid (Onufriev+Sigalov, ‘06) Bardhan+Knepley, J. Chem. Phys. (in press)
Goal: Make Fast Models More Rigorous Many advantages for chemists/biophysicists: • Enables systematic model improvement • Prove approximation properties • Leverage existing fast, scalable algorithms • Can add better physics as we learn them • Natural coupling to inverse problems
Feig et al. test set, > 600 proteins BIBEE/CFA Energy Is a Provable Upper Bound • BIBEE/P is an effective lower bound, provable in some cases but not all • Another variant (BIBEE/LB) is a provable LB but too loose to be useful Bardhan, Knepley, Anitescu (2009)
The Reaction-Potential Matrix • A weighted combination of charge distributions in the solute molecule produces a weighted combination of the individual responses: • The “canonical” basis is the natural, atom-based point of view • We can also use the eigenvector basis for analysis! • In comparing models we don’t just have to use the total electrostatic solvation free energy
Reaction-Potential Operator Eigenvectors Have Physical Meaning • Eigenvectors from distinct eigenvalues are orthogonal • Eigenvectors correspond to charge distributions that do not interact via solvent polarization (this confuses chemists) • If an approximate method generates a solvation matrix , its eigenvectors should “line up” well with the actual eigenvectors, i.e. i = j
BIBEE in Separable Geometries • For half-spaces, spheres, ellipsoids, BIBEE exactly reproduces actual eigenvectors. • Proof for spheres, ellipsoids: use appropriate harmonics • Question for future: What about near separable geometries? Bardhan and Knepley, 2011
SGB/CFA GBMV BIBEE/CFA BIBEE Is An Accurate, Parameter-Free Model Snapshots from MD • Peptide example Met-enkephalin BIBEE’s stronger “diagonal” appearance indicates superior reproduction of the eigenvectors of the operator. All models look essentially the same here.
Goal: Make Fast Models More Rigorous Many advantages for chemists/biophysicists: • Enables systematic model improvement • Prove approximation properties • Leverage existing fast, scalable algorithms • Can add better physics as we learn them • Natural coupling to inverse problems
Pre-corrected FFT Algorithm • Potential calculation is a convolution. • Convolutions are “cheap” in frequency space • Green’s function independent! (Laplace, Helmholtz, Stokes, etc.) • Project charges to grid • Point-wise multiplication in frequency space • Interpolate grid potentials • “Pre-correct” so that local interactions are accurate Bioelectromagnetics Proteins Circuit Simulation Aerodynamics Kuo, Altman, Bardhan, Tidor, White (2002) Willis, Peraire, White Cadence Design Systems Phillips and White (1997)
Higher-order Protein BEM A geometry representative of a protein: The barnase-barstar protein complex: Bardhan + Altman et al., 2007 Altman + Bardhan, White, Tidor 2009
Develop scalable protein simulations with leaders in parallel computing + FMM 760-node GPU cluster Degima Parallel GPU FMM code Picture courtesy T. Hamada Cost of cluster: ~ US $420,000 Sustained: 34.6 Tflops Performance/price: 80 Mflops/$ Application to proteins with PetFMM code of Yokota, Cruz, Barba, Knepley, Hamada
800 Å 1 copy 100 copies 10 copies 1000 copies Scalable algorithms enable bigger science Lysozyme: ~2K atom charges, ~15K surface charges • “How do proteins work in the crowded environment of the cell?” 1000 lysozyme molecules: model of a concentrated protein solution Yokota, Bardhan, et al. 2009
Goal: Make Fast Models More Rigorous Many advantages for chemists/biophysicists: • Enables systematic model improvement • Prove approximation properties • Leverage existing fast, scalable algorithms • Can add better physics as we learn them • Natural coupling to inverse problems
We are still adding physics to our models. Speed “Classical” modeling: one can assume the model is right! All simulate same thing! Circuit simulation: Maxwell equations Solid mechanics: elasticity Airplane simulation: Navier-Stokes CAD tools Accuracy Bio-modeling: “All models are wrong, some are useful”* Diverse set of flawed models. To avoid flaws, use expert insight. New models are always evolving! We have to connect multiple models (uncertainty quantification). These are just the models associated with the molecular scale!! --George Box
Lone pair electrons Hydrogen bonds Oxygen Hydrogens Adding physics to the continuum model using nonlocal dielectric theory KNOWN weaknesses of Poisson model: • Linear response assumption Caveat: Nonlinear dielectrics ARE important for some molecules! • Violates continuum-length-scale assumption • Water molecules have finite size • Water molecules form semi-structured networks Test with all-atom molecular dynamics Relatively small deviation! y=x denotes exactly linear response Nina, Beglov, Roux ‘97
Green’s function for Nonlocal Continuum Electrostatics:Lorentzian Model and Promising Tests • Nonlocal response: • Now • Integrodifferential Poisson equation Single parameter fit for gives much better agreement with experiment!! A. Hildebrandt et al. 2004
Molecular surface “Licorice” “Cartoon” Nonlocal Continuum Electrostatics:Reformulation for Fast Simulations • Integrodifferential equations in complex geometries? • Result: No progress on nonlocal model for DECADES Spherical ions, charges near planar half-spaces… nothing else. • Breakthrough in 2004 (Hildebrandt et al.): • Define an auxiliary field: the displacement potential • Approximate the nonlocal boundary condition • Double reciprocity leads to a boundary-integral method
Nonlocal Continuum Electrostatics: Purely BIE Formulation • Three surface variables, two types of Green’s functions, and a mixed first-second kind problem • The derivation uses double reciprocity theory, which can be applied to nonlinear problems as well! • Have derived exact solution for charges in a sphere Hildebrandt et al. 2005, 2007
Required accuracy Dense methods used previously could not achieve useful accuracy! Just as fast, but now with better physics! • Unoptimized code still allows a laptop to solve 10X larger problems than is possible on a cluster with dense methods • Current work: comparing to molecular dynamics simulations Local model Nonlocal model Bardhan and Hildebrandt, DAC ‘11
Local theory needs unrealistically large dielectric constants to match experiment! 3 2 Error in pKa value (RMSD) Measured protein dielectric constants suggest = 2-5 1 0 5 20 40 60 80 Nonlocal Continuum Electrostatics: Charge Burial and the pKa Problem • Understanding charge burial energetics is important! • For protein folding, misfolding (Alzheimer’s), etc. • For two molecules binding (drug-protein, protein-protein, etc.) • For change in environment (pH, temperature, concentration, etc.) Ion or charged chemical group, alone in water Ion or charged chemical group, buried in protein Demchuk+Wade, 1996
Nonlocal Continuum Electrostatics: Charge Burial and the pKa Problem • Nonlocal theory with realistic dielectric constant predicts similar energies as (widely successful) local theories with unrealistic dielectric constants! Bardhan, J. Chem. Phys. (in press)
A Common Framework for Multiple Models BIBEE provides a unifying, scalable approach to testing and extending new physics. GB-like fast nonlocal approximate model Improved GB models Fast GB-like nonlinear approximations Analytical solution of nonlocal model for sphere Explain Coulomb-field approx. Full nonlinear PB via boundary-integrals Coupling to fast, scalable algorithms Advanced PB models (Bikerman, etc.) Biomolecular complexes Linearized PB models Dynamics: hybrid explicit/implicit, and fully implicit Popular quantum methods couple to exactly our Poisson problem (“polarizable continuum model”) Protein 1 Protein 2
Goal: Make Fast Models More Rigorous Many advantages for chemists/biophysicists: • Enables systematic model improvement • Prove approximation properties • Leverage existing fast, scalable algorithms • Can add better physics as we learn them • Natural coupling to inverse problems
The Value of Systematic Approximations in Inverse Problems: Biomolecule Design • The electrostatic contribution to binding is • A total of three simulations is needed.
Mandal and Hilvert, 2003 Lee and Tidor, 2001 Electrostatic Optimization of Biomolecules:Applications in Analysis and Design • E. coli chorismate mutase inhibitors: • Analyzed by Kangas and Tidor • Suggested substitution experimentally verified: result is the tightest-binding inhibitor yet known • Barnase/barstar protein complex: • Tight-binding complex • Optimal charge distribution closely matches “wild-type” charge distribution
Challenge: Optimization is SLOW. 10 min/simulation * 2000 simulations (protein) = 2 CPU weeks!!
A Novel Method: The Reverse-Schur Approach • For these PDE constraints, we really only need to solve multiple systems simultaneously: • The unconstrained problem is therefore • Constraints can be handled using standard methods (Lagrange multipliers, etc.)
New Approach is Dozens to Hundreds of Times Faster, but Formally Exact Method scales comparably with normal PDE-constrained approaches Formally exact calculation 10 min/simulation = 20 min/optimization (no matter how many charges!) Bardhan et al., 2004; Bardhan et al., 2005; Bardhan et al., 2007; Bardhan et al., 2009
BIBEE as Inverse Problem Regularizer Approximated eigenvectors closely match actual ones • Regularization can be performed using “approximate” penalty functions: • No linear solve: Accurate but 10-20X faster than simulation!! • BIBEE/P captures small eigenvalues very accurately identify number of directions to penalize Single +1 charge +1, -1 charges 3 A apart
Red: Optimized charge values Blue: “Wild-type” charges (from 6-31G*/RESP) Application: Cyclin-Dependent Kinase 2 and Inhibitor PDE-constrained optimization is almost 200 times faster for this small molecule Anderson, et al. 2003 (not exactly the optimized ligand) Bardhan et al., J. Chem. Theory Comput. (2009)
Summary: Pushing On All Dimensions 1. Fast, Scalable Numerical Methods 2. Add Realism But Preserve Speed 3. Solve Inverse Problems in Design 4. Unify Theories For New Science
Four Points For Today • Cellular and molecular biomedical problems also need efficient simulation methods. • Fast BEM solvers represent an appealing approach even at the molecular scale! • Challenge: Persuading community to abandon beloved ad hoc fast methods for systematic ones. • Strategy: Systematic methods are more flexible as we add new physics and address inverse problems.
Collaborators and Acknowledgments • Fast methods: Michael Altman (Merck), Matt Knepley (U. Chicago), Rio Yokota (King Abdullah University of Science and Technology), Lorena Barba (Boston U.), Tsuyoshi Hamada (Nagasaki U.) • Nonlocal continuum theory: Andreas Hildebrandt (Johannes Gutenberg U., Mainz), Peter Brune, David Green (SUNY Stony Brook) • Fast optimization: Michael Altman, Bruce Tidor (MIT), Jacob White (MIT), Jung Hoon Lee (Merck), Sven Leyffer (Argonne) , Steve Benson (Argonne), David Green, Mala Radhakrishnan (Wellesley) • Approximation method: Matt Knepley, MihaiAnitescu (Argonne), Mala Radhakrishnan Support from: • Department of Energy (DOE) Computational Science Graduate Fellowship (CSGF) • Wilkinson Fellowship in Math and Computer Science Division of Argonne National Lab • NIH Technology Development (EUREKA) • Rush New Investigator Award