190 likes | 290 Views
IMA Thematic Year on Mathematics of Materials and Macromolecules. Thanks to Local Organizers: Mitch Luskin, Maria Calderer, Dick James. Effective Theories for Materials and Macromolecules. +NGF. Sloppy Models: Universality in Data Fitting.
E N D
IMA Thematic Year on Mathematics of Materials and Macromolecules Thanks to Local Organizers: Mitch Luskin, Maria Calderer, Dick James Effective Theories for Materials and Macromolecules
+NGF Sloppy Models: Universality in Data Fitting Kevin S. Brown, JPS, Rick Cerione, Chris Myers, Kelvin Lee, Josh Waterfall, Fergal Casey, Ryan Gutenkunst, Søren Frederiksen, Karsten Jacobsen, Colin Hill, Guillermo Calero Cell Dynamics Error Bars for Interatomic Potentials Fitting Exponentials, Polynomials
32 35 125 P, S, I Ensemble: Extrapolation Fit Ensemble: Interpolation Fitting Decaying Exponentials Classic Ill-Posed Inverse Problem Given Geiger counter measurements from a radioactive pile, can we recover the identity of the elements and/or predict future radioactivity? Good fits with bad decay rates! 6 Parameter Fit
EGFR NGFR +NGF +EGF Tunes down signal (Raf-1) Sos Pumps up signal (Mek) Ras Raf-1 10’ 10’ ERK* ERK* MEK1/2 Time Time ERK1/2 PC12 Differentiation Biologists study which proteins talk to which. Modeling?
‘Sloppy Model’ Errors for Atoms Søren Frederiksen, Karsten W. Jacobsen, Kevin Brown, JPS Interatomic Potentials V(r1,r2,…) • Fast to compute • Limit me/M → 0 justified • Guess functional form • Pair potential V(ri-rj) poor • Bond angle dependence • Coordination dependence • Fit to experiment (old) • Fit to forces from electronic • structure calculations (new) Bayesian Ensemble Approach to Error Estimation of Interatomic Potentials Quantum Electronic Structure (Si) 90 atoms (Mo) (Arias) Atomistic potential 820,000 Mo atoms (Jacobsen, Schiøtz) 17 Parameter Fit
Tyson eigen parameters Best Fit Brown Stiff Sloppy bare parameters Kholodenko Why the Name Sloppy Model? Eigenvalues Span Huge Range Each eigenvalue ~three times next Ill-conditioned Stiff 1cm Sloppy~meters,km Local Collinearity of Parameters Many alternative fits just as good Huge ranges of allowed parameters Eigenvalue Huge Fluctuations around Best Fit Hessian ∂2C/∂q2 at Best Fit Sloppy Directions Small Eigenvalues
Anharmonic H Perfect (Fake) Data Sloppy Model Eigenvalues Molybdenum Interatomic Potential Many fitting problems are sloppy • Cell Dynamics Lessons: • Sloppy Due to Insufficient Data? No: Perfect Data Sloppy Too • Survives Anharmonicity? Yes: Principle Component Analysis Signal Transduction Polynomial Fitting
eigen bare Ensemble of Models We want to consider not just minimum cost fits, but all parameter sets consistent with the available data. New level of abstraction: statistical mechanics in model space. Don’t trust predictions that vary Generate an ensemble of states with Boltzmann weights exp(-C/T) and compute for an observable: O is chemical concentration, or rate constant …
+EGF +NGF eigen 10’ 10’ ERK* ERK* bare Time Time 48 Parameter “Fit” to Data Cost is Energy Ensemble of Fits Gives Error Bars Error Bars from Data Uncertainty
Does the Erk Model Predict Experiments? Model Prediction Brown’s Experiment Model predicts that the left branch isn’t important Predictive Despite Sloppy Fluctuations!
* * * * * * * stiffest * * 2nd stiffest Which Rate Constants are in the Stiffest Eigenvector? Oncogenes Ras Eigenvector components along the bare parameters reveal which ones are most important for a given eigenvector. Raf1
Interatomic Potential Error Bars Ensemble of Acceptable Fits to Data • Not transferable • Unknown errors • 3% elastic constant • 10% forces • 100% fcc-bcc, dislocation core Best fit is sloppy: ensemble of fits that aren’t much worse than best fit. Ensemble in Model Space! T0 set by equipartition energy = best cost T0 Error Bars from quality of best fit Green = DFT, Red = Fits
Note: tails… MEAM errors underestimated by ~ factor of 2 Sloppy Molybdenum: Does it Work? Comparing Predicted and Actual Errors Sloppy model errorsigives total error if ratior = errori/sidistributed as a Gaussian: cumulative distributionP(r)=Erf(r/√2) • Three potentials • Force errors • Elastic moduli • Surfaces • Structural • Dislocation core • 7% < si < 200% “Sloppy model” systematic error most of total ~2 << 200%/7%
Fitting Polynomials: Hilbert • Polynomial fit: L2 norm • Hessian = 1/(i+j+1) • = Hilbert matrix • (Classic ill-conditioned matrix) • Monomial coefficients qn sloppy. • Orthonormal shifted Legendre • Coefficients an not sloppy What is Sloppiness? Sloppiness as Perverse, Skewed Choice of Preferred Basis (Human or Biological)
Exploring Parameter Space • Glasses: Rugged Landscape • Metastable Local Valleys • Transition State Passes • Optimization Hell: Golf Course • Sloppy Models • Minima: 5 stiff, N-5 sloppy • Search: Flat planes with cliffs Rugged? More like Grand Canyon (Josh)
Ensemble Fluctuations Along Eigendirections Work In Progress stiff sloppy loge fluctuations along eigendirection 3x previous Monte Carlo Fluctuations Suppressed in Soft Directions: Anharmonicity or Convergence?
Work In Progress Error Bars Stochastic versus Sensitivity • Sensitivity Analysis = Harmonic Approximation for Errors • Yields Much Larger Prediction Fluctuations • Anharmonicity Constrains Soft Modes • Mimic w/ modest prior (fluctuations < 106, one s) • Sensitivity w/Prior Fluctuations Now Close to Monte Carlo
Work In Progress Sloppy Model Universality? Random Matrix GOE Ensemble: many different NxN random symmetric matrices have level repulsion, universal ~Wigner-Dyson spacings as N→ Product ensemble: equally spaced logs! stronger level repulsion Fitting exponentials: very strong level repulsion! New random matrix ensemble? Strong Level Repulsion Why are all these problems so similar? Fitting exponentials: stiffest minus second