150 likes | 311 Views
Distilling Free-Form Natural Laws from Experimental Data Michael Schmidt and Hod Lipson, Science , vol. 324, no. 81, pp. 81-85, April, 2009. 2010. 5. 25 박 한 샘. Outline. Overview of this paper Background & Motivation Algorithm Experiments Conclusion. Overview of This Paper.
E N D
Distilling Free-Form Natural Laws from Experimental DataMichael Schmidt and Hod Lipson,Science, vol. 324, no. 81, pp. 81-85, April, 2009 2010. 5. 25 박 한 샘
Outline • Overview of this paper • Background & Motivation • Algorithm • Experiments • Conclusion
Overview of This Paper Actual pendulum, data and results • Mining physical systems • Capturethe angles and angular velocities over time using motion tracking • Search for equations that describe a single natural law relating these variables without any prior knowledge about physics or geometry • Turns out to be the double pendulum’s Hamiltonian • The proposed approach is demonstrated • Using a simple harmonic oscillator and a chaotic double-pendulum
Background Symbolic Regression • Symbolic regression • Searches both the parameters and the form of equations unlike traditional linear and nonlinear regression methods • Process (evolutionary computation) • Initial expressions are formed by randomly combining mathematical building blocks such as algebraic operators {+, -, x, /}, analytical functions (for example, sine and cosine), constants, and state variables • New equations are formed by recombining previous equations and probabilistically varying their sub-expressions • Algorithm retains equations that model the experimental data better than others and abandons unpromising solutions • After equations reach a desired level of accuracy, the algorithm terminates returning a set of equations that are most likely to correspond to the intrinsic mechanisms underlying the observed system
Motivation Challenge It is a major challenge even for a human scientists to identify nontrivial relations ? Nontrivial conservation equation should be able to predict connections among derivatives of groups of variables over time, relations that we can also calculate from new experimental data ? One instance of such a metric is the partial derivatives between pairs of variables
Method Algorithm to Detect Conservation Laws One can control the type of law, to an extent, by choosing what variables to provide to an algorithm If we provide velocities, the algorithm is biased to find energy laws If we additionally supply accelerations, the algorithm is biased to find force identities and equations of motion Given other types of variables, other or previously unknown analytical laws may exist
Experiments Data Collection • This paper collected data from typical systems: an air-track oscillator and a double pendulum • Motion tracking cameras and software were used • Infrared markers are placed on the experimental device • Its dynamics are captured • Motion tracking software produces time-series data of 3-dimensional Euclidean position coordinates for each infrared marker
Experiments Setting • Two configurations of the air track • Two-spring single-mass • Minimal noise • Three-spring double-mass • Considerable noise • Two configurations of a pendulum • A pendulum • A double pendulum • Higher measurement noise
Experiments Summary of Laws Inferred
Experiments Summary of Laws Inferred • Given position and velocity data over time • The algorithm converged on the energy laws of each system (Hamiltonian and Lagrangian equations) • Given acceleration data also • It produced the differential equation of motion corresponding to Newton’s second law for the harmonic oscillator and pendulum systems • Given only position data for the pendulum • The algorithm converged on the equation of a circle, indicating that the pendulum is confined to a circle … • In the absence of appropriate building blocks, the algorithm developed approximations • For example, eliminating cosine but not sine drove the algorithm to converge on the equality cos(Ө)=sin(Ө+π/2) or more complex equivalences One can control the type of law
Experiments Accuracy/Complexity Tradeoff • Consider the relationship between equation complexity and accuracy • Extremely complex equations with near perfect accuracy • Taylor series, neural networks, and Fourier series • Simple, single-parameter models with baseline accuracy • The Pareto front for the double pendulum • Equation at the cliff corresponds to the exact energy conservation law • Dramatical jump means capturing some significant relationships of the system
Experiments Time to Detect Solutions • The computation time increases with the dimensionality (# of variables), law equation complexity, and noise • In the worst case, the time to converge on the law equations • Depends exponentially on the complexity of the law expression itself, and • Depends roughly quadratically on the system dimensionality • The bootstrapped double pendulum is an exception • In a 32-core implementation, the time required ranged from a few minutes (the harmonic oscillator) to 30 hours (the double pendulum) • Noise reduces the ability to find accurate law equations substantially • It simply requires more time to compute, or • It obscure the law equation entirely depending on the noise strength
Experiments Bootstrapping • Bootstrapping search reduced the search time from 30~40 hours of computation to 7~8 hours • It uses the terms from simpler systems as a seed • We can guess that bootstrapping may be critical for detecting laws in higher-order systems that are veiled in complexity
Conclusion • Summary • This paper demonstrated the discovery of physical laws directly from experimentally captured data with the use of a computational search • It is used to detect nonlinear energy conservation laws, Newtonian force laws, geometric invariants and system manifolds • Discussion • The concise analytical expressions that we found • Are amendable to human interpretation and • Help to reveal the physics underlying the observed phenomenon • This process will not diminish the role of future scientists, but help to focus on interesting phenomena more rapidly