310 likes | 333 Views
This overview discusses the use of Bayesian methods and sequential design of experiments for indentation, specifically for extracting intrinsic material parameters via single crystal indentation. It covers efficient building of a surrogate model and selection of physical experiments for calibration.
E N D
Bayesian Methods and Design of Experiments for Indentation AC
Overview • Several methods and associated techniques for the sequential design of experiments are presented • Extraction of intrinsic material parameters via single crystal indentation is used as a case study. This is separated into two parts… • Efficient Building of a forward model (surrogate) • Efficient Selection of physical experiments to calibrate material parameters
Introduction: Indentation • Mechanical properties of a polycrystalline material is a result of its constituents • To successfully implement multiscale models it is necessary to ascertain crystal level parameters in order to feed into multiscale models • This brings into mind, the crystallography of the material of interest, at the microscale we begin to think of single crystals, orientation and mechanical properties. (stiffness tensor, and slip resistances) Nanoscale Macroscale Microscale
Introduction: Indentation • Spherical Indentation • Mechanical testing protocol for probing multiscale mechanical properties from materials of interest • This method is experimentally a high throughput process, however post processing with simulations often takes sizeable time and computational resources • We are Interested in calibrating the stiffness parameters. In the materials extracting these material properties are of great interest. They often feed into larger multiscale models or provide information about new materials. Simulation Inputs Parameters Features Simulation Output Scalar (Eind)
Bayesian Updates • Effective indentation modulus from finite element simulations • Establish likelihood of simulated observations given a model: • Establish reduced order model coefficients which capture governing physics • Bayesian Update:
Setting of Hyperparameters • Evaluate marginal distribution over model coefficients to estimate associated parameters: • Parameters, are chosen to maximize • Posterior from Bayesian Update becomes new prior to future updates • Note: Associated probabilities assumed normally distributed
Bayesian Update for Governing Physics(continued • The distribution for an indentation modulus, for a new , is found from • Choose next simulation , which uncertainty of is largest to rapidly improve the model.
Calibration of model parameters • Effective indentation modulus from experiments: , • Establish likelihood of the set of experimental observations, given the reduced order model established • Use sampling techniques to sample from the posterior distribution without actual having it (using likelihood and prior) • Establish priors over parameters, to reflect domain knowledge • In the following slides we describe methods to sample from the posterior distribution. For simplicity we will denote the following
Introduction to MCMC • Monte Carlo Markov Chains • Class of common sampling methods • Metropolis Algorithm (1953) – Simulated Annealing • Metropolis Hastings Algorithm (1970) – Generalized Metropolis Algo to posterior sampling • The Essentials • A Markov Chain is a sequence of random variables which each value is only dependent on the previous value • A finite, ergodicmarkov chain has a limiting distribution which it converges to as the length of the chain approaches very large numbers. • GOAL: ConstructaMarkov Chain which samples from the posterior (limiting) distribution of interest, p*
Markov Chains: Properties and Terminology • Transition Kernel • Describes the process of generating a Markov chain. Function providing the probability of moving from two possible states. • Ergodic Markov Chains • A Markov chain is called ergodic if the Transition kernel in the finite domain being considered is always positive. • Given an Ergodic Markov Chain, there exists a stationary distribution that is invariant with respect to the transition kernel. This is the limiting distribution previously described.
Monte Carlo Markov Chains: Reversibility • Sampling from a posterior distribution • Posterior distribution is the invariant distribution w.r.t. a Markov Chain • We need to generate a markov chain, which requires constructing the appropriate transition kernel • The easiest way to do so is to satisfy reversibility • Marginalizing both sides by x we find
Monte Carlo Markov Chains: Candidate Distributions and acceptance probability • Hastings devised an update scheme which satisfies reversibility by recasting the transition kernel into a candidate distribution, q, which describes all possible moves from state x to y, and the probability of acceptance, , which describes the probability the move from state x to y was actually made. • Must Satisfy
Monte Carlo Markov Chains: Update Scheme • Define the Hastings Ratio • Define acceptance ratio • We can confirm that this satisfies reversibility • We now perform monte carlo “simulations”, by generating a new state sampling from the candidate distribution, and accepting the move based on the acceptance probability .
Steps to Generate a Markov Chain from p* • initialize randomly • Select a form of candidate distribution • Draw from candidate distribution • Calculate acceptance ratio Recall:
Steps to Generate a Markov Chain from p* • Compare acceptance ratio to a draw from a uniform distribution • For initial draws augment candidate distribution to achieve a desired acceptance rate, , from previous J iterations • Repeat process (3-5) to generate chain * *Note:
Post Process Examples of trace plots for MCMC chains • Look at the trace plot of variables • This is the value of the accepted moves over the iterations. • The initial gray section is called the burn-in period. Which is discarded to avoid effects of the starting point. • There are various ways to diagnose the convergence of the MCMC chains. Looking at the trace plots allows to view appropriate mixing (we are moving around parameter space without getting stuck) • We can also ensure convergence to the same distribution using various chains started at different points burn-in burn-in http://web.as.uky.edu/statistics/users/pbreheny/701/s13/notes/3-5.pdf http://sbfnk.github.io/mfiidd/mcmc_diagnostics.html
Summary: MCMC • With the definition of the probability of acceptance defined we now seek to build a chain sampling from the posterior distribution using a selected candidate distribution. • Pseudo code • j < J • ; • ~ • a ~ • > a , ; • ; *Note, is a uniform distribution with range 0-1 • ~ • a ~ • > a
Additional Notes: MCMC • An additional note about MCMC chains is they allow us to approximate the expectation of a variable or function that may typically be difficult to evaluate. • Generally this is written • Where is indexing the chain sampled as described previously. This fact is taken advantage of in the evaluation of mutual information (later)
Selection of Experiments Via Bayesian Criteria • Goal • Select experiments which provide the most amount of additional information • Allows for calibration of parameters in a faster manner • We desire some sort of criteria which quantifies the amount of information to be gained for the estimated parameters by observing additional experiments
Information • Entropy • Proposed by Shannon in 1948, extended to continuous variables by Jaynes • Negative logarithm of the probability density function (for continuous variables) • Less probable events carry more surprise, or information • The joint information of two independent events is additive • Amount of information of a variable is the expectation of the entropy • Also thought of the expectation of “surprise” • Note: That the following assumes the probability density of given variables is essentially zero outside of a given range
Types of Entropy • Joint Entropy • Conditional Entropy
Mutual Information • The mutual information is defined as the information gained for observing a random variable X by observing another random variable Y. • The conditional entropy can be thought of the amount of information needed to describe a variable X, once Y is given (if this H is small, Y contains a lot of information about X) • We will use the concept of mutual information and extend it to amount of information for parameters (X as place holders) when an experimental output is observed (Y as place holders)
Mutual Information • Mutual Information can be written
Mutual Information Continued • Pause to inspect the simplified term for mutual information • Blue brackets “Kullback-Lieber” Divergence • This term is often used in statistics as a type of distance measurement to describe how different two distributions are • That is, is p(x) is our prior distribution, or previous knowledge about parameters x, and p(x|y) is the known distribution for the parameters for the given data. We are measuring how different these distributions are. If all information of x is contained within the prior, than this term is 1, and the KL distance is zero, resulting in zero information gain. • Red brackets expectation of the Kullback-Lieber Divergence • This integration is difficult to do, and approximations are often made in order to evaluate this overall term
Mutual Information Continued • Another issue with mutual information is is generally unknown. In order to cope with this Bayes theorem is applied. • Recall • Substitute into Mutual Information • There are various approaches in evaluating the above utility, the important thing to note is the distributions are effectively written in terms of a prior, p(x) and the distribution p(y|x), which is an analog to the distribution found using surrogate models
Extension to Indentation • We can leverage the idea of mutual information in the decision of what grain orientation to indent next • In the context of mutual information this can be stated as we want to maximize the information gain for our parameter estimates C (analog to X) through our observation of indentation moduli E (analog to Y) • We note that E depends on the parameter estimates C, and grain orientation g, while the parameter estimates C are independent of g. Recasting mutual information we can determine the next grain orientation g* using …
Example of Shannon Information • Evaluated using MCMC chain from posterior of p(C) • Rapid Convergence to a final distribution compared against • Random selected • Selected via highest local sensitivity • Shannon information in Blue
Sampling Method: MCMC • To prevent getting stuck in local minima and fulfill reversibility conditions, a probability of accepting a move from one state given another, is introduced. This allows us to write • Setting to the maximum probability of 1, we now evaluate the probability of a move
Sampling Techniques: MCMC • Typically a way to sample from a posterior distribution • The goal of MCMC is to create a chain sampled from the posterior distribution • MCMC seeks the reversible stationary posterior distribution above by walking across the parameter space in accordance to a proposal/jumping distribution q(*). Where q(*), is a candidate distribution which describes the probability of moving from one state, {i,i+1}, to another state {i+1,i} • However, in practice using this formulation we can get stuck in a local minima, and we are likely to find we are moving to the state {i+1} from {i} too often, causing us to no longer fulfill reversibility