1 / 23

Including Uncertainty Models for Surrogate-based Global Optimization The EGO algorithm

Including Uncertainty Models for Surrogate-based Global Optimization The EGO algorithm. Introduction to optimization with surrogates. Based on cycles. Each consists of sampling design points by simulations, fitting surrogates to simulations and then optimizing an objective.

howard
Download Presentation

Including Uncertainty Models for Surrogate-based Global Optimization The EGO algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Including Uncertainty Models for Surrogate-based Global Optimization The EGO algorithm

  2. Introduction to optimization with surrogates • Based on cycles. Each consists of sampling design points by simulations, fitting surrogates to simulations and then optimizing an objective. • Construct surrogate, optimize original objective, refine region and surrogate. • Typically small number of cycles with large number of simulations in each cycle. • Adaptive sampling • Construct surrogate, add points by taking into account not only surrogate prediction but also uncertainty in prediction. • Most popular, Jones’s EGO (Efficient Global Optimization). • Easiest with one added sample at a time.

  3. Background: Surrogate Modeling Surrogates replaceexpensive simulationsby simple algebraic expressions fit to data. is an estimate of . Example: • Kriging (KRG) • Polynomial response surface (PRS) • Support vector regression • Radial basis neural networks • Differences are larger in regions of low point density.

  4. Background: Uncertainty Some surrogates also provide an uncertainty estimate: standard error, s(x). Example: kriging and polynomial response surface. Both of these are used in EGO.

  5. Kriging fit and defining Improvement • First we sample the function and fit a krigingmodel. • We note the present best solution (PBS) • At every x there is some chance of improving on the PBS. • Then we ask: Assuming an improvement over the PBS, where is it likely be largest?

  6. What is Expected Improvement? Consider the point x=0.8, and the random variable Y, which is the possible values of the function there. Its mean is the kriging prediction, which is slightly above zero. • yPBS: Present Best Solution • : Kriging Prediction • s(x) : Prediction standard deviation • : Cumulative density function of standard normal distribution • : Probability density function of standard normal distribution

  7. Expected Improvement (EI) • Idea can be found in Mockus’s work as far back as 1978. • Expectation of improving beyond the Present Best Solution (PBS) or current best function value, yPBS. Predicted difference between current minimum and prediction at x Penalized by area under the curve Large when s(x) is large, promotes exploration Large when is small with respect to yPBS, promotes exploitation Balance between seeking promising areas of design space and the uncertainty in the model. Mockus, J., Tiesis, V. and Zilinskas, A. (1978), The application of Bayesian methods for seeking the extremum, in L.C.W. Dixon and G.P. Szego (eds.), Towards Global Optimisation, Vol.2, pp. 117–129. North Holland, Amsterdam.

  8. Exploration and Exploitation EGO maximizes E[I(x)] to find the next point to be sampled. • The expected improvement balances exploration and exploitation because it can be high either because of high uncertainty or low surrogate prediction. • When can we say that the next point is “exploration?”

  9. Recent developments in EI • Sasena MJ, “Flexibility and Efficiency Enhancements for Constrained Global Design Optimization with Kriging Approximations,” Ph.D. Thesis, University of Michigan, 2002. • Cool criterion for EI • SobesterA, Leary SJ, and Keane AJ, “On the design of optimization strategies based on global response surface approximation models,” Journal of Global Optimization, 2005. • Weighted EI • Huang D, Allen TT, Notz WI, and Zeng N, “Global optimization of stochastic black-box systems via sequential kriging meta-models,” Journal of Global Optimization, 2006. • EGO with noisy data (Augmented EI)

  10. Probability of targeted Improvement (PI) • First proposed by Harold Kushner (1964). • PI is given by yTarget: Set Target to be achieved () TI: Target of Improvement (usually a constant) : Kriging Prediction s(x) : Prediction standard deviation : Cumulative density function of a normal distribution • Kushner HJ, “A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise,” Journal of Basic Engineering, Vol. 86, pp. 97-106, 1964.

  11. Probability of targeted Improvement (PI) PBS

  12. Sensitivity to target setting Target of 50% improvement on PBS Target of 25% improvement on PBS Target too modest Target too ambitious

  13. How to deal with target setting in PI? • One way is to change to EI. • Approach proposed by Jones: Use several targets corresponding to low, medium, high desired improvements. • Leads to selection of multiple points per iteration. • Local and Global search at the same time. • Takes advantage of parallel computing. • Recent development: • Chaudhuri A, Haftka RT, and Viana FAC, “Efficient Global Optimization with Adaptive Target for Probability of targeted Improvement,” 8th AIAA Multidisciplinary Design Optimization Specialist Conference, Honolulu, Hawaii, April 23-26, 2012.

  14. Expanding EGO to surrogates other than Kriging Considering the root mean square error, : (a) Kriging We want to run EGO with the mostaccurate surrogate. But we have no uncertainty model for SVR (b) Support vector regression

  15. Importation of Uncertainty model

  16. Hartmann 3 example Hartmann 3 function (initially fitted with 20 points): After 20 iterations (i.e., total of 40 points), improvement (I) over initial best sample (IBS):

  17. Two other Design of Experiments FIRST: SECOND:

  18. Summary: Hartmann 3 example Box plot of the difference between improvement offered by different surrogates after 20 iterations (out of 100 DOEs) In 34 DOEs (out of 100) KRG outperforms RBNN (in those cases, the difference between the improvements has mean of only 0.8%).

  19. Potential of EGO with Multiple surrogates Hartmann 3 function (100 DOEs with 20 points) Overall, surrogates are comparable in performance.

  20. EGO with Multiple Surrogates (MSEGO) Traditional EGO uses kriging to generate one point at a time. We use multiple surrogates to get multiple points.

  21. EGO with Multiple Surrogates (MSEGO) “krg”runs EGO for 20 iterationsadding one pointat a time. “krg-svr”and “krg-rbnn”run 10 iterationsadding two points/iteration. Multiple surrogates offer good results in half of the time!!!

  22. EGO with Multiple Surrogates and Multiple Sampling Criteria • Multiple sampling (or infill) criteria (each of them also have the capability of adding multiple points per optimization cycle): • EI: Expectation of improvement beyond the present best solution. • Multipoint EI (q-EI), Kriging Believer, Constant Liar • PI: Probability of improving beyond a set target. • Multiple targets, multipoint PI • Multiple sampling criteria with multiple surrogates has very high potential of: • Providing insurance against failed optimal designs. • Providing insurance against inaccurate surrogates. • Leveraging parallel computation capabilities.

  23. Problems EGO • What is Expected Improvement? • What happens if you set very ambitious targets while using Probability of targeted Improvement (PI)? How could this feature be useful? • What are the potential advantages of adding multiple points per optimization cycle using multiple surrogates and multiple sampling criteria?

More Related