1 / 59

Towards Optimal Solvers for PDE-Constrained Optimization

Towards Optimal Solvers for PDE-Constrained Optimization. Volkan Akcelik Omar Ghattas Carnegie Mellon University George Biros New York University. TOPS Winter Meeting Livermore, CA January 25-26, 2002. Simulation vs. Optimization. PDE model Simulation (forward) problem

Download Presentation

Towards Optimal Solvers for PDE-Constrained Optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards Optimal Solvers for PDE-Constrained Optimization Volkan Akcelik Omar Ghattas Carnegie Mellon University George Biros New York University TOPS Winter Meeting Livermore, CA January 25-26, 2002

  2. Simulation vs. Optimization • PDE model • Simulation (forward) problem • Given “data” x (e.g. material coefficients, domain or boundary sources, geometry), find state variables u (velocity, stress, temperature, magnetic or electric field, displacement, etc.) • Optimization (inverse) problem • Given desired goal involving u, find x • Optimal design, optimal control, or parameter estimation

  3. PDE-Constrained Optimization

  4. First-order Optimality Conditions

  5. Examples • Optimal design • Shape optimization of viscous flow system • Nd= O(1) • Optimal control • Boundary flow control • Nd = O(Nu2/3) • Parameter estimation • Heterogeneous inverse wave propagation • Nd = O(Nu)

  6. Optimal Design: Artificial Heart Shape Optimization 1 Goal: Find pump geometry that minimizes blood damage (J. Antaki & G. Burgreen, Univ Pittsburgh Medical Center) G. Burgreen

  7. Optimal Design: Artificial Heart Shape Optimization 2 Burgreen & Antaki, 1996

  8. Optimal Control:Boundary Flow Control 1 MHD flow control using boundary magnetic field

  9. Optimal Control: Boundary Flow Control 2 Goal: find boundary suction/injection that minimizes energy dissipation in a viscous incompressible flow

  10. Optimal Control: Viscous Boundary Flow Control 2 (G. Biros, 2000)

  11. Optimal Control: Viscous Boundary Flow Control 3 No control Optimal control

  12. Parameter Estimation: Seismic Inversion 1 • Forward Problem: Given soil material and earthquake source parameters, find earthquake ground motion. • Inverse Problem: Given earthquake observations, estimate material and source parameters. • Part of Quake KDI Project at CMU (J. Bielak, D. O’Hallaron), Berkeley (J. Shewchuk), and SDSU (S. Day, H. Magistrale)

  13. Parameter Estimation: Seismic Inversion 2

  14. Parameter Estimation: Seismic Inversion 3 Simulation of 1994 Northridge Earthquake

  15. Parameter Estimation: Seismic Inversion 4 Surface Visualization

  16. states controls objective function state equations (PDEs) Lagrange-Newton-Krylov-Schur method • Consider discrete form of problem:

  17. Optimality Conditions • Lagrangian function: • First order optimality conditions: • Define:

  18. Newton’s Method for Optimality Conditions (SQP, Newton-Lagrange) • A Newton step: (KKT system) • Two possibilities for solution: • Full space method: solve KKT system with a Krylov method—need good preconditioner • Reduced space method: eliminate state variables and Lagrange multipliers

  19. Quasi-Newton Reduced SQP • LU factorization of (permuted) KKT matrix • Problems: • Reduced Hessian Wz needs m (number of decision variables) PDE solves per optimization iteration • 2nd derivatives needed • Quasi-Newton RSQP: • Replace reduced Hessian by quasi-Newton approximation • Discard other 2nd derivative terms • Properties • Just 2 PDE solves required per optimization iteration • 2-step superlinear convergence

  20. ~ ~ ~ Approximate QN-RSQP Preconditioner • Use Schur-based approximate LU factorization as KKT preconditioner • Since preconditioner, can replace PDE solves by actions of PDE preconditioner • Properties: • No PDE solves per iteration (2 preconditioner applications required per inner Krylov iteration) • Newton convergence for outer iteration • Inspiration: • Schur complement domain decomposition preconditioners (Keyes & Gropp, 1987)

  21. Effectiveness of Schur Preconditioner • Preconditioned KKT system (w/exact A) • Schur preconditioner results in: • Reduced condition number • Clustering of eigenvalues • But not a normal matrix—convergence rate bounds depend on condition number of eigenvector matrix • In practice, works very well

  22. How to Precondition Reduced Hessian? • Requirements • Cannot compute reduced Hessian, only its (approximate) action on a vector • Possibilities • Quasi-Newton formula (limited memory for large m) • Fixed small number of stationary iterations • Sparse approximate inverse preconditioner (SPAI) • Discrete approximation of underlying continuous Wz • Sherman-Morrison-Woodbury, if reduced Hessian is of form “compact operator + SPD”

  23. 2-step Stationary Iterations as Reduced Hessian Preconditioner • Properties • Same complexity as CG but a constant preconditioner • Needs only Matvecs (here with approximate reduced Hessian) • Needs estimates of extremal eigenvalues • Use to initialize L-BFGS preconditioner (effectively H0)

  24. Inexact Solution of KKT System • Solve KKT system iteratively to tolerance dependent on h x Lagrangian gradient • Eisenstat & Walker inexact Newton theory applies for reduction of norm of Lagrangian gradient • But will it satisfy merit function criteria? • Can show that in vicinity of local minimum (for augmented Lagrangian merit function): • If h small enough, Armijo condition satisfied with unit steplengths • If h is O(Lagrangian gradient), quadratic convergence preserved

  25. Putting it all together: Lagrange-Newton-Krylov-Schur Method • Continuation loop • Optimization iteration (Lagrange-Newton) • Estimate extremal eigenvalues of (approximate) reduced Hessian using Lanczos (retreat to QN-RSQP if negative) • Inexact KKT solution via symmetric QMR (Krylov) • Quasi-Newton RSQP preconditioner (Schur) • 2-step stationary iterations+ L-BFGS approximation of inverse reduced Hessian • PDE solve replaced by PDE preconditioner • Backtracking line search on augmented Lagrangian or l1 merit function • If no sufficient descent use QN-RSQP • Compute derivatives, objective function, and residuals • Update solution, tighten tolerances

  26. Application: Optimal Boundary Control of Viscous Flow Goal: find boundary suction/injection that minimizes energy dissipation in a viscous incompressible flow

  27. Fixed-size efficiency (n=120,000) algorithmic parallel overall Veltisto + PETSc implementation

  28. Mesh independence of Krylov iteration with exact PDE solves (implies reduced Hessian preconditioner is effective) 4x cost of PDE solve Moderate growth of Krylov iterations with approximate PDE solves (implies PDE precond is moderately effective) “textbook” Newton mesh independence Isogranular algorithmic efficiency

  29. Veltisto: A Library for PDE-Constrained Optimization • PETSc extension • Object oriented • Nonlinearly constrained optimization • Targets steady PDE constraints • Supports domain-based parallelism • Matrix-free implementation • In use at Sandia-Albuquerque

  30. Veltisto Features • Optimizers: LNKS, LNKS-IP, QN-RSQP, N-RSQP • Preconditioners: LM-BFGS, SR1, Broyden, 2-step stationary • Merit functions: augmented Lagrangian, L1, L1+second order correction • Symmetric QMR (allows indefinite preconditioning)

  31. Veltisto Example

  32. Veltisto Example

  33. Conclusions • Schur-complementing the optimality system permits capitalizing on investment in PDE solvers • Lagrange-Newton-Krylov-Schur methods can exhibit mesh-independent outer iterations for mesh-based control spaces • Inner iteration potentially mesh independent • Depends on effective PDE and reduced Hessian preconditioners • Very good scalability • >10x speedup over quasi-Newton • Optimization in small multiple of simulation cost

  34. Future Work • Approximate block elimination preconditioners • Other reduced Hessian preconditioners • Direct preconditioning of KKT system • DD preconditioners directly on KKT system • Multigrid on KKT system (Barry) • Multigrid preconditioner on block diagonal approximation of KKT system • Reformulation of KKT system to make it SPD • FOSLS? • Augmented Lagrangian? • Time dependence and inequalities remain a challenge

  35. Acoustic inversion

  36. First order optimality conditions

  37. First order optimality conditions, cont.

  38. Summary: first order optimality conditions (Karush-Kuhn-Tucker)

  39. Newton solution of the KKT conditions

  40. Schur complement method • Schur complement is self adjoint & positive definite near minimum • Action of Schur complement on q requires 1 forward & 1 adjoint wave prop

  41. Gauss-Newton-Schur method • GN Schur complement is self adjoint & positive definite everywhere • Quadratic convergence for good fit problems, linear otherwise • Action of Schur complement on q requires 1 forward & 1 adjoint wave prop

  42. Spectrum of GN Schur complement (just least squares term; no regularization) 400 eigenvalues smooth eigenvectors rough eigenvectors first 12 eigenvalues

  43. W-(Gauss)-Newton-Schur-CG Solver • Multiscale continuation over u, l, q grids and source frequency • (Gauss)-Newton nonlinear iteration • Conjugate gradient solution of Schur-decomposed linear system • Currently no preconditioner

  44. 2D/3D Acoustic Wave Equation Example • Piecewise bi(tri)linear approximation of pressure u, adjoint l, and square of wave speed q • Explicit time integration • Up to 512x512 and 64x64x64 grids, 2000 time steps • 50 receivers, 1 harmonic source • Homogeneous initial guess • PETSc (www.petsc.anl.gov) implementation • Performance (Tikhonov regularization) • 6 hours on 64 processors of T3E-900 at PSC • Number of inner conjugate gradient iterations 10~20 • Number of outer Newton iterations 10~20

  45. Effect of different types of regularization

  46. Total variation regularization, finer grids

  47. Effect of different regularization weights(Tikhonov regularization)

  48. single level solution Multilevel vs. single level continuation(Tikhononv regularization) multilevel solution

  49. Effect of data noise on solution

More Related