690 likes | 864 Views
Integration Methods for Multidimensional Probability Integrals. Abebe Geletu abebe.geletu@tu-ilmenau.de www.tu-ilmenau.de/simulation. IFAC2011 Pre-Conference Tutorial August 27-28, 2011, Milano. Group of S imulation and O ptimal P rocesses ( SOP )
E N D
Integration Methods for MultidimensionalProbability Integrals Abebe Geletu abebe.geletu@tu-ilmenau.de www.tu-ilmenau.de/simulation IFAC2011 Pre-Conference Tutorial August 27-28, 2011, Milano Group of Simulation and Optimal Processes (SOP) Institute for Automation and Systems Engineering Technische Universität Ilmenau
Introduction One-dimensional probability integrals Cubature rules for multidimensional 3.1. Full-grid cubature rules 3.2. Sparse-grid cubature rules Application Conclusions Resources and References Content
1. Introduction Task: to solve the optimization problem Define:
Introduction … The problem CCOPT is equivalently written as • Question: How to solve the problem NLP? • Use either gradient-based or gradient-free optimization algorithms or a combination of both. • (A) Any optimization algorithm requires • value of the objective function • Value of the constraint function • for each given .
Introduction … • The most difficult task in solving CCOPT is the evaluation of the values of the chance constraint • for a given . • Note that is a random variable, since is random. • If is non-linear w.r.t. it is difficult to determine the distribution of the random variable from that of . • Approach : back-projection through a montony relation • Let be a 1D random variable with a known distribution. • Let and is a strictly increasing real valued function. • Then The latteris simpler tocompute!!
Introduction … • In higher dimensions (experimentally or analytically) study the equation • and among find a which has a strict monotony relation with ; so that : • is either strictly increasing or striclty decreasing
Introduction… • (B) Gradient-based algorithms further require • gradient of the objective function and • Gradient of the constraints function • when these derivatives exist. • In both (A) and (B) values and gradient are computed through evaluation of multidimensional integrals of the form • Except in some special cases • these integrals cannot be computed analytically • integrals on higher dimensions are computationally expensive
Introduction… Numerical methods for evaluation of multidimensional integrals. • Fast and efficient evaluation of probability integrals reduces computational expenses in the overall optimization strategy for CCOPT. • Since integration is done with respect to we drop the parameterization with and consider only • for the sake of simplicity.
Introduction … Methods for Multidimensional Integrals Deterministic Methods • Quadrature rules – for 1D integrals • Full or Spars-grid cubature rules – for MD integrals Sampling-based Methods • Monte Carlo Methods • Quasi-Monte Carlo methods • Deterministic integration rules for multidimensional integrals (commonly called cubature rules) are usually constructed from 1D quadrature rules.
2. Qudrature Rules Given the 1D integral where is a finite or an infinite interval and is a non-negative weight function. • The weight function corresponds to a probability density function of a random variable on the set . • represents expected value. Weights and integration domains for some standard 1D probability integrals
Qudrature Rules … Idea: to approximate the integral by a weighted sum • so that • the approximation error • is as small as possible; • the quadrature rule be capable of approximating 1D integrals • for a large class of functions . Important issues: How to generate • quadrature nodes: • corresponding weights: and • the number of function evaluations ?
Qudrature Rules … • The number of nodes is a trade-off between accuracy and efficiency. In general, the use of several nodes may not reduce the approximation error!! Methods for 1D integrals: • Newton-Cotes Formulas • -less accurate for probability integrals • Monte-Carlo and Quasi-Monte Carlo • - mainly used for multidimensional integrals • Gauss quadrature rules and Kronord/Patterson extenstions • - highly accurate and suitable for probability integrals • Clenshaw-Curtis quadrature rule • - suitable for probability integrals • Required property: • All the quadrature weights are non-negative.
Quadrature Rules … • Non-negative quadrature weights have advantages: • the use of non-negative weights reduces the danger of numerical cancellations in ; • in stochastic optimization, if is convex w.r.t. u for each fixed , then • is convex and due to the non-negative of weights the approximation • preserves the convexity.
Orthogonal Polynomials and Gauss Quadrature Rules Ref: Walter Gatushi : Orthogonal Polynomials • For two functions and • defines a scalar product w.r.t. on the set . • A degree n polynomial and a degree m polynomial are orthogonal on w.r.t. if • Different pairs lead to different sets of orthogonal polynomials. • Example: Sets of orthogonal polynomials for standard pair: • Hermit polynomials - • Jacobi polynomials - etc. • Quadrature nodes and corresponding weights • are computed based on orthogonal polynomials with respect to a given pair .
Generation of orthogonal polynomials Ref: Walter Gatushi : Orthogonal Polynomials • Given any pair . Let and all moments • exist and finite. Then • there is a unique set of orthogonal polynomials • corresponding to satisfying • (ii) the set of polynomials are uniquely determined by the three-term recurrence relation using and • where and
Computation of Gauss quadrature nodes and weights Ref: Golub, G.H., Welsch, J.H. Calculation of Gauss quadrature rules. • the computation of the coefficients of the recurrence relation requires algorithms (commonly known as Steleje’s procedures see Gander & Karp for a stable algorithm. • For known recurrence coefficients the N nodes • and weights of a Gauss • quadrature rules are computed from the Jacobi matrix • Note that: all the quadrature weights computed above are nonnegative Gauss quadrature rules have non-negative weights. • the quadrature nodes all lie in the interior of .
Standard Gauss Quadrature Rules Ref: Abramowitz, M. and Stegun, I. A. Handbook of Mathematical Functions with Furmulas, Graphs, and Mathematical Tables • There are software and lookup tables for standard quadrature nodes and corresponding weights. Table : Standard Gauss-quadrature rules NB: All 1D probability integrals corresponding to standard Gauss-quadrature rules can be computed with non-negative quadrature weights.
Gauss Quadrature Rules – Exactness • An N-point Gauss quadrature rule computes all polynomials with degree less or equal to exactly; i.e. • the quadrature rule has a degree of (polynomial) exactness equal to 2N-1 a probability integrals with a polynomial integrand can be computed exactly. • if the integrand is a general function of the uncertain variable, the approximation error depends on the smoothness of the function and the number N of integration nodes; • integration nodes and weights are generated independent of the integrand; • quadrature rules with odd number N of nodes are usually more preferred.
Example: Gauss-Hermite quadrature Ref: Abramowitz, M. and Stegun, I. A. Handbook of Mathematical Functions with Furmulas, Graphs, and Mathematical Tables Gauss-Hermite quadrature nodes and weights 3-Point Gauss-Hermite quadrature Polynomial exactness up to degree 7-Point Gauss-Hermite quadrature Polynomial exactness up to degree
Example: Gauss-Hermite quadrature Suppose to evaluate where Variable transformation :
Example: Gauss-Hermite quadrature Using quadrature nodes based on the weight function we have For special case A 3-Point quadrature is not efficient to compute
Embedded quadrature rules Ref: Trefethen: Is Gauss quadrature better than Clenshaw-Curtis? • Let If the set of nodes for is a • subset of the set of nodes for ; i.e., • the quadature rule is embedded. Clenshaw-Curtis QuadratureRule on • Quadrature nodes • All quadrature weights in are positive; • have polynomial exactness equal to only (less than Gauss-quadrature rules); • it is an embedded quadrature rule Given , construction of requires only additional points; values of already computed for can be reused in saves time.
Advantages of Embedded quadrature rules • Unfortunately, Gaussquadratturerulesarenotembedded. Example: Gauss-Hermite quadrature Someadvantages of embeddedrules: • nodes of lowerdegree quadrature canbeusedwhenconstructinghigherdegreequadratures; • provideseasiererrorestimationfor quadrature rule; • embeddednessis a highlydesiredpropertyfortheconstruction of multidimensional cubaturetechniques, etc.
Kronord and Patterson Extensions Ref: Laurie, D.P. Calculation of Gauss-Krnord quadrature rules. • ExtendGaussquadraturerulesto makethemembedded. Kronord‘s Extension (Gauss-Krnord quadrature) • Given Gauss quadrature nodes , between every two nodes add one new node: • so that the new set of nodes • embeds the former ones. • the new quadrature weights are non-negative • Nodes of lower degree quadrature can be used in constructing higher degree quadratures. Degree of exactenss
Patterson’s Extensions Ref: Patterson, T.N. L. The optimum addition of points to quadrature formula. • To existing Gauss quadrature nodes add p new quadrature nodesso that the resulting rule has a maximum degree of accuracy • and the weights so that the new set of are non-negative. • Hence, • In general, • pre-fixing integration nodes reduces degree of exactness; • construction of embedded Gauss quadrature rules is not a trivial task. (see Laurie 1997)
Quadrature for non-standard integrals Note that: transform integrals on non-standard intervals onto the standard ones. • Examples of some possible transformations: • etc. • transformation is done in such a way the resulting integral is easier to compute; • when possible to try to match the resulting weight function and integration domain, so that available results can be easily used.
Quadrature for non-standard integrals Example This can be applied to a chance constraint as • The transformed integral can be computed using either Gauss-Legendre or Gauss-Chebychev quadrature rule.
From Chance to Expected Value Constraint Ref: Geletu et al: Monotony Analysis and Sparse-Grid Integration for nonlinear Chance Constrained Process Optimization Example: with Monotony relation
From Chance to Expected Value Constraint … where The inner integral can be transformed into using the change of variables
3. Multidimensional probability integrals Problem: Given a (continuous) function and a non-negative weight function how to compute the integral: In many practical applications the indefinite integral does not have analytic expression. The domain of integration commonly has a product form
Multidimensional probability integrals … Standard integration domains Note: Transform non-standard integrals into standard forms. Example: Let be a random variable w.r.t. the probability measure such that and Then represents the expected value of w.r.t. the probability measure .
Multidimensional probability integrals … Assumptions: (A1) The weight (probability density) function can be written as where Assumption (A1) holds true if the are independent random variables. (A2) The domain of integration Example:
Numerical method for multidimensional Integrals Two major approaches • Cubature Techniques (rules) • 3.1. Full-Grid Integration Techniques • 3.2. Sparse-Grid Integration Techniques • Sampling based Techniques will not be discusses here • (IIIA) Monte-Carlo (MC) Integration Techniques • (IIIB) Quasi-Monte-Carlo (QMC) Integration Techniques • Cubature techniques are constructed based on one dimensional quadrature rules. • One-dimensional interpolatory Gauss quadrature rules (and their extensions) are found to be efficient, due to their higher degree of accuracy. • MC methods use randomly generated samples from • QMC methods use sequence of integration nodes from with lower discrepancy.
Suppose assumptions (A1) and (A2 ). Let for 3.1. Full-Grid Cubature techniques are quadrature nodes; corresponding weights; for the one-dimensional integral on with the weight function A full-grid cubature rule to compute I[f] is called full-grid tensor-product of one dimensional quadrature-rules or product rule.
Important questions: How many grid-points are there in the full-grid cubature rule ? That is, the set Is it necessary to use all the grid points in ? Is there redundancy in the full-grid scheme? What is the polynomial (or degree) of exactness of ? How good are full-grid cubature techniques? The number of grid-points (integration nodes) in : If then the number of grid points will be exponential growth!!
Examples – full-grid Techniques Example: For a 5-dimensional integral, a full-grid quadrature rule using 11-quadrature nodes in each dimension uses cubature nodes. • requires large number of function evaluations even for moderate dimensions; • not efficient for problems of higher dimensions. In particular, computationally expensive for stochastic optimization. Example: Let 7-Point 2D Full-grid nodes
Multidimensional polynomials and exactness of cubature techniques • One measure of quality for a cubature rule is related with the largest degree polynomial that it can integrate exactly. • For the variables a monomial of degree in the variables is an expression of the form • where • Example:For two variables • the following are monomials of degree 3 • all monomials in the two • variable of degree less equal to 3 are
Multidimensional polynomials and exactness of cubature techniques … • The number of distinct monomials in variables, degree less than equal to is equal to • A multidimensional polynomialof degree in the variables is a linear combination of monomials of degree less or equal to ; i.e. Hence, Example: (a) (b)
Multidimensional polynomials and exactness of cubature techniques … Ref: Cools, R: Advances in multidimensional Integration. • A cubature rule is said to be exact for a polynomial if • That is, • A cubature rule is said to have a polynomialexactness (or degree of accuracy) if it is exact for all polynomials of degree less than equal to .
Multidimensional polynomials and exactness of cubature techniques … Ref: Cools, R: Advances in multidimensional Integration. Theorem(Cools 2002). A cubature rule constructed as a tensor-product of one-dimensional Gauss-quadrature rules: with degree of exactness of the quadrature rule equal to then the degree of exactness of is equal to In particular, if , then the degree of exactness of will be equal to Good Idea: Let be an arbitrary function. For an accurate evaluation of use a cubature rule with a higher degree of accuracy . If itself is a polynomial with degree less or equal to , then
Multidimensional polynomials and exactness of cubature techniques … Fact:Higher accuracy in computing can be achieved by using a cubature rule with higher degree of exactness. Example: Consider a full-grid 2D cubaturer for the integral Almost equal result from two full-grids
Redundancy in the full-grid integration technique Ref: Davis, P.J., Rabinowitz, P.: Methods of numerical integration. The use of Gauss-Hermite full-grid cubature nodes leads to, too many function evaluations only with a little gain in accuracy. • Redundancy in the full-grid cubature techniques. • The use of cubature nodes can lead the curse of dimensions. Question: How many integration nodes are sufficient to obtain a polynomial exactness ? Answer: Theorem (Möller 1976, Mysovskikh 1968, Tchakaloff 1957) To attain a polynomial exactness equal to , the (optimal) required number of grid points in has lower and upper bounds given by • Known as Möller’s lower bound, while is Mysovskikh’s upper bound • (for unbounded ) or Tchakaloff upper bound (for bounded ).
Efficient cubature rules - represents the largest integer less than or equal to a; Definition(Davis & Rabinowitz) A cubature rule is said to be efficient (optimal) if it uses integration nodes. • represents the smallest integer greater or equal to a; • eg. • Now it is obvious that , for large n.
How many of cubature nodes? • For a 10-dimensiona integral we find the following values • For the computation of a 10–dimensional probability integral, using the full-grid technique with a degree of precision d=7, requires 1,048,576 function evaluations. Unafforable in the conext of stochastic optimization.
Construction of Efficient Cubature Rules Ref: Smolyak, S.:Quadrature and interpolation formulas for tensor products of certain classes of functions. Remark: The Theorems of Möller, Mysovskikh or Tchakaloff are non-constructive. Question: How to construct cubature rules with minimal number of nodes; i.e. number of nodes near or equal to ? If not, rules with number of nodes lying between the bounds and ? • The construction of cubature techniques with minimum number of integration nodes and higher polynomial exactness is still a hot research topic!! In fact, construction of cubature rules is partly an art as well as a science (Cools 1997). In 1963 Smolyak gave a scheme for construction of cubature techniques with number of nodes between and . Leading to a class of cubature rules known as Smolyak’s tensor-product integration rules or sparse-grid integration techniques.
3.2. Sparse-grid integration techniques Recall the integral with assumptions (A1) and (A2). Assumption(A3): For the sake of simplicity, we assume The random variables are independent and identically distributed. • In general, assumption (A3) is not required. Sparse-grid cubature rules can be constructed for independent but non-identically distributed random variables • However, correlated variables need to be de-correlated (or transformed ) for construction a sparse-grid integration rule. Now, according assumption (A3), consider the same quadrature rules on each using , so drop the index .
Sparse-grid integration techniques… Assumption(A4): For each one dimensional cubature rule on , there is a sequence of sets of quadrature nodes with The 1D quadrature rule with nodes is for • If the quadrature nodes in assumption (A4) satisfy the property that, • Then corresponding sequence of quadrature rules is called a nested or embedded quadrature rules. • Such sequence of quadrature rules can be constructed based on • Curtis-Clenshaw, Krnord/Patterson extension rules, etc .
Construction of sparse-grid integration rules Let be a multi-index such that • Smolyak 1963 (also Wasilkowski & Woznikowski 1995 ): • A sparse-grid rule based on the sequence of quadrature rules • for the approximation of the n-dimensional integral with a degree of accuracy is • where • for
Sparse-grid integration rules - Examples • For n=2 and d=7, the sparse grid technique takes the form • Observe that
Sparse-grid integration rules - Examples For n=2 and d=7, the sparse-grids are spread as follows: Fig: with 29 grid-points.