190 likes | 307 Views
Higher-order Confidence Intervals for Stochastic Programming using Bootstrapping. Cosmin G. Petra Joint work with Mihai Anitescu Mathematics and Computer Science Division Argonne National Laboratory petra@mcs.anl.gov INFORMS ANNUAL MEETING 2012. Outline. Confidence intervals Motivation
E N D
Higher-order Confidence Intervals for Stochastic Programming using Bootstrapping Cosmin G. Petra Joint work with MihaiAnitescu Mathematics and Computer Science Division Argonne National Laboratory petra@mcs.anl.gov INFORMS ANNUAL MEETING 2012
Outline • Confidence intervals • Motivation • statistical inference for the stochastic optimization of power grid • Our statistical estimator for the optimal value • Bootstrapping • Second-order bootstrapped confidence intervals • Numerical example
Confidence intervals (CIs) for a statistic • Want an interval [L,U] where resides with high probability • Need the knowledge of the probability distribution • Example: Confidence intervals for the mean of Gaussian (normal) random variable Normal distribution, also called Gaussian or "bell curve“ distribution. Image source: Wikipedia.
Approximating CIs • In many cases the distribution function is not known. • Such intervals are approximated based on the central limit theorem (CLT) • Normal approximation for equal-tailed 95% CI • Notation
Optimal value in stochastic programming Sample average approximation (SAA) Stochastic programming (SP) problem • Monotonically shrinking negative bias: • Consistency • Arbitrary slow convergence • Non-normal bias Properties
Stochastic unit commitment with wind power • Wind Forecast – WRF(Weather Research and Forecasting) Model • Real-time grid-nested 24h simulation • 30 samples require 1h on 500 CPUs (Jazz@Argonne) Thermal generator Wind farm Slide courtesy of V. Zavala & E. Constantinescu
The specific of stochastic optimization of energy systems uncertainty discrete continuous SAA Sampling is expensive Statistical inference Only a small number of samples are available.
Standard methodology for stochastic programming – Linderoth, Shapiro, Wright (2004) • Lower bound CI CI for based on M batches of N samples • Upper bound CI CI for (obtained similarly) • Needs a relatively large number of samples (2MN) • First-order correct and therefore unreliable for small number of samples Correctness of a CI – order k if
Our approach for SP with low-size samples 1. Novel estimator • Converges one order faster than • Excepting for a set whose measure converges exponentially to 0. 2. Bootstrapping • Allows the construction of reliable CIs in the low-size samples situation. • Bootstrap CIs are second-order correct M. Anitescu, C. Petra: “Higher-Order Confidence Intervals for Stochastic Programming using Bootstrapping”, submitted to Math. Prog.
The estimator • L is the Lagrangian of SP and J is the Jacobian of the constraints • is the solution of the SAA problem – obtained using N samples • Intended for nonlinear recourse terms • Theorem 1: (Anitescu & P.) Under some regularity and smoothness conditions • Proof: based on the theory of large deviations. • CIs constructed for are based on a second batch of N samples. • A total of 2N sample needed when using bootstrapping
Bootstrapping – a textbook example US population known in 1920. 1930 population of 49 cities known Want 1. estimation of the 1930 population 2. CIs for the estimation Solution • 1. 1930 population = 1920 population X mean of the ratios • 2. needs the distribution of the ratios - not enough samples -> Bootstrapping • Sample the existing samples (with replacement) • For each sample compute the mean • Bootstrapping distribution is obtained • Build CIs based on the bootstrapping distribution Histogram for the ratio of 1930 and 1920 populations for N=49 US cities “Bootstrapped” distribution clearly not a Gaussian Bootstrap CIs outperform normal CIs.
The methodology of bootstrapping • BCa (bias corrected and accelerated) confidence intervals • second-order correct • the method of choice when an accurate estimate of the variance is not available
What does bootstrapping do? • Edgeworth expansions for cdfs • Bootstrapping accounts also for the second term in the expansion • The quantiles are also second-order correct (Cornish-Fisher inverse expansions) • (Some) Bootstrapped CIs are second-order correct Reference: Peter Hall, “The Bootstrap and Edgeworth Expansion”, 1994.
Bootstrapping the estimator Theorem 2: (Anitescu & P.) Let be a second order bootstrapping confidence interval for . Then for any
Numerical order of correctness Observed order of correctness Correctness order 0.32 Correctness order 0.82 Correctness order 2.11 Correctness order 1.14
Concluding remarks and future work • Proposed and analyzed a novel statistical estimator for the optimal solution of nonlinear stochastic optimization • Almost second order correct confidence intervals using bootstrapping • Theoretical properties confirmed by numerical testing • Some assumptions are rather strict and can/should be relaxed • Parallelization of the CI computations for large problems needed
Thank you for your attention! Questions?
Bootstrapping - theory Edgeworth expansions for pdfs • Cornish-Fisher expansion for quantiles (inverting Edgeworth expansion) • Bootstrapped quantiles possess similar expansion • But • (Some) Bootstrap CIs are second-order correct (Hall’s book is really detailed on this) Bootstrapping also accounts for the second term of in the expansion.