300 likes | 409 Views
Random Number Generation Using Low Discrepancy Points. Donald Mango, FCAS, MAAA Centre Solutions June 7, 1999 1999 CAS/CARe Reinsurance Seminar Baltimore, Maryland. What is Discrepancy?. Large # of points inside a unit hypercube : n-dimensional hypercube of length 1 on each side
E N D
Random Number GenerationUsing Low Discrepancy Points Donald Mango, FCAS, MAAA Centre Solutions June 7, 1999 1999 CAS/CARe Reinsurance Seminar Baltimore, Maryland
What is Discrepancy? • Large # of points inside a unit hypercube :n-dimensional hypercube of length 1 on each side • For any “sub-volume” of the hypercube, Discrepancy = the difference between the proportion of points inside the volumeand the volume itself
Low Discrepancy Point Generator: • Method to generate a set of points which fills out a given n-dimensional unit hypercube, with as little discrepancy as possible • Attempt to be systematic and efficient in filling a space, given the number of points • My paper discusses “Faure” Points, just one of many alternatives • Faure method relies on prime numbers
Other Low Discrepancy Point Generators: • Named after number theorists: Sobol’, Neiderreiter, Halton, Hammersley, ... • More advanced methods use “irreducible polynomials” -- polynomial equivalents of prime numbers (cannot be factored) • More complex algorithms • Less flexible than Faure
Linear Congruential Generator: • Xn+1 = (aXn + c) mod m • Used in spreadsheets -- RAND() in Excel, @RAND in Lotus • Sequential • Cyclical, with a long cycle length or “period” • “Randomized” in spreadsheets by using a random seedvalue ( X0 ) = the system clock
LDPMAKER Excel 97 Workbook: • Available in the 1999 Spring Forum section of the CAS Website:www.casact.org/pubs/forum/99spforum/99spftoc.htm • Includes both: • A spreadsheet-only calculation (recalc-driven), and • A Visual Basic for Applications (VBA) macro-driven generator (run with a button)
LDPMAKER Excel 97 Workbook: • “Example” sheet is spreadsheet-only calculation • Demonstrates formulas • Not very flexible
Example: 4 Dimensions, 24 Iterations • Dimension #1: • First, convert each iteration number N to base Prime (= 5) • Iteration 1 = 01base5Iteration 10 = 20base5 • F(N, 1) = Faure point (Iteration N, Dimension 1)F(1,1) = 0/52 + 1/5 = 0.20F(10,1) = 2/52 + 0/5 = 0.08
Example: 4 Dimensions, 24 Iterations • Dimension #2: • Start with the base Prime digits from Dimension #1 and “shuffle” them • Using combinations, sum of digits and MOD operator • First digit in Dimension #2 = [ Sum (first digit, second digit) from Dimension #1 ] MODPrime • Dimension #1, Iteration 10 = 20base5Dimension #2, Iteration 10 = 22base5 • Formula for F(N,2) is the same
Example: 4 Dimensions, 24 Iterations • Dimensions #3 and higher: • Start with the base Prime digits from the previous dimension and “shuffle” them • Formula for F(N,3) ... is the same
Loops in the Faure Algorithm: • Fills out the space in ever-larger loops of ever-smaller spacing • Fills out the space sequentially • There MAY be an issue with ending the iterations in the middle of one of these loops • Examples later in the test results...
Visual Basic for Applications (VBA) Version: • VBA = real programming language • Recursive algorithm using “dynamic arrays” - arrays which are dimensioned (sized) at run-time • Generalization of spreadsheet-only calculations • FAST
Performance Test #1:Sum of Limited Paretos Table 2 (from Paper) - Pareto Parameters
Performance Test #1:Sum of Limited Paretos Table 3: Sum of 2 Limited Paretos
Performance Test #1:Sum of Limited Paretos Table 4: Sum of 5 Limited Paretos
Performance Test #2:Sum of Poissons Table 5: Sum of 2 Poissons (l = 8)
Performance Test #2:Sum of Poissons Table 6: Sum of 5 Poissons (l = 8)
Performance Test #3:Low Frequency Events Table 7 - Pareto Parameters used for Severity
Performance Test #3:Low Frequency Events Table 8: One Event, 5% Prob of Occurrence
Performance Test #3:Low Frequency Events Table 9: Two Events, each with 5% Prob of Occurrence
Performance Test #4:99th Percentile of Sum of Normals Table 10 - Normal Parameters
Performance Test #4:99th Percentile of Sum of Normals Table 11 - 99th Pctle of Sum of 2 Normals
Performance Test #4:99th Percentile of Sum of Normals Table 12 - 99th Pctle of Sum of 5 Normals
Performance Test #5:Mixed Bag • Sum of 5 each from: • LogNormal • Pareto • Uniform • Normal • Testing variability of estimates over 10 runs
Performance Test #5:Mixed Bag Table 14 - Avg % Error and Std Dev of % Error over 10 runs
Possible Concerns in Using LDPs • Unused Dimensions: • Example: modeling Excess Claims • # of Excess claims between 0 and 30 • requires 30 dimensions • If # claims < 30, are the “used” dimensions still filled out with low discrepancy? • Dr. Tom?
Possible Concerns in Using LDPs • Time Series: • Example: Probability of 2 consecutive years of loss ratio exceeding 75% • How many dimensions is this problem? • Can’t use a single dimension of LDPs, because they are sequentially dependent • Need to know “over how many years”, then set dimensions
Possible Concerns in Using LDPs • Correlation: • If two variables are • 100% correlated ==> 1 dimension • 0% correlated ==> 2 dimensions • x% correlated ==> ? dimensions • Is promise of “low discrepancy” still fulfilled? • How to implement?
Possible Concerns in Using LDPs • Loop Boundaries: • Faure algorithm fills out space sequentially in ever-expanding loops of ever-finer granularity • If iteration count does not finish on a loop boundary (depends on Prime), there may be potential bias... • See Appendix B of paper