90 likes | 202 Views
Evaluation of modeling and solution techniques. Theoretical worst case, average case, partial orders shortcomings: worst case seldom occurs unrealistic assumptions Empirical computational experiments. Principles. Results presented must be sufficient to justify claims
E N D
Evaluation of modeling and solution techniques • Theoretical • worst case, average case, partial orders • shortcomings: • worst case seldom occurs • unrealistic assumptions • Empirical • computational experiments
Principles • Results presented must be sufficient to justify claims • e.g., don’t confuse an algorithm with an implementation • Sufficient detail to allow reproducibility of results • give actual code • experimental notebook
Test problems • Benchmark sets • from practice • specially constructed • Randomly generated • simple random • model a real problem
Advantages & disadvantages • Benchmark sets • sometimes representative of real world • expensive to collect, thus sets often small • biased • Randomly generated • can explore entire space of problems • allows statistically valid conclusions • lack of realism
Performance measures • Efficiency • CPU time • nodes visited • constraint checks • Robustness, scope • class of problems which can be effectively solved • Scalability • size of problems • Accuracy, solution quality
Performance claims A claim that… • a new algorithm is feasible and promising • preliminary testing on several hand-picked problems • an algorithm/implementation is better • detailed comparison with prominent methods already available on broad range of problems
Pitfalls • Straw algorithms • only compare against the “best” • Easy problems • Unfair comparisons • different languages, programmers, optimization efforts, machines, ... • Test set tuning • e.g., parameter tuning • solution: divide into “training” and test sets
Competitive testing vs Scientific testing • Drawbacks of competitive testing • enormous amount of work • dictates implementation language • tells us which algorithm is better but not why • negative results are considered uninteresting • Scientific testing • experiments designed to contribute to understanding
References • Crowder, H.P., Dembo, R.S., and Mulvey, J.M. “On reporting computational experiments with mathematical software,” ACM Transactions on Mathematical Software, 5:193-203, 1979. • Jackson, R.H.F., Boggs, P.T., Nash, S.G., and Powell, S. “Guidelines for reporting results of computational experiments,” Mathematical Programming, 49:413-426, 1990. • Hooker, J.N., “Needed: An empirical science of algorithms,” 1993. • Hooker, J.N., “Testing heuristics: We have it all wrong,” 1995.