520 likes | 657 Views
Randomized Accuracy Aware Program Transformations for Efficient Approximate Computations. Sasa Misailovic Joint work with. Zeyuan Allen Zhu. Jonathan Kelner. Martin Rinard. MIT CSAIL. …. …. …. …. …. Nodes represent computation Edges represent flow of data. …. …. …. …. ….
E N D
Randomized Accuracy Aware Program Transformations for Efficient Approximate Computations SasaMisailovic Joint work with Zeyuan Allen Zhu Jonathan Kelner Martin Rinard MIT CSAIL
… … … … … • Nodes represent computation • Edges represent flow of data
… … … … … • Functions – process individual data • Reduction nodes – aggregate data
… … … … avg avg avg avg … min • Functions – process individual data • Reduction nodes – aggregate data
… … … … avg avg avg avg f1 f2 f3 … min Function substitution • Multiple implementations • Each has expected error/time
… … … … avg avg avg avg … min Function substitution • Multiple implementations • Each has expected error/time
… … … … avg avg avg avg … min Sampling inputs of reduction nodes • Reductions consume fewer inputs
… … … … avg avg … min Sampling inputs of reduction nodes • Reductions consume fewer inputs
Tradeoff Space Time Error
Tradeoff Space Time Error
Optimal Tradeoff Curve Using the tradeoff curve: • Minimize time subject to error bound • Minimize error subject to time bound Time Error
Our Result Original program Error bound Transformations Analysis Optimization • Randomized computation • Guaranteed expected error/time tradeoff • -approximation of optimal tradeoff Optimizedprogram
Outline Model of Computation Tradeoff Curve Construction Optimized Program Selection Related Work
Model of Computation t t u u v v w w … … … … avg avg avg avg g g f f g g f f … min
Model of Computation t t u u v v g f m m w w avg avg avg … … avg avg … … avg g g f f g g f f n n t u v w … min n min 1
Structure of Computation • Computation nodes • DAGs of functions • Functions: arbitrary code • Process individual inputs • Reduction nodes • Aggregation functions • Average, min, max, sum… • Computation Tree • Computation nodes and reduction nodes g f m m avg avg n n t u v w n min 1
Accuracy-Aware Transformations Function substitution • Multiple versions • Execute with probability • Each has error/time spec Reduction sampling • Consume inputs • Probability of selecting eachinput: • Derived error/time specifications Average: Min/max: m m avg avg n n n min 1
Accuracy-Aware Transformations Function substitution • Multiple versions • Execute with probability • Each has error/time spec Reduction sampling • Consume inputs • Probability of selecting eachinput: • Derived error/time specifications Average: Min/max: m m avg avg avg n n … n min 1
Accuracy-Aware Transformations Function substitution • Multiple versions • Execute with probability • Each has error/time spec Reduction sampling • Consume inputs • Probability of selecting eachinput: • Derived error/time specifications Average: Min/max: m m avg avg avg n n … n min 1
Accuracy-Aware Transformations Function substitution • Multiple versions • Execute with probability • Each has error/time spec Reduction sampling • Consume inputs • Probability of selecting eachinput: • Derived error/time specifications Average: Min/max: m m avg avg avg n n … n min 1
Program Configuration Vector Defines transformed program • Functions: probability of executing each version • Reductions: number of elements to sample m m avg avg n n n min 1
Configuration Vector • Specifies program version • Functions: probability of executing each version • Reductions: number of elements to sample Find optimal program = Find configuration vector that achieves optimal accuracy vs. performance tradeoff m m avg avg n n n min 1
Tradeoff Curve Construction: Algorithm Divide and conquer • For each subcomputation construct tradeoff curve • Dynamic programming Properties • Polynomial time • -approximation oftrue tradeoff curve m m avg avg n n n min 1
Tradeoff Curve Construction: Algorithm m m avg avg n n n min 1
Tradeoff Curve Construction: Algorithm m m avg avg n n n min 1
Tradeoff Curve Construction: Algorithm m m avg avg n n n min 1
Tradeoff Curve Construction: Algorithm m avg n n n min 1
Tradeoff Curve Construction: Algorithm n n n min 1
Tradeoff Curve Construction: Algorithm n n n min 1
Tradeoff Curve Construction: Algorithm n n n min 1
Computation Node Optimization Variables: • Probability to execute each version of • Range: • Sum: Linear program (E2,T2) (E0,T0) (E1,T1)
Computation Node Optimization Variables: • Probability to execute each version of • Range: • Sum: Objective: Constraint: Linear program (E2,T2) (E0,T0) (E1,T1)
Computation Node Optimization Variables: • Probability to execute each version of • Range: • Sum: Objective: Constraint: Linear program (E2,T2) (E0,T0) (E1,T1)
Computation Node Optimization Variables: • Probability to execute each version of • Range: • Sum: Objective: Constraint: Linear program (E2,T2) (E0,T0) (E1,T1)
The Algorithm: Reduction Nodes Given error bound , find number of elements to sample m avg
The Algorithm: Reduction Nodes Given error bound , find number of elements to sample s.t. m avg
The Algorithm: Reduction Nodes Given error bound , find number of elements to sample s.t. m From approximate tradeoff curve: for avg
The Algorithm: Reduction Nodes Given error bound , find number of elements to sample s.t. m • Univariate optimization problem • Analogously, minimize error subject to avg
Approximate Tradeoff Curve Bidimensional Discretization Take elements at regular intervals Time Error
Approximating Tradeoff Curve Bidimensional Discretization Time Error
Approximating Tradeoff Curve Time Error
Approximating Tradeoff Curve Randomized configuration: • Execute with probability • Execute with probability Time Error
Approximating Tradeoff Curve Time Error
Approximating Tradeoff Curve Time T Error
Properties of the Algorithm Performance • Number of tradeoff curve points: • Most expensive operation: bidimensional discretization • Calling LP solver times • Each call can have variables Precision • Precision decreases linearly with the number of nodes • To obtain -approximation set intermediate Space • Storing tradeoff curves:
Obtaining Optimized Programs Tradeoff curves for all subcomputations: • Each curve contains partial configuration • Probability of executing local function nodes • Number of inputs to sample from reduction node • Error tolerated by subcomputation • Distribution over optimal program configurations Incrementally construct configuration vector: • For every execution • Traverse the tree, starting from root • Time to get full vector:
Related Work Accuracy-aware transformations • Empirical justification: training/test input set [Rinard ICS ‘06, Rinard OOPSLA ’07, Ansel et al. PLDI ‘09, Misailovic et al. ICSE ’10, Baek & Chilimbi PLDI ‘10 Hoffmann et al. ASPLOS ‘11, Sidiroglou et al. FSE ‘11] • Probabilistic accuracy analysis for loop perforation [Misailovic et al. SAS ‘11, Chaudhuri et al. FSE ‘11]
Related Work Accuracy-aware transformations • Empirical justification: training/test input set [Rinard ICS ‘06, Rinard OOPSLA ’07, Ansel et al. PLDI ‘09, Misailovic et al. ICSE ’10, Baek & Chilimbi PLDI ‘10 Hoffmann et al. ASPLOS ‘11, Sidiroglouet al. FSE ‘11] • Probabilistic accuracy analysis for loop perforation [Misailovicet al. SAS ‘11, Chaudhuriet al. FSE ‘11] Ensuring safety of transformed programs • Separating critical and approximate parts of program • [Carbin & Rinard ISSTA ’10, Sampson et al. PLDI ’11] • Verifying relaxed semantics of programs[Carbin et al. CSAIL-TR ‘11] Analytic properties of programs [Majumdar & Saha RTSS ‘09, Chaudhuri et al. POPL’10, Ivancicet al. MEMOCODE ‘10 , Reed & Pierce ICFP ’10, Chaudhuri& Solar-Lezama PLDI ’10 , Chaudhuriet al. FSE ’11]