180 likes | 294 Views
Task Runtime Response Optimization Using Cost-Based Operation Motion. Abdallah Tabbara Bassam Tabbara Alberto Sangiovanni-Vincentelli University of California at Berkeley. Embedded System. Electronic “brain” found in many applications e.g. Consumer electronics Telecommunications
E N D
Task Runtime Response Optimization Using Cost-Based Operation Motion Abdallah Tabbara Bassam Tabbara Alberto Sangiovanni-Vincentelli University of California at Berkeley EECS 249, Fall 1999
Embedded System • Electronic “brain” found in many applications e.g. • Consumer electronics • Telecommunications • Consists of: • Software: flexibility • Hardware: performance • Application requirements on the system: • Small • Efficient • Power • Other metrics EECS 249, Fall 1999
Design Specification Design Representation Synthesis HW/SW Partitioning Evaluation SW Micro-processor Implementation ASIC HW Hardware/Software Co-design EECS 249, Fall 1999
Problem Statement • Target: heterogeneous control-dominated embedded system applications • Functional decomposition captures design as a network of Finite State Machines extended with data computations (EFSMs) (e.g. Esterel front-end) • Goal: run-time optimization for the synthesis of each individual task • No assumptions on how tasks are composed in the whole system. EECS 249, Fall 1999
Intermediate Design Representation • Function Flow Graph (FFG) / C-Like Intermediate Format (CLIF) [Tabbara 99] • Able to represent EFSM • Suitable for control and data flow analysis EFSM FFG Optimized FFG SW/HW Synthesis Data Flow/Control Optimizations EECS 249, Fall 1999
Function Flow Graph (FFG) • is a triple G = (V, E, N0) where • V is a finite set of nodes • E = (x,y), a subset of VV, is an edge from x to y where x Pred(y), the set of predecessor nodes of y. • N0N is the start node corresponding to the EFSM initial state. • Operations are associated with each node N. • TESTs performed on the EFSM inputs and internal variables • ASSIGNs of computations on the input alphabet (inputs/internal variables) to the EFSM output alphabet (outputs and internal (state) variables) EECS 249, Fall 1999
EFSM in FFG Form(An Example in State Tree Form) S1 F4 S0 F3 F5 F1 F0 S2 F7 F2 F6 F8 EECS 249, Fall 1999
Previous Work (1) • Code motion (hoisting) from the software (HLS) domain(s) • Avoid unnecessary re-computations at runtime • Temporary variables (“registers”) at certain program points • Must be safe: Main strategy • As early as possible [Morel 1979], [Knoop 1992] • Practice: register pressure • Temporary lifetime minimzation [Knoop 1994] • Limitations: not cost based, laborious and involves addition of “synthetic nodes” in the control structure EECS 249, Fall 1999
Previous Work (2) • [Hailperin 98]: “cost” extension to [Knoop 94] • Metric based on individual operations (+, *, …) • No concept of I/O preservation (Embedded Systems) • We need task level runtime cost • [Castellucia 96]: Probabilities of inputs/tests guides ordering/restructuring of EFSM nodes in Esterel single automata • Cost-guided Relaxed Operation Motion • Use code motion techniques: safe (correct), fast • Guidance from runtime (average/worst-case) statistics EECS 249, Fall 1999
(Cost-guided) Relaxed Operation Motion • Our Approach (polynomial complexity in FFG nodes) consists of 4 steps: • Data Flow and Control Optimizations • Reverse Sweep (as early as possible/cost guided) • Dead operation addition • Normalization • Available operation elimination • Copy propagation • Dead elimination • Forward Sweep (register lifetime minimization) • Final optimization pass EECS 249, Fall 1999
Motivating Example [Knoop 94] … S8: z = a + b; a = c; goto S9; S9: x = a + b; goto S10; S10: … EECS 249, Fall 1999
S7: _T30 = a + b; y = _T30; _T30 = a + b; _T29 = c + b; _T30 = a + b; goto S8; S8: _T30 = a + b; z = _T30; a = c; _T30 = a + b; _T29 = c + b; _T30 = a + b; goto S9; Optimization Pass Dead addition S8: _T30 = a + b; z = _T30; a = c; goto S9; S9: _T30 = a + b; x = _T30; goto S10; S9: _T30 = a + b; x = _T30; _T30 = a + b; a = c; _T29 = c + b; _T30 = a + b; a = c; goto S10; Relaxed Operation Motion EECS 249, Fall 1999
Available Elimination Copy Propagation Optimization Pass _T30 = a + b; …. S8: z = _T30; a = c; _T30 = c + b; goto S9; S9: x = _T30; a = c; _T30 = c + b; goto S10; S1: _T31 = a + b; H = _T31; _T29 = c + b; … S8: z = H; H = _T29; goto S9; S9: x = H; goto S10; S8: z = _T30; a = c; _T30 = a + b; goto S9; S9: x = _T30; a = c; _T30 = a + b; goto S10; Relaxed Operation Motion EECS 249, Fall 1999
FFG (back-end) User Input Profiling CDFG (SHIFT) Attributed FFG Inference Engine Or Relaxed Operation Motion Software Compilation Hardware Synthesis Object Code (.o) Netlist Optimization and Synthesis Flow Design Optimization HW/SW Co-Synthesis Cost Estimation EECS 249, Fall 1999
Work In Progress • Cost estimation methodology • Operation motion • Guidance • Lifetime optimality (forward sweep) • Results collection on motivating example • We already beat [Knoop 94] • Evaluate with various cost scenarios • Collect synthesis results EECS 249, Fall 1999
Cost Estimation Using Bayesian Belief Networks (1) EECS 249, Fall 1999
Cost Estimation Using Bayesian Belief Networks (2) EECS 249, Fall 1999
Conclusions • Novel approach for task runtime response optimization: • Code motion from software domain limited mostly to loop invariants, no real task runtime cost guidance • Our approach: Relaxed Code Motion • Is “natural” in a control/data flow optimization framework • Specialize to embedded domain tasks e.g. I/O preservation across invocations • Apply application/environment driven costs to optimization EECS 249, Fall 1999