740 likes | 1.1k Views
High-level Synthesis Scheduling, Allocation, Assignment,. Note: Several slides in this Lecture are from Prof. Miodrag Potkonjak, UCLA CS. Overview. High Level Synthesis Scheduling, Allocation and Assignment Estimations Transformations. Allocation, Assignment, and Scheduling.
E N D
High-level SynthesisScheduling,Allocation, Assignment, Note: Several slides in this Lecture are fromProf. Miodrag Potkonjak, UCLA CS
Overview • High Level Synthesis • Scheduling, Allocation and Assignment • Estimations • Transformations
Allocation, Assignment, and Scheduling Techniques Well Understood and Mature
Control Step Scheduling and Assignment Control Step
ASAP: Another Example ASAP Schedule Sequence Graph
ALAP: Another Example ALAP Schedule(latency constraint = 4) Sequence Graph
Observation about ALAP & ASAP • No priority is given to nodes on critical path • As a result, less critical nodes may be scheduled ahead of critical nodes • No problem if unlimited hardware • However of the resources are limited, the less critical nodes may block the critical nodes and thus produce inferior schedules • List scheduling techniques overcome this problem by utilizing a more global node selection criterion
List Scheduling Algorithm using Decreasing Criticalness Criterion
Scheduling • NP-complete Problem • Optimal • Heuristics - Iterative Improvements • Heuristics – Constructive • Various versions of problem • Unconstrained minimum latency • Resource-constrained minimum latency • Timing constrained • If all resources identical, reduced to multiprocessor scheduling • Minimum latency multiprocessor problem is intractable
Scheduling - Optimal Techniques • Integer Linear Programming • Branch and Bound
Integer Linear Programming • Given: integer-valued matrix Amxn, vectors B = ( b1, b2, … , bm ), C = ( c1, c2, … , cn ) • Minimize: CTX • Subject to: AX B X = ( x1, x2, … , xn ) is an integer-valued vector
Integer Linear Programming • Problem: For a set of (dependent) computations {t1,t2,...,tn}, find the minimum number of units needed to complete the execution by k control steps. • Integer linear programming: Let y0 be an integer variable. For each control step i ( 1 i k ): define variable xij as xij = 1, if computation tj is executed in the ith control step. xij = 0, otherwise. define variable yi = xi1 + xI2 + ... + xin .
Integer Linear Programming • Integer linear programming: For each computation dependency: ti has to be done before tj, introduce a constraint: k x1i+ (k-1) x2i+ ... + xki kx1j+ (k-1) x2j+ ... + xkj+ 1(*) Minimize: y0 Subject to: x1i+ x2i+ ... + xki = 1 for all 1 i n yj y0 for all 1 i k all computation dependency of type (*)
c1 c2 c3 c4 c5 c6 An Example 6 computations 3 control steps
An Example • Introduce variables: • xij for 1 i 3, 1 j 6 • yi = xi1+xi2+xi3+xi4+xi5+xi6 for 1 i 3 • y0 • Dependency constraints: e.g. execute c1 before c4 3x11+2x21+x31 3x14 +2x24+x34+1 • Execution constraints: x1i+x2i+x3i = 1 for 1 i 6
An Example • Minimize: y0 • Subject to: yi y0 for all 1 i 3 dependency constraints execution constraints • One solution: y0 = 2 x11 = 1, x12 = 1, x23 = 1, x24 = 1, x35 = 1, x36 = 1. All other xij = 0
ILP Model of Scheduling • Binary decision variables xil • i = 0, 1, …, n • l = 1, 2, … +1 • Start time is unique
ILP Model of Scheduling (contd.) • Sequencing relationships must be satisfied • Resource bounds must be met • let upper bound on # of resources of type k be ak
Minimum-latency Scheduling Under Resource-constraints • Let t be the vector whose entries are start times • Formal ILP model
Example • Two types of resources • Multiplier • ALU • Adder • Subtraction • Comparison • Both take 1 cycle execution time
Example (contd.) • Heuristic (list scheduling) gives latency = 4 steps • Use ALAP and ASAP (with no resource constraints) to get bounds on start times • ASAP matches latency of heuristic • so heuristic is optimum, but let us ignore it! • Constraints?
Example (contd.) • Start time is unique
Example (contd.) • Sequencing constraints • note: only non-trivial ones listed • those with more than one possible start time for at least one operation
Example (contd.) • Resource constraints
Example (contd.) • Consider c = [0, 0, …, 1]T • Minimum latency schedule • since sink has no mobility (xn,5 = 1), any feasible schedule is optimum • Consider c = [1, 1, …, 1] T • finds earliest start times for all operations • equivalently,
Example Solution: Optimum Schedule Under Resource Constraint
Example (contd.) • Assume multiplier costs 5 units of area, and ALU costs 1 unit of area • Same uniqueness and sequencing constraints as before • Resource constraints are in terms of unknown variables a1 and a2 • a1 = # of multipliers • a2 = # of ALUs
Example (contd.) • Resource constraints
Example Solution • MinimizecTa = 5.a1 + 1.a2 • Solution with cost 12
Precedence-constrained Multiprocessor Scheduling • All operations done by the same type of resource • intractable problem • intractable even if all operations have unit delay
Scheduling - Iterative Improvement • Kernighan - Lin (deterministic) • Simulated Annealing • Lottery Iterative Improvement • Neural Networks • Genetic Algorithms • Taboo Search
Scheduling - Constructive Techniques • Most Constrained • Least Constraining
Force Directed Scheduling • Goal is to reduce hardware by balancing concurrency • Iterative algorithm, one operation scheduled per iteration • Information (i.e. speed & area) fed back into scheduler
- - - - * * * * * * * * * * * * + + + + < < ASAP Step 1 • Determine ASAP and ALAP schedules ALAP
* * * - - + + < * * * Step 2 • Determine Time Frame of each op • Length of box ~ Possible execution cycles • Width of box ~ Probability of assignment • Uniform distribution, Area assigned = 1 C-step 1 C-step 2 C-step 3 1/2 C-step 4 1/3 Time Frames
0 0 1 1 2 2 3 3 4 4 Step 3 • Create Distribution Graphs • Sum of probabilities of each Op type • Indicates concurrency of similar OpsDG(i) = Prob(Op, i) 1 1 2 2 3 3 4 4 DG for Multiply DG for Add, Sub, Comp
- - Fork + + + Join - - 0 1 2 + + 1 + 2 DG for Add Conditional Statements • Operations in different branches are mutually exclusive • Operations of same type can be overlapped onto DG • Probability of most likely operation is added to DG
Self Forces • Scheduling an operation will effect overall concurrency • Every operation has 'self force' for every C-step of its time frame • Analogous to the effect of a spring: f = Kx • Desirable scheduling will have negative self force • Will achieve better concurrency (lowerpotential energy) Force(i) = DG(i) * x(i) DG(i) ~ Current Distribution Graph value x(i) ~ Change in operation’s probability Self Force(j) = [Force(i)]
C-step 1 C-step 2 * * * C-step 3 - - + + < 1/2 * C-step 4 * * 1/3 1 0 1 2 3 4 2 3 4 DG for Multiply Example • Attempt to schedule multiply in C-step 1 • Self Force(1) = Force(1) + Force(2) • = ( DG(1) * X(1) ) + ( DG(2) * X(2) ) • = [2.833*(0.5) + 2.333 * (-0.5)] = +0.25 • This is positive, scheduling the multiply in the first C-step would be bad
- - * * * * + + < * * Predecessor & Successor Forces • Scheduling an operation may affect the time frames of other linked operations • This may negate the benefits of the desired assignment • Predecessor/Successor Forces = Sum of Self Forces of any implicitly scheduled operations