350 likes | 502 Views
Lic Presentation. Memory Aware Task Assignment and Scheduling for Multiprocessor Embedded Systems. Radoslaw Szymanek / Embedded System Design Radoslaw.Szymanek@cs.lth.se http://www.cs.lth.se/home/Radoslaw_Szymanek. Outline. Introduction Problem Formulation and Motivational Example
E N D
Lic Presentation Memory Aware Task Assignment and Scheduling for MultiprocessorEmbedded Systems Radoslaw Szymanek / Embedded System Design Radoslaw.Szymanek@cs.lth.se http://www.cs.lth.se/home/Radoslaw_Szymanek
Outline • Introduction • Problem Formulation and Motivational Example • CLP Introduction • CLP Modeling • Optimization Heuristic and Experimental Results • Conclusions
System Level Synthesis (SLS) • Multiprocessor embedded systems are designed using CPU’s, ASIC’s, buses, and interconnection links • The application areas range from signal and image processing to multimedia and telecommunication • Task graph representation for application • The main design activities are task assignment and scheduling for a given architecture • Memory constraints (code and data memory)
SLS with memory constraints ROM RAM ROM RAM L1 P1 P2 B1 ROM RAM L2 RAM P3 A1 target architecture annotated task graph
Problem Assumptions and Formulation • Data dominated application represented as directed bipartite acyclic task graph • Each task is annotated with execution time, code and data memory requirements • Heterogeneous architecture • Both tasks and communications are atomic and they must be performed in one step • Find a good CLP model • Find a good heuristic for memory constrained time minimization task assignment and scheduling satisfying all constraints
Motivation • SoC multiprocessor architectures • Co-design methodology needs tool support • Memory consideration to decrease cost and power consumption • System Level design for fast evaluation
Motivating example (memory) Data Memory Schedule 8 8 P1 P2 DC3 6 6 P2 C3 DC2 C2 4 4 C1 L1 C2 t t P1 DC2 P1 P2 task graph 8 8 6 6 P2 4 4 L1 P1 P2 L1 C3 DC1 DC3 P1 t t DC3 DC2 architecture Task - 1kB code memory, 4kB data memory, Communication - 2kB data memory
CLP Introduction “Constraint programming represents one of the closest approaches computer science has yet made to the Holy Grail of programming: the user states the problem, the computer solves it.” Eugene C. Freuder CONSTRAINTS, April 1997
CLP Introduction • Relatively young and attractive approach for modeling many types of optimization problems • Many heterogeneous applications of constraints programming exist today • State decision variables which constitute to solution • State constraints which must be satisfied by solution • Search for solutions using knowledge you can derive from constraints
Constraints properties • may specify partial information — need not uniquely specify the values of its variables, • non-directional — typically one can infer a constraint on each present variable, • declarative — specify relationship, not a procedure to enforce this relationship, • additive — order of imposing constraints does not matter, • rarely independent — typically they share variables.
A simple constraint problem 1. Specify all decision variables and their initial domains Natural language description There are three tasks, namely, T1, T2, and T3. Each of these tasks can execute on any of two available processors, P1 and P2. Tasks T1 and T2 send data to task T3. The tasks should be assigned and scheduled in such a way that the schedule length does not exceed 10 seconds. CLP description TP1, TP2, TP3 :: 1..2, TS1, TS2, TS3 :: 0..10, Cost :: 0..10,
A simple constraint problem 2. Specify all constraints and additional variables The execution time of task T1 is four seconds on processor P1 and two seconds on processor P2. Task T2 requires three and five seconds to complete execution on processor P1 and P2 respectively. Task T3 always needs three seconds for execution. If TP1 = 1 then TD1 = 4. If TP1 = 2 then TD1 = 2, If TP2 = 1 then TD2 = 3, If TP2 = 2 then TD2 = 5, TD3 = 3,
A simple constraint problem Tasks T1 and T2 must execute on different processors. Tasks T1 and T2 send data to task T3. If two communicating tasks are executed on different processors there must be at least one second delay between them so the data can be transferred. The tasks should be assigned and scheduled in such a way that the schedule length does not exceed 10 seconds. TP1 != TP2, If TP1 != TP3 then D1 = 1 else D1 = 0, TS1 + TD1 + D1 <= TS3, […], Cost >= TS1 + TD1, Cost >= TS2 + TD2, Cost >= TS3 + TD3.
Modeling • Constraint Logic Programming (finite domain, CHIP solver) • Global constraints (cumulative, diffn, sequence, etc.) reduce model complexity of the synthesis problem and exploit specific features of the problem • Global constraints are useful for modeling placement problems and graph problems • Problem-specific search heuristic for NP-hard problem
CLP Model Decision variables for task • TS – start time of the task execution • TP – resource on which task is executed • TDP – exact placement of task local data in memory Additional variables for task • TD – task duration • TCM and TDM denote the amount of code and data memory for task execution
CLP Model Decision variables for data • DS – start time of the data communication • DB – resource on which data is communicated • DCP and DPP – exact placement of data in memory of the producer and consumer processor Additional variables for data • DD – data communication duration
CM 1 PU DM TCM TD TD 1 TDM TP TDP PU TP b) code memory time time TS TS a) execution time c) data memory CLP Model – Task Requirements
CLP Model – Data Requirements DM CU DM DD DA 1 DA DCP DB DPP time time time DS+DD DS TSc + TDc DS TSp data mem (cons) data mem (prod) communication time
Simple Example P2 P2 D1_c T1 D2_e T2 D1 D2 B1 C1 P1 T3 T1 T3 T2 D2_p D2_c D1_p P1 D1_e D3_e Diffn constraint
Code Memory Constraint Code Memory Limit T8 T4 T3 T2 T1 T7 T5 T6 Processor
Constraints types • precedence constraints • processing resources constraints • communication resource constraints • pipelining constraints • code memory constraints • data memory constraints
Task Assignment and Scheduling Heuristic Choose a task from ready task set with min(max(Ti)) – minimize schedule length Assign the task to a processor with the minimal implementaion cost ci Schedule communications that Ti is minimal Assign data memory Y data memory estimate no. 1 holds? N Y data memory estimate no. 2 holds? N Undo all decision – choose a task which consumes the most data
Execution Cost Ind = LowTS/PTS – LowCM/PCM i-th task, n-th processor ATS = available time slots, ACM – available code memory
Data and Communication Cost i-th task, n-th processor
Estimates • Estimate no. 1 where S (Sn) is a set of tasks already scheduled on a processor (processor Pn), tasks tj are direct successors of task ti, and dij is amount of data communicated between ti and tj. • Estimate no. 2 uses the global constraint diffn and it takes time into account
FB1 REK IDCT FB2 IQ FIR 96 553 96 96 96 96 BMA PRAE 64 96 Q C 96 96 96 IN 96 Synthesis Results - H.261 example DCT Video Coding Algorithm H.261
Main Contributions • Definition of the extended task assignment and scheduling problem • Inclusion of memory constraints to decrease the cost for data dominated applications • Specialized search heuristic to solve resource constrained task assignment and scheduling • CLP modeling framework to facilitate an efficient, clean, and readable problem definition
Conclusions and Future Work • The synthesis problem modeled as a constraint satisfaction problem and solved by the proposed heuristic, • Good coupling between model and search method for efficient search space pruning, • Memory constraints and pipelined designs taken into account, • Heterogeneous constraints can be modeled in CLP, important advantage over other approaches • Need for our own constraint engine implementation, approximate solutions, mixture of techniques • Need for better lower bounds, problem specific global constraints, designer interaction during search
Lic Presentation Memory Aware Task Assignment and Scheduling for MultiprocessorEmbedded Systems Radoslaw Szymanek / Embedded System Design Radoslaw.Szymanek@cs.lth.se http://www.cs.lth.se/home/Radoslaw_Szymanek
Related Work • J. Madsen, P. Bjorn-Jorgensen, “Embedded System Synthesis under Memory Constraints”, CODES ‘99 (GA, only RAM) • S. Prakash and A. Parker, “Synthesis of Application-Specific Heterogeneous Multiprocessor Systems”, VLSI Signal Processing, ‘94 (MILP, no ASIC’s, optimal)