280 likes | 466 Views
Design & Co-design of Embedded Systems. Distributed System Co-synthesis (2). Maziar Goudarzi. Today Program. Introduction Preliminaries Hardware/Software Partitioning Distributed System Co-Synthesis (part 2). References:
E N D
Design & Co-design of Embedded Systems Distributed System Co-synthesis (2) Maziar Goudarzi
Today Program • Introduction • Preliminaries • Hardware/Software Partitioning • Distributed System Co-Synthesis (part 2) References: Wayne Wolf, “Hardware/Software Co-Synthesis Algorithms,” Chapter 2, Hardware/Software Co-Design: Principles and Practice, Eds: J. Staunstrup, W. Wolf, Kluwer Academic Publishers, 1997. W. Wolf, “An architectural co-synthesis algorithm for distributed, embedded computing systems,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 5, no. 2, pp. 218-229, 1997. Design & Co-design of Embedded Systems
Topics • Introduction • An Integer Linear Programming Model • A Heuristic Algorithm • On ordinary task graphs • On an Object-Oriented model Design & Co-design of Embedded Systems
Co-Synthesis Algorithms:Distributed System Co-Synthesis Wolf’s Heuristic Algorithm on Ordinary Task Graphs
Wolf’s Heuristic Algorithm • As ever, topics of importance: • System Specification Language/Model • Target Architecture • Functionality (Allocation/Scheduling) Quantum • Allocation Strategy • Scheduling Strategy • Cost Estimation • Performance Estimation • Algorithm Details Design & Co-design of Embedded Systems
Wolf’s Heuristic Algorithm (cont’d) • Wolf’s Heuristic Algorithm • System Specification Language/Model • Algorithm input: single-rate task graph • Target Architecture • Heterogeneous multiprocessor architecture • Allocation • Primal approach: Performance is the major objective • Scheduling • ? • Functionality Quantum • Processes in a single-rate task graph Design & Co-design of Embedded Systems
Wolf’s Heuristic Algorithm (cont’d) • Wolf’s Heuristic Algorithm (cont’d) • Performance Estimation • Component Technology Library • Run-time of each process on each available PE is supposed to be known • Cost Estimation • Component Technology Library • Total Cost = Si (Cost of PEi) + Sj (Cost of Devicej) + Sk (Cost of Comm. Channelk) • Algorithm Details Design & Co-design of Embedded Systems
Wolf’s Heuristic AlgorithmDetails • Four major steps in co-design • Partitioning: dividing the spec. into smaller parts (e.g. processes) • Allocation: assigning each process to a multiprocessor node (PE) • Scheduling: serializing processes assigned to each PE • Mapping: selecting a particular component for each PE • Problem: These steps (especially allocation, scheduling, and mapping) have a circular relationship • Solution: Break the loop Design & Co-design of Embedded Systems
Wolf’s Heuristic AlgorithmDetails (cont’d) • Wolf: • Give an initial allocation • Refine it to reduce cost • Order of satisfying design criteria: • Satisfy all deadlines • Minimize PE cost • Minimize comm. port cost • Minimize device cost Design & Co-design of Embedded Systems
Wolf’s Heuristic AlgorithmDetails (cont’d) • First ignore communication costs. Later, take them into account • Steps: 1. Create an initial feasible solution, and perform an initial scheduling on it. • Initial feasible solution: assign each process to a separate PE 2. Reallocate processes to PEs to minimize total PE cost. • Possibly eliminate PEs from initial feasible solution 3. Reallocate processes again to minimize the amount of communication required between PEs 4. Allocate communication channels 5. Allocate IO devices. (Internal or external to PEs) Design & Co-design of Embedded Systems
Wolf’s Heuristic Algorithm Details (cont’d) • The most important step: 2. Initial reallocation • Reason: PE cost is the dominant hardware cost • Initial reallocation 1. PE cost reduction: 1.1 Scan the PEs, starting with the least-utilized PE. 1.2 Try to reallocate that PE’s processes to other existing PEs 1.3 If no process left on the PE, eliminate it otherwise replace the PE with a suitable lower-cost one 2. Pair-wise merge Merge a pair of PEs into a single, more powerful one 3. Load balancing Design & Co-design of Embedded Systems
Wolf’s Heuristic Algorithm Details (cont’d) • Initial reallocation (cont’d) • “PE cost reduction” phase tries to reallocate multiple processes at a time • The above 3 phases are repeated as far as possible Design & Co-design of Embedded Systems
Wolf’s Heuristic Algorithm: Experimental Results Design & Co-design of Embedded Systems
Wolf’s Heuristic Algorithm Experimental Results (cont’d) • Finds optimal solutions to most of ILP-solved examples • Finds near-optimal solutions for the remaining examples • Showed good results on larger examples • Requires very little run-time • Due to multiple-move strategy during PE cost minimization phase Design & Co-design of Embedded Systems
Co-Synthesis Algorithms:Distributed System Co-Synthesis Wolf’s Heuristic Algorithm for Object-Oriented Models
Introduction • Target • Co-synthesis of a Distributed-System out of an Object-Oriented Specification • Significance • OO is a promising approach in designing embedded systems at ESL Reference: W. Wolf, “Object-Oriented Co-Synthesis of Distributed Embedded Systems,” ACM Transactions on Design Automation of Electronics Systems, pp. 301-314, 1996 Design & Co-design of Embedded Systems
OO Co-Synthesis Algorithm • Again, our eight topics • System Specification Language/Model • Target Architecture • Functionality (Allocation/Scheduling) Quantum • Allocation Strategy • Scheduling Strategy • Cost Estimation • Performance Estimation • Algorithm Details Design & Co-design of Embedded Systems
Object O2 method m4 variables v10,v20 Object O1 method m1 variables v1,v2 Object O3 method m2 variables v2,v3 method m3 variables v8,v9 OO Co-Synthesis Algorithm (cont’d) • System Specification Model/Language • An Object-Oriented Specification as input • Method dataflow graph as model Design & Co-design of Embedded Systems
OO Co-Synthesis Algorithm (cont’d) • Target Architecture • Distributed System • An arbitrary-topology network of PEs • Functionality Quantum • Methods of Objects in an OO Specification • As far as possible, keeps together all methods of an object • Partitioning is done during algorithm execution Design & Co-design of Embedded Systems
OO Co-Synthesis Algorithm (cont’d) • Cost and Performance Estimation • Pre-specified • A technology description of available components is input to the algorithm • Allocation, Scheduling, and Algorithm Details • Much like Wolf’s previous heuristic algorithm • Includes modifications in order to: • handle large sets of methods • consider effects of splitting objects across PEs Design & Co-design of Embedded Systems
OO Co-Synthesis Algorithm (cont’d) • Allocation, Scheduling, and Algorithm Details • Initial allocation and scheduling. Allocate processes to PEs such that all tasks are placed on PEs fast enough to ensure that all deadlines are met, keeping objects together as much as possible 2. Minimize PE cost. Reallocate processes to PEs to minimize PE cost, splitting objects when necessary. 3. Minimize communication. Reallocate processes again to minimize inter-PE communication, taking into account traffic generated by splitting objects across PEs Design & Co-design of Embedded Systems
OO Co-Synthesis Algorithm (cont’d) 4. Allocate channels. Allocate communication channels 5. Allocate devices. either as on-chip devices or external devices on communication channels • Allocation, … Details (cont’d) Design & Co-design of Embedded Systems
OO Co-synthesis Details • Step 1 (initial allocation) • One PE per object • Step 2 (minimize PE cost) • oo_balance_load() • Tries to redistribute methods to better balance the system load • PE_replacement() • Use a cheaper PE without distributing the allocation • oo_pairwise_merge() • Tries to eliminate PE by moving its methods to other PEs • Step 2 is done repeatedly • Methods are re-scheduled after each new allocation Design & Co-design of Embedded Systems
OO Co-synthesis Details (cont’d) Note : This operation may cause "Hidden communication”. Design & Co-design of Embedded Systems
OO Co-synthesis Details (cont’d) Design & Co-design of Embedded Systems
Reason for highest cpu-time: Having most methods => scheduling required in each inner loop of step 2 This implementation, had a simple inefficient scheduler. OO Co-Synthesis Algorithm (cont’d) • Experimental Results • Algorithm implemented in C++ • Using NIH class library • 8600 lines of code • Executed on SGI Indigo workstation • Algorithm applied to examples from software engineering books on OO design • Example #objects/methods CPU Time • cfuge 2/3 0.05 • dye 3/15 2.0 • juice 3/4 0.05 • train 5/6 0.05 Design & Co-design of Embedded Systems
OO Co-Synthesis Algorithm (cont’d) • Main contribution • OO specification is an important aid to automatic partitioning • The specification is naturally divided into two levels of granularity • Systems is composed of Objects • Objects are composed of data members and methods • The heuristic: • Preserve the specification’s partitioning as much as possible Design & Co-design of Embedded Systems
What we learned today • Distributed System Co-Synthesis • A heuristic approach • Non-OO algorithm • Customization to OO specifications • Heuristic: First minimize the PE cost since it is the dominant factor Design & Co-design of Embedded Systems