1 / 39

Architectural-level Design Exploration for Power Aware System

Architectural-level Design Exploration for Power Aware System. Dexin Li October 2000. Background . Component-level low power design cannot meet system-level design goals System needs not only low power designs, but also power aware features. Motivation.

brick
Download Presentation

Architectural-level Design Exploration for Power Aware System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Architectural-level Design Exploration for Power Aware System Dexin Li October 2000

  2. Background • Component-level low power design cannot meet system-level design goals • System needs not only low power designs, but also power aware features

  3. Motivation • System architecture is important for power aware system designs • Our micro-rover example shows bus/bus interface consume 25.65% of total system power • By adopting a variety of low-power design techniques and low power components, architectural optimization becomes more important.

  4. Application Example • Microrover - Robot exploring Mars • Solar power: 15W @noon • Electronics system: • Processor, microcontroller • Camera • radio frequency modem • Non-volatile memory/hard drive • Scientific equipment: APXS & ASI/MET • Bus drivers • System tasks: • Steering and driving • Capture pictures and send compressed data • Perform scientific experiments, store data on media, and send data

  5. Previous Work • A lot of lower power design techniques • Voltage scaling, frequency scaling, clock gating • Bus encoding, bus segmentation • Algorithm transformation, imprecise arithmetic • Other Power Aware methodology • PACT: on demand control of power consumption and performance • µAMPS: adaptive energy-aware distributed microsensors

  6. IMPACCT methodology • A framework to enable power aware design • Behavioral level optimization • Scheduling, partitioning, migration • Architectural level design exploration • Constraint-driven design space exploration • Meet power and performance constraints • Different view of system behavior, thus different solution • Static, know system behaviors prior to architecture exploration • Mixed, hybrid, prepare solutions for a few scenarios, pick up one at run time • Dynamic, determine the system behaviors and explore design space both at run time,

  7. Assumptions for the problem • Use COTS component to construct system • Communication • Un-directional • Components’ stand-alone time is absorbed into communication time (at coarse granularity) • Static view of the system behavior

  8. Problem statement • Design a tool(algorithm) that comes up with an architectural topology and power management scheme that satisfy system-level power, workload and schedule constraints. • Input: • Component property • Workload graph • Behavioral schedule • System constraints • Output: • A feasible architecture • power management scheme

  9. Component property • Component name • Power modes • Communication bandwidth • Mapping table • Performance / power • Clock frequency / power • Supply voltage / power • Bus interface • Maximum fanout • Root node eligibility(Can be root node or not)

  10. mc1 sc1 cam 40 40 20 cpu1 2 20 180 hd cpu2 mc1 5 30 sc1 rf Workload graph • A representation of communication • Vertices: components • Edges: workload(data transfer rate) • Weight: required communication bandwidth

  11. CPU1 MC1 MC2 CPU2 CAM RF HD SC1 SC2 10 20 30 40 50 min Behavioral schedule • Mission-level schedules • From behavioral scheduling or system specification • Communication-active and stand-alone-active • Granularity related • Here assumes they are same

  12. System constraints • Power • Maximum power, constant • Maximum power, function of time • Power range, constant, function of time • Protocol • Topology: e.g. tree for 1394 bus • Communication bandwidth • 100, 200, 400Mbps for different 1394 bus components • Up to 80% of bandwidth for isochronous transfers

  13. mc2 sc2 cpu1 mc1 hd cpu2 rf cam sc1 Output topology • A feasible topology meets all system constraints, if any

  14. Output-PM scheme • Power management scheme • Working together with the output topology • Indicating results for each components, at each schedule interval • power mode • power consumption number • required bandwidth • Used as feedback to behavioral scheduling or software development

  15. Problem Formulation • Tool elements: • Component library(CL) • Topology generator(TG) • Power management inspector(PMI) • Power calculator(PC) • With workload graph, TG first generates a graph from which different topology would be abstracted out; PMI sets working modes to each component, and check whether they are legal combinations. PC finds out power number for the entire system and see whether it meets power constraints. If yes, the problem is solved; if not, different working modes or different topology are tried, and check again.

  16. Application FU BI Bus media BI FU Application LNK Full-on PHY sleep SUS Deep sleep Bus media Component Model • Component composition: • Functional unit (FU) • Bus interface (BI) • Power management model: • Layered power modes • Modes correspondence between FU and BI Suppose when FU is working, it has communication with other components.

  17. application application TRS TRS LNK LNK PHY PHY Bus media sender receiver Bus Model • Sender and receiver • Service layers • Transfer property(modes, speed, bandwidth) • Configuration process

  18. BI FU Data to be transferred from node A to C yes Application A B C LNK Full-on no PHY sleep SUS Deep sleep Node B can’t be put in SUS mode. Bus media Configuration Management I • Power modes constraints: • Intra-component constraints • Inter-component constraints

  19. Data to be transferred from node A to C @ 400Mbps A B C A B C D Node B’s transfer speed should be 400Mbps, too Configuration Management II • Bandwidth constraints: Data transfer rates: A to D: 150Mbps B to D: 80Mbps Bandwidth for C: No less than 230Mbps For FireWire bus: 400Mbps

  20. segmentation Low power design techniques I • Bus segmentation • Improve communication bandwidth • Power reduction by disable unused components or clusters • Enabling other low power design techniques

  21. Low power design techniques II • Clock Scaling and Voltage scaling • Trade off between performance and power • Two or multiple levels of frequencies or voltages to select from • Extra hardware needed to implement the techniques

  22. segmentation 400Mbps bus Using low power design techniques • Bus segmentation with clock scaling • With clustered bus, we can keep same bandwidth by lower the clock frequency for the communication 200Mbps cluster 100Mbps cluster Suspended cluster

  23. Algorithm I • Creating Communication-Scheduling Table • Obtain combined information of both schedule and communication • Used for finding out constraint set for each component • Format: • CST : (tuple1, tuple2, ...) • Tuple1:(workload_path, interval, required_bandwidth) (('cpu2','mc1'),((20,30), 10)), (('cpu2','mc2'),((0,15), 20)), (('cpu1','cam'),((10,20), 20)), ...

  24. Algorithm II • Building Constraint Set • Find legal modes • Working mode • Power mode • Bandwidth level • Constrained by • Topology • system schedule • communication Cam: ON: LNK Cam: WL: 120 Camera must be working at at least link-layer-on mode; Required bandwidth is 120Mbps, thus the bus driver should work at at least 200Mbps

  25. Algorithm III • Enumerating topology • Complexity • pick up |Et| from |Eg|, |Et|, # of edges in the tree;|Eg|, # of edges in the graph 1. Start from workload graph G; 2. Add some redundant edges to G, we get G’; 3. Abstract valid topology T from G’ 4. Append T to topology library TL

  26. Enumerating topology

  27. Algorithm IV • Traversing Power management schemes • Grouping nodes into three classes: • Transferring (C1) • Passing (C2) • Idle(C3) • Traverse different combinations • Try bus segmentation and clock scaling techniques

  28. Algorithm: top level 1.Reading in component property, workload graph, system schedule, and system constraints 2. Creating Communication-Scheduling Table 3. Building Constraint Set 4. Enumerating topology, building topology library TL 5. For Ti in TL : 6. For interval in schedule : 7. Traverse power management schemes PMSi; 8. Run power_calculator to find power number P for PMSi 9. If p satisfy power_constraint : 10. print “find a feasible solution”, Ti, PMSi 11. Stop 12. Print “can’t find a feasible solution”

  29. MC1 SC1 SC2   30 30 20 CPU1  1 CPU2  20 160    MC2 10  20 NVM/HD RF CAM       CPU1   MC1   MC2 CPU2      CAM  RF   HD   SC1  SC2 10 20 30 40 50 min Example • FireWire 1394 bus architecture • Tree topology • Transfer speed 100, 200, 400Mbps • Application-Micro rover • 9 nodes • System schedule:walking, taking picture, walking and collect scientific data • Workload graph • power Constraints: • Constant value • Function of time • A range with max and min value or function      

  30. schedule workload Topology iterator topology Constraint set Power modes traversor Power calculator Component library Solution Experimental methodology • Constraint-driven design space exploration • Pre-given schedule from behavioral level to break the iteration loop • Proliferate the exploration space by adding some edges to original graph • Use both scheduling and communication information as knowledge, to build constraint set

  31. CAM SC SC HD CPU MC  80  80   CPU HD MC 120 30 RF CAM   40 30 MAX_POWER constraint = 15.0W Actual MAX_POWER = 14.9W RF Experiments • Experiment 1:      

  32. SC CAM CPU HD MC CAM    CPU  RF MC   MAX_POWER constraint = 14.0W Actual MAX_POWER = 13.94W SC   RF  HD    10 20 30 40 50 Experiments       min

  33. Experiments

  34. Experiments

  35. CPU2 HD CPU1 MC2 MC1 Power(W) RF CAM SC1 SC2 15 Power constraints 14 13 12 11 10 9 8 7 10 20 30 40 50 60 Experimental Results Time(min)

  36. sc1 sc2 cam hd mc2 cpu1 mc1 rf cpu2

  37. cam sc1 sc2 cpu2 hd mc2 cpu1 mc1 rf

  38. Summary and future work • A tool to explore design space for power aware architecture • Meets different kinds of power constraints • Incorporate low power design techniques • Interaction with behavioral scheduling to refine solution • Future work: hybrid and dynamic exploration

  39. Algorithm 1.Read in component property, communication graph, read system schedule, read system constraints; 2. Construct searching graph (SG); if |SG| > Max_SG then stop; 3. Construct schedule intervals Si; 4. Enumerate all the topologies from searching graph Ti  SG 5. For each Ti do 6. { if Ti is topologically illegal then next Ti; 7. Build configuration constraints set(CCS)) for each component; 8. Initialize first schedule interval S1, all components in Full-on modes; 8. For each Si do 9. { if (Si != S1) copy power modes sets(PMSi) from previous interval; 10. While (PMSi not exhausted) 11. { If PMSi is legal then run power_calculator 12. { if system power satisfy power constraints then next Si; 13. Else next Ti; 14. } else 15. { find next PMSi; } 16. } 17. Next Ti ; 18. } print “find a solution:”; output Ti, PMS; stop 19. } 20. Go to step 2;

More Related