1.18k likes | 1.33k Views
Integrated Management of Power Aware Computing & Communication Technologies. Kickoff review meeting Nader Bagherzadeh, Pai H. Chou, Fadi Kurdahi University of California, Irvine, ECE Dept. DARPA Contract F33615-00-1-1719 September 27, 2000. Agenda. Introduction and overview
E N D
Integrated Management of Power Aware Computing & Communication Technologies Kickoff review meeting Nader Bagherzadeh, Pai H. Chou, Fadi KurdahiUniversity of California, Irvine, ECE Dept. DARPA Contract F33615-00-1-1719 September 27, 2000
Agenda • Introduction and overview • Management status, financial, milestones, schedule. • Technical presentation • Task progress • Architecture • Applications • CAD • Lessons learned, challenges, issues. • Questions + action items review.
Outline • Introduction • Program goals • Project overview • Management status • Personnel and teaming plans • Plans and milestones • Financial information • Technical presentation • Background • Technical approach • Status and accomplishments • Current detailed schedule • Program impact and anticipated transitions
Program Goals • Power-aware system-level design • Enhance mission success (time, task) • Rapid customization for different missions • Design tool • Exploration & evaluation • Optimization& specialization • Technique integration • System architecture • Statically configurable • Dynamically adaptive • Use COTS parts & protocols
Technical approach • High-level specification • Separate behavior from architecture • Explicit constraints (timing, power) • Library characterization • System synthesis tool • Source-aware power usage scheduling • Bus topology transformation and communication scheduling • Configurable architecture • Task migration & selective shutdown • Bus segmentation and voltage scaling • Domain knowledge • Encompass mechanical / thermal power • Aware of power supply model
behavioral system model high-level components composition operators parameterizable components system architecture busses, protocols Quad Chart Behavior Innovations high-level simulation • Component-based power-aware design • Exploit off-the-shelf components & protocols • Best price/performance, reliable, cheap to replace • CAD tool for global power policy optimization • Optimal partitioning, scheduling, configuration • Manage entire system, including mechanical & thermal • Power-aware reconfigurable architectures • Reusable platform for many missions • Bus segmentation, voltage / frequency scaling functional partitioning & scheduling Architecture mapping system integration& synthesis static configuration dynamic powermanagement Year 1 Year 2 Impact Kickoff 2Q 02 2Q 00 2Q 01 • Static & hybrid optimizations • partitioning / allocation • scheduling • bus segmentation • voltage scaling • COTS component library • FireWire and I2C bus models • Static composition authoring • Architecture definition • High-level simulation • Benchmark Identification • Dynamic optimizations • task migration • processor shutdown • bus segmentation • frequency scaling • Parameterizable components library • Generalized bus models • Dynamic reconfiguration authoring • Architecture reconfiguration • Low-level simulation • System benchmarking • Enhanced mission success • More task for the same power • Dramatic reduction in mission completion time • Cost saving over a variety of missions • Reusable platform & design techniques • Fast turnaround time by configuration, not redesign • Confidence in complex design points • Provably correct functional/power constraints • Retargetable optimization to eliminate overdesign • Power protocol for massive scale
Innovations • Component-based power-aware design • Exploit off-the-shelf components & protocols • COTS offer best price/performance, reliable, cheap to replace • CAD tool for global power policy optimization • Optimal partitioning, scheduling, configuration • Manage entire system, including mechanical & thermal • Power-aware reconfigurable architectures • Reusable platform for many missions • Bus segmentation, voltage / frequency scaling
Impact • Enhanced mission success • More task for the same power • Dramatic reduction in mission completion time • Cost saving over a variety of missions • Reusable platform & design techniques • Fast turnaround time by configuration, not redesign • Confidence in complex design points • Provably correct functional/power constraints • Retargetable optimization to eliminate overdesign • Power protocol for massive scale
Personnel & teaming plans • UC Irvine, Co-PI's - Design tools • Nader Bagherzadeh • Pai Chou • Fadi Kurdahi • UC Irvine, research assistants • Dexin Li • Jinfeng Liu • Afshin Niktash • USC - Component power optimization • Jean-Luc Gaudiot • Seong-Won Lee • JPL - Applications and benchmarking • Nazeeh Aranki • Nikzad “Benny” Toomarian
Previous work • Design tools • System-level: the Chinook HW/SW codesign tool • Architectural synthesis (w/ physical design considerations) • Components • Reconfigurable computing: the MorphoSys Chip • Parameterizable components: PCL • Simultaneous MultiThreadingvs. Chip MultiProcessing • Architectural platform • Segmented bus X-2000, Mars Pathfinder • Configurable SMP
Responsibilities • Bagherzadeh, Chou, Kurdahi -- co-PIs • Oversee project operation • Integration into curriculum and related research efforts • Li, Liu, Afshin -- RA's • Development of CAD tools • Modeling of demonstrator examples • Authoring of component / protocol library • JPL • Furnish example specifications • Co-develop optimization techniques • USC • Supporting link to low-level technologies
External collaborations • JPL • X-2000 multi-mission architecture • Mars Pathfinder as baseline • JPL to provide COTS testbed • JPL to evaluate IMPACCT optimizations • USC • Parameterizable components • Low-level power estimation • Consystant Design Technologies (Seattle, WA) • Framework for component-based design • IMPACCT plugins to support power management
Background: MorphoSys project • Reconfigurable processor array • MIPS-like RISC processor • High-bandwidth data interface • 100 MHz clock • 0.35µm 4metal CMOS • Software support • Platform for dynamic power management Advanced RISC Processor MorphoSys Reconfigurable Processor Array System Bus Instr./Data Cache (L1) High Bandwidth Data Interface External Memory (e.g. SDRAM, RDRAM)
column block RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC row block RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC RC 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 RC Array and Context Memory • Context Memory • 2 blocks • 8 sets in each block • A set controls 1 row or column (SIMD) • 16 contexts in 1 set. • Possible to overlap ctx broadcast with ctx reloading
mLoad Software environment mView App. (C Code) TR_app a = b + c p = a + 1 Configuration context Z=RC_F(X) W=RC_F(Y) RC Array functions Context Lib. mcc 0100011....11100 1100110....00010 0011101....10100 mSched Executable MuLate, MorphoSim MorphoSys Chip C++, VHDL TinyRISC RC Array
Background on USC's SMT work • High performance processors • Superscalar processor (SSP) • Single chip multiprocessor (CMP) • Very long instruction word (VLIW) • Simultaneous multithreading (SMT) • Performance and power dissipation • High performance need high power consumption • Recent applications need for low power, high performance processor
Microarchitectural tradeoffs • Power tradeoffs between different architectures • SMT vs. SSP: • SMT has more modules than SSP • SMT has better performance and consumes more power • SMT vs. CMP: • SMT has better utilization • They have similar performance, but SMT consumes less power • SMT vs. VLIW: • SMT consume more power • SMT has compatibility with conventional architecture • Design of simple SMT • A simplified SMT may consume less power and still have the advantage of TLP • Analysis of architectural features • Power drain of modern processor (control vs. data path)
SMT design methodology • Measuring power consumption of a processor • Checking transitions of signals and module operations • Hardware implementation of the processor simulator • Measuring performance of modules • The contribution of each module to the total performance • Performance-power ratio of each module • Comparison between architectures • Design of a low power processor
Measuring performance • Finding the performance per power of each module • Simulate and measure the performance without a module • Calculate the performance per power for each module • Classify modules if more than two modules cooperate with each other • Find the solution for the low power high performance processor
Background: Chinook project • Component-based HW/SW codesign framework • Specification, simulation, synthesis • Motivated by IP reuse, system integration • Problem: IP reuse forces modification • Reason: components have hardwired coordination protocols • Approach • Adaptable components • Separate coordination protocolsfrom components • Benefits • Reuse without modification • Enable system-level optimizations
s i y i i y s i joystick override idle subsuming i bumper escape s s i sonar avoid s subsuming yielding F B i wheels y i W B W sensors actuators decision modules decision composition Example protocol: Subsumption • Must handle three cases: • Subsuming, yielding, idle • Hardwired protocol • Generalization: • Adaptable components (by mode mapping) • Separate protocols & components y s i y s i +subsuming y subsumption interface idle subsuming yielding s i Bumper process y release W F B W T 2s F B T s i bump 45d +B +W
mode manager Architectural mapping • Single processor or multiple processors • Multiple mappings to an architecture modal processes
mode manager Distributed mode managers • Automatically partitioned among processors • Synthesized control communication • Comm. tradeoffs: synchronization, replication modal processes
Past missions – Mars Pathfinder “Sojourner” The Mars Pathfinder Microrover Flight Experiment Alpha Proton X-ray Spectrometer (APXS)
Application requirements • System specification • 6 wheel motors • 4 steering motors • System health check • Hazard detection • Power supply • Battery (non-rechargeable) • Solar panel • Power consumption • Digital • Computation, imaging, communication, control • Mechanical • Driving, steering • Thermal • Motors must be heated in low-temperature environment
Energy Required Function Time and Calculation 7.51W-hr 5.63W-hr 6.92W-hr 1.83W-hr 0.45W-hr 1.2W-hr 5.2W-hr 0.63W-hr 15.0W-hr 50W-hr 95W-hr motor heating: 1 motor at a time motor heating: 2 motors at a time driving (extreme terrain @ -80degC) hazard detection imaging (3 images @ 2 min/image) image compression (compress 3 images @ 6 min/image) 6Mbit communication @ 50min/sol 42, 10 sec health checks during day remainder of 7 hr daytime CPU operation WEB heating (as needed) = 7.51W x 1hr = 11.26W x 0.5hr = 13.85W x 0.5hr = 7.33W x 0.25hr = 4.5W x 0.1hr = 3.7W x 0.3hr = 6.27W x 0.8hr = 6.27W x 0.1hr = 3.7W x 4hr = 50W-hr System-level power budget
Design issues • Timing constraints • System health check 10s/10min • Heating motor for 5s, 50s prior to driving • Hazard detection 10s – steering 5s – driving 10s • Power management • Low-power electronics cannot make significant power saving • No system-level management tool available • Conservative hand-crafted schedule • Serialize all operations to avoid power surge • Long execution time • Solar power wasted
Present missions – Athena/Mars ’03 Rover configuration Pancam/Mini-TES Instrument Arm Cluster : Raman Spectrometer Alpha-Proton-X-Ray Spectrometer (APXS) Mössbauer Spectrometer Microscopic Imager Mini-Corer
Athena/Mars ‘03 Rovers - power subsystem • Power utilization: • 38 W = 19 W (CPU&I/O) + 9 W (accel and gyro) + 10 W (wheel motors) for driving. • 75 W = 19 W (CPU&I/O) + 55 W (transmission) for orbiter communication • 30 W = 19 W (CPU&I/O) + 10 W (transmission) for lander relay communication • 55 W = 19 W (CPU&I/O) + 33 W (peak motor) for drilling • 29 W = 23 W (CPU&I/O) + 6 W (cameras) required for imaging • 11 W Raman, 1.4W APXS and 2.3 W for nighttime spectrometer operation • 141Whr daily for housekeeping engineering • 75Whr limit for nighttime operations
Present missions – MUSES-CN Asteroid NanoRover • Completely solar powered • Requiring only 1 watt, including an RF telecommunications system for communications between the rover and a lander or small-body orbiter for relay to Earth. • Power source • 500 grams of commercial, non-rechargeable, replaceable lithium batteries, with energy density of 750 joules per gram.
Power-aware designs • Subsume low power as a special case • Minimize power consumption • Minimal application specific knowledge, limited reconfiguration space • Conservative • Make best use of available power • Use MAX solar power while it's available • Increase parallelism, perform more tasks, reduce mission time • Both MIN and MAX power constraints • Application-specific knowledge • Multiple mission requirement • Adapt to run-time power supply, operating environment
System-level power management • Amdahl's law -- extended to power • Component-level improvements must be scaled by % contributions • Synergy between inter-component interactions • Scope of system power model • Digital, mechanical, thermal • Battery model - control power surge • Renewable source - solar panel, etc • Mission-driven tradeoffs • Execution time vs. power saving • Adapt to operating environment
What's needed? • Reconfigurable system architecture • Statically configurable for different missions • Reconfiguration for dynamic power management • Support state-of-the-art power management policies • System-level design tool • Support design space exploration • Take full advantage of COTS components • Optimize mission-specific system configuration • Synthesize system-level power manager • Support simulation for early validation
X2000 avionics system architecture • Symmetric COTS multiprocessors • Low cost component with strong commercial support • Widely accepted specification, design, application and testing • Reduced development cost • Dual system bus architecture • High speed data rate with moderate power • Low speed control with low power • Industry standard bus protocols • FireWire (IEEE 1394) bus • I2C bus • Reconfigurable bus topology
PA system architecture The NASA X2000 Avionics System high-rateinput symmetric multiprocessor modules reconfigurable hardware blocks communication module (CDMA) (camera) high-speed bus (e.g. IEEE 1394) low-speed bus (e.g. I2C ) bus power controller microcontroller-directed subnet - power regulations & control - analog telemetry sensors - safety inhibits - valve & pyro drive altimeter subnet
Applicable power optimizations • Application level • Scheduling under timing and power constraints • Task partitioning, allocation, migration • Algorithm selection • Architecture level • Bus segmentation / clustering • Communication scheduling • Component level • Voltage / frequency scaling • Power down • X-2000 goals • Digital electronics power: 10x decrease • Analog electronics power: 2x decrease • Computer performance: 10 to 20x increase both static & dynamic versions
The need for a system-level CAD tool • Avoid pitfalls with manual design • Overdesign (too conservative) • Hardwired assumptions in implementation (hard to change/adapt) • System integration (bottleneck in projects) • Scalable methodology • Specification: separation of concerns • Behavior vs. architecture • Policy vs. mechanism • Constraint vs. implementation • Exploration • Framework for technique integration • Rapid feedback • Manage complexity • Knowledge base for component/bus details • Consistent knowledge propagation through design stages
Design tool • Library • Components and bus protocols • Provides power estimation • Defines configuration space • Authoring • Behavioral description, architecture description • Mapping from behavior to architecture • Synthesis • Scheduling, partitioning • Bus segmentation, voltage scaling • Synthesis of power manager with task scheduler • Simulation • High-level: explore design space • Detailed-level: power/performance for a given design point
high-level components composition operators behavioral system model parameterizable components system architecture busses, protocols IMPAC2T overview Behavior high-level simulation functional partitioning & scheduling Architecture mapping system integration& synthesis static configuration dynamic powermanagement
VHDL code Bus width = 8 Bus width = 16 Library: low-level components • Supported components • COTS • Parameterizable • Levels of abstraction • Parameterizable • Simulatable • Synthesizable • Reconfigurable
Library: component definition • Component interface • Physical: pin interface • Functional: data and control interface • Power, current, voltage • Power/mode characterization • Mode governs power usage • Restrictions on mode changes allowed • High-level yet refined power estimation • Aggregation • Smaller components combined into larger ones • New external parameters, interfaces, modes
Example components • Processor : • PowerPC, ARM, Pentium, MIPS • Microcontroller • StrongARM, Intel 8051, Motorola 68HC11, 68332 • Bus controller/transceiver: • FireWire controller& transceiver • I2C bus controller, GPIB • Memory • SRAM • DRAM • Flash memory
Example component definition • FireWire bus transceiver: National Semi CS4103 • Working voltage: 3.3 V • Power modes • Full-on (400mW) • PHY-on (150mW) • Standby (50mW) • CLK-disable (21mW) • Crystal-disable (16mW) • FireWire bus controller: National Semi CS4210 • Working voltage: 3.3 V • Power modes • Full-on (300mW) • Standby (17mW) • Aggregated bus transceiver/controller • Up to ten working modes to play with • Flexibility in power management
Library: bus protocols • Architecture • Parallelism (parallel or serial) • Topology (serial, tree, ring) • Service layers (physical, link, transaction, application) • Communication • Data transfer mode (asynchronouus, isochronous) • Data transfer speed • Response mode (need acknowledgement or not) • Arbitration mode • Configuration • Configuration process (deterministic or randomly ) • Reconfigurability (statical, hybrid, dynamical) • Power • Power mode ( full-on, standby, deep-sleep, shutdown) • Media (cable, wireless, backplane)