390 likes | 401 Views
This paper discusses synthesis techniques for optimizing leakage power in embedded memories on FPGAs, focusing on temporal and spatial information. It presents different operating modes and location assignment schemes to achieve optimal leakage power management.
E N D
Leakage Power Reduction of Embedded Memories on FPGAs through Location Assignment Yan Meng, Tim Sherwood, and Ryan Kastner University of California, Santa Barbara Department of Electrical & Computer Engineering ExPRESS Group: http://express.ece.ucsb.edu
Outline • Motivation • The leakage problem of embedded memories on FPGAs is of growing importance • Synthesis techniques for leakage power optimization of embedded memories • Conclusions
Motivation • FPGAs are attractive options • High processing power, flexibility, reconfigurability • Power is becoming critical • Why worry about power? • Heat dissipation, portability • Where does power go in CMOS? • Dynamic power consumption • Switching power due to charging and discharging load capacitors • Short circuit currents between supply rails when both transistors are on during switching • Leakage power consumption
15nm 20nm 30nm 50nm 70nm 130nm Technology Scaling and Leakage Power Dissipation • Leakage is dominating over dynamic power as technology scales down (improving speed, transistor density and functionality)
On-chip Memory Leakage Control • Why control leakage through on-chip memory? • Huge portion of chip area • Leakage is proportional to the number of transistors • Major source of leakage consumption [Roy01, Hu01, Flautner02,Mudge04] • Caches on microprocessors • 50% 2005 [ITRS 02] • Dynamic reshuffling due to cache replacement policies • Cache hierarchy with data replication • Memories on FPGAs • Configuration SRAMs: not on critical paths, high Vth • Embedded memories • Accesses are usually statically scheduled • Not necessary a part of memory hierarchy with inclusion
Leakage Power Optimization of Embedded Memories on FPGAs Embedded memory bits/logic cells > 20x • Leakage problem of embedded memories isof growing importance 20x
Motivating Example • Temporal information • Spatial information
Outline • Motivation • Synthesis techniques for leakage power optimization of embedded memories • Temporal • Temporal + spatial • Conclusions
Temporal Information • Precedence order between variables • Saving power on variables • Keep frequently accessed lines active to ensure high performance • Turn off lines that are not used for a long time • Use low supply voltage to save power for the rest • Using the generalized model to calculate maximal leakage power savings for variables [Meng’ HPCA05]
access(v) access(v) Last use |Ii| Time Dead interval Definitions – Intervals • Live interval • time between two successive accesses to the same variable v within a memory entry • Dead interval • time before the first access or after the last access to a variable
Definitions – Operating Modes • Active mode • Power on the whole line • No power saving • Sleep mode[Roy01, Hu01] • Sleep/“turn off” transistors • Lose data • Drowsy mode[Flautner02,Mudge04] • Use low supply voltage to save power when it is not needed • Preserve data for fast reaccess • Wake up to the high voltage and return data
Active mode Sleep mode Drowsy mode Choosing Operating Modes |Ii|
Inflection Points • Which mode to apply on each interval? • Active-drowsy inflection point a • The least amount of time drowsy mode needs to save energy • Sleep-drowsy inflection point b • The time where sleep and drowsy modes consume the same amount of energy
Selecting Operating Modes with Inflection Points Active Interval Active Mode 0<|I|≤a Drowsy Interval Drowsy Mode |I|? I a<|I|≤b |I|>b Sleep Interval Sleep Mode
Optimal Leakage Management Policy • Oracle knowledge of all interval lengths based on static scheduling • Applying the appropriate operating mode on each variable interval • Obtaining maximal leakage power saving • Formal proof of the optimality [Meng HPCA’05]
Outline • Motivation • Synthesis for leakage power optimization of embedded memories • Temporal • Temporal + spatial • Conclusions
Spatial Information • Spatial layout of data leads to different potentials of power savings One variable per entry Minimal number of entries
BRAM Line BRAM Line BRAM Line time t time t time t 0 0 0 the state-of-the-art used-active min-entry BRAM Line BRAM Line BRAM Line time t time t time t 0 0 0 sleep-dead drowsy-long path-place Memory Leakage Optimization Techniques
Location Assignment Schemes (I) • The state of the art: no leakage control Full-active
Location Assignment Schemes (II) • Turning off the unused part Used-active
Location Assignment Schemes (III) • Packing variables into the minimal number of entries and turning off the rest Min-entry
Location Assignment Schemes (IV) • Min entry + sleep dead intervals Sleep-dead
Location Assignment Schemes (V) • Min entry + sleep dead + drowsy long Drowsy-long
I1 start start I2 e1 e1 e2 e2 I3 E1 I1 w1 I2 w2 I1 w1 I2 w2 time E4 end start e4 e4 e3 e3 I3 w3 I3 w3 E2 E3 e5 e5 end end Extended DAG Modeling Temporal information 4 entries I1 I3 +Spatial information
Path-place Algorithm • Greedily covering DAG with N node-disjoint paths. The length of a path indicates the power saving of a memory entry. • First sort all vertices in topological order • A vertex is covered each time to calculate the longest path reaching it, iff not adjacent to other nodes • Sum the weights of the final level vertices, edges, and virtual edges from start to end if k < N • Complexity: O((n+e)*N)
Location Assignment Schemes (VI) • Data layout with leakage awareness Power savings on unused entries, dead and live intervals Path-place
BRAM Line BRAM Line BRAM Line time t time t time t 0 0 0 the state-of-the-art used-active min-entry BRAM Line BRAM Line BRAM Line time t time t time t 0 0 0 sleep-dead drowsy-long path-place Location Assignment Schemes
Embedded Memory Leakage-aware Design Flow • Exploring temporal and spatial information • Path traversal and location assignment • Introduced for deciding the best data layout within embedded memory to achieve the maximal leakage saving
Radix-2 FFT Example Location assignment Scheduling Path traversal Compilation
Empirical Study • Experimental setup • Simulation of a configurable double-port synchronous RAM with 18K-bits • Read/write ports: both ports can read the same memory cell simultaneously, but can’t write to the same location (no write conflict). • Configurable: 1-bit, 2-bit, 4-bit, 9-bit, or 18-bit • eCACTI [Dutt’04]: modeling transistor leakage • DSP benchmarks: dft, idft, fft-2, fft-4, filter, mp
Comparing Different Schemes 95% 76% 37%
Conclusions • Leakage is dominating dynamic power as technology scaling trends hold • Leakage problem of embedded memories is of growing importance • Explored temporal and spatial information for optimizing leakage power, achieving significant leakage saving 95%
BATTERY (50+ lbs) Multimedia, Internet, Cellular Telephony Won’t work The machine is too hot. The battery is tooheavy.
Saving Leakage Power without Performance Degradation • Deriving the interval lengths with static scheduling • Scheduling any needed datajust before it is needed • Avoiding any performance impact
EAS The Generalized Model • Parameterized model • Inputs • Wake-up latencies • Interval distribution • Leakage power of each state • Transition energy between states • Output • Maximal power saving P(Active) P(Sleep) [Meng HPCA’05]
Example of path-place start e1 e4 E1 I1 w1 E2 I4 w4 e2 E3 I2 w2 TopList: {I4, I1, I2, I3} e5 e3 E8 E7 E6 E4 E5 I3 w3 e6 end
Outline • Motivation • Synthesis for leakage power optimization of embedded memories • Temporal • Temporal + spatial • Conclusions