190 likes | 348 Views
Reducing Power Consumption of the Issue Logic. Daniele Folegnani and Antonio González Universitat Politècnica de Catalunya. MOTIVATION. Power consumption High performance microarchitecture Cooling systems Reliability Embedded systems Battery life. OUTLINE.
E N D
Reducing Power Consumption of the Issue Logic Daniele Folegnani and Antonio González Universitat Politècnica de Catalunya
MOTIVATION • Power consumption • High performance microarchitecture • Cooling systems • Reliability • Embedded systems • Battery life
OUTLINE • Power Consumption in Superscalar Processors • IPC-based Instruction Queue Resize • Results • Conclusions
Power Evaluation Methodology Dynamic Power Estimator[Cai,Lim MICRO32] • Architectural design partition • Architectural block fits a circuit block • Power consumption evalutation at block level • Power density of blocks (SPICE, input sets, technology and circuit styles definition) • Blocks and sub-blocks activity (execution-driven) • Area (feedback from VLSI design)
The Power Model • 0.18 microm CMOS • 5 Types of logic (static, dynamic, SRAM, clock, PLA) • 32 Blocks and area associated • Custom design • Power densities ( APD, IPD )
EXPERIMENTAL FRAMEWORK • 4 instr. fetch, issue and commit • 128 entries instruction queue size • I-Cache 128Kbytes, direct mapped, 32 byte line, 1 cycle hit, 3 cycle miss • D-Cache 128Kbytes, 4 way set ass, 32 byte line, 1 cycle hit, 3 cycle miss • UL2-Cache,1024Kbytes, 4 way set ass, 64 byte line, 3 cycle hit • Combined predictor of 1K entries with Gshare with 1K 2-bit counters, • 8 bit global history and bimodal pred. of 2K entries with 2-bit counters • 4 intALU, 4fpALU, 1int mul/div, 1 fp mul/div • Out of order issue, oldest ready first selection policy
ANALYSIS • Power Analysis • IQ + ROB = 53% of total consumption • Almost independent to instruction mix • Trends in Superscalar • Increasing • IW • entries in the window IQ Power contribution may grow in the future
ANALYSIS • Considering • Periods of execution with low parallelism • Some parts of the IQ has negligible impact on total IPC • Periods of execution with high parallelism • Few parts of IQ can satisfy the issue width
IPC-based Instruction Queue Resize • IQ Resize • Based on IPC contribution • Avoid wake-up on disabled parts IQ has a circular FIFO without collapsing
IPC-based Instruction Queue Resize • IQ Resize • IQ physically divided in 16 parts of 8 entries • Add the limit pointer, updated as the head pointer • At resize time, move the limit of one part • If limit reach the tail, stop to insert new instructions
Heuristics • Heuristic to reduce size • Statistic of committed instructions in youngest part every quantum time => add a bit in each ROB entry • Threshold based resize decision • No size limit to disable • Heuristic to grow size • Grow one portion every 5 quantum time • The threshold based scheme will decide the correctness • the next quantum time
Conclusions • IQ is a the critical point for power consumption in superscalar processors • Dynamically adapting the IQ size based on IPC contribution can save about 15% of total power with negligible impact on performance