Xavier de Gunten

Kaxiras and Martonosi: “Computer architecture techniques for Power Efficiency” Synthesis Lectures in Computer Architectures, 2008Ch. 4.8 - 4.13 Xavier de Gunten

Sections • 4.8 Idle-Capacity Switching Activity: Caches • 4.9 Parallel Switching-Activity in Set-Associative Caches • 4.10 Cacheable Switching Activity • 4.11 Speculative Activity • 4.12 Value-Dependent Switching Activity: Bus Encodings • 4.13 Dynamic Work Steering

Idle-Capacity Switching Activity • Cache resizing that trades memory between 2 cache levels • Selective Cache Ways – Resizing through associativity • Accounting Cache – Combination of 1 and 2 • CAM-tag Resizing

Cache Resizing: trading memory between 2 cache levels • Structures partitioned in segments with buffered wires • Trading between L1 and L2 by altering associativity • Use CPI and “phase” to determine organization of caches

Selective Cache Ways – Resizing through associativity • Resize 1 big cache through associativity • Large Cache partitioned into subarrays • Disabling a cache way => ignore cache accesses but tags remain active

Accounting Cache • Cross between selective ways cache and variable L1/L2 division • Fake L2 cache • True LRU replacement policy • One-shot configuration • Energy savings between 35 – 53 % with performance lost of 1 - 4 %

CAM-Tag Cache Resizing • RAM -> high performance, CAM -> power efficient • Resizing is more advantageous for highly associative CAM-tag caches • Resizing granularity is finer for CAM • CAM tag cache resizing can be done individually per set because bit lines run across ways of a set • Control Policy -> Performance-based feedback loop • Based on # of misses in given time window

CAM-Tag Cache Resizing

Parallel Switching Activity in Set-Associative Caches • Associative cache consumes power linearly to associativity • Phased Cache • Phase 1  tags Phase 2  data

Sequentially Accessed Set-Associative Cache • Only most likely way to produce hit is probed (MRU) • Similar power/performance to direct-mapped cache • Misses are expensive • Way prediction • Prediction structure to hold MRU information

Advanced Way-Prediction Mechanisms • Selective direct-mapping • Set-associative for tags • Direct-mapped for data • Separating conflicting and non-conflicting lines

Multi-MRU • Allow multiple MRU predictors to disambiguate among tags

Way Selection • Location Cache • Store position of L2 cache lines for L1 misses • Way Halting • Halt parallel access once hit and location are determined in partial tag compare • Decaying Bloom Filters • Determine which lines are live and dead • Decreases the number of ways that need to be searched

Cache Coherence Protocols • Exclude-Jetty • Small tag-cache to determine what is not cached in L2 • Include-Jetty • Hash table to capture superset of what is cached in L2 • Hybrid-Jetty • Consult both an Exclude and Include-Jetty

Cacheable Switching Activity • Work Reuse (Reduce Repetitive Computing) • Operation Level: Memoization • Instruction Level: Instruction Reuse Buffers • Basic Block Level: Block History Buffer • Trace Level: Groups of consecutive instructions based on dynamic execution

Cacheable Switching Activity • Filter Cache • 128-256 byte cache inserted before L1 • Loop Cache • Software/compiler controlled • Filter cache for Instructions • Trace Cache • Pentium 4 • Stores blocks of instructions

Speculative Activity • Incorrect speculation is costly • Instruction reuse buffer • Pipeline gating • Stall the whole pipeline when confidence in branch prediction is low • Trade vs doing less work and larger penalty of stalling • Selective Throttling

Value-Dependent Switching Activity: Bus Encodings • Two factors that drive power consumption • Average # of signal transition • Capacitance of wires • Dynamic Base Register Caching • Grey Encoding • T0 Encoding

Address and Data Buses • Bus-inversion encoding • Dictionary-based solutions • Frequent-Value Encoding

Dynamic Work Steering • Circuit Level: Precomputation • Microarchitectural Level: Deal with idle-width activity • Processor core Level: Activity Migration

Dynamic Work Steering

Xavier de Gunten

Xavier de Gunten

Presentation Transcript

Xavier University

Xavier Munoz

Xavier Award

Xavier Lawrence

Xavier Gellynck

Xavier Cirera

XAVIER UNIVERSITY

Xavier University

Francis Xavier