410 likes | 794 Views
SHiP : Signature-based Hit Predictor for High Performance Caching. * Carole-Jean Wu, # Aamer Jaleel , #, + William Hasenplaugh, * Margaret Martonosi, # Simon Steely Jr., #, + Joel Emer * Princeton University # Intel Corporation, VSSAD #,+ MIT.
E N D
SHiP: Signature-based Hit Predictor forHigh Performance Caching *Carole-Jean Wu,#Aamer Jaleel, #,+William Hasenplaugh, *Margaret Martonosi, #Simon Steely Jr., #,+Joel Emer *Princeton University #Intel Corporation, VSSAD #,+MIT IEEE/ACM International Symposium on Microarchitecture (MICRO’2011)
Motivation • Factors making caching important • Increasing ratio of CPU speed to memory speed • Multi-core poses challenges on better shared cache management • LRU has been the standard LLC replacement policy • However LRU has problems!
Problems with LRU Replacement • Working set larger than the cache causes thrashing miss miss miss miss miss Wsize LLCsize • References to non-temporal data (scans) discards frequently referenced working set hit hit hit miss hit miss miss scan scan scan LLCsize Wsize • scansoccur frequently in commercial workloads
Desired Behavior from Cache Replacement • Working set larger than the cache Preserve some of working set in the cache hit hit hit hit hit miss miss miss miss miss Wsize LLCsize [ DIP (ISCA’07), DRRIP (ISCA’10) achieves this effect ] • Recurring scans Preserve frequently referenced working set in the cache hit hit hit hit hit hit hit scan scan scan [ SRRIP (ISCA’10) achieves this effect ]
Dynamic Re-Reference Interval Prediction ( DRRIP ) (SRRIP) Scan-Resistant ( BRRIP ) Thrash-Resistant insertion insertion 0 Imme- diate 1 Inter- mediate 2 far 3 distant No Victim No Victim No Victim re-reference eviction re-reference re-reference [ Jaleel et al., ISCA’10 ]
SRRIP Not Always Scan Resistant… • LONG scans in access pattern hit miss hit hit miss “short” scan “long” scan
SRRIP Not Always Scan Resistant… • LONG scans in access pattern hit miss hit hit miss “short” scan “long” scan • Active working-set MUST beRE-REFERENCED at least ONCEbetween scans miss miss miss miss scan scan scan
SRRIP Not Always Scan Resistant… • LONG scans in access pattern hit hit miss hit hit miss “short” scan “long” scan • Active working-set MUST beRE-REFERENCED at least ONCEbetween scans miss miss hit hit miss miss hit scan scan scan • Can We Be More Intelligent in Dealing with Scans?
Closer Look at Scan Access Patterns scan scan No Future References Future Reference • Assuming Perfect Knowledge of Re-Reference Pattern
Improving RRIP on Cache Insertions Improve Insertion scan 0 Imme- diate 1 Inter- mediate 2 far 3 distant No Victim No Victim No Victim re-reference eviction re-reference re-reference • Need to Assign DIFFERENT Re-Reference Predictions on Cache Insertion
Focus of this Paper… • Goal: Learn re-reference interval of a cache line PREDICTOR 0: immediate 1: intermediate 2: far 3: distant cache access re-reference prediction • How Best to Learn the Re-Reference Interval?
Learning Re-Reference Behavior scan scan REFERENCE SAME MEMORY REGION REFERENCED BY SIMILAR SET OF PCs • Can We Learn Re-References By Correlating Accesses With Some Other Information?
Learning Re-Reference Behavior scan scan REFERENCE SAME MEMORY REGION REFERENCED BY SIMILAR SET OF PCs • Can We Learn Re-References By Correlating Accesses With Some Other Information?
Using Signatures to Correlate Re-Reference • Different types of information: • Memory Region • Memory Instruction PC • Instruction Sequence • Observation: LLC accesses by the same “signature” tend to have similar re-reference patterns scan scan “signature“ • OBSERVE, LEARNandPREDICT Re-Reference Pattern of a Signature
Observe Signature Re-Reference Behavior • Observe re-reference pattern in the baseline cache Address Load/Store • Cache Tag • Replacement State • Coherence State LLC
Observe Signature Re-Reference Behavior • Observe re-reference pattern in the baseline cache • Hardware Required: • Was line re-referenced after cache insertion ( 1-bit ) • “Signature” responsible for cache insertion ( 14-bits ) Signature Address Load/Store • reuse bit • signature_insert metadata LLC
Learn Signature Re-Reference Behavior • Learn signature re-reference behavior • Hardware Required: • Signature History Counter Table (SHCT) ( 16K, 2-bit counters ) • SHCT Training: • If evicted line reused: SHCT [ signature_insert ] ++ • If evicted line NOT reused: SHCT [ signature_insert ] -- counter = 0, signature NOT re-referenced counter != 0, signature re-referenced SHCT Last Level Cache (LLC)
Signature-based Hit Predictor (SHiP) • Predict re-reference interval of line using SHCT SHiP SHCT 0: immediate 1: intermediate 2: far 3: distant cache hit/miss re-reference prediction signature
Signature-based Hit Predictor (SHiP) • Predict re-reference interval using SHCT on CACHE MISS SHiP Re-Reference Predictions On Miss if ( SHCT [ signature ] == 0 ) if ( SHCT [ signature ] == 0 ) 0: immediate 1: intermediate 2: far 3: distant cache miss re-reference prediction predict DISTANT (i.e. 3) signature else predict FAR (i.e. 2)
Signature-based Hit Predictor (SHiP) • Predict re-reference interval on CACHE HIT SHiP Re-Reference Predictions On Hit 0: immediate 1: intermediate 2: far 3: distant cache hit re-reference prediction Always predict IMMEDIATE (i.e. 0) signature
SHiP – High Level Architectural Overview Signature Address Access Type data hit/miss SHiP SHCT Training SHCT signature_insert reuse_bit LLC hit/miss Re-Reference Prediction Last Level Cache (LLC)
SHiP – High Level Architectural Overview Per-Line Overhead Can Be Reduced by using Set Sampling ( need only 32 - 64 sets ) Signature Address Access Type data hit/miss SHiP SHCT Training SHCT signature_insert reuse_bit LLC hit/miss Last Level Cache (LLC) Re-Reference Prediction
SHiP – High Level Architectural Overview Per-Line Overhead Can Be Reduced by using Set Sampling ( need only 32 - 64 sets ) Address Access Type Signature data hit/miss SHiP SHCT Training SHCT ~6 KB NO CHANGE signature_insert reuse_bit LLC hit/miss Last Level Cache (LLC) Re-Reference Prediction
Performance Comparison of Replacement Policies 16-way 2MB LLC Core i7 Type Hierarchy SHiP Significantly Improves Performance Across All Workload Categories
Performance Comparison of Replacement PoliciesCRC Results Comparison 16-way 1MB Private Cache 65 Single-Threaded Workloads Averaged Across PC Games, Multimedia, Enterprise Server, SPEC CPU2006 Workloads S H i P SHiP • 16-way 4MB Shared Cache • 165 4-core Workloads SHiP Has 2X Performance Improvements of Prior State-of-the-Art Policies
Total Storage Overhead (16-way Set Associative Cache) • LRU: 4-bits / cache block • Pseudo-LRU 1-bit / cache block • RRIP: [ ISCA’10 ] 2-bits / cache block • Seg-LRU: [ CRC’10 ] ~8-bits / cache block • SDBP: [ MICRO’10 ] ~10-bits / cache block • SHiP: [ MICRO’11 ] ~5-bits / cache block SHiP Outperforms State-of-the-Art with HW Similar to LRU
Summary • Scan-resistance is an important problem in commercial workloads • State-of-the art policies do not fully address scan-resistance • Signatures help improve re-reference predictions to address scans • Need fine-grained re-reference predictions at insertion • Proposed a Simple and Practical Scan-Resistant Replacement • SHiP significantly outperforms winner of CRC Championship • SHiP requires less storage than CRC winner • HW overhead of SHiP is comparable to LRU
Re-Reference Interval Prediction ( RRIP ) CAN INSERTION BE MORE INTELLIGENT? Scan-Resistant insertion 0 Imme- diate 1 Inter- mediate 2 far 3 distant No Victim No Victim No Victim re-reference eviction re-reference re-reference
Using Signatures to Correlate Re-Reference Behavior SIGN ATURE a b a c d c Example Signatures Memory Region Program Counter Instruction Decode History scan scan No Future Cache Hits Future Cache Hits c a b d
LRU vs. Re-Reference Interval Prediction (RRIP) 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 Physical Way # Physical Way # LRU Cache Tag Cache Tag c c g g d f h s e s b h b f d e “LRU Chain” position Re-Reference Prediction 1 0 2 2 RRIP Outperforms LRU with Storage Less Than LRU 5 4 3 6 0 7 0 2 2 2 3 0 3 1 RRIP
Signature-based Hit Predictor (SHiP) • Goal: Predict the re-reference behavior of a signature • Learn Re-Reference Behavior: Signature Address Access Type data hit/miss LLC