80 likes | 193 Views
Bank Conflict. One point need notice is cache bank conflict for SMT, which would cause 3.4% performance loss. Symptom I .
E N D
Bank Conflict • One point need notice is cache bank conflict for SMT, which would cause 3.4% performance loss
Symptom I • If the PC of different threads point to the same bank, then bank conflict occurs, we can only fetch one thread from one bank at a time, which means we can not achieve the anticipated fetch bandwidth
Symptom II • If there is a cache miss for one thread, while the slot being replaced belong to another thread that going to be used in the near future, then we will undergo another cache miss again, which decrease the performance
Passive/Active • Even though we have logic for cache logic detection (to compare different PCs), but it’s passive, we need some active mechanism to prevent from happening
How to solve it • We can construct a mechanism to distribute different threads to different bank, which would solve the bank conflict issue between different threads
Heuristic (for two threads) • Static: Divide the cache banks evenly between two threads • Dynamic: divide cache banks among threads, put two thresholds, one to increase the quota of the fast thread, one to protect the quota of the slow thread.
SMT Overview Reading • D.M. Tullsen et al, Simultaneous Multithreading: Maximizing On-Chip Parallelism, ISCA 1995 • S.J. Eggers et al, Simultaneous Multithreading: A Platform for Next Generation Processors, IEEE Micro 1997 • Wen-Mei Hwu et al, Simultaneous Multithreading: Unlocking the Magic, UIUC Tech Report 2002 • Roger Espasa and Mateo Valero, Simultaneous Multithreading Vector Architecture: Merging ILP and DLP for High Performance
Relative Topic Reading • Trace Cache: J.E.Smith et al, A Trace Cache Microarchitecture and Evaluation, IEEE Trans. on Computers, 1999 • CMP: V. Krishnan and J.Torrellas, A Chip-Multiprocessor Architecture with Speculative Multithreading, IEEE Trans. on Computers, 1999 • Multiscalar: G.S. Sohi et al, Multiscalar Processors, ISCA 1995 • IRAM: David Patterson et al, A case for Intelligent RAM: IRAM, IEEE Micro 1997 • HiDISC: Wonwoo Ro et al, A high-Performance, Hierarchical Decoupled Architecture