150 likes | 281 Views
A Novel Cache Architecture with Enhanced Performance and Security. Zhenghong Wang and Ruby B. Lee. Introduction. Introduction Problems with current designs Attempts to mitigate information leakage Proposed Design Cache Miss Replacement Policy Address Decoder Results. Introduction.
E N D
A Novel Cache Architecture with Enhanced Performance and Security Zhenghong Wang and Ruby B. Lee
Introduction • Introduction • Problems with current designs • Attempts to mitigate information leakage • Proposed Design • Cache Miss • Replacement Policy • Address Decoder • Results
Introduction • Current Cache designs are susceptible to cache based attacks. • Caches should have low miss rates and short access times and should be power efficient at the same time. • The author used the SPEC2000 suite to evaluate cache miss behaviour. CACTI and HSPICE to validate the circuit design. • The proposed cache architecture has low miss rates comparable to a highly associative cache and short access times and power efficiency close to that of direct mapped cache.
Problems with Current Designs • Hardware caches in processors introduce interference between programs and users. • One process can evict cache lines of other processes, causing them to miss cache accesses. Critical information can be leaked out due to common cache behaviour. • Cache-based attacks allow the recovery of the full secret cryptographic key and require much less time and computation power to do so. • A remote computer user can become an attacker without the need for special equipment.
Past Attempts to Mitigate Information Leakage • Software: mostly involves re-writing code to prevent known attacks from succeeding • Hardware: Cache line locking and cache Partitioning • Both prevent undesirable cache evictions if the objects are put into a private partition or locked in cache thus helping to achieve constant execution time. • Drawback is cache underutilization. Randomized approach avoids this.
Design • New address decoder • New SecRAND replacement algorithm • Adopt the direct mapped architecture and extend this with dynamic memory-to-cache remapping and larger ache index.
Design (2) • Line width: 2n+k [n – from traditional direct mapping] [n+k – equivalent to mapping the memory space to a large logical direct mapped cache with 2n+k] • So it has 2n physical cache lines, but has 2n+k lines in memory that can be mapped to these lines. • RMT – Re-Mapping Table – allows different processes to have different memory to cache mappings.
Cache Miss • Because they have chosen to use dynamic re-mapping, a cache replacement algorithm is needed. • Index miss: none of the LNregs matches the given RMT_ID and index. None of the cache lines is selected in an index miss. [Unique to this cache design] • Tag Miss: essentially the same thing as an ordinary miss in a traditional direct-mapped cache. • Note from figure: Protection bit is also included.
Replacement Policy • Because of the dynamic remapping, we need a cache replacement algorithm. • The tag misses are conflict misses in the LDM cache since the addresses of the incoming data line and the line in cache have the same index but different tags. No two LNregs can contain the same index bits. Either the original line is replaced with the incoming line, or the incoming line is not cached at all. • The proposed replacement method is a new modified random replacement policy (SecRAND).
Replacement Policy (2) • Tag Miss (most likely a part of the same process): To avoid information leaking interference, a random cache line is selected to be evicted if either C or D are protected. Since D cannot replace lines other than C, it is sent directly to the CPU core. • Index miss: C and D may or may not belong to the same process. The new memory block, D can replace any cache line, which is randomly selected. • The random placement algorithm requires less hardware than other commonly used replacement algorithms (such as LRU and FIFO) due to its stateless nature.
Cache Miss Rates As reported by a cache simulator derived from sim-cache and sim-cheetah of the simplescalar toolset. Run on all 26 SPEC2000 benchmarks.
Overall Power Consumption Obtained using CACTI 5.0
Additional Benefits/Conclution • Fault Tolerance: Due to dynamic remapping • Hot Spot Mitigation: Due to special and temporal locality • Ability to optimize for power efficiency: Due to ability to turn off unused cache lines. • Many challenging and even conflicting design goals such as security and high performance can be achieved at the same time. • Proposed architecture is secure, yet requires lower hardware cost.