Increasing TLB Reach by Exploiting Clustering in Page Translations

Increasing TLB Reach by Exploiting Clustering in Page Translations Binh Pham§, AbhishekBhattacharjee§, Yasuko Eckertǂ, Gabriel H. Lohǂ §Rutgers University ǂAMD Research Binh Pham - Rutgers University

Address Translation Overview Address Generation TLB Page Table Walker VA X86: Four-Level Page Tables in Memory Address Translation Time PTE PTE PTE PTE PA Cache Access VA: Virtual Address PA: Physical Address PTE: Page Table Entry Binh Pham - Rutgers University

Address Translation Performance Impact • Address translation performance overhead – 10-15% • Clark & Emer [Trans. On Comp. Sys. 1985] • Talluri & Hill [ASPLOS 1994] • Barr, Cox & Rixner[ISCA 2011] • Emerging software trends • Virtualization – up to 89% overhead [Bhargava et al., ASPLOS 2008] • Big Memory workloads – up to 50% overhead [Basu et al., ISCA 2013] • Emerging hardware trends • LLC capacity to TLB capacity ratios increasing • Manycore/hyperthreadingincreases TLB and LLC PTE stress Binh Pham - Rutgers University

TLB Miss Elimination Approaches • Increasing TLB size? • Latency • Power • Increasing TLB reach • Using large pages • Using “CoLT: Coalesced Large-Reach TLBs” (Pham et al., MICRO 2012) Binh Pham - Rutgers University

Contiguous Locality in Page Tables Page Table CoLT TLB Sequential Groups Holes Singletons OoO Binh Pham - Rutgers University

Clustered Locality in Page Tables Page Table Clustering Holes Clustered Groups OoO • Clustered locality can deal with “holes” between PTEs • Clusteredlocality does NOT care about PTEs’ order Binh Pham - Rutgers University

Spatial Locality Characterization Page Table % PTEs in the same group group len PTEs Distribution • Clustered locality is abundant and surpasses contiguous locality • Clusteredlocality increases with clustered spatial region size Binh Pham - Rutgers University

Outline • How do we exploit clustered locality in hardware? • How much can our design improve performance? • Conclusion Binh Pham - Rutgers University

Clustered TLB: Miss and Fill Page Table Page Table Walker 64B cacheline 8B PTE Coalescing Logic Clustered TLB Entry Sub-entries Binh Pham - Rutgers University

Clustered TLB Look Up Base VPN VPN(2:0) 0 011 Clustered TLB Base V Base P Sub-entries 0 01 000 001 010 100 100 101 X X X =? PPN lower bits Sub-entry hit? concat PPN = 12 (01100) Hit? Binh Pham - Rutgers University

Multi Granular TLB Design L1-TLB VPN cacheline L2-TLB Clustered TLB C0 TLB Coalescing Logic PPN PPN Clustered hit Base VPN Base PPN 8B PTE C0 hit Clustered-TLB entry TLB hit PPN N len >= Θ Y Binh Pham - Rutgers University

Methodology • Workloads: SPEC CPU2006, Cloudsuite, Server • Full System Simulation: • Baseline: 64-entry L1 ITLB, 64-entry L1 DTLB, 512-entry L2 TLB • Roughly equal hardware for baseline, CoLT, and MG-TLB L1-TLB L1-TLB L1-TLB L2-TLB CoLT-TLB MG-TLB Baseline CoLT MG-TLB Binh Pham - Rutgers University

Miss Elimination Best design gives 7% performance improvement on average -123% Binh Pham - Rutgers University

Insert to Clustered TLB or C0 TLB? Θ = 2 gives best performance: -51% -218% C0 Entry C0 Entry Cluster3 Entry Binh Pham - Rutgers University

Prefetching versus Capacity Benefit -32% -21% MG-TLB combines prefetching and capacity to get best performance Binh Pham - Rutgers University

Conclusion • We observe more generic type of locality (clustered locality) in the page translations • Multi-granular TLB • Eliminates nearly half of TLB misses • Our approach requires no OS modification, and provides robust performance gain Binh Pham - Rutgers University

Thanks for listening! Questions? Binh Pham - Rutgers University

Increasing TLB Reach by Exploiting Clustering in Page Translations

Increasing TLB Reach by Exploiting Clustering in Page Translations

Presentation Transcript

TLB grand format

Exploiting Wikipedia as External Knowledge for Document Clustering

Translations

Translations

Translations

Translations

Translations

Translations

Exploiting Clustering Techniques for Web Session Inference

Clustering by Compression

CoNMF: Exploiting User Comments for Clustering Web2.0 Items

e500 TLB Initialization

Translations

Translations

Translations

Translations

Translations

Translations

Translations

How Does SEO Helps In Increasing Page Ranking

Exploiting Wikipedia as External Knowledge for Document Clustering

Translations