1 / 17

Cache-Conscious Structure Definition

Cache-Conscious Structure Definition. By Trishul M. Chilimbi, Bob Davidson, and James R. Larus Presented by Shelley Chen March 10, 2003. Motivation. Processor-Memory Performance Gap Growing at 53% per year! Chilimbi and Larus previous work

zan
Download Presentation

Cache-Conscious Structure Definition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cache-ConsciousStructure Definition By Trishul M. Chilimbi, Bob Davidson, and James R. Larus Presented by Shelley Chen March 10, 2003

  2. Motivation • Processor-Memory Performance Gap • Growing at 53% per year! • Chilimbi and Larus previous work • Placed objects with high temporal locality in the same cache block • Works best with objects < ½ cache block • Current paper proposes techniques for larger data structures

  3. Improving Cache Performance

  4. Two Cache-Conscious Definition Techniques • Structure Splitting • Split large data structures into two smaller structures • Field Reordering • Group fields in a structure with high temporal locality into the same cache block

  5. Structure Splitting • Split large structures into two smaller structures • “hot” structure contains frequently accessed fields • “cold” structure contains rarely accessed fields • Allow more “hot” structures to fit into the cache • Has been done manually in the past, this paper is first to automate

  6. Class Splitting Overview

  7. Experimental Setup • Ran compiled programs on Sun Ultraserver E5000 • 2 GB of memory • L1 dcache, 16 KB DM, 16 byte blocks • L2 cache, 1 MB DM, 64 byte blocks

  8. Results: Structure Splitting Reduces L2 miss rates by 10-27%; improves execution time by 10-20%

  9. Field Reordering • Logical ordering of the program is not usually consistent with its data access patterns • Frequently accessed fields may be coded next to rarely accessed fields, putting them in the same cache block • cause excessive cache misses • Reorder field definitions of structure • fields with high temporal affinity in same cache block

  10. bbcache • Tool that produces structure field reordering recommendations • Construct a database containing both static and dynamic information about the structure field accesses • Process database to construct field affinity graphs for each structure • Produce the structure field order recommendations for the affinity graphs • Attempts to group fields with high temporal affinity into the same cache block

  11. Experimental Setup • 4 processor 400MHz Pentium II Xeon system • 1MB L2 cache/processor • 4GB memory • Ran TPC-C on Microsoft SQL Server 7.0 to collect traces as input to bbcache • Chose 5 structures which showed largest potential from the benefits of reordering

  12. Results: Field Reordering Modified SQL Server was consistently better by 2-3% Cache block pressure =Σ (b1, …,bn)/n Cache block utilization = Σ(f11, …, fnbn)/ Σ(b1, …,bn)

  13. Conclusion • 2 techniques to improve cache performance • change the internal organization of fields in a data structure • Structure splitting • Field reordering • Structure Layouts better left to compiler • Easier to determine field access order

  14. Questions?

  15. Class Splitting Algorithm

  16. Structure Access Database

  17. Building Field Affinity Graphs

More Related