180 likes | 316 Views
Cache-Conscious Structure Definition. By Trishul M. Chilimbi, Bob Davidson, and James R. Larus Presented by Shelley Chen March 10, 2003. Motivation. Processor-Memory Performance Gap Growing at 53% per year! Chilimbi and Larus previous work
E N D
Cache-ConsciousStructure Definition By Trishul M. Chilimbi, Bob Davidson, and James R. Larus Presented by Shelley Chen March 10, 2003
Motivation • Processor-Memory Performance Gap • Growing at 53% per year! • Chilimbi and Larus previous work • Placed objects with high temporal locality in the same cache block • Works best with objects < ½ cache block • Current paper proposes techniques for larger data structures
Two Cache-Conscious Definition Techniques • Structure Splitting • Split large data structures into two smaller structures • Field Reordering • Group fields in a structure with high temporal locality into the same cache block
Structure Splitting • Split large structures into two smaller structures • “hot” structure contains frequently accessed fields • “cold” structure contains rarely accessed fields • Allow more “hot” structures to fit into the cache • Has been done manually in the past, this paper is first to automate
Experimental Setup • Ran compiled programs on Sun Ultraserver E5000 • 2 GB of memory • L1 dcache, 16 KB DM, 16 byte blocks • L2 cache, 1 MB DM, 64 byte blocks
Results: Structure Splitting Reduces L2 miss rates by 10-27%; improves execution time by 10-20%
Field Reordering • Logical ordering of the program is not usually consistent with its data access patterns • Frequently accessed fields may be coded next to rarely accessed fields, putting them in the same cache block • cause excessive cache misses • Reorder field definitions of structure • fields with high temporal affinity in same cache block
bbcache • Tool that produces structure field reordering recommendations • Construct a database containing both static and dynamic information about the structure field accesses • Process database to construct field affinity graphs for each structure • Produce the structure field order recommendations for the affinity graphs • Attempts to group fields with high temporal affinity into the same cache block
Experimental Setup • 4 processor 400MHz Pentium II Xeon system • 1MB L2 cache/processor • 4GB memory • Ran TPC-C on Microsoft SQL Server 7.0 to collect traces as input to bbcache • Chose 5 structures which showed largest potential from the benefits of reordering
Results: Field Reordering Modified SQL Server was consistently better by 2-3% Cache block pressure =Σ (b1, …,bn)/n Cache block utilization = Σ(f11, …, fnbn)/ Σ(b1, …,bn)
Conclusion • 2 techniques to improve cache performance • change the internal organization of fields in a data structure • Structure splitting • Field reordering • Structure Layouts better left to compiler • Easier to determine field access order