220 likes | 409 Views
Whole Program Paths. James R. Larus. Outline. Find acyclic path fragments Convert into whole-program path Determine hot subpaths. Acyclic Paths. As per Ball&Larus paper we implemented. Calculating Acyclic Paths. Instrument chords Sum along paths is unique Postprocess for functions
E N D
Whole Program Paths James R. Larus
Outline • Find acyclic path fragments • Convert into whole-program path • Determine hot subpaths
Acyclic Paths • As per Ball&Larus paper we implemented
Calculating Acyclic Paths • Instrument chords • Sum along paths is unique • Postprocess for functions • Loop iter is new path • New: also function calls • Dump path ID to file
Outline • Find acyclic path fragments • Convert into whole-program path • Compress output string • Coalesce common substrings • Store efficiently • Determine hot subpaths
Grammatical Benefits • Explain output string as context-free grammar: • Efficient compression (~20x) • Automatic subsequence grouping • Grammar creation • Append symbols to start rule • Digrams appear at most once • Rules must be used at least twice • Example: 121213121214
Execution Representation • Not a control-flow graph! • Execution sequence = post-order traversal of DAG
Whole Paths • Efficient representation • Create grammar online • Execution context information • e.g., A runs after B • Frequency information • Simple path aggregation
Outline • Find acyclic path fragments • Convert into whole-program path • Determine hot subpaths • Find short frequent subsequences • ??? • Profit!
Outline • Find acyclic path fragments • Convert into whole-program path • Determine hot subpaths • Find short frequent subsequences • Heavily optimize that 1% • Applies to 75% of cache misses
Hot Subpaths • Looking for minimal hot subpaths • L or fewer consecutive acyclic path fragments with cost of C or greater • Cost = execution frequency x costs of acyclic path fragments • Path fragment cost = number of instructions
Finding Hot Subpaths • Recursively look for hot minimal subpaths • Split between children • Processed at lower recursive level
Results • Typically: • 30MB/sec program trace (@200MHz) • 1 MB/sec program path • 30 grammar rules per path fragment • 100,000 rules in grammar • Number of hot paths grows slowly with maximum length • Space sublinear in input size, time supralinear
Results • Typically: • 30MB/sec program trace (@200MHz) • 1 MB/sec program path • 30 grammar rules per path fragment • 100,000 rules in grammar • Number of hot paths grows slowly with maximum length • Space sublinear in input size, time supralinear
Summary • Contributions • Stream out acyclic path fragments in order • Compress and structure with grammar • Find hot subpaths from whole program path • Limitations • 15x runtime slowdown • Space-based limits on runtime • High number of hot paths found
Questions • What other potentially-useful information does this data structure give? • Order-dependent code errors • What potential for optimization does this open up? • Other applications? • Experimental hot-path results?