1 / 36

A Dictionary Construction Technique for Code Compression Systems with Echo Instructions

A Dictionary Construction Technique for Code Compression Systems with Echo Instructions. Philip Brisk. Jamie Macbeth. Ani Nahapetian. Majid Sarrafzadeh. {philip, macbeth, ani, majid}@cs.ucla.edu. Embedded and Reconfigurable Systems Lab. Computer Science Department.

willis
Download Presentation

A Dictionary Construction Technique for Code Compression Systems with Echo Instructions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Dictionary Construction Technique for Code Compression Systems with Echo Instructions Philip Brisk Jamie Macbeth Ani Nahapetian Majid Sarrafzadeh {philip, macbeth, ani, majid}@cs.ucla.edu Embedded and Reconfigurable Systems Lab Computer Science Department University of California, Los Angeles LCTES ’05. June 16, 2005. Chicago, IL

  2. Outline • Introduction: Code Compression • Dictionary Compression • Dictionary Construction • Overview of the Algorithm • Experimental Methodology and Results • Summary

  3. Introduction: Code Compression For Embedded Systems • Why Reduce Program Size? • Reduces Memory Requirements • Silicon Cost of Program Storage in on-chip ROMs • As Embedded Systems Become More Complex, Ever-More Functionality Will Migrate to Software • Costs of Runtime Decompression • Performance Overhead • Area of the Decoder Circuitry

  4. Dictionary Compression • Find Repeated Code Sequences • Place Each Sequence Into a Dictionary • Replace Each Sequence in the Program with a Codeword that Accesses the Dictionary Dictionary Program

  5. CALD and Echo Instructions • CALD Instructions • Place each sequence in a dictionary • All Codewords Point to the Dictionary • Echo Instructions • Leave one Instance of the Sequence Inline • All Codewords Point to the Sequence Dictionary Program Program

  6. Compression Algorithms • The Traditional Approach: Compression Performed at Link Time • Substring Matching [Fraser et al., 1984] • + Register Renaming [Cooper and McIntosh, 1999] • [Debray et al., 2000] • + Instruction Rescheduling [De Sutter et al., 2002] • Our Approach is Somewhat Different… • Identify Repeated Isomorphic Patterns that Occur within the Intermediate Representation PRIOR TO Register Allocation [Brisk et al., 2004]

  7. Dictionary Construction Sequence 1 Dictionary 1 DAG 1 A: R1 ← R2 + R3 B: R4 ←R5 + R6 C: R7 ←R1 + R4 A: R1 ← R2 + R3 B: R4 ←R5 + R6 C: R7 ←R1 + R4 5 A: R1 ← R2 + R3 C: R7 ←R1 + R4 DAG 2 Sequence 2 A: R1 ← R2 + R3 C: R7 ←R1 + R4 Dictionary 2 2 Schedules Exist for DAG 1 B: R4 ←R5 + R6 3 A: R1 ← R2 + R3 C: R7 ←R1 + R4 DAG 2 is isomorphic to a subgraph of DAG 1

  8. Isomorphic Pattern Generation • Edge Contraction • Add an Operation to a Pattern • Combine 2 Patterns into a Larger One • Build a Subgraph Hierarchy (SH)

  9. Isomorphic Pattern Generation • Edge Contraction • Add an Operation to a Pattern • Combine 2 Patterns into a Larger One • Build a Subgraph Hierarchy (SH) T1 SH T1

  10. Isomorphic Pattern Generation • Edge Contraction • Add an Operation to a Pattern • Combine 2 Patterns into a Larger One • Build a Subgraph Hierarchy (SH) T1 SH T1

  11. Isomorphic Pattern Generation • Edge Contraction • Add an Operation to a Pattern • Combine 2 Patterns into a Larger One • Build a Subgraph Hierarchy (SH) T1 T2 SH T1

  12. Isomorphic Pattern Generation • Edge Contraction • Add an Operation to a Pattern • Combine 2 Patterns into a Larger One • Build a Subgraph Hierarchy (SH) T1 T2 SH T1 T2

  13. Isomorphic Pattern Generation • Edge Contraction • Add an Operation to a Pattern • Combine 2 Patterns into a Larger One • Build a Subgraph Hierarchy (SH) T1 T2 SH T1 T2

  14. Isomorphic Pattern Generation • Edge Contraction • Add an Operation to a Pattern • Combine 2 Patterns into a Larger One • Build a Subgraph Hierarchy (SH) T1 T2 SH T1 T2

  15. Isomorphic Pattern Generation • Edge Contraction • Add an Operation to a Pattern • Combine 2 Patterns into a Larger One • Build a Subgraph Hierarchy (SH) T1 T2 SH T1 T2

  16. Isomorphic Pattern Generation • Edge Contraction • Add an Operation to a Pattern • Combine 2 Patterns into a Larger One • Build a Subgraph Hierarchy (SH) T2 T3 T1 T2 SH SH T2 T3 T1 T2 T4

  17. Isomorphic Pattern Generation • Edge Contraction • Add an Operation to a Pattern • Combine 2 Patterns into a Larger One • Build a Subgraph Hierarchy (SH) T2 T3 T1 T2 SH SH T2 T3 T1 T2

  18. Isomorphic Pattern Generation • Edge Contraction • Add an Operation to a Pattern • Combine 2 Patterns into a Larger One • Build a Subgraph Hierarchy (SH) T4 T2 T3 T1 T2 SH SH T2 T3 T1 T2 T4

  19. Isomorphic Pattern Generation • Edge Contraction • Add an Operation to a Pattern • Combine 2 Patterns into a Larger One • Build a Subgraph Hierarchy (SH) T4 T2 T3 T1 T2 SH SH T2 T3 T1 T2 T4

  20. An SH Grammar • The SH is also a DAG • Generate a pattern Tk from sub-patterns Ti and Tj; • Contract edge (Ti, Tj) • Create a Production: Tk→ TiTj x x T4 T2 T1 T2 T3 T2→ xT1 T4→ T3T2

  21. Derivations and Scheduling Grammar G1 G2 G1→ G2G3 G2→ G4b G3→ G5g G4→ ac G5→ G6f G5→ G7e G6→ de G7→ df a a b b c c G3 d d G1 G1 Derivations e f e f G2 G2 G3 G3 g g b b G5 G5 G4 G4 g g a G4 G5 d G6 G7 f e ac ac c e f d d de df G7 G6 f e acbdefg acbdfeg

  22. Compatibility Ti, Tj – patterns Si, Sj – schedules for Ti, Tj Assume Ti is a Subgraph of Tj We want Ti and Tj to Share the Same Dictionary Entry Then Si must be a Contiguous Subsequence of Sj. AC is a Contiguous Subsequence of BAC but not ABC A: R1 ← R2 + R3 B: R4 ←R5 + R6 C: R7 ←R1 + R4 B: R4 ←R5 + R6 A: R1 ← R2 + R3 C: R7 ←R1 + R4 A: R1 ← R2 + R3 C: R7 ←R1 + R4

  23. Convex Cuts in DAGs • Let G = (V, E) be a DAG • A Cut is a Partition of V • A Convex Cut cannot have edges that cross the boundary of a cut in BOTH directions • SH Construction Ensures Convex Cuts Convex Cut / Scheduling DAG Non-Convex Cut

  24. Convex Cuts and Compatibility G4 G2 a G1 G5 a b c G1→(2,3),(4,5) a G3 b c d b c a d e a a e f d b b c b g f g d c e f d G1→(4,5) G1→(2,3) f d g e f a a e c g b b c c f e CYCLE! g g d d e G1→ G2G3 g e f f G1→ G4G5 g

  25. Generalized Compatibility Given a Set of Productions with G1 on the LHS… G1→ G4G5 G1→ G2G3 , … G1→ G2kG2k+1 How can we Tell if they are Compatible? • Three Criteria Equivalent to Compatibility • G1→(2,3),(4,5),…,(2k,2k+1) is Acyclic • G2 G4 … G2k • G2k+1 … G5 G3 The Pragmatic Question: If all Productions are NOT Compatible, what is the Largest Compatible Subset?

  26. The Subset/Subgraph View of Compatibility and Scheduling Gi Si Si Sj-i Gj Gj - Gi Sj-i Gi Gj • Construct a Schedule Si for Gi • Construct a Schedule Sj-i for Gj-i • Construct a Schedule Sj = SiSj-i for Gj

  27. A Production Compatibility Graph • Represent the Subgraph Relation as a DAG • called the Production Compatibility Graph (PCG) • Productions G1 → Gi… and G1 → Gj… create vertices Gi and Gj • Add an Edge (Gi, Gj) to the PCG if • 1. Gi Gj • 2. There is no Gk such that Gj Gk Gi • Any PATH in the PCG Corresponds to a Subset • of Patterns that can be Scheduled Contiguously • within a Dictionary entry for G1.

  28. PCG Example G2 G4 a G1 a G5 b c a b PCG c G3 d b c d G8 d e e f e f g f g g G4 G2 G6 G10 a a G8 a b c b b c G7 G6 G9 G10 d d d c f e f e f e G11 g g g

  29. Algorithm Overview • Recall that the Subgraph Hierarchy is a DAG • Process SH Entries in Topological Order • All Sub-Patterns Processed Before Each Pattern • Construct a PCG for each SH Entry • Assign Vertex Weights to Each Pattern based on the Number of Sub-Patterns in the Dictionary Entry • Find Max Vertex-Weighted Path in the PCG • Determine the Maximum Gain Pattern in the SH • Remove the Max Gain Pattern – and all Sub-Patterns Selected for its Dictionary Entry • Repeat until the SH is Empty

  30. Experimental Framework • Algorithm Built into the Machine SUIF Compiler • Consolidate Each Application using link_suif Pass • All Unrolled Loops Manually Re-rolled • Standard Front End Compilation Script • One Round of Constant Folding/DCE • Instruction Selection for Alpha Architecture • ARM Back End Recently Released… • Detect Recurring Isomorphic Patterns in the IR • Analysis described in [Brisk et al., 2004] • Dictionary Construction as Described Here

  31. Experimental Methodology • Cannot Compare with Substring Matching • Many Schedules Exist for Each DAG • Substring Matching Assumes Scheduled Code • How to Determine the Best Schedule for Each DAG? • Our Algorithm Determines a Schedule for the Entire Set of DAGs to Maximize Pattern Overlap • Naïve Approach – Each Pattern Gets Its Own Dictionary Entry • Our Approach - Isomorphism/Scheduling

  32. Experimental Results Applications Taken from MediaBench [Lee et al., 1997]

  33. Compilation Time

  34. Conclusion • Algorithm Given for Dictionary Construction • What Is Built is Actually an Intermediate Representation of a Dictionary • Combination of 3 Classically Hard Problems • Graph/Subgraph Isomorphism • Scheduling • Dictionary Construction/Compression • Future Work: Register Allocation and Assignment • Make a Best Effort to Assign Registers So that Isomorphic Patterns have Identical Register Usage

  35. References • 1. Brisk, P., Nahapetian, A., and Sarrafzadeh, M. Instruction Selection for Compilers that Target Architectures with Echo Instructions, SCOPES 2004. • 2. Fraser, C. W., Myers, E., and Wendt, A. Analyzing and Compressing Assembly Code. Symposium on Compiler Construction, 1984. • 3. Cooper, K. D., and McIntosh, N. Enhanced Code Compression for Embedded RISC Processors, PLDI 1999. • De Sutter, B., De Bus, B., and De Bosschere, K. Sifting out the Mud: Low-Level C++ Code Reuse, OOPSLA 2002. • Debray, S., Evans, W., Muth, R., and De Sutter, B. Compiler Techniques for Code Compaction, TOPLAS, 2000. • Lee, C., Potkonjak, M., and Mangione-Smith, W. H. MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems, MICRO-30, 1997.

  36. Questions ?

More Related