1 / 33

Cost Effective Memory Dependence Prediction Using Speculation Levels and Color Sets

Cost Effective Memory Dependence Prediction Using Speculation Levels and Color Sets. Soner Önder Michigan Technological University, Houghton MI www.cs.mtu.edu/~soner. Outline. Background Memory dependence prediction. Pairing based approach. Store sets. Color sets Notion of color sets.

burian
Download Presentation

Cost Effective Memory Dependence Prediction Using Speculation Levels and Color Sets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cost Effective Memory Dependence Prediction Using Speculation Levels and Color Sets Soner Önder Michigan Technological University, Houghton MI www.cs.mtu.edu/~soner

  2. Outline • Background • Memory dependence prediction. • Pairing based approach. • Store sets. • Color sets • Notion of color sets. • Color set implementation. • Color set predictor. • Instruction window modifications. • Experimental evaluation • Basic policy. • Aggressive policy.

  3. Seq. 1 2 3 p p+1 p+2 p+3 Ready No Yes No Yes Yes No Yes Instruction ST-1 ST-2 ST-3 ST-p ST-p+1 ST-p+2 LD-s p No St-p Memory Dependence Prediction • Assume ST-2, ST-p and LD-s all access the same memory location. • If we issue LD-s at this point in time, we’ll get a memory order violation. • If we know Load Ld-s is dependent on Store St-p, we can issue the load at the right time.

  4. Dynamic Memory Disambiguation • Problem: • In the presence of unresolved stores in the instruction window, which load(s) must be held? • Ideal Solution: • Wait only for the producer store. • Simple Solutions: • Wait for all - no speculation. • Issue blindly - blind speculation.

  5. Memory dependence prediction(Moshovos et al. 1997-1998) • Earlier work which mainly concentrated on predicting precise dependencies among pairs of load/store instructions : • To enable early issuing of loads through memory dependence prediction. • To streamline communication so that values can be directly passed from producers to consumers instead of through memory. • Emphasis has been given to identifying the precise store instruction a load may depend on.

  6. Store-set Memory Dependence Predictor(Chrysos & Emer - 1998) • A store set is the set of all stores a load has been observed to be dependent on. • Initially employ blind speculation for loads. • Upon memory order violation create a store set for the offending load and store. • Next time the same load is encountered make the load wait until the store issues. • Store set may contain multiple stores: chain the stores and make load dependant upon the last store.

  7. Store-set Implementation PC LFST SSID • Dependence information is digested to create SETS of colliding instructions. • Each set tells exactly which stores a load should wait for. • Sufficiently large tables yield performance of an ORACLE.

  8. Color Set predictor • Instead of • predicting precise dependencies among pairs of loads/stores or • constructing sets of store and load instructions which collided in the past, • We assign the processor, load and store instructions various speculation levels (colors) and predict the speculation level (i.e.,the color) a load or store can be issued without a collision. Predictor size

  9. Color Set predictor • Since we only try to predict the speculation level, we expect to have: • smaller storage for the predictor, • better performance at smaller hardware budgets, • faster implementations, • power savings and • more collisions.

  10. So, it is something like this The rules governing the color change:policies. We investigate two policies, a basic policy and an aggressive policy. 00 01 10 11 Processor 00 01 10 11 Load

  11. Load instruction selection Eligible load instructions 00 01 10 11 Current processor color

  12. Load instruction selection Eligible load instructions 00 01 10 11 Current processor color

  13. Load instruction selection Eligible load instructions 00 01 10 11 Current processor color

  14. Load instruction selection Eligible load instructions 00 01 10 11 Current processor color

  15. + + Instruction window extensions 0 Inhibit color Window details Global color 0 1 + 0 <= 0 + + 1 Issue? 0 + 0 + Instructions entering window

  16. load store load store 01 01 10 01 Collisions 00 01 10 11 Current processor color

  17. Color Set Predictor Basic Policy • 1. Basic policy gradually becomes aggressive when port utilization is low. • 2. The load instruction is given a higher color and a store instruction given a lower color upon a collision. • 3. Processor runs at the smaller of the current processor color and the color of the store instructions. • 4. Rules 2 & 3 together runs the processor at a lower speculation level than the level the prior collision has occurred.

  18. Color Set Predictor Aggressive Policy • 1. Aggressive policy switches to maximum speculation level when port utilization is low. • 2. The load instruction is given a higher color and a store instruction is specifically marked upon a collision. • 3. Processor decrements the current processor color when a colliding store is detected. • 4. As a result, the processor runs at the highest speculation level that won’t result in a collision and at a different color than the color it had during the collision.

  19. Color Set Predictor • Accessed early in the pipeline using L/S PC • Updated upon collision/successful speculation Basic Policy 00 No speculation 01 Level 1 10 Level 2 11 Level 3 L/S PC L/S color 10 Aggressive Policy 00 No speculation 01 Level 1 10 Level 2 11 Level 3/Colliding store

  20. Low port utilization Colliding stores Processor’s colorful perspective Basic policy • When port utilization is low, the processor moves on to next color. • Processor assumes the lowest ranking store’s color. 00 01 10 11

  21. Low port utilization Colliding stores Processor’s colorful perspective Aggressive policy • When a colliding store enters the window, the processor decrements its color. • When port utilization is low, processor switches to red. 00 01 10 11

  22. Successful speculation Collision Load instruction color states Both policies 00 01 10 11

  23. Simulation Framework • Aggressive out-of-order superscalar processor: • 8 instructions/cycle fetch/dispatch • 16 instructions/cycle retire width • 64 entry centralized reservation station • 8 symmetric functional units • Multi-block gshare fetch unit • 2 memory ports r/w • Perfect D-cache • Simulated using cycle-accurate simulators generated automatically from ADL descriptions using the FAST system.

  24. Performance Spec Fp Arithmetic Mean

  25. Performance Spec Fp Harmonic Mean

  26. Performance Spec Int Arithmetic Mean

  27. Performance Spec Int Harmonic Mean

  28. Individual benchmarks 128-Fp

  29. Individual benchmarks 4096-Fp

  30. Individual benchmarks 128-Int

  31. Individual benchmarks 4096-Int

  32. So ... • Cost effective dependence prediction. • Why does it work? • Design space: • Number of colors/number of entries. • Confidence mechanisms. • Other policies. • Power consumption • Disable chunks of predictor and use basic policy; • Enable and become aggressive.

  33. Have a colorful evening Soner Önder Michigan Technological University Antalya, Turkey

More Related