1 / 23

Circuit-Level Timing Speculation: The Razor Latch

Circuit-Level Timing Speculation: The Razor Latch. Developed by Trevor Mudge’s group at the University of Michigan, 2003. We’ve Already Encountered Speculation in ECE 568. Branch prediction When a branch is encountered, guess whether it is taken or not

creda
Download Presentation

Circuit-Level Timing Speculation: The Razor Latch

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Circuit-Level Timing Speculation: The Razor Latch Developed by Trevor Mudge’s group at the University of Michigan, 2003

  2. We’ve Already Encountered Speculation in ECE 568 • Branch prediction • When a branch is encountered, guess whether it is taken or not • If the guess is correct, we have gained time • If the guess is incorrect, we must undo any incorrectly executed instructions and move on • Multi-word cache lines • When a cache miss is encountered, we bring in the entire cache line, not just the word we’re looking for • If the access pattern shows spatial locality, we are prefetching other words that the program will soon ask for, thereby saving time. • If the speculation is too aggressive (i.e., the cache lines are too long), we’ll fetch many words uselessly.

  3. Speculation (contd.) • Value Prediction • (Not covered in this course) • Idea is to predict what the value of a variable will be and use the predicted value. • If the predicted value was right, we gain some time; if it was wrong, we did some useless execution. If this execution changed processor state, these changes will have to be undone. • Not used in practice (to my knowledge): mainly an academic exercise so far.

  4. Speculating on Time • The pipeline clock cycle is the time by which each stage is guaranteed to complete its assigned operation • This time is a function of • Actual hardware parameters: Gate and wire delays vary within the same die, from one die to another, and from one wafer to another. • Data involved in the computation: • Example: Ripple-carry adder. Worst-case execution time is the time it takes to ripple the carry through from carry-in of the least significant, to the carry-out of the most significant, stage. Actual execution times may vary considerably. • Requiring the worst-case delays to be accounted for often forces designers to be overly conservative in setting the clock rates

  5. Timing Speculation: Basic Idea • Suppose F is the frequency at which the pipeline is guaranteed to function correctly • Run the pipeline at a somewhat higher rate, f. • Much of the time, this clock period, t_p=1/f, will be sufficient for all pipeline stages, and we’ll gain in execution speed • Some of the time, we may need more time: • Need to discover when this is the case • When this is the case, provide additional time by allowing the pipeline stages additional cycles to complete their operation

  6. Implementation • Recall that pipeline stages are separated by latches • Duplicate each pipeline latch by introducing a shadow latch • Consider any stage of the pipeline. Suppose it starts some activity at time 0. • At time t_p=1/f, latch the output of that stage into the regular pipeline latch. • At time T_p=1/F, latch the output of the stage into the shadow latch. • Compare the results of the regular and shadow latches • If they agree, • do nothing: running at a higher speed has paid off • If they don’t agree, • Use the result of the shadow latch as the correct one • Squelch the computation that the following stage began on the basis of the incorrect shadow latch results • Restart the computation in the following stage using the correct results, as stored in the shadow latch

  7. Unless otherwise stated, all figures are from Ernst, et al., MICRO-36, 2003.

  8. Issues to Consider • How aggressive should we be? • If f is too high, a large fraction of the results will require correction with the shadow latch and we’ll actually lose time • If f is too low, the clock will be unnecessarily too slow and we won’t gain much

  9. Issues to Consider (contd.) • What about F? • Lower bound of F is given by the worst-case path (for the worst-case inputs) • What happens if F is too small? [This is one of the few instances in design when being too conservative at one level affects correctness of functioning!] • F may be so small that the results of the next computation propagate through the stage and arrive at the shadow latch • We’d then be comparing the results of two different operations!

  10. Metastability • If the input data is not stable when the clock transition happens, the output of the latch may float at a voltage that is in neither the 0 nor in the 1 logic ranges • Duration of metastable stage is not bounded • Different gates may interpret such indeterminate voltages differently (in terms of logic values) • Cannot reduce the probability of metastability to zero: all we can do is to keep it sufficiently low for all practical purposes

  11. Recovery Technique 1: Global Clock Gating • If any stage detects a timing problem • Stall the entire pipeline for one clock cycle. • Use this additional clock cycle to recompute using the correct shadow-latch values

  12. Recovery Technique 2:Counterflow Pipelining • When a mismatch (between regular and shadow latch contents) is detected: • Assert a bubble signal, to specify that the erring pipeline slot is now to be considered a bubble. • In the subsequent cycle, inject the shadow latch value into the next stage, allowing the errant operation to continue with the correct values • Trigger a flush train, traveling backwards from the errant stage, flushing operations at each stage it visits (Question: Is this flush operation necessary?? Can we do something else to avoid it?)

  13. Power ConsumptionUsing a Processor to Fry an Egg From: www.phys.ncku.edu.tw/~htsu/humor/fry_egg.html

  14. Power Density From: Hsu and Feng, “A Power-Aware Real-Time System…”, 2005

  15. Power Implications: Dynamic Power From Krishna & Lee: IEEE Trans. Computers, 2003.

  16. Static Power • Even when there is no switching, transistors leak current • Leakage power is a strongly increasing function of temperature and supply voltage; it is inversely proportional to the threshold voltage.

  17. Subthreshold leakage vs temperature From: Do, et al: Tech Report 2007-06, Dept of CSE, Chalmers Instt of Tech

  18. Leakage Current vs Vdd From Do et al., op cit.

  19. Voltage Control for Razor Latch System

More Related