140 likes | 256 Views
CS 152, Spring 2010 Section 3. Andrew Waterman. University of California, Berkeley. Agenda. Precise Exceptions Problem Set 1 review. Precise Exceptions. Goal: provide illusion of sequential, non-overlapping instruction execution
E N D
CS 152, Spring 2010Section 3 Andrew Waterman University of California, Berkeley
Agenda Precise Exceptions Problem Set 1 review
Precise Exceptions • Goal: provide illusion of sequential, non-overlapping instruction execution • All instructions before exceptional one appear to have executed completely • All instructions including and following exceptional one appear to have not executed at all
Precise Exceptions • Two requirements for precise exceptions: • Keep architected state consistent • Take the correct (oldest) exception • How to keep architected state consistent: • Don’t update it until instruction is guaranteed to commit • Need to be able to draw a single “commit line”: • Before commit line, no architected state is modified • At commit line, it is known whether or not an exception will occur
Problem Set 1 Review – P1 Skipping 1.A-1.C (solutions online) 1.D: What did you notice about relative code size for CISC, RISC, and stack machines? 1.E: What optimization strategies proved effective? Did anyone beat my solution (7 instructions)?
Problem Set 1 Review – P2 • For microcode problems, key is to get the pseudocode right • Control signals follow readily from pseudocode • Sanity checks: • Only one device may drive the bus • The bus probably should be driven every cycle • Don’t read from a register whose write-enable was a don’t-care
Problem Set 1 Review – P2 • Most people got P2 A/B correct, but didn’t use don’t-cares aggressively • If you won’t read A/B/MA registers again, their write-enables should be don’t-cares • If enMem is off, Mem Wr is a don’t-care
Problem Set 1 Review – P2 • P2A: M[rd] <~~ M[rs] + M[rt] • MA <- R[rs] • A <- Mem • MA <- R[rd] • B <- Mem • MA <- R[rd] • Mem <- ALU (A+B); uBR=J • Note efficiency: 9 cycles vs. 18 for ld,ld,add,st
Problem Set 1 Review – P2 • P2B: if(--rs != 0) then branch • A <- R[rs] • R[rs] <- ALU (A-1); uBr=z • A <- signext(imm) • PC <- A+B; uBR=J • Recall that B <- PC+4 happened for free
Problem Set 1 Review – P2 • P2B: if(--rs != 0) then branch • A <- R[rs] • R[rs] <- ALU (A-1); uBr=z • A <- signext(imm) • PC <- A+B; uBR=J • Recall that B <- PC+4 happened for free
Problem Set 1 Review – P3 3.A: load-use stalls are gone (lw -> add) 3.B: address-calc, store data stalls appear (add -> lw/sw) 3.C: compiler can schedule around load-use stalls, but now address calculation is costly. Old pipeline was better 3.D: anyone have a favorite solution? 3.E: what is the problem with precise state?
Problem Set 1 Review – P4 Pipeline depth, microcode vs. hardwired are clearly NOT ISA visible CISC vs. RISC is ISA visible (it IS the ISA) Delay slot is ISA visible Stack machine’s # of physical registers isn’t ISA visible, provided the spill mechanism is automatic
Problem Set 1 Review – P5 • Deeper pipelining • Doesn’t affect I/P, increases CPI, reduces T • Adding complex insn • Reduces I/P if compiler can use; increases CPI, T? • Reducing bypasses • Doesn’t affect I/P, increases CPI, reduces T • Improving mem access speed • Doesn’t affect I/P, reduces either CPI or T
Questions? (short of “what’s on the quiz”)