190 likes | 277 Views
Computer Architecture Lecture 3 – Part 2 15 th May, 2006. Abhinav Agarwal Veeramani V. Preamble. Last Quiz Webpage – Lec Slides, Quiz Sol, Verilog Lab Optional Lectures Limited Projects. Outline. Simple Pipeline – hazards and solution Out of order exceution Reg Renaming
E N D
Computer ArchitectureLecture 3 – Part 215th May, 2006 Abhinav Agarwal Veeramani V.
Preamble • Last Quiz • Webpage – Lec Slides, Quiz Sol, Verilog Lab • Optional Lectures • Limited Projects
Outline • Simple Pipeline – hazards and solution • Out of order exceution • Reg Renaming • In order Commit
Quick recap – Pipelining source: http://cse.stanford.edu/class/sophomore-college/projects-00/risc/pipelining/
Control Hazard • Branch delay slot • bnz r1, label • add r1, r2, r3 • label: sub r1, r2, r3 • Save one cycle stall. • Fetch in the negative edge to save another. bez r1, label IFID EX MEM WB Bubble IF Target IF
Branch Prediction • Deeper pipelines. • Such static compiler techniques would not work. • Dynamically remember last targets of this branch and take decision on basis of history
Data Hazards • RAW hazard – Read after Write • add r1, r2, r3 • store r1, 0(r4) • WAW hazard – Write after Write • div r1, r3, r4 • … • add r1, r10, r5 • WAR hazard – Write after Read • Generally not relevant in simple pipelines
Remedies • Bypass values (Data forwarding) • RAW hazards are tackled this way • Not all RAW hazards can be solved by forwarding. E.g.: Load delay • load r1, 0(r2) • add r3, r1, r4 • Solutions: • Software – Compiler Techniques • Hardware – Out of order Execution
Out of Order Execution source: EV8 DEC Aplha Procesor, (c) Intel
Register Renaming • lw r4, 0(r1) lw p2, 0(p7) • addi r2, r4, 0x20 • and r3, r4, r1 • xor r4, r2, r4 • sub r2, r4, r3 Register Map
Register Renaming • lw r4, 0(r1) lw p2, 0(p7) • addi r2, r4, 0x20 addi p1, p2, 0x20 • and r3, r4, r1 • xor r4, r2, r4 • sub p6, p5, p3 Register Map
Register Renaming • lw r4, 0(r1) lw p2, 0(p7) • addi r2, r4, 0x20 addi p1, p2, 0x20 • and r3, r4, r1 and p3, p2, p7 • xor r4, r2, r4 • sub r2, r4, r3 Register Map
Register Renaming • lw r4, 0(r1) lw p2, 0(p7) • addi r2, r4, 0x20 addi p1, p2, 0x20 • and r3, r4, r1 and p3, p2, p7 • xor r4, r2, r4 xor p5, p1, p2 • sub r2, r4, r3 sub p6, p5, p3 • WAW hazards eliminated • Useful for new processors which have larger no. of Physical Reg Register Map
In order Retirement • After Execution, each inst gets queued up in a table • This table ensures that the initial program order is maintained • Inst are allowed to become permanent only when they reach top of Re-order table
Remedies to Structural hazards • Simplest solution: Increase resources, functional units (Silicon allows us to do this) • Another solution: Pipeline the functional units • Pipelining is not always possible/feasible.
Superscalar execution! • Execute more than one instruction every cycle. • Make better use of the functional units • Fetch, commit more instructions every cycle.
Memory Organization in processors • Caches inside the chip • Faster – ‘Closer’ • SRAM cells • They contain recently-used data • They contain data in ‘blocks’
Rational behind caches • Principle of spatial locality • Principle of temporal locality • Replacement policy (LRU, LFU, etc.) • Principle of inclusivity
References • http://en.wikipedia.org/wiki/Hazard_(computer_architecture) • http://www.csee.umbc.edu/~plusquel/611/slides/chap3_3.html