Course Review

14 Course Review Kai Bu kaibu@zju.edu.cn http://list.zju.edu.cn/kaibu/comparch2018

THANK YOU

EmailLinkedInTwitterWeibo... Keep in touch:)

Chapters 1&5Appendices A-D Disclaimer:) Exam questions may be beyond the coverage of review slides.

Lectures 02-03 Fundamentals of Computer Design

Classes of Parallel Architectures according to the parallelism in the instruction and data streams called for by the instructions: SISD, SIMD, MISD, MIMD

Instruction Set Architecture ISA • actual programmer-visible instruction set • the boundary between software and hardware

ISA: Class • Most are general-purpose register architectures with operands of either registers or memory locations • Two popular versions register-memory ISA: e.g., 80x86 many instructions can access memory load-store ISA: e.g., ARM, MIPS only load or store instructions can access memory

ISA: Memory Addressing • Byte addressing supports accessing individual bytes of data rather than only larger units called words • Aligned address object width: s bytes address: A aligned if A mod s = 0

Each misaligned object requires two memory accesses

ISA: Addressing Modes Specify the address of a memory object • Register Add R2, R1; R2<-R2+R1 • Immediate Add R2, #3; R2<-R2+3 • Displacement Add R2, 100(R1); R2<-R2+M[100+R1]

Measuring Performance • Execution time the time between the start and the completion of an event • Throughput the total amount of work done in a given time

Measuring Performance • Computer X and Computer Y • X is n times faster than Y

Quantitative Principles • Parallelism • Locality temporal locality: recently accessed items are likely to be accessed in the near future; spatial locality: items whose addresses are near one another tend to be referenced close together in time

Quantitative Principles • Amdahl’s Law

Quantitative Principles • Amdahl’s Law: two factors 1. Fractionenhanced: e.g., 20/60 if 20 seconds out of a 60-second program to enhance 2. Speedupenhanced: e.g., 5/2 if enhanced to 2 seconds while originally 5 seconds

Quantitative Principles • The Processor Performance Equation

ICi: the number of times instruction i is executed in a program CPIi: the average number of clocks per instruction for instruction i

Lecture 04 Instruction Set Principles

ISA Classification • Classification Basis the type of internal storage: stack accumulator register • ISA Classes: stack architecture accumulator architecture general-purpose register architecture (GPR)

ISA Classes:Stack Architecture • implicit operands on the Top Of the Stack • C = A + B Push A Push B Add Pop C First operand removed from stack Second op replaced by the result memory

ISA Classes:Accumulator Architecture • one implicit operand: the accumulator one explicit operand: mem location • C = A + B Load A Add B Store C accumulator is both an implicit input operand and a result memory

ISA Classes:General-Purpose Register Arch • Only explicit operands registers memory locations • Operand access: direct memory access loaded into temporary storage first

ISA Classes:General-Purpose Register Arch Two Classes: • register-memory architecture any instruction can access memory • load-store architecture only load and store instructions can access memory

ISA Classes:General-Purpose Register Arch Two Classes: • register-memory architecture any instruction can access mem • C = A + B Load R1, A Add R3, R1, B Store R3, C

ISA Classes:General-Purpose Register Arch Two Classes: • load-store architecture only load and store instructions can access memory • C = A + B Load R1, A Load R2, B Add R3, R1, R2 Store R3, C

Addressing Modes • How instructions specify addresses of objects to access • Types constant register memory location – effective address

Lectures 05-07 Pipelining

Pipelining start executing one instruction before completing the previous one

Pipelined Laundry 3.5 Hours Time Observations • No speed up for individual task; e.g., A still takes 30+40+20=90 • But speed up for average task execution time; e.g., 3.5*60/4=52.5 < 30+40+20=90 30 40 40 40 40 20 A Task Order B C D

MIPS Instruction • at most 5 clock cycles per instruction • IF ID EX MEM WB

MIPS Instruction IF ID EX MEM WB IR ← Mem[PC]; NPC ← PC + 4;

MIPS Instruction IF ID EX MEM WB A ← Regs[rs]; B ← Regs[rt]; Imm ← sign-extended immediate field of IR (lower 16 bits)

MIPS Instruction IF ID EX MEM WB ALUOutput ← A + Imm; ALUOutput ← A func B; ALUOutput ← A op Imm; ALUOutput ← NPC + (Imm<<2); Cond ← (A == 0);

MIPS Instruction IF ID EX MEM WB LMD ← Mem[ALUOutput]; Mem[ALUOutput] ← B; if (cond) PC ← ALUOutput;

MIPS Instruction IF ID EX MEM WB Regs[rd] ← ALUOutput; Regs[rt] ← ALUOutput; Regs[rt] ← LMD;

MIPS Instruction Demo • Prof. Gurpur Prabhu, Iowa State Univ http://www.cs.iastate.edu/~prabhu/Tutorial/PIPELINE/DLXimplem.html • Load, Store • Register-register ALU • Register-immediate ALU • Branch

Load

Store

Course Review

Course Review

Presentation Transcript

Course Review

Course Review

Course Review

Course Review

Crash Course Review

Course Review

Course review

Course Review

Readiness Review Course

Course Review

Course Review

Course Review

Course Review

Course Review

Course Review

Course Review

Course Review

Course Review

Course Review

CSE233 Course Review

Course Review