1 / 212

Course Review

14. Course Review. Kai Bu kaibu@zju.edu.cn http://list.zju.edu.cn/kaibu/comparch2017. THANK YOU. Email LinkedIn Twitter Weibo... Don't hesitate to keep in touch:). Lectures 02-03. Fundamentals of Computer Design. Classes of Parallel Arch itectures. according to the parallelism

pneal
Download Presentation

Course Review

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 14 Course Review Kai Bu kaibu@zju.edu.cn http://list.zju.edu.cn/kaibu/comparch2017

  2. THANK YOU

  3. EmailLinkedInTwitterWeibo...Don't hesitate to keep in touch:)

  4. Lectures 02-03 Fundamentals of Computer Design

  5. Classes of Parallel Architectures according to the parallelism in the instruction and data streams called for by the instructions: SISD, SIMD, MISD, MIMD

  6. SISD • Single instruction stream single data stream • uniprocessor • Can exploit instruction-level parallelism

  7. SIMD • Single instruction stream multiple data stream • The same instruction is executed by multiple processors using different data streams. • Exploits data-level parallelism • Data memory for each processor; whereas a single instruction memory and control processor.

  8. MISD • Multiple instruction streams single data stream • No commercial multiprocessor of this type yet

  9. MIMD • Multiple instruction streams multiple data streams • Each processor fetches its own instructions and operates on its own data. • Exploits task-level parallelism

  10. Instruction Set Architecture ISA • actual programmer-visible instruction set • the boundary between software and hardware

  11. ISA: Class • Most are general-purpose register architectures with operands of either registers or memory locations • Two popular versions register-memory ISA: e.g., 80x86 many instructions can access memory load-store ISA: e.g., ARM, MIPS only load or store instructions can access memory

  12. ISA: Memory Addressing • Byte addressing supports accessing individual bytes of data rather than only larger units called words • Aligned address object width: s bytes address: A aligned if A mod s = 0

  13. Each misaligned object requires two memory accesses

  14. ISA: Addressing Modes • Specify the address of a memory object • Register Add R2, R1; R2<-R2+R1 • Immediate Add R2, #3; R2<-R2+3 • Displacement Add R2, 100(R1); R2<-R2+M[100+R1]

  15. Trends in Cost • Cost of an Integrated Circuit wafer for test; chopped into dies for packaging

  16. Trends in Cost • Cost of an Integrated Circuit percentage of manufactured devices that survives the testing procedure

  17. Trends in Cost • Cost of an Integrated Circuit

  18. Trends in Cost • Cost of an Integrated Circuit

  19. Trends in Cost • Cost of an Integrated Circuit • N: process-complexity factor for measuring manufacturing difficulty

  20. Dependability • Two measures of dependability Module reliability Module availability

  21. Dependability • Two measures of dependability Module reliability continuous service accomplishment from a reference initial instant MTTF: mean time to failure MTTR: mean time to repair MTBF: mean time between failures MTBF = MTTF + MTTR 1st f 2nd f

  22. Dependability • Two measures of dependability Module reliability FIT: failures in time failures per billion hours MTTF of 1,000,000 hours = 109/106 = 1000 FIT

  23. Dependability • Two measures of dependability Module availability

  24. Measuring Performance • Execution time the time between the start and the completion of an event • Throughput the total amount of work done in a given time

  25. Measuring Performance • Computer X and Computer Y • X is n times faster than Y

  26. Quantitative Principles • Parallelism • Locality temporal locality: recently accessed items are likely to be accessed in the near future; spatial locality: items whose addresses are near one another tend to be referenced close together in time

  27. Quantitative Principles • Amdahl’s Law

  28. Quantitative Principles • Amdahl’s Law: two factors 1. Fractionenhanced: e.g., 20/60 if 20 seconds out of a 60-second program to enhance 2. Speedupenhanced: e.g., 5/2 if enhanced to 2 seconds while originally 5 seconds

  29. Quantitative Principles • The Processor Performance Equation

  30. ICi: the number of times instruction i is executed in a program CPIi: the average number of clocks per instruction for instruction i

  31. Lecture 04 Instruction Set Principles

  32. ISA Classification • Classification Basis the type of internal storage: stack accumulator register • ISA Classes: stack architecture accumulator architecture general-purpose register architecture (GPR)

  33. ISA Classes:Stack Architecture • implicit operands on the Top Of the Stack • C = A + B Push A Push B Add Pop C First operand removed from stack Second op replaced by the result memory

  34. ISA Classes:Accumulator Architecture • one implicit operand: the accumulator one explicit operand: mem location • C = A + B Load A Add B Store C accumulator is both an implicit input operand and a result memory

  35. ISA Classes:General-Purpose Register Arch • Only explicit operands registers memory locations • Operand access: direct memory access loaded into temporary storage first

  36. ISA Classes:General-Purpose Register Arch Two Classes: • register-memory architecture any instruction can access memory • load-store architecture only load and store instructions can access memory

  37. ISA Classes:General-Purpose Register Arch Two Classes: • register-memory architecture any instruction can access mem • C = A + B Load R1, A Add R3, R1, B Store R3, C

  38. ISA Classes:General-Purpose Register Arch Two Classes: • load-store architecture only load and store instructions can access memory • C = A + B Load R1, A Load R2, B Add R3, R1, R2 Store R3, C

  39. GPR Classification • ALU instruction has 2 or 3 operands? 2 = 1 result&source op + 1 source op 3 = 1 result op + 2 source op • ALU instruction has 0, 1, 2, or 3 operands of memory address?

  40. Addressing Modes • How instructions specify addresses of objects to access • Types constant register memory location – effective address

  41. Lectures 05-07 Pipelining

  42. Pipelining start executing one instruction before completing the previous one

  43. Pipelined Laundry 3.5 Hours Time Observations • No speed up for individual task; e.g., A still takes 30+40+20=90 • But speed up for average task execution time; e.g., 3.5*60/4=52.5 < 30+40+20=90 30 40 40 40 40 20 A Task Order B C D

  44. MIPS Instruction • at most 5 clock cycles per instruction • IF ID EX MEM WB

  45. MIPS Instruction IF ID EX MEM WB IR ← Mem[PC]; NPC ← PC + 4;

More Related