180 likes | 398 Views
Parallel Architecture. 10.9 Parallel Architecture 10.10 Case Study: Parallel Processing in the Sega Genesis Semantic gap Most complex instructions and addressing modes went largely unused by compilers
E N D
Parallel Architecture • 10.9 Parallel Architecture • 10.10 Case Study: Parallel Processing in the Sega Genesis • Semantic gap • Most complex instructions and addressing modes went largely unused by compilers • Better to spend optimizing instructions that account for the greatest percentage of execution time rather than focusing on inherently complex but rare occur. • The bulk of programs are very simple at the instruction level • little or no payoff in increasing the complexity of the instructions
Instruction Frequency • • Frequency of occurrence of instruction types for a variety of languages. The percentages do not sum to 100 due to roundoff. (Adapted from Knuth, D. E., An Empirical Study of FORTRAN Programs, Software—Practice and Experience, 1, 105-133, 1971.)
Complexity of Assignments • • Percentages showing complexity of assignments and procedure calls. (Adapted from Tanenbaum, A., Structured Computer Organization, 4/e, Prentice Hall, Upper Saddle River, New Jersey, 1999.)
Trends in Computer Architecture • To make the frequent case fast means make it simple • Concentrate on making assignment statement fast • ld/st • Simple instruction set • From CISC to RISC • CISC do not fit pipeline architecture • RISC characteristics • (pg. 390) • large # of regs.; simple instructions & addressing modes; pipeline; ld/st
Pipeline Behavior • • Pipeline behavior during a memory reference and during a branch.
Filling the Load Delay Slot • • SPARC code, (a) with a nop inserted, and (b) with srl migrated to nop position.
Parallel Processing • multiple processors are coordinated to work on a single problem
PowerPC 601 • Superscalar architecture • IUs • FPUs • BPUs • branch instruction, especially conditional branch instructions pose a bottleneck • condition must be first ascertained to be true • branch address must then be computed • often involves address arithmetic • RISC machine (pg.405)
PowerPC 601 • 32 32-bit general registers (GPRs) • RISC • 32 64-bit FPRs • 8 4-bit CC registers • Nearly 50 special-purpose 32-bit registers • control memory management and OS • Over 250 Instruction • 32KB cache • MMU and memory unit assist in fetching both instructions and data
IU • 32 32-bit GPRs • 1 XER • holds exceptions arise within the IU
BPU • 8 4-bit CC registers • 8 instructions to have separate CC bits, not interfere with each other’s ability to set CC • Looks in the IQ • if a conditional branch instruction is found, it proceeds to compute the branch target address a head of time and fetches instructions at the branch target • results in a 0-cycle branch • Link register • can store subroutine return address
FPU • 32 64-bit FPRs • 1 FPSCR • can store exceptions • FPU is pipelined
Parallel Processing in Sega Genesis • • External view of the Sega Genesis home video game system.
Sega Genesis Architecture • • External view of the Sega Genesis home video game system.