220 likes | 357 Views
An Instruction Set and Micro architecture for Instruction Level Distribution Processing. (Ho-Seop Kim and James E. Smith) Haiying Qu Electrical and Computer Engineering University of Alberta. Introduction 1. ILP : Instruction Level Parallelism Achieved significant performance gains
E N D
An Instruction Set and Micro architecture for Instruction Level Distribution Processing (Ho-Seop Kim and James E. Smith) Haiying Qu Electrical and Computer Engineering University of Alberta
Introduction 1 • ILP: Instruction Level Parallelism • Achieved significant performance gains • ILDP: Instruction Level Distributed Processing • Technology trend
Introduction 2 • Proposed Micro architecture • Short pipelines • Distributed processing elements: in-order instruction processing enable out-of order execution • Strand: dependent instructions • Accumulator • Inter instruction communication
64 General Purpose Registers: R0-R63 Source or Destination 8 Accumulators: A0-A7 Dead Accumulator Instruction Set
Load/store Instruction • One accumulator value • One GPR • One parcel • Ai <- mem(Aj) • Ai <- mem(Rj) • mem(Ai) <- Rj • mem(Rj) <- Ai
Register Instruction • Operation: accumulator and GPR/immediate • Result: accumulator or GPR • Ai <- Ai op Rj • Ai <- Ai op immed • Ai <- Rj op immed • Rj <- Ai • Rj <- Ai op immed
Branch/jump Instruction • Conditional branch: compare Ai, 0 or GPR(All usual predicates) • Program counter (p) • Indirect jump: Ai or GPR • Return address: GPR • P <- P + immed; Ai pred Rj • P <- P + immed; Ai pred 0 • P <- Ai • P <- Rj • P <- Ai; Rj <- P++
Strand Figure 3. Types of values and and associated registers
Two strands intersect: copy one to GPR Out put is a static global register New strand Strand Ends Figure 4. Issue timing
Stages • Fetch: 4 words-- over 4 instructions • Parceling: Break into individual instructions • Renaming: GPR • Steering: into FIFO according to the accumulators
Some Concepts • PE: Processing Element • IR: Issue Register—single Reservation Station • ICN: Interconnection Network
Table 1 Complexity Comparison Please be noted: the ILDP’s is based on one PE
Evaluation 1 Figure 7 type of register values Figure 8 Average strand length
Evaluation 2 Figure 9 Strand end Figure 10 instruction size
Evaluation 3 Figure 11 Cumulative strand re-use Figure 12 IPC
Evaluation 4 Figure 13 Global register rename map read/ write bandwidth