1 / 10

A Combinatorial Architecture for Instruction-Level Parallelism

A Combinatorial Architecture for Instruction-Level Parallelism. Prepared by: HongJun Yu. Regulated Elements By Universal Scheme (REBUS). EXECUTABLE PROGRAM. Partitioned Instruction Streams. Processing Elements with Replicated Scratchpad Registers. Combinatorial Interconnection Structure.

Download Presentation

A Combinatorial Architecture for Instruction-Level Parallelism

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Combinatorial Architecture for Instruction-Level Parallelism Prepared by: HongJun Yu

  2. Regulated Elements By Universal Scheme (REBUS) EXECUTABLE PROGRAM Partitioned Instruction Streams Processing Elements with Replicated Scratchpad Registers Combinatorial Interconnection Structure MCU MCU MCU Sliced Memory Hierarchy MEMORY SYSTEM

  3. Processing Elements (PE) and Memory Coordination Units (MCU) 1 7 5 2 1 6 3 2 7 4 3 1 5 4 2 6 5 3 7 6 4 Reg PE 2 3 4 5 6 7 1 …… 2 3 4 5 6 7 1 MCU (X1, X2, X3, X4, X5, X6, X7) using (7, 7, 3, 3 ,1)

  4. Structure of MCU and its connections Processing Element Processing Element Processing Element To other MCUs Scratchpad Registers Unit Controller Cache Memory To and From Main Memory

  5. Structure of PE Global Signals Management Processor With Private Memory Queues of Scratchpad Copies R2 R1 MCU Interface R3

  6. Pairwise-balanced combinatorial interconnection • X={x1, X2, X3, X4, X5, X6, X7, X8, X9} a Balanced Incomplete Block (BIB) with configuration (b, v, r, k, λ) • v : element number; b: number of k-subsets; r: each element appears exactly in r subsets; λ : each pair of elements appears exactly in λ subsets • v*r=b*k • For example (12,9,4,3,1) is a BIB

  7. Cont’ • A program can be partitioned amongst the PEs by having an instruction’s operand pair determine the PE to which the instruction should be designated • ADD R1 R7 • MULT R2 R6 • DIV R4 R5 PE #1 PE #2 PE #5

  8. Excellent ideas: • Implementing ultra parallelism using balanced incomplete block(BIB); • Expand parallelism from instruction level to assembly code level; • Parallelism is not restricted in a small size “window” of code; • Support parallelism among a group of connected processors; • Compatible to current technologies using in compiler and superscalar. • Could benefit to both RISC and CISC.

  9. Characteristics: • Using fixed format of assembly code; • Usage of memory coordination units (MCU); • Need data replication

  10. Future work • Apply on multi-threaded processing • Various instruction format support

More Related