1 / 1

A Reconfigurable Functional Unit for an Adaptive Dynamic Extensible Processor

A Reconfigurable Functional Unit for an Adaptive Dynamic Extensible Processor. Detects start addresses of Hot Basic Blocks (HBBs). N-way in-order general RISC. Adaptive Dynamic Extensible Processor. Base Processor. Fetch. Reg File. Augmented Hardware. Decode.

arnold
Download Presentation

A Reconfigurable Functional Unit for an Adaptive Dynamic Extensible Processor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Reconfigurable Functional Unit for an Adaptive Dynamic Extensible Processor Detects start addresses of Hot Basic Blocks (HBBs) N-way in-order general RISC Adaptive Dynamic Extensible Processor Base Processor Fetch Reg File Augmented Hardware Decode Switches between main processor and RFU Profiler Execute RFU Memory Sequencer Write Executes Custom Instructions : Functional Unit : Base connection : Optimized connection Integrating RFU with the Base Processor Reg0 Reg0 Reg31 Reg31 ……………………………………………………………… ……………………………………………………………… . . Config Config Mem Mem Decoder Decoder Sequencer Sequencer DEC/EXE Pipeline Registers DEC/EXE Pipeline Registers RFU FU1 FU1 FU2 FU2 FU3 FU3 FU4 FU4 ACC Sequencer Sequencer EXE/MEM Pipeline Registers EXE/MEM Pipeline Registers Hamid Noori*, Farhad Mehdipour†, Norifumi Yoshimastu‡, Kazuaki Murakami*, Koji Inoue* and Morteza Saheb Zamani† *Department of Informatics, Kyushu Univ., Japan ‡Fukuoka Laboratory for Emerging & Enabling Technology of SoC, Japan †Computer Engineering and Information Technology Department, Amirkabir Univ. of Technology, Iran E-mail: noori@c.csce.kyushu-u.ac.jp, nyoshimatsu@fleets.jp, {murakami,inoue}@i.kyushu-u.ac.jp, {mehdipur,szamani}@aut.ac.ir Operation Modes General Overview of the architecture • Normal mode • Profiling (optional) • Executing Custom Instructions on the RFU and other parts of the code on the base processor • Training mode • Profiling • Detecting start address of Hot Basic Blocks (HBBs) • Generating Custom Instructions • Generating Configuration Data for the RFU • Binary rewriting • Initializing the Sequencer Table ♦ Online • Needs a simple hardware for profiling • All tasks are run on the base processor ♦ Offline • Needs a PC trace after taken branches/jumps Training Mode Training Mode Normal Mode Running Tools for Generating Custom Instructions, Generating Configuration Data for ACC and Initializing Sequencer Table Monitors PC and Switches between main processor and ACC Detecting Start Address of HBBs Applications Applications Applications Binary-Level Profiling Processor Processor Processor Profiler Profiler Profiler Profiler RFU RFU RFU Sequencer Sequencer Sequencer Binary Rewriting Executing CIs Tool Chain Generating Custom instructions • Custom instructions • 1- Exclude floating point, multiply, divide and load instructions • 2- Include at most one STORE, at most one BRANCH/JUMP and all other fixed point instructions • Finding the biggest sequence of instructions in the HBB that can be executed on the ACC • Moving the instructions and appending supportable instructions to the head of the detected instruction sequence after checking flow-dependency and anti-dependency • Moving the instructions and appending supportable instructions to the tail of the detected instruction sequence after checking flow-dependency and anti-dependency • Rewriting object code if instructions have been moved • Moving instructions, should not modify the logic of the application • Custom instruction generation is done without considering any other constraints. 4052c0 addiu $29,$29,-32 4052c8 mov.d $f0,$f12 4052d0 sw $18,24($29) 4052d8 addu $18,$0,$6 4052e0 sw $31,28($29) 4052e8 sw $16,16($29) 4052f0 mfc1 $16,$f0 4052f8 mfc1 $17,$f1 405300 srl $6,$17,0x14 405308 andi $6,$6,2047 405310 sltiu $2,$6,2047 405318 addu $6,$6,$18 405320 sltiu $2,$6,2047 405328 lui $2,32783 405330 and $17,$17,$2 405338 andi $2,$6,2047 405340 sll $2,$2,0x14 405348 or $17,$17,$2 405350 mtc1 $16,$f0 405358 mtc1 $17,$f1 405360 lw $31,28($29) 405370 lw $16,16($29) 405378 addiu $29,$29,32 405380 jr $31 Speedup RFU Architecture Input from register file Output to register file

More Related