270 likes | 481 Views
Journal Paper Presentation. Application-Specific Customization of FPGA Soft-core Processors. Presented by: Ahmad Sghaier Course Instructor: Dr. Shawki Areibi Course: ENGG 6090*6 – Winter07 Date: Apr. 5 th , 2007. Outlines. Introduction. Parameterized Soft-cores.
E N D
Journal Paper Presentation Application-Specific Customization of FPGA Soft-core Processors Presented by: Ahmad Sghaier Course Instructor: Dr. Shawki Areibi Course: ENGG 6090*6 – Winter07 Date: Apr. 5th, 2007
Outlines • Introduction. • Parameterized Soft-cores. • Micro-architectural Trade-offs and ISA Sub-setting. • Fast Application-specific Customization. • Conclusion.
Resources • P. Yiannacouras, J. Steffan and J. Rose, “Exploration and Customization of FPGA-Based Soft Processors” in IEEE Transactions on Computer-aided Design of integrated Circuits and Systems, Vol. 26, NO. 2, Feb. 2007. • D. Sheldon, R. Kumar, R. Lysecky, F. Vahid and D. Tullsen, “Application-Specific Customization of Parameterized FPGA Soft-Core Processors” in IEEE/ACM Int. Conf. on Computer-Aided Deisgn, Nov. 2006.
Soft-core vs. Hard-core • A hard-core processor is laid out on the chip next to the FPGA’s configurable logic fabric • A soft-core processor is synthesized onto the FPGA’s fabric, just like any other circuit. • soft-core processors advantages: • Utilizing standard mass-produced • Enabling a custom number of microprocessors • Soft-core processors disadvantages: • Reduced processor performance • Higher power consumption • Larger size.
Commercial Soft-cores • Xilinx MicroBlaze • A 32-bit soft-core processor. • A single-issue in order execution processor. • Configurable to five components: multiplier, barrel shifter, divider, floating-point unit (FPU), and data cache. • Altera Nios II. It has three mostly unparameterized variations: • Nios II/e, a small unpipelined 6 cycles per instruction (CPI) processor with serial shifter and software multiplication; • Nios II/s, a five-stage pipeline with multiplier-based barrel shifter, hardware multiplication, and instruction cache • Nios II/f, a large six-stage pipeline with dynamic branch prediction, and instruction and data caches.
Parameterized Soft-cores • Configurability. • Application Specific. • Size, performance and power constraints. • Configurable Parameters: • Instantiating Functional Units (0,1). • Unit-Specific Parameters (Cache type/size). • Instruction Set Architecture. • Pipelining (Depth).
Exploration and Customization of FPGA-Based Soft Processors • Exploration of the micro-architectural tradeoffs for soft processors • A set of customization techniques: • Tuning the micro-architecture to the application. • Subsetting the ISA • Hybrid approach • To improve the performance/area of a soft processor for a specific application. • A CAD Tool.
Approach • Developing a customization tool that will generate the most customized soft-core. • SPREE (soft-processor rapid exploration environment). • Targeting functional unit customization and ISA subsetting.
SPREE • Input: Textual Description (ISA& Datapath). • ISA & datapath verification. • Constructing the Datapath. • Control Generation. • Synthesizable RTL (Verilog)
Framework • Altera Startix I. • Comparison with Nios-II variations (e, s and f) • MIPS Instructtion Set. • Performance Metrics • Area in LE • Performance in MIPS • Efficiency in MIPS/LE • Equal weight for performance and area • Benchmark • 20 varied applications (fir, FFT, DES, CRC, QSORT, Bubble-sort)
Micro-architecture Exploration (1) • Functional Units • Shifter Implementation (serial, shared multiplier) • Multiplication (SW, HW).
Micro-architecture Exploration (2) • Pipelining • Depth • Organization
Micro-architecture Customization • 6 micro-architectural axes • Exhaustive search for the generated solutions.
ISA Subsetting • Eliminate the unused instruction • Simplify Control Unit Reduce Area • Less than 50% utilization of the ISA.
Impact of ISA subsetting Impact on Performance Impact on Area
Results • Fine Customization Environment • an improvement in performance per area of 14.1% on average across all benchmarks. • Combined approach improved the performance per area by 24.5% on average across all applications.
Application-Specific Customization of Parameterized FPGA Soft-Core Processors • A methodology for fast application-specific customization of a parameterized FPGA soft core. • Targeting 1-2 hours Runtime • Near-optimal Results • Traditional CAD with 0-1 Knapsack Algorithm • Synthesis-in-the-loop exploration.
Framework • Xilinx MB on Virtex-II Pro FPGA • Comparison with Base and Full MB • Performance Metrics • Area in equivalent LUTs • Performance by the application runtime in (ms) • Benchmark • 11 applications from EEMBC
Approach-1 • Traditional CAD Approach • 0-1 knapsack problem • Maximize performance • Constraint on area • 6 synthesis/execution runs
Approach-2 • Synthesis-in-the-loop • pre-determines the impact each parameter individually has on design metrics • then search the parameters in sequence, ordered from highest impact to lowest. • Two orders (fixed-ordered and impact-ordered)
Results • Exhaustive search took 11 hours. • The fixed impact-ordered tree approach had the fastest runtime of 108 minutes. • Knapsack algorithm with similar results to the fixed impact-ordered tree approach. • Similar results for 50% constraint. No Constraint Fixed 80% constraint Per application 80% constraint
Results • Reimplementation on Spartan2 FPGA • 1.5 hours runtime for the fixed-order impact-ordered tree • 200 minutes for the application-specific impact-ordered tree
Scalability • Increasing the number of parameters • Increase the runtime. • Fixed-order impact-ordered tree and knapsack scale well.
Conclusion • Impact of customization on performance and area. • Emphasis on performance. • Customizable parameters span the micro-architecture and the ISA. • Use of near-optimal solutions to save on runtime. • Possibility to look for finer customization, but scalability have to be addressed. • Finer customization might consider 0-1 parameters or multi-valued parameters.
THANK YOU Q&A