490 likes | 655 Views
A Monte Carlo Simulation Accelerator using FPGA Devices. Final Year project : LHW0304 Ng Kin Fung && Ng Kwok Tung Supervisor : Professor LEONG, Heng Wai Philip. Overview. Overview. Objective Background Software-only Implementation Hardware Implementation FPGA Soft-Core Micro-Processor.
E N D
A Monte Carlo Simulation Accelerator using FPGA Devices Final Year project : LHW0304 Ng Kin Fung && Ng Kwok Tung Supervisor : Professor LEONG, Heng Wai Philip
Overview • Objective • Background • Software-only Implementation • Hardware Implementation • FPGA • Soft-Core Micro-Processor
Overview • Background • Interest Rate Modeling • Brace-Gatarek-Musiela (BGM) Model • Motivation and Contribution • System Design • System Design Overview • System Components • System Operations
Overview • Experiment and Result • Resources • Performance • Data Transmission Overhead • Conclusion • Future Improvement • Q & A Section
Objective • What we achieved in last semester • Study and get familiar with the development related tools • Implement some simple examples to get experience in system development of FPGA with Soft-core Micro-processor • First ever successful port of the Microblaze system to the Celoxica RC200 development board • Study the performance and power consumption of the system
Objective • How about this semester • Build up a Monte Carlo Simulation Accelerator using FPGA technology and Soft-core Micro-processor • Study the speed up and performance • Study the transmission overhead of the transmission channel between user core and Soft-core Micro-processor
Software only implementation • The performance is NOT satisfactory • Sequential execution of instruction instead of parallel execution • Slow Memory access • Lack of ability to customize hardware • No way to save power by switching off hardware module • There is a need to solve the problem in another approach
FPGA Technology • More and more popular in system design • Higher degree of parallelism • Fewer clock cycle required
FPGA Technology • Explicitly hardwired to perform a certain operation • Optimized for specific purpose higher performance • Enable customization of hardware module • Power Saving • Reconfigurable • Enable reuse of hardware • Able to simulate and synthesize the circuits from a high level program-like description • Easy system development and system testing • Shorter time to market higher profit
Soft-Core Micro-Processor • Most systems use a PC+FPGA accessed through a PCI bus • Bottleneck for entire system • Use of Soft-Core Micro-Processor • Everything is implemented in FPGA • Transmission of data is within the FPGA • A higher transmission bandwidth and lower latency
Soft-Core Micro-Processor • Other advantages • Easier to develop • Retain the advantage of using FPGA • Flexible • Retargetable • Conclusion • FPGA technology + Soft-Core Micro-Processor
Interest Rate Modeling • Important of interest rate modeling • Simulate market behavior with historical parameter values • Explain interest rate movements in terms of an underlying model • decision making on economic policy • risk management
Brace-Gatarek-Musiela (BGM) Model • One of the most popular interest rate models • Base on Monte Carlo Method • Looping Part (most computational expensive)
Implementing BGM Model using FPGA and Soft Core Microprocessor BGM core generate 50 paths with 9 fixed points
Implementing BGM Model using FPGA and Soft Core Microprocessor • Implemented by FPGA in parallel style • Post-processing calculation by Microblaze • Average and Standard error • Fast Simplex Link Bus for data transmission between BGM core and Microblaze
Contribution • Improve the performance of the system
Microblaze • A soft-core Microprocessor • Delivered as HDL source code for synthesis • Designed in VHDL • Specially optimized for Xilinx FPGAs • A reduced instruction set computer (RISC) • Speed of Microblaze across different devices from Xilinx Statistics
User Core – BGM • Connect the core designed in VHDL to the Microblaze system • Solve most computational expensive task in fully hardware • Need to follow the signal and timing of the bus connected • A microprocessor description (MPD) file • Defines the interface of the peripheral • Ports, Buses • A Peripheral Analyze Order (PAO) file • A list of HDL files in order of compilation that are needed for synthesis
Fast Simplex Link (FSL) • 32 bits wide bus • Unidirectional point-to-point data streaming interfaces • Control and Data communication support • FIFO based communication • Fast Internal data and control transmission • Peak bandwidth 300MB / SEC
Fast Simplex Link (FSL) Xilinx Fast Simplex Link Channel Product Specification DS449 (v1.1) Aug 06, 2003
Fast Simplex Link (FSL) Xilinx Fast Simplex Link Channel Product Specification DS449 (v1.1) Aug 06, 2003 Use Read Marco microblaze_bread_datafsl(val, id) for reading data from FSL FIFO to Microblaze
On-Chip Memory, Local Memory Bus and Memory Bus Controller • On Chip Memory • Storage medium for the data and instruction • Minimize the transmission overhead between the Microblaze and the memory • Local Memory Bus • Single-cycle access to on-chip dual-port block RAM • Performance of 125 MHz • LMB BRAM Interface Controller • Interface between the LMB and the bram_block peripheral • Separate controller for data and control
On-Chip Peripheral Bus (OPB Bus) • Connection between the main system and the peripherals • Make Microblaze System More Functional • In this project • UART • OPB Timer • GPIO
Universal Asynchronous Receiver-Transmitter (UART) • Handles asynchronous serial communication • Libgen allows the mapping of standard input and output • Use of scanf and printf for the communication with user
OPB Timer • Facilitate the correct measurement of the performance • Initiate timer Start timer Stop timer Get timer value • XStatus XTmrCtr_Initialize • void XTmrCtr_Start • void XTmrCtr_Stop • Xuint32 XTmrCtr_GetValue
General Purpose Input Output (GPIO) • Problem found on FSL Bus • Reset signal connected to Gound • No way to reset the BGM core through FSL Bus • Solution • Make change to the VHDL source code • Use GPIO
Reset Reset Microblaze FSL Reset BGM Core X Reset Reset Microblaze GPIO Reset by GPIO Reset by FSL BGM Core General Purpose Input Output (GPIO)
Microblaze System Start System Operations BGM Core is reset Timer is started BGM Process yes Any More Data No Post-Processing Calculation by Microblaze Timer is stopped Result is printed out End of Microblaze System
System Operations BGM Process Start BGM Core in process of generating path Data transfer from BGM core to Microblaze System Data format transform Temperate storage of data End of Microblaze System
Resources • Unable to place whole system to the FPGA board • System Simulation by ModelSim
Performance Comparison of performance for the running of BGM core in FPGA and in PC (By Dr. Zhang) Speed up factor : 19.87
Performance The comparison of performance for the running the BGM core in FPGA and PC with different number of paths generated (By Dr. Zhang) Stable Performance with different path numbers
Performance Simulation of Microblaze system Total time required for generating 50 paths : 2.871ms Speed up factor : 21.94
Transmission Bandwidth In FSL Bus 32 bit of data is sent by about 40000ps Transmission bandwidth is around 100MB per second Same significant as the peak transmission bandwidth as stated in specification
Conclusion • A Monte Carlo Simulation Accelerator was implemented using FPGA technology and Xilinx Microblaze Soft-core Micro-processor • A speed up factor 21.94 when compared with software only implementation • Higher bandwidth and lower latency can be achieved using FSL Link between Microblaze and BGM core • High performance, the parallelism of execution of instruction, the reconfigurability and reuseability and the short development time……
Future Development • Put the whole system in the FPGA board • Implement other applications which put high performance and short developing time as the major consideration • Study other IP core included and make improvement to the system