170 likes | 312 Views
Experiencing MicroBlaze Hardware and Software. *RAMP Summer Retreat, June 2006. Outline. Introduction and Background Advantages of MicroBlaze Hardware Experiences with Multicore Lesson Learned from Software Porting Q&A. Initial prototyping of Internet in a Box project
E N D
Experiencing MicroBlaze Hardware and Software *RAMP Summer Retreat, June 2006
Outline • Introduction and Background • Advantages of MicroBlaze • Hardware Experiences with Multicore • Lesson Learned from Software Porting • Q&A
Initial prototyping of Internet in a Box project IIAB – a RAMP cluster based distributed system testbed at O(1000) nodes MicroBlaze as the first processor and basic building block What have we done in Internet in a Box version 0? A small cluster with Xilinx XUP boards Virtex-II Pro XC2VP30-7, half size and same speed grade as XCVP70 on BEE2 4 MicroBlazes @ 100 MHz per FPGA with heavy workload! (stable) More detail and demo tomorrow Experiences with MicroBlaze Introduction and Background
MicroBlaze Advantage • Easy to use with EDK • Linux/GCC support (with limitations) • High “performance” softcore processor • Most of instructions can be completed with 1 cycles • Shorter pipeline, higher working frequency (>100 MHz on Virtex-II ) • LEON3 7-stage pipeline, 5000 LUTs @ 90MHz on Virtex-II • FPGA optimized implementation • Fast carry chain MUX, hardware multiplier • RLOC placement constraints
Outline • Introduction and Background • Advantages of MicroBlaze • Hardware Experiences with Multicore • Lesson Learned from Software Porting • Q&A
Poor quality of IP cores kills most of developing time.What else can I say here? • Most of IP cores are not multicore compatible (e.g. bus arbitration problem) • A long bug list: opb_ddr, mch_opb_ddr, opb_ethernet • Poorly written document make things worse • Open source/commercial IP core bugs VS open source software bugs • More time to find the problem • More difficult to fix (less update, small size of the community)
Scaling difficulty inside large FPGA (Tussle with softcore) • Timing issue becomes the second time killer! • 100 MHz is the upper bound? • Take quite a while for 4-core design working • 6-core design appears unstable under heavy load • 16-core/FPGA on BEE2 might be ambitious! • Shared resources (e.g. memory controller) become the critical components • Routing delay dominates ( 60%-70%) • Floorplaning highly connected components is hard • Too many fast carry chain style MUXes in IP cores • One-level logic, so faster? – No! without RLOC constraints will make things even worse! • Be careful with your signal naming in RTL codes! • OPB0, OPB1 will be treated as signals from the same bus – affect the register mapping • Place and Route time is so long! O(hour)
Timing summary • When a new IP core is added, it’s not only a resource and functionality problem. • Embedded physical information into RTL code is preferred • Can’t write the code without timing/placement constraints (argument to other high level synthesis tools) • Advanced physical synthesis is preferred • EDK is easy to use, but not friendly with physical synthesis software (e.g. Synplify Premier, Precision Physical). • Tool compatibility issues • RTL information is hided by non-standard netlist files (Xilinx NGC files) • Can’t cross probe between RTL code and mapped design. • For QoR and full control, EDK is not the best choice • Efforts spent on timing tuning exceed those on connecting signals saved by EDK • What about RDL? • An easy solution: • Lower the frequency!
Architecture limitations of MicroBlaze • No MMU support • no protection among processes • Can’t run full version Linux. • No double precision floating point • no full floating point libraries support in libC • No atomic instructions • hard to implement lock • non-blocking FIFO instruction problem. • No cache coherent support
The abuse of BRAM • Most of BRAM are used for Cache • Why not use external SRAM? • high power consumption • high chip cost • unbearable place/route time
Outline • Introduction and Background • Advantages of MicroBlaze • Hardware Experiences with Multicore • Lesson Learned from Software Porting • Q&A
The Missing 5%.. • No protection between processes • Nightmare for software debugging • Lack of fork() • vfork() does not have the same semantics • pthread sometimes works at the cost of rewriting the application. • No shared library support • Applications suffer from jumbo file size • ”simple” i3 applications – 25KB v.s. 1.8MB • some applications will not run without shared library • Ruby interpreter (libdl)
“Auto”config • Makefile/Configuration files can not recognize MicroBlaze target • Too many architecture dependent codes in existing applications • Running Java is hard • Reconfigurable hardware confuses common build tools • Some exception handlers are crucial (e.g. unaligned access)
50MIPS vs 1000MIPS • Many applications are designed to run on CPU over 1000 MIPS • Porting them is not straight-forward • Talk to the real world • I3 pings all fingers every “second” • How to dilute the “second”? • Many places to change in the code • Not done in this project • Time dilation is our future work • Emulate machines over 1000 MIPS
None technical challenges • Maturity level of tools • The software community for MicroBlaze is too small • Research codes are even worse than general software • Portability • Convention