110 likes | 229 Views
A Shared Memory Microblaze Manycore Multiprocessor. Murali Vijayaraghavan vmurali@mit.edu MIT Computer Science and Artificial Intelligence Laboratory RAMP Retreat, UC Berkeley, January 11, 2007. Why? A Hardware Emulator for InfiniCore.
E N D
A Shared Memory Microblaze Manycore Multiprocessor Murali Vijayaraghavan vmurali@mit.edu MIT Computer Science and Artificial Intelligence Laboratory RAMP Retreat, UC Berkeley, January 11, 2007
Why? A Hardware Emulator for InfiniCore • InfiniCore is a homogeneous manycore research processor under development at MIT • Designed to support both general-purpose computation and high-performance embedded apps • Lots of small cores on one chip • Target is 1K 64-bit cores at 32nm technology node (2012?) • InfiniCore has completely new software stack (new OS, new parallel virtual machines) => Essential to have fast platform for software development • But also, want to help develop more RAMP infrastructure components in the memory system
Goals for First Version • Design a distributed uncached shared memory system containing Microblaze cores • Microblaze OK to get going and has high density, but will eventually migrate cores to new InfiniCore ISA • An RTL model but timing doesn’t necessarily match target • No support for caches: An on-chip network connecting on-chip scratchpad memories and off-chip DRAM. • Support parallel synchronization primitives. • Map & run MIT bthreads (a stripped down version of pthreads sufficient to run many parallel applications) • Suitable for experiments in embedded manycore application mapping
Caveats • This is very preliminary work • Not everything is worked out • Implements small fraction of Infinicore’s functionality • This is not InfiniCore…
One Tile mblaze core LMB Data Bus LMB Inst Bus To DRAM To Router Memory Interface From DRAM From Router Local Data Scratchpad (in BRAM) Local Inst Scratchpad (in BRAM)
2-D Array of Tiles on one BEE2 FPGA DRAM Tile DRAM DRAM DRAM
Network • Load-Store network, fixed size (header + 32 bits) messages • Separate Request and Reply networks • Static dimension-ordered routing • Route in one dimension first, then route in the other • Flow control in the routers
Design tools • Xilinx IP cores for Microblaze core • BEE2 DDR2 controller core for DRAM controller • Rest of the code in Bluespec
Current status • Designed Memory interface and Router • Stubs written in Bluespec to encapsulate interfaces • LMB buses • DDR2 controller interfaces • BRAM • Finished interconnecting components
In Progress – Hardware synchronization • Atomic read-modify-write instructions using PUT and GET instructions in Microblaze ISA and design custom FSL core (in bluespec) • Rest of the design is almost the same put Local Memory Interface Remote Memory Interface Atomic operation core (FSL core) mBlaze core get Lock for Read-Modify -Write Memory
Future version plans: Load, store buffers & caches • Interaction of load/stores with synchronization instructions • Fences before atomic memory operation • Caches • Microblaze core’s caches with InfiniCore hardware/software coherence protocol