10 likes | 153 Views
Services provided by modern computer systems Computation oriented Fast, low power cost Communication oriented Slow, high power cost. Hardware Design, Synthesis, and Verification of a Multicore Communication API Ben Meakin, Ganesh Gopalakrishnan {meakin, ganesh}@cs.utah.edu
E N D
Services provided by modern computer systems Computation oriented Fast, low power cost Communication oriented Slow, high power cost Hardware Design, Synthesis, and Verification of a Multicore Communication API Ben Meakin, Ganesh Gopalakrishnan{meakin, ganesh}@cs.utah.edu University of Utah School of Computing Multicore Communication API MIPS Core Data-path Project Objectives Hardware Verification • Multicore Association Communication API (MCAPI) • Lightweight messaging API designed for embedded multicore systems • Implementation • Messages and packet channels use pointers to shared memory • Scalar channels copy data • Uses in-line assembly code • Application of IBM's Sixthsense semi-formal verification tool to complex multicore hardware • Promises simulator usability with MUCH higher coverage • Ability to verify large designs due to non-exhaustive state space exploration Simulation Custom On-Chip Network Synthesis Formal Verification • Objectives of this project • Research and implement efficient means of performing on-chip communication • Evaluate the impact of instruction set extensions enabling explicit data transfer • Apply these to a modern communication API • Study the use of semi-formal HW verification tools to verify realistic multicore HW • Workload driven synthesis of NoC given a model of an MCAPI target application • Paper under review for HiPEAC '10 • Algorithmic objectives • Generate custom topology to minimize average hops / flit for application • Synthesize deadlock free routing tables based on shortest path • Given approximate node sizes find a physical placement such that average wire distance is minimized Semi-Formal Verification • Cache coherence protocol verification at RTL • Can SXS find bugs not found by simulation? • Further application to pipeline control • Work in progress... 8-Core MIPS System-on-Chip • 8 processor tiles on a Xilinx Virtex5 FPGA • 16-bit MIPS cores (6-stage pipelines) • Private 2KB instruction and 2KB data caches • Shared 4KB slice of L2 data cache • Network interface unit • NUCA • MSI Directory based cache coherence • Various I/O interfaces Implementing Inter-core Communication Future Work • Evaluation of SXS and other tools as applied to multicore RTL descriptions • Extensive benchmarking of MCAPI implementation and interconnect technology • Research additional applications of proposed ISA extension in parallel programming methods • Research hardware mechanisms for increasing observability of multicore processors • Deterministic replay • Physical transport layer • Asynchronous network-on-chip • Dual networks; one for user, one for cache controllers • MIPS instruction set extension • Enables explicit data transfer • Reduces some hardware complexity More Information Wiki page with link to read-only SVN checkout: www.cs.utah.edu/formal_verification/mediawiki -Under “MCAPI Hardware Implementation” Ben Meakin's web-page: www.cs.utah.edu/~meakin Multicore Association web-page: www.multicore-association.com • Results highly encouraging • From baseline, our algorithms achieved for specific application (> 16 cores) • ~50% reduction in avg. hops / flit • ~50% reduction in avg. wire distance / flit • ~17% increase in throughput • Comparable hardware cost • Cache Architecture • Direct mapped, 8 words per block • L2 physically distributed/logically shared (NUCA) • L1 private • MSI directory coherence protocol • Write invalidate policy • Simplified form of modern architecture • Performed at least as well as baseline for general purpose • Better scalability