210 likes | 388 Views
SLAAC/ACS API: Control of Systems of Adaptive Computing Nodes. Virginia Tech Configurable Computing Lab SLAAC Retreat March 1999. Dr. Peter Athanas Dr. Mark Jones Heather Hill Emad Ibrahim Zahi Nakad Kuan Yao Diron Driver Karen Chen. Chris Twaddle Jonathan Scott Luke Scharf
E N D
SLAAC/ACS API: Control of Systems of Adaptive Computing Nodes Virginia Tech Configurable Computing Lab SLAAC Retreat March 1999
Dr. Peter Athanas Dr. Mark Jones Heather Hill Emad Ibrahim Zahi Nakad Kuan Yao Diron Driver Karen Chen Chris Twaddle Jonathan Scott Luke Scharf Lou Pochet John Shiflett Peng Peng Sarah Airey Chris Laughlin The Virginia Tech SLAAC Team
Problem Definition • A single adaptive computing board is insufficient for many applications • insufficient power & functionality • Difficult to move an application from a research reference platform to deployment in a field system • Need for application to move to new platforms as they become available without unreasonable effort
Research Reference Platform • Network of ACS-accelerated workstations. • Inexpensive readily available platform for ACS development. • Tracks performance advances in workstations and cluster computing. • ACS hardware is PCI-based. • OS is NT or Unix. • Network is simple Ethernet or high speed such as Myrinet.
Representative Field System • Embedded, distributed system • sensor nodes • actuator nodes • adaptive computing nodes • Limited OS/microprocessor support on most nodes • Heterogeneous network VME • Simplest carrier is a cluster of single-board computers. • ACS hardware is VME-based. • OS is VxWorks. • Network is Myrinet SAN.
Solution Approach • Define a platform independent API that allows for configuration and control of a multi-board ACS • Provide efficient implementations of the API for research & field platforms • exploit high speed networking • modular design that performs more complex control tasks on a OS-equipped host
Capabilities of SLAAC/ACS API • Allows control of a distributed system of adaptive computing nodes from a single host through functions defined in the API • Allows migration between platforms w/o modification of host source code • lightweight runtime environment on nodes • Channel-based model of computation allows for flexible, efficient combining of high-performance networks & ACS nodes
ACS API defines a system of nodes and channels. System dynamically allocated at runtime. Channels stream data between FIFOs on host/nodes. API provides common control primitives to a distributed system configure, readback, set_clock, run, etc. Programming Model Hosts Nodes Network
Boards operate on individual clocks, but are data-synchronous. Channels can apply back-pressure to stall producers. Use network channels in place of physical point-to-point connections. M M M M M M M M M M M M M M M M F F F F F F F F F F F F F F F F Crossbar Crossbar Crossbar Crossbar Network Channels Network-channel
Adds multiple dimensions of scalability. Channel topology can be changed dynamically. Channels allow data to flow through the system with a programmable topology. M M M M M M M M M M M M M M M M F F F F F F F F F F F F F F F F Crossbar Crossbar Crossbar Crossbar Programmable Topology
ACS_Initialize Parses command line. Initializes globals. ACS_System_Create Allocates nodes and channels. Creates opaque system object in host program. Same host program can manage multiple systems. Nodes and channels are logically numbered in order of creation. Host is node zero. System Creation Functions 2 0 3 1 0 2 1 3
ACS_Read() Gets block of memory from (system, node, address, count) into user buffer. ACS_Write() Puts block of memory from user buffer to (system, node, address, count). ACS_Copy() Copies memory from (node1, address1) to (node2, address2) directly. ACS_Interrupt() Generates an interrupt signal at node. Memory Access Functions
Each node/system has a set of FIFO buffers. Channels connect two FIFO buffers. Arbitrary streaming-data topologies supported. ACS_Enqueue() put user data into FIFO ACS_Dequeue() get user data from FIFO FIFO 0 FIFO 0 FIFO 1 FIFO 1 FIFO 2 FIFO 2 FIFO 3 FIFO 3 Streaming Data Functions 1 FIFO 0 0 FIFO 1 FIFO 2 2 FIFO 3
Implementation Strategy • Communication • Rely on MPI for high-performance communication where available • When MPI not available or convenient, tightly couple network & ACS hardware • Portability • limited new code is required to extend API implementation for a new ACS board • control program for compute nodes is simple enough to run w/o complex OS
Completed implementation of v1.0 of API Implemented in C++ (callable from C) Software: NT + MPI (WMPI & MPI-FM) Hardware: WildForce Runs on the Tower of Power 16-node cluster of PCs WildForce board on each PC Myrinet network connecting all PCs API Implementation Status
PerformanceMonitor Dynamic topology display Performance Metrics Playback (future) Use to confirm the configuration of the system Use to identify performance bottlenecks
ACS Multiboard Debugger • Based on Boardscope and Jbits • Will provide • Waveforms • State Status • Channel Status • Interfaces through SLAAC API
Project Timeline Multiboard Debugger Kickoff Intervention Free Operation SLAAC-1 & 2 Integration Nov 98 Feb 99 May 99 Aug 99 Multiboard API Applications
Why Use This API? • Single Board Systems • API closely matches accepted API’s, e.g. AMS Wildforce & Splash • Virtually no overhead • Your application will port to SLAAC • Multi Board Systems • Single program for multi-node applications • Inherent management of the network • Zero sided communication • It’s FREE
Future Work • Support for Linux in addition to NT • Support for RunTime Reconfiguration (RTR) • Extension to SLAAC-1 & 2 boards • API implementation for embedded systems • System level management of multiple programs
Summary • Latest versions of source code and design documents available for download • For more information visit TOP websitehttp://acamar.visc.ece.vt.edu/