1 / 14

CARMA: A Comprehensive Management Framework for High-Performance Reconfigurable Computing

2. CARMA Motivation. Key missing pieces in RC for HPCDynamic RC fabric discovery and managementCoherent multitasking, multi-user environmentRobust job scheduling and managementDesign for fault tolerance and scalabilityHeterogeneous system supportDevice independent programming modelDebug and s

tanginika
Download Presentation

CARMA: A Comprehensive Management Framework for High-Performance Reconfigurable Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. CARMA: A Comprehensive Management Framework for High-Performance Reconfigurable Computing Ian A. Troxel, Aju M. Jacob, Alan D. George, Raj Subramaniyan, and Matthew A. Radlinski High-performance Computing and Simulation (HCS) Research Laboratory Department of Electrical and Computer Engineering University of Florida Gainesville, FL

    2. 2 CARMA Motivation Key missing pieces in RC for HPC Dynamic RC fabric discovery and management Coherent multitasking, multi-user environment Robust job scheduling and management Design for fault tolerance and scalability Heterogeneous system support Device independent programming model Debug and system health monitoring System performance monitoring into the RC fabric Increased RC device and system usability Our proposed Comprehensive Approach to Reconfigurable Management Architecture (CARMA) attempts to unify existing technologies as well as fill in missing pieces

    3. 3 CARMA Framework Overview CARMA seeks to integrate: Graphical user interface Flexible programming model COTS application mapper(s) Handel-C, Impulse-C, Viva, System Generator, etc. Graph-based job description DAGMan, Condensed Graphs, etc. Robust management tool Distributed, scalable job scheduling Checkpointing, rollback and recovery Distributed configuration management Multilevel monitoring service (GEMS) Networks, hosts, and boards Monitoring down into RC Fabric Device independent middleware API Multiple types of RC boards PCI (many), network-attached, Pilchard Multiple high-speed networks SCI, Myrinet, GigE, InfiniBand, etc.

    4. 4 Application Mapper Evaluation Evaluating on basis of ease of use, performance, hardware device independence, programming model, parallelization support, resource targeting, network support, stand-alone mapping, etc. C-Based tools Celoxica - SDK (Handel-C) Provides access to in-house boards: ADM-XRC (x1), Tarari (x4), RC1000 (x4) Good deal of success after lessons learned Hardware design focused Impulse Accelerated Technologies – Impulse-C Provides an option for hardware independence Built upon open source Streams-C from LANL Supports ANSI standard C Graphical tools StarBridge Systems - Viva Nallatech – Fuse / DIMEtalk Annapolis Micro Systems - CoreFire Xilinx - ISE compulsory Evaluating the role of Jbits, System Generator, and XHWIF Evaluations still ongoing Programming model a fundamental issue to be addressed

    5. 5 CARMA Interface Simple graphical user interface Preliminary basis for graphical user interface via the Simple Web Interface Link Library (SWILL) from the University of Chicago* User view for authentication and job submission/status Administration view for system status and maintenance Applications supported Single or multiple tasks per job (via CARMA DAGs**) CARMA registered (via CARMA API and DAGs) or not Provides security, fault tolerance Sequential and parallel (hand-coded or via MPI) C-based application mappers supported CARMA middleware API provides architecture independence Any code that can link to the CARMA API library can be executed (Handel-C and ADM-XRC API tested to date) Bit files must be registered with the CARMA Configuration Manager (CM) All other mappers can use “not CARMA registered” mode Plans for linking Streams/Impulse-C, System Generator, et al.

    6. 6 CARMA User Interface

    7. 7 CARMA Job Manager (JM) Prototyping effort (CARMA interoperability) Completed first version of CARMA JM Task-based execution via Condor-like DAGs Separate processes and message queues for fault-tolerance Checkpointing enabled with rollback in progress Links to all other CARMA components Fully distributed multi-node operation with job/task migration Links to CARMA monitor and GEMS to make scheduling decisions Tradeoff studies and analyses underway External extensions to COTS tools (COTS plug and play) Expand upon preliminary work @ GWU/GMU* Striving for “plug and play” approach to JM CARMA Monitor provides board info. (via ELIM) Working to link to CARMA CM Tradeoff studies and analysis underway Integration of other CARMA components in progress

    8. 8 CARMA CM Design Builds upon previous design concepts* Execution Manager (EM) Forks tasks from JM and returns results to JM Requests and releases configurations Configuration Manager (CM) Manages configuration transport and caching Loads, unloads configurations via BIM Board Interface Module (BIM) Provides board independence Allows for configuration temporal locality benefits Communication Module Handles all inter-node communication

    9. 9 Distributed CM Management Schemes

    10. 10 CM System Recommendations

    11. 11 CARMA Monitoring Services Monitoring service Statistics Collector Gathers local and remote information Updates GEMS* and local values Query Processor Processes task scheduling requests from JM Maintains local information Round-Robin Database Compact way to store performance logs Supports simple query interface CARMA Diagnostic System watchdog alerts based on defined heuristics of failure conditions Provides system monitoring and debug Initial monitor version is complete Studying FPGA monitoring options Increasing the scheduling options Tradeoff studies and analyses underway

    12. 12 CARMA End-to-End Service Description Functionality demonstrated to date Graphical user interface Job/task scheduling based on board requirements and configuration temporal locality Parallel and serial jobs CARMA registered and non-registered tasks Remote execution and result retrieval Configuration caching and management Mixed RC and “CPU-only” tasks Heterogeneous board execution (3 types thus far) System and RC device monitoring Inter-node communication via SCI or TCP/IP/GigE Fault-tolerant design Processes can be restarted while running Virtually no system impact from CARMA overhead despite use of unoptimized code Less than 5MB RAM per node Less than 0.1% processor utilization on a 2.4 GHz Xeon server Less than 200 Kbps network utilization

    13. 13 CARMA Framework Verification Several test jobs executed concurrently Parallel Add Test composed of ADD.exe, a “CPU-only” task to add two numbers AddOne.bit, an RC task to increment input value Parallel N-Queens Test composed of ADD.exe, a “CPU-only” task to add two numbers NQueens.bit, an RC1000 task to calculate a subset of the total number of solutions for an N×N board 4 RC1000s and 4 Tararis communicating via MPI Parallel Sieve of Erasthones (on Tarari) Parallel Monte Carlo Pi Generator (on Tarari) Blowfish encrypt/decrypt (on ADM-XRC)

    14. 14 Conclusions First working version of CARMA complete & tested Numerous features supported Simple GUI front-end interface Coherent multitasking, multi-user environment Dynamic RC fabric discovery and management Robust job scheduling and management Fault-tolerant and scalable services by design Performance monitoring down into the RC fabric Heterogeneous board support with hardware independence Linking to COTS job management service Initial testing shows the framework to be sound with very little overhead imposed upon the system

    15. 15 Future Work and Acknowledgements Continue to fill in additional CARMA features Include support for other boards, application mappers, and languages Complete JM rollback feature and finish linkage to LSF Include broker and caching mechanisms for the peer-to-peer distributed CM scheme Include more intelligent scheduling algorithms (e.g. Last Release Time) Expand RC device monitoring and include debug and opt. mechanisms Enhance security including secure data transfer and authentication Deploy on a large-scale test facility Develop CARMA instantiations for other RC domains Distributed shared-memory machines with RC (e.g. SGI Altix) Embedded RC systems (e.g. satellite/aircraft systems, munitions) We wish to thank the following for supporting this research: Department of Defense Xilinx Celoxica Alpha Data Tarari Key vendors of our HPC cluster resources (Intel, AMD, Cisco, Nortel)

More Related