230 likes | 383 Views
Institute for Software Integrated Systems. Vanderbilt University Nashville, Tennessee. Resource Allocation & Control Engine (RACE). Nishanth Shankar nshankar@dre.vanderbilt.edu www.dre.vanderbilt.edu/~nshankar. Jai Balasubramanian jai@dre.vanderbilt.edu www.dre.vanderbilt.edu/~jai.
E N D
Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee Resource Allocation & Control Engine(RACE) Nishanth Shankar nshankar@dre.vanderbilt.edu www.dre.vanderbilt.edu/~nshankar Jai Balasubramanian jai@dre.vanderbilt.edu www.dre.vanderbilt.edu/~jai Dr. Douglas C. Schmidt schmidt@dre.vanderbilt.edu www.dre.vanderbilt.edu/~schmidt
How RACE fits with ARMS Mission Scenario Deployment ( in ) feasible & Mission Planner (IA + ASM) • Decompose mission goal(s) into tasks • Implement tasks as CORBA components • Deploy components using CIAO/DAnCE middleware Condition Monitors • Observe mission performance System Resource Monitors • Observe resource utilization • System wide • Per application basis Adaptive Resource Manager • Allocate resources to new components – Resource Allocator • Modify existing component deployment based on system performance – ASM ( un ) successful Mission Planner Mission Goals Deploy components System Status Condition Adaptive System Resource Monitors Resource Monitors Manager Mission System Performance Performance Physical Resources
Genesis of RACE (1/2) Context • Deploy components • Varying resource requirements • Varying resource availability/capability • Need for resource allocation • Bootstrap system • Need for adaptation • Changes in mission goals (modes) as execution progresses • Changes in mission goals because tasks cannot be completed on time • Degradation in system performance • Task execution times & resource requirements may be revised dynamically Application Components with Varying Resource Requirements Uniform interface to parse component requirements Multiple Resource Allocation Algorithms Uniform interface to deploy components Target Platform with Varying Resource Availabilities & Capabilities
Genesis of RACE (2/2) Problem • Develop effective allocation & adaptation algorithms (policies) • Implement allocation & adaptation algorithms (mechanisms) • Tight coupling between policies & mechanisms • Monolithic implementations increases memory footprint & complexity Solution • Separation of concerns • e.g., separate policy & mechanism • Framework support for plugging in allocation/adaptation algorithms Application Components With VaryingResource Requirements Uniform interface to parse component requirements Allocation Adaptation Algorithms Algorithms Resource Allocation & Control Engine (RACE) Uniform interface to deploy components Target Platform With Varying Resource Availabilities & Capabilities Resource Allocation & Control Engine (RACE)
Why RACE/DAnCE for ARMS ? (1/2) • ARMS’ Primary Goal • Develop Adaptive Resource Management technology for DD(X) • Research Goals • Develop general purpose Adaptive Resource Management into standardized software services • Benefits • Life of technology is not limited to the lifetime of ARMS program • MLRM technology can be reused in other areas research other than ARMS • Increases ease of technology transfer to DD(X) if DRM capability is available via standardized service compared to custom application architectures • Leverages the latest development in CCM technology, and potentially enhances it • As DARPA PM Joseph Cross says: • “We deliver Technology, and not Software!” • “If we can improve the design, lets improve it as now is the time to do so!”
Why RACE/DAnCE for ARMS ? (2/2) • Limitations with current MLRM infrastructure • Software Architecture for Phase I MLRM is brittle and difficult to reuse in contexts outside ARMS • Provide custom services for functions where emerging standardized services are available • e.g. Node Provisioner is a ARMS Node Daemon in DnC specification • Provides an all or nothing solution • Can not reuse individual component such as AMS, IA, or Pool Manager separately out slide the MLRM framework • Proposed Approach • Package ARMS adaptive resource management capabilities into a set of a modular, general purpose, DnC spec compliant / coordinated services • RACE • Design framework for plugging algorithms specializing in resource allocation and adaptive resource management • Provides an abstraction to address the interdependencies between resource allocation and string management in a scalable and configurable manner • Is NOT • One monolithic software application • Algorithms or the core science
Overview of RACE • Resource allocation framework atop DAnCE • Generate deployment plans to allocate components to available resources • Configure components to satisfy QoS requirements based on dynamic mission goals • Run-time adaptation • Coarse-grained mechanisms • React to new missions, drastic changes in mission goals, or unexpected circumstances such as loss of resources • e.g., component re-allocation or migration • Fine-grained mechanisms • Compensate for drift & smaller variations in resource usage • e.g., adjustment of application parameters such as QoS settings Application Components With Varying Resource Requirements Uniform Interface to parse Component Requirements Resource Allocation & Control Engine (RACE) Deploy & Configure Components CIAO/DAnCE Monitor & Adapt Deploy on Target Target Domain
Overview of RACE Architecture Deployment Manager • Collects target domain utilization periodically • Uses a configured resource allocation algorithm & makes new deployment planning decisions Target Manager • Collects utilization information about various domain resources • Allow efficient retrieval of resource utilization information about the domain & individual nodes Resource Monitors • Tracks use of individual resources Deployment Manager Domain resource utilization Target Manager Deploy & configure components CIAO/DAnCE Resource utilization Deploy on target Monitors Target Domain Plug & play framework for resource allocation & adaptation algorithms
Deployment Manager Utilization Data Data Ready Target Manager Monitor Monitor Monitor Target Domain Application Application Component Component Detailed RACE Architecture Deployment Manager • CCM component that encapsulates a configurable framework • Deploy application components • Allocate resources to components • Configure application QoS parameters • Pluggable resource allocation algorithms • Manage overall system performance • Resource utilization • Maintains system resource utilization within specified bounds • Application performance • Ensure desired application performance by tuning different QoS parameters • Pluggable adaptation algorithms
Target Manager CCM component that receives periodic resource updates from monitors One per domain Tracks resource usage of all resources in the domain Uniform interfaces for retrieving information From individual components From the whole component assembly Resource Monitors CORBA components/objects >= 1 monitor per domain resource Periodically observe resource utilization per application and/or overall system utilization e.g. of resource monitors CPU utilization monitor Network bandwidth utilization monitor Memory utilization monitor Deployment Manager Utilization Data Data Ready Target Manager Monitor Monitor Monitor Target Domain Application Application Component Component Detailed RACE Architecture
Deployment Manager Design • Implemented as CCM assembly-based component containing the following individual (monolithic) components • Allocators • Plug-in points for Resource Allocation algorithms • Adapters • Plug-in point for Adaptation algorithms • Manage system/application performance • Lookup Table • Associates assembly with the allocation & adaptation algorithm • Decorators • Weave resource/application monitors into assembly • Allocation History & System State • History of previous allocation & current resource allocation
RACE Runtime View • Receives component assembly to deploy from the planning module in the form of XML descriptors & parses those descriptors. • Decorate assembly with application monitors & QoS decorators • Determine allocation & adaptation algorithm based on assembly information • Consider CPU or network or both? • Use allocation & adaptation algorithm that considers all dimensions • Obtain current resource utilization from Target Manager • Prepare allocation plan using allocation history
RACE Runtime View • Deploy component assembly using generated deployment plan • Monitor system performance • Target Manager • Compute modifications to individual components or the whole assembly using adaptation algorithms • Modify component assembly properties
Intelligent Mission Planner RACE RACE CIAO/DAnCE Resource Monitors Target Domain Current Status & Implementation Plan Completed • CIAO/DAnCE infrastructure middleware In design phase • Architecture of RACE • CCM interfaces of RACE • Allocation & adaptation algorithms Plan of action • Implement • Resource monitors • Target Manager • RACE • Allocation & Adaptation algorithms • Integrate with Intelligent Mission Planner to achieve science mission goals Adaptation Algorithms Adaptation Algorithms Allocation Algorithms Allocation Algorithms Target Manager Deploy Components
Existing ARMS Architecture • Non standard entities • Custom built for ARMS • Monolithic implementation of each layer • Reduced flexibility • Increases complexity • Number of layers in MLRM is fixed to three • Infrastructure Layer • Pool Layer • Node Layer • Possibly limits scalability • Roles performed by entities at Infrastructure Layer and Pool Later are similar, but differ only in “scope”
Proposed MLRM Architecture with RACE/DnC • DnC Spec compliant entities • Wide applicability • Can be reused outside ARMS • Pluggable Framework • Supports multiple • Allocation Algorithms • Adaptation Algorithms • Number of layers in MRLM architecture is flexible • Increases scalability • Logical grouping of related entities • Provides a template for each (higher) layers in the MLRM architecture • MLRM is truly a Multi-Layer Resource Management Middleware, and not limited to three layers
Milestones Details (1/2) • Deploy existing MLRM components using DAnCE • Port MLRM Components to DAnCE • Deploy MLRM Components using DnC Tools (Node Daemon, Execution Manager, and Plan Launcher) instead of P&D tools (CIAO Daemon, Assembly Manager, and Assembly Deployer) • Test Phase I GM2 Experiment (formerly GM3) on EMULAB (and at six machines set up at ISIS in an Emulab-like environment) • Convert the AIM to DAnCE descriptors • Verify that generic property facility is sufficiently rich to capture AIM-like information • Determine mapping of AIM-like properties to generic DAnCE properties • Customize PICML user interface to support intuitive entry and manipulation of AIM-like properties • Develop the RT-HIGH2.1 application string using the new DAnCE descriptors and the new component workload generators • Convert the DIM to DAnCE descriptors • Convert the DIM_Writer and DIM_Reader (XML handling façade classes) so that they produce and consume DAnCE descriptors
Milestones Details (2/2) • Convert the NodeProvisioner to standard DnC entity Node Daemon • Migrate proxy objects to mandatory component facets for component applications; and proxy components for non-component applications • Implement the Target Manager • Map the OMG standard Target Manager IDL interface for resource monitoring to the functionality provided by RSS • Determine interface between Target Manager and Deployment Manager • Implement Resource Monitors and determine an interface between them and the Target Manager • Implement the Deployment Manager • Determine external interfaces (with other MLRM components) • Determine internal interfaces (among Deployment Manager Core, Allocators, and Adapters) • Implement Deployment Manager Core and sample instances of Allocators and Adapters • Parallel development of specialized allocation and adaptation algorithms, adhering to the agreed-upon interface
Convert the AIM to DAnCE descriptors 10% finish by July 1 Convert the DIM to DAnCE descriptors Convert the NodeProvisioner to standard DAnCE tools finish by July 1 Implement Deployment Manager & Target Manager finish by September 1 Proposed Plan of Action Deploy existing MLRM components using DAnCE 100% Completed! Integrate RACE into ARMS MLRM finish by October 1