100 likes | 244 Views
NextGRID Monitoring and Fabric Management Requirements. SLA Management Example: SweGrid Accounting System and Test-bed. Thomas Sandholm, KTH, sandholm@pdc.kth.se. NextGRID. How do we make the Grid sustainable?. Outline.
E N D
NextGRID Monitoring and Fabric Management Requirements SLA Management Example: SweGrid Accounting System and Test-bed Thomas Sandholm, KTH, sandholm@pdc.kth.se
NextGRID How do we make the Grid sustainable?
Outline • NextGrid WP4: Grid Foundations - Advanced Deployment, Service Management and Migration • SLA Management Lifecycle: Construction, Negotiation, Attainment, Charging • Towards Adaptive Systems: SLA Manager Bag of Services • Example Test-bed: SweGrid & SGAS • Example SLA Usage: SLA Management in SGAS • Requirements Checklist
NextGRID WP4: Grid Foundations - Advanced Deployment, Service Management and Migration • Work Package - Grid Foundations: • Address basic properties, protocols, and core services of individual OGSA services, e.g., QoS & Manageability – engineer reference solution • Task - advanced deployment, service management and migration: • Requirement:Decentralized automatic control needed over hardware comprising Grid fabric as well as applications and services running on that fabric • Requirement:Incremental evolution to avoid loss of service • Focus on autonomous service management and SLA management • Phase 1:analyse available monitoring and supervision solutions. Requirements from existing Grid projects, e.g., Framework 5 Projects, GRASP, Android, SweGrid • Phase 2: develop management framework ,SLA+negotiation • Phase 3:integrate monitoring and management solution and introduce intelligent decision-making process. • NG Partners: British Telecom (UK), HLRS (Germany), KTH (Sweden)
SLA Management Lifecycle • Construction Phase: offers prepared by service providers (or their agents) with fixed and negotiable terms, service requests with QoS requirements prepared by customers (or their agents) • Negotiation Phase: negotiation protocol needed to settle on negotiable terms and sign SLA. SLA-SLS mapping. • Attainment Phase:monitoring, policing, re-negotiation, re-configuration, obligation fulfillment. • Charging Phase:accounting, usage recording, auditing, archiving, price rating, billing.
Towards Adaptive Systems: SLA Manager Bag of Services SLA Manager P2P Event Manager Pricing Manager (GridBank Trader Service) Access Flow Policing/Shaping (DiffServ Packet Dropping) SLA Provider (WS-Agreement, WSLA) Usage Tracker/Analyzer (GGF-UR, Nework Traffic Analyzers) Policy Manager (PAP, PIP, PDP, PEP) Service Registration/Discovery (WS-RF, UDDI) Negotiation Agent (Contract Net, WS-AgreementNegotiation) Service Monitor/Controller (GGF-CMM, WSDM, WS-RF) Policy Rule Base (XACML, FuzzyLogic) Knowledge Repository Meta-Data Repository (Ontologies, WSDL) Usage Repository
Example Test-bed: SweGrid & SGAS • Swedish nation-wide computational resource comprising 600 Intel P4 at 6 HPC Centers interconnected with 10Gb/s GigaSunet network • Resources allocated to promising research projects with demanding computational and storage needs by national allocations comittee (SNAC) • SweGrid Accounting System (SGAS) provides soft real-time allocation enforcement across all centers in the Grid based on SNAC quota • 3-party policy-driven resource access (user resource specification, local resource policy, allocation authority policy) • Java Web services, OGSA, WS-Security, GSI, GGF-UR, XACML standards-based Infrastructure • Integration platform for workload managers and local accounting systems/schedulers • Currently built with GT3 (OGSI), transition to GT4 (WS-RF) next year
Negotiation Agent Negotiation Agent Example NextGRID Deliverable Use: SLA Management in SGAS Resource Specification 3rd Party (ARC/Globus) Service Registration/Discovery Service Monitor Resource Remote Execution Service Usage Tracker Bank Reservation Manager Allocation Authority Policy Manager Policy Manager
Requirements Checklist (incomplete in random order) • Decentralized automatic control needed over hardware comprising Grid fabric as well as applications and services running on that fabric (WP4) • Incremental evolution to avoid loss of service (WP4) • Common information models for service level agreements and for the management information that is required to deliver end-to-end application quality (WP3) • Techniques for adapting the representation of information according to its context (WP3) • Standardized QoS Ontologies to allow monitoring on predefined SLA parameters with well-defined metrics • Sensors and Controllers on various levels (e.g. Resource, Workflow) wrapping instrumented code – accessed using standard protocols defined in WSDL • Registration/Discovery of Sensors and Controllers – using standard protocols defined in WSDL • Both Push and Pull Event Handling of messages of various criticality (filterable) • Virtualization of Resources, Abstract Runtime (Hosting) Environments • Back-end SLS Control: CPU, Bandwidth, Storage, Memory • Front-end SLA Request: availability, run time, jitter, cost
Example Test-bed Experience: SGAS Resource Administrator Interaction and Policy Introduction • Involve RAs early in the process with surveys • Feedback from running system crucial to move from prototype to production • Use a phased low-risk, low-intrusion deployment approach • Allow all stake-holders (e.g. RAs, users, resource owners) to customize local policies easily through XML document centric configurations and transformations, e.g. RSL, XACML, GGF-UR Style sheets. Provide sensible defaults.