10 likes | 136 Views
Fork. TDWB. SGE. LL/Fork. TDWB. Fork. WS Interface. BSC Grid Cluster. Fork. Enabling Autonomic Meta-Scheduling in Grid Environments. Yanbin Liu 1 , S. Masoud Sadjadi 2 , Liana Fong 1 , Ivan Rodero 3 , David Villegas 2 , Selim Kalayci 2 , Norman Bobroff 1 , Juan Carlos Martinez 2
E N D
Fork TDWB SGE LL/Fork TDWB Fork WS Interface BSCGrid Cluster Fork Enabling Autonomic Meta-Scheduling in Grid Environments Yanbin Liu1, S. Masoud Sadjadi2, Liana Fong1, Ivan Rodero3, David Villegas2, Selim Kalayci2, Norman Bobroff1, Juan Carlos Martinez2 1: IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 2: Florida International University, Miami, FL 33199 3: Barcelona Supercomputing Center, Barcelona, Spain I: Objectives II: P2P Meta scheduling III: Autonomic Meta scheduling • Objective • Enable autonomic meta-scheduling over different organizations • Strategic Importance • Enhance usability: common job control language to different resource domains • Drive interoperability of schedulers: proprietary and open-source • Provide integrated scheduling views for enterprise and grid customers • Technology Benefits • Meet various user service objectives: policy driven (e.g. capability based, response time based) • Maximize resource availability to users with transparency of locations • Optimize utilization of resources across domains • Use of a common protocol for communication • Initial negotiation phase • Meta scheduler roles • Heartbeat rate • Type of authentication • Quality of Service • Job execution • Based on user policies • Resource matching • Physical resources • Scheduler capabilities (FCFS, priority-based) FIU FIU-GCB • Some key aspects of the Metascheduler Protocol: • Heterogeneous sites; inner structure of domains doesn’t effect the functionality of the protocol. • Site autonomy; each metascheduler is responsible from its own site, and offers as much information as it wants to other sites. • Peer-to-peer; no centralized body, no single-point of failure. LAGrid Meta-Scheduler Peer-to-peer Peer-to-peer IBM Meta-Scheduler Meta-Scheduler BSC Peer-to-peer CEPBA IBM-USA BSCgrid IBM-India P: Job flow is from C to P, resource info flow is from P to C C IV: System architecture FIU IBM • Connection API • Establish and terminate connections between domain meta-schedulers. • Negotiate roles and connection parameters using the interface • Provider roles: provide resources for job execution; is responsible of sending out resource information • Consumer roles: use resources provided by providers; route job request to providers. • Send heart beats: exchanged to guarantee the healthy state of the connection. • Resource exchange API • Exchange the scheduling capability and capacity of the domain controlled by the meta-scheduler • Exchanged information can be a complete or incremental set of data • Job management API • Submit, re-route and monitor job executions across schedulers WS Client Meta-TDWB Connection API JSDL Web Console Connection Management Conn Records Connection Management User Client Global Scheduling manager Resource manager Command-ine Job Mgmt API Job Records Job Management Job Management JSDL 1. User Client takes the job request from the local User. This request is forwarded to Global Scheduling Manager (GSM). 2. GSM queries the Resource Manager (RM) for resources. RM stores information about local and remote resources. 3. If available resources are found on local site, job request is forwarded to Site Scheduling Manager (SSM). 4. SSM leverages Gridway functionality to submit the job to the Grid Middleware (Globus). 5. If there are not available resources locally, job request is sent to a remote site through WS Client 6. Alternatively, job requests from other peers can be received from the WS layer. Scheduler Site scheduling manager Resource Repository Resource exchange API Gridway Resource Management Resource Management Sched Policy Globus Globus LL Proxy Resource Management Job Management Connection Management Web Services Tivoli Agent 1. Users send job requests in JSDL format to Meta-TDWB using either TDWB web console or command line. 2. The connection with other meta-schedulers is created and recorded. 3. The resource information from other meta-schedulers, local schedulers such as LoadLeveler and TDWB, and local resources are recorded. 4. Combined with chosen scheduling policy, scheduler matches job requests to resources. 5. Based on the allocated resources, job requests are either forwarded to other schedulers or dispatched to the resources directly. SGE Fork Load Leveler TDWB GCB Cluster LAGrid Cluster Resources Apache Axis2 Server LAGrid Plugin LoadLeveler WS Client eNANOS Client LAGridRP Globus 1. The eNANOS Client forwards the user requests to the eNANOS Broker. 2. The remote request from the P2P infrastructure are managed by regular WS (Axis2) acting as a wrapper to a GT4 service that implements the LAGrid APIs and protocols. Connections and other data is stored in Resource Properties. 3. Jobs and resources (aggregated data) obtained from local and remote sites are used in the eNANOS Resource Broker scheduling. Jobs are executed under the local domain through Globus services, or are forwarded to other meta-scheduler. 4. eNANOS provide its resources data, forwards jobs and performs other operations (such as sending heart beats) through a WS Client. CEPBA Resources JSDL eNANOS Resource Broker Command-line Globus Java API GT4 Container BSC