210 likes | 365 Views
Grid Coordination by Using the Grid Coordination Protocol. R. Harakaly, F. Bonnassieux, P. Primet Presented by: Laurent LEFEVRE CNRS-UREC, Lyon, FRANCE INRIA RESO, LIP (UMR CNRS, ENS, INRIA, UCB), Lyon, FRANCE. Outline. Why do we need grid scheduling? Grid Coordination Protocol Features
E N D
Grid Coordination by Using the Grid Coordination Protocol R. Harakaly, F. Bonnassieux, P. Primet Presented by: Laurent LEFEVRE CNRS-UREC, Lyon, FRANCE INRIA RESO, LIP (UMR CNRS, ENS, INRIA, UCB), Lyon, FRANCE
Outline • Why do we need grid scheduling? • Grid Coordination Protocol • Features • Architecture • Multiple ring support • Robustness • Security • One time token • User Interface • Implementation and Results • Network monitoring • Configuration coordination • Network Topology Discovery • Summary GAN 2004
Why do we need grid scheduling? • Centralized services: • VO servers • CRL distribution servers • Configuration servers • Distributed services • Network monitoring and discovery GAN 2004
Grid Coordination Protocol • Based on the Probes Coordination Protocol (PCP) • Generalized functions, not focused only to the network monitoring • Ring with token approach • Multiple ring support with inter-ring host locking for scalability • Used for: • Network monitoring synchronization • Coordination of the configuration updates • Scheduling of information distribution GAN 2004
Features • Openness: Possibility to schedule any service needed • Flexibility/Customizability: Full and easy (re)configuration/parametrization of the service on the remote nodes. • Robustness/Reliability: Necessity to provide fully reliable service • Scalability: Possibility to schedule big number of members • Security: Distributed information and participating member nodes must be secure. • One time token: information distribution on demand GAN 2004
GCP Architecture • Distributed architecture • No central information source • No single point of failure • Distributed token registration • Distributed functions • Scalability • Ring: logical group of services • Support of multiple rings • Possibility to build hierarchy of rings GAN 2004
Multi-ring support • Required by need of: • Support of scalability by creation of the ring hierarchy • Scheduling of different services (e.g. CRL update, topogrid, Iperf, etc.) • Multiple independent rings: danger of possible collision • Critical for active network measurements GAN 2004
Inter-Ring Experiment Collision Two measurements on the same host • Collision possibility: • In case of multiple independent rings sharing one or more hosts • Ring1 members {1, 2, 6, 7} • Ring2 members {3, 4, 5, 7} • Solution: • Inter-ring host locking 2 3 1 7 4 ! 6 5 GAN 2004
GCP host locking mechanism Unable to lock destination • Source and destination host locking • Conflicting experiments are delayed due to lock on the host BLOCKED 2 3 1 7 4 6 5 GAN 2004
GCP Robustness • Distributed architecture • No single point of failure • In case of failure of one measurement host, GCP will bypass it without any impact on a service periodicity • In case of reliable service the failure report can be created for later successful finishing of the task • Protocols based on token passing face to problems connected with lost and/or duplicated token. • Timeout based token recovery mechanism • Token_ID and regenerating_host_ID based duplicate token elimination GAN 2004
GCP Security • Three main security issues: • Host Security: Impossibility to start non-approved service on the host, or action which compromises the host security • Token Security: Integrity of the token cannot be modified on the way • User Authentication: Assign owner to the token and base any token manipulation and service on this information GAN 2004
One Time Token • New feature • Token passes once through all member nodes. • Used for: • Non-periodic/on demand/interactive services • On demand CRL update • Ad Hoc monitoring measurements • On demand/interactive active network monitoring probes • Plan: Add possibility to define an arbitrary number of passes. GAN 2004
User Interface • Set of utilities is provided for easy manipulation (creation, deletion, update, ..) of the rings and for an external GCP host (un)locking. • C and JAVA API for embedding of GCP client functionality (ring creation, modification, etc.) is prepared. GAN 2004
edg-gcpd-admin output [hary@ccwp7 bin]$ ./edg-gcpd-admin -L grid-nm.ifae.es GCP daemon version: 2.0.7 Reporting node: 192.101.162.78 Ring name: pinger, token id: 940, options = 0 Token status: NORMAL Token state: WAITING Period 1800, Delay 60, Timeout 600 Command: edg-pinger Last execution timestamp: Fri Apr 9 10:50:14 2004 Members: 134.158.105.254 137.138.225.18 141.52.160.24 130.246.187.145 193.136.90.138 193.206.210.133 131.154.99.101 192.101.162.78 192.16.186.229 ... GAN 2004
Implementation and results • Most of presented use cases are already deployed on the application testbed of the European DataGrid project. GAN 2004
Network monitoring • Scheduling of the set of distributed network monitoring sensors • Scalability problems solved by multilayer monitoring architecture • Inter-ring locking used for avoiding the concurrent measurements between two rings Fr Backbone ring Es It GAN 2004
700 period 600 20 500 15 token regeneration count 400 10 300 5 200 0 118 120 122 124 126 128 130 100 0 118 120 122 124 126 128 130 Periodicity [s] Experiment periodicity measurement GAN 2004
Network monitoring configuration • Network monitoring management cannot be completely distributed. It is always centralized in one (or several) network operation centers. • Monitoring nodes then downloads the configuration files from these centers. • GCP enables to create the easily maintainable and configurable upgrade scenarios • This approach is easily applicable for any service which publish the information on a central node like (CA CRL updates, VO servers, etc.) GAN 2004
Network Topology Discovery GAN 2004
Summary • GCP is a generic coordination protocol for grid control and management services • Stability and usability were demonstrated on the use cases already implemented in the EDG DataGrid project • Download: http://ccwp7.in2p3.fr • Questions: robert.harakaly@urec.cnrs.fr GAN 2004