300 likes | 392 Views
NorduGrid Architecture and tools. CHEP2003 – UCSD Anders Wäänänen waananen@nbi.dk. NorduGrid project. Launched in spring of 2001, with the aim of creating a Grid infrastructure in the Nordic countries. Idea to have a Monarch architecture with a Nordic tier 1 center
E N D
NorduGrid Architecture and tools CHEP2003 – UCSD Anders Wäänänen waananen@nbi.dk
NorduGrid project • Launched in spring of 2001, with the aim of creating a Grid infrastructure in the Nordic countries. • Idea to have a Monarch architecture with a Nordic tier 1 center • Partners from Denmark, Norway, Sweden, and Finland • Initially meant to be the Nordic branch of the EU DataGrid (EDG) project • 3 full-time researchers with few externally funded
Motivations • NorduGrid was initially meant to be a pure deployment project • One goal was to have the ATLAS data challenge run by May 2002 • Should be based on the the Globus Toolkit™ • Available Grid middleware: • The Globus Toolkit™ • A toolbox – not a complete solution • European DataGrid software • Not mature in the beginning of 2002 • Architecture problems
Architecture requirements • No single point of failure • Should be scalable • Resource owners should have full control over their resources • As few site requirements as possible: • Local cluster installation details should not be dictated • Method, OS version, configuration, etc… • Compute nodes should not be required to be on the public network • Clusters need not be dedicated to the Grid
NorduGrid components • Grid Manager – Mange Grid jobs in cluster • Job control and data management • Information system • Patched Globus MDS with improved schema • User interface • Job submission and personal broker • Grid monitor • Web based interface to information system • Globus replica catalog
Grid manager features 1 • Staging of executables and input/output data • Supported protocols: • Local files, gridftp, ftp, http(s), Replica Catalog, Replica Location Services • Data transfer control including retries • Caching of input data • Cache size control • Private (per UNIX user) and shared caches • Data access control based on user’s credentials • Support for runtime environment (eg. Software installations) • Full job information available for auditing, accounting and debugging
Grid manager features 2 • Globus building blocks used • GridFTP – fast, reliable and secure data access • GASS transfer – http(s) like data access protocol • Replica catalog • Replica Location Service (with EDG) • RSL – expandable Resource Specification Language • Limitations • Data handling is currently only supported at job start and job end when cluster nodes are on a private network
submission NorduGrid gridftp server File access Job control stagein downloader Grid Manager stageout uploader Grid Manager architecture Frontend Computing node LRMS LRMS NFS Job session directory Job session directory Cache Link or copy
User interface • The NorduGrid user interface provides a set of commands for interacting with the grid • ngsub – for submitting jobs • ngstat – for states of jobs and clusters • ngcat – to see stdout/stderr of running jobs • ngget – to retrieve the results from finished jobs • ngkill – to kill running jobs • ngclean – to delete finished jobs from the system • ngcopy – to copy files to, from and between file servers and replica catalogs • ngremove – to delete files from file servers and RC’s
Information system • The nerve system of the Grid - information is a critical resource! • Complications: • Large number of resource -> scalability • Heterogeneous resources -> characterization • Decentralized • Efficient access to dynamic data • Quality and reliability of information • Compromise between: • Up to date data vs. load on the Grid
NorduGrid information system • Use Globus MDS • Improved schemas with natural representation of resources: • Clusters (queues, jobs and users) • Storage elements • Replica Catalogs • Use efficient providers • Each resource runs a GRIS • GRIS’s are organized into a dynamic country based GIIS hierarchy. • Have enough information to do brokering
queue users jobs user-02 user-01 job-03 job-02 job-01 DIT of a cluster cluster queue users jobs user-03 user-02 user-01 job-05 job-04
queue users jobs user-02 user-01 job-03 job-02 job-01 DIT of a cluster cluster queue users jobs user-03 user-02 user-01 job-05 job-04
queue users jobs user-02 user-01 job-03 job-02 job-01 DIT of a cluster cluster queue users jobs user-03 user-02 user-01 job-05 job-04
Job entry job status monitoring = information system query
Another job entry - the job entry is generated on the execution cluster - when the job is completed and the results are retrieved the job disappears from the information system
queue users jobs user-02 user-01 job-03 job-02 job-01 DIT of a cluster cluster queue users jobs user-03 user-02 user-01 job-05 job-04
Personalized information user based information is essential on the Grid: • users are not really interested in the total number of cpus of a cluster, but how many of those are available for them! • number of queuing jobs are irrelevant if the submission gets immediately executed • instead of total disk space the user's quota is interesting nordugrid-authuser objectclass • freecpus • diskspace • queuelength
GIIS Hierarchy Hierarchy of GRISes/GIISes
Brokering & job submission • Searches through the NorduGrid Testbed for available clusters • Loops through all the clusters and selects those queues (possible targets) where: • The user is authorized to run • Job requirements can be satisfied • Selects a job destination from the matching targets • Randomly selects among the free resources (where user-freecpus>0) • In case there are no free matching resources some of the “load” attributes (i.e. user-queuelength) are taken into account
RSL RSL RSL RC MDS SE SE NorduGrid job submission Gatekeeper GridFTP Grid Manager
Quick client installation/job run • As a normal user: • retrieve nordugrid-standalone-0.3.17.rh72.i386.tgz tar xfz nordugrid-standalone-0.3.17.rh72.i386.tgz cd nordugrid-standalone-0.3.17 source ./setup.sh • Maybe get a certificate grid-cert-request • install certificate per instructions grid-proxy-init ngsub '&(executable=/bin/echo)(arguments="Hello World")‘
Future development or integration • Better Authorization • Accounting • Optimize brokering • More intelligent data management and replication service • Handle network requests from running jobs on “private” networks • Grid portal interface – in testing • Move towards Grid services and improved community compatibility
Future • The committee of Nordic natural science ministers NOS-N has decided to fund a new common Nordic Grid Project based on the work done by the NorduGrid project. This project should work on a proposal/recommendation for a Nordic DataGrid facility. • Support for the toolkit in the future • This will be supported in each country by local Grid initiatives • Collaboration with the Nordic computing centers have already been initiated with the deployment of the toolkit on several large centers. • Use it for future ATLAS production in the Nordic countries • Move towards OGSA and better community compatibility
Resources • Documentation and source code are available for download • Main Web site: • http://www.nordugrid.org/ • Repository • ftp://ftp.nordugrid.org/pub/nordugrid/
The NorduGrid core group • Александр Константинов • Balázs Kónya • Mattias Ellert • Оксана Смирнова • Jakob Langgaard Nielsen • Trond Myklebust • Anders Wäänänen