400 likes | 493 Views
K.Harrison and A.Soroko Cosener’s House, Abingdon, UK 22 May 2002. Framework-Grid interfaces: technical survey – Need for Framework-Grid interfaces – Outline of required functionality – Tools for software installation and configuration – Production tools
E N D
K.Harrison and A.Soroko Cosener’s House, Abingdon, UK 22 May 2002 Framework-Grid interfaces: technical survey – Need for Framework-Grid interfaces – Outline of required functionality – Tools for software installation and configuration – Production tools – Grid interfaces currently under development – Conclusions Aim to give general background, and brief overview of software products relevant to a Framework-Grid interface for ATLAS and LHCb Many items covered in more detail in later presentations
Need for Framework-Grid interfaces – Resources for Grid activities are becoming available in increasing numbers Want to take advantage of these resources as early as possible – Take Cambridge as an example: For Grid activities, have now: 32 X 400 MHz Pentium II processors 20 X 1.13 GHz Pentium III processors About 0.5 Tbyte of disk space Globus 2.0 installed In near future: will add 2 Tbyte file server will install EDG middleware will connect to UK eScience Grid and EU Testbed
– Physics at Cambridge using Grid resources: ATLAS: already submitting ATLFAST simulation jobs; plan to participate in data challenges LHCb: participating in data challenges (initially non-Grid; later with Grid) NA48: preparing to simulate 10^8 events for evaluation of backgrounds in rare kaon decays (300 days of CPU time) To fully exploit possibilities for physics studies, need a tool that simplifies Grid access and job configuration: Framework-Grid interface
– First ideas for a Grid interface with built-in knowledge of the Gaudi/Athena framework used by ATLAS and LHCb developed in summer 2001, in particular by P.Mato and C.Tull Gaudi/Athena and Grid Alliance (GANGA) – GANGA might eventually be: a completely new Grid interface an adaptation/evolution of an existing Grid interface – In all cases, expect GANGA to be modular and to make use of tools/service developed by others This workshop should help us understand how to proceed
Outline of required functionality – A Framework-Grid interface for ATLAS and LHCb will need to provide access to services that can be logically divided into two categories – Grid services are developed in the context of many groups and work packages: Security services Job submission Job decomposition Resource allocation and management Data replication and cataloguing Application-independent monitoring Would hope to use these as they are (assume no further development needed)
– Framework-related services (specific to ATLAS and LHCb) will need to be developed in parallel with the interface implementation: Job configuration (algorithms to run, properties, input/output requests) Management of software environment (executables, libraries,databases, etc) Automatic creation of job-description files Error recovery Application-specific monitoring Bookkeeping
Tools for software installation and configuration – In general, Grid resources will not be dedicated to a single experiment: might run jobs for ATLAS one day, CDF the next and LHCb the day after Framework-Grid interface will need access to a tool that allows setting up of the user’s software environment – Tools of interest include: LCFG: developed in context of EU WP4, based on rpm files DAR: developed at FNAL, based on tarballs pacman: developed at Boston University, fetches, installs and manages packages based on rpm files or tarballs, makes use of software cache See presentation by S.Youssef
Production tools – Production tools already in use can provide ideas for implementing some of the services to be offered by a Framework-Grid interface – As an example, consider Simulation for LHCb and its Integrated Control Environment (SLICE) see presentation of G.Kuznetsov – Working in a non-Grid environment: Production requests to distributed facilities are submitted via a web page Java servelets create job scripts and options files Production is monitored using control system based on PVSS Update of bookkeeping database, transfer of output data to mass storage and quality checks performed automatically – Grid-based system at experimental stage
Servelet Purpose request for production: nr of events, channel, datatype (implies a workflow), configuration, deadline for completion physicist • production manager: • Create required nr of jobs (500 evts each) • Determine configuration • Determine/create runtime environment • Run executable • Check data • Copy data/logs • Flag production as completed, prepare updating of bookkeeping db Maprunmc sicbmc for rawh production physics coordinator: ratifies production request which gets added as outstanding request to the database Brunelrun Brunel for DST production Bbinclrun Sicbmc + sicbdst for physics production Job creation/submission (via Web): identify outstanding requests, select workflow(s), give nr of events, create scripts Monitoring (via PVSS): submit jobs to distributed sites, see what jobs are running, how many, channel, datatype, site, current event nr, configuration used by job, submit time, kill jobs Mcbrunel Sicbmc v249 + Brunel v9r1 for data challenge tests, dbase v243r1p1, v243r3 bookkeeping database LHCb production strategy using SLICE (From E.vanHerwijnen)
Update bookkeeping database Submit jobs remotely view Transfer data to Mass store Execute on farm Data Quality Check Monitor performance of farm via Web (From E.vanHerwijnen)
Grid interfaces currently under development – Middleware (Globus, EDG, PPDG, other) provides an interface to grid services via command-line instructions given in a particular sequence – More user-friendly interfaces are being developed by several groups: Alice Environment (Alien) see also presentations of P.Buncic and L.Goosens EDG GUI see also presentation of D.Colling Grid Enabled Web Environment for Site-Independent User Job Submission (GENIUS) Grid Access Portal for Physics Applications (Grappa) see also presentation of C.Tull Others?
AliEn • General characteristics of AliEn: • Under development by Alice Offline Group, but not specific to Alice • Uses iVDGL or EDG middleware, Globus toolkit, and a variety of external modules (SOAP, PAM, SWIG, etc) • Based on Perl • User access via machine on which AliEn is installed: • Command-line interface allows authentication, access to distributed catalogue, job submission, etc • With appropriate module installed, also have GUI interface • Web interface is under development
Functionality of AliEn (I) • File Catalogue: • To access the catalogue, user types: alien • To authenticate to the server, user must have either a globus certificate, or ssh keys • User can browse the catalogue using UNIX-like commands • Catalogue entries seen by user are Logical File Names (LFN) • Each user has a home directory, and can register files by giving LFN, PFN, and size
Functionality of AliEn (II) • Getting a file (from local SE) Proxy Authen Lfn? Pfn and SE 1 2 SE SE at the site of the client Pfn? File 3 Get lfn Client (From P.Saiz)
Functionality of AliEn (III) • Job submission: • Jobs may be executed on any cluster of AliEn • Output is accessible through the AliEn catalogue • alien StartMonitor starts a daemon that forwards job requests to a central server • alien login gives user the AliEn prompt, which allows access to the AliEn Catalogue and provides commands to submit jobs • User gives job description using Classads (name of the executable, possible arguments or input files, extra requirements for the job, etc)
Functionality of AliEn (IV) 4 Registering stdin • Submitting jobs IS Proxy Authen CPUServer 3 Cluster Monitor 1 2 submit Client (From P.Saiz)
Functionality of AliEn (V) • Executing a job One per organization IS Proxy CPUServer 2 • Possible Local Queues: • LSF • PBS • BQS • Globus • CONDOR • DQS 1 3 Cluster Monitor Process Monitor One per element CE (From P.Saiz)
AliEn GUI • AliEn xfiles • alien xfilescreates a window for browsing the catalogue
AliEn C API • AliEn C API will provide C++ (ROOT) binding • Proposed types typedef unsigned long Alien_t; // opaque handle to Alien connection // associated struct contains ALIEN connection state typedef struct AlienResultStruct { char **results; // array of result strings int result_count; // number of results int current; // current result } AlienResult_t; typedef struct AlienAttrStruct { char **attribute; // array of attribute names char **values; // array of attribute values int atrr_count; // number of attribute pairs int current; // current attribute } AlienAttr_t;
Alien C API • Some function declarations // Connect to ALIEN server. Return handle to ALIEN instance, 0 in case of failure. Alien_t AlienConnect(const char *alien_server, const char *user, const char *passwd); // Close connection to ALIEN server. Returns -1 in case of error. int AlienClose(Alien_t srv); // Return ALIEN version string. const char *AlienGetInfo(Alien_t srv); // Add physical file to catalog and associate logical file name. Returns -1 on error, like // lfn,pfn already exists, illegal handle, etc. int AlienAddFile(Alien_t srv, const char *lfn, const char *pfn); // Delete lfn and associated pfn's. Returns -1 on error, like illegal handle, lfn not existing, no // perm, etc. int AlienDeleteFile(Alien_t srv, const char *lfn);
GENIUS • GENIUS general characteristics: • Under development by NICE s.r.l. (Italy),and INFN • Uses EDG middleware, Globus toolkit and the EnginFrame framework of NICE srl. • Based on Java and XML, which is translated by EnginFrame into HTML, WML, PDF and enriched XML • Unix/NT integration makes extensive use of the available Internet standards (HTML, HTTP, JAVA, XML, etc.) • User must obtain an account on an interface machine where GENIUS is installed and upload globus certificate • Testbed access is provided via web page from anywhere (desktop, laptop, PDA, WAP telephone, etc)
GENIUS • GENIUS modules: • Service: XML representations of computing-related facilities • Client Tier: any browser and its extensions, the layer with which users interact • Server Tier: one or more servelet-enabled web servers, providing contents and services to the clients, and controlling resource activities in the back-end • Resource Tier: where a number of "Agents" control the actual computing resources (clusters, stand-alone hosts, etc) and provide correctly formatted results to the servers • Plug-ins: developed for the Resource Tier: LSF, AFS, Nfuse, Globus and DataGrid
GENIUS • GENIUS modules: The EnginFrame work-flow
https+java/xml+rfb WEB Browser GENIUS Local WS EnginFrame Apache EDG UI EDG+GSI the Grid GENIUS • GENIUS architecture: GENIUS is bult on top of the already existing DataGrid command-line interface (From R.Barbera)
GENIUS functionality (I) • GENIUS services: • File Services • Security Services • Job Services • Information Services • Monitoring Services • Interactive Services(Virtual Network Computing package ) • VO services • Statistics
GENIUS functionality (II) • File Services: • Create a File • View a File • Edit a File • Rename a File/Directory • Delete a File/Directory • Create a Directory • Upload a File • Show the Environment
GENIUS functionality (III) • Security Services: • Upload Your Certificate • Upload .globus Tar ball • Upload Your .p12 Certificate • Information on proxy • Renew proxy • Change GENIUS Password • Change X.509 PEM phrase
GENIUS functionality (IV) • Job Services: • Single Job • Job Submission • The user has to provide the JDL file • Select one of the possible Computing Elements • Press the button “Submit job” • Job Queue • Job identifier, JDL file, time, Computing Element, present status, possible action • Job Output (The user has to press the button “Get Output”) • Job Data (The user can inspect personal spooler area) • Clean Job Queues • List Available Resources
GENIUS functionality (V) • Job Services: • Job Submission
GENIUS functionality (VI) • Job Services: • List Available Resources
GENIUS functionality (VI) • Information Services: • Sites belonging to the test-bed • Computing Elements present at each site with the information on the local resource manager • Storage Elements present at each site, the connection port, the size and the mount point
Grappa • General characteristics of Grappa: • Under development in context of Grid Physics Network (GriPhyN) Project and ATLAS • Prototype based on XCAT Science Portal • Allows user to submit jobs to US-ATLAS testbed resources • Provides file staging, remote job-option file editing, basic monitoring • Provides a set of tools for collaborative data analysis • Packaged with pacman
AthenaNotebook XCATSciencePortal Grappa TomcatServer Grappa (From R.Gardner) Grappa current architecture:
User’s Web Browser Portal Web Server (tomcat server + java servlets) GSI Authentication Jython Intepreter Notebook Database Grid Grappa (From S.Smallen) • Jython - access to Java classes: • Globus Java CoG kit • XCAT • XMESSAGES XCAT architecture:
Grappa functionality • Provided via Athena Active Notebook Users can: • Submit Athena Jobs to the GRID • Manage resources • Submit a sequence of jobOptions files to the GRID • Monitor status of running jobs
Conclusions – Framework-Grid interfaces will be of immediate use for physics studies – Various tools and services relevant to a Framework-Grid interface are already available – User-friendly (GUI-based) Grid interfaces are being developed by several groups Workshop should help us understand how to proceed with development of Framework-Grid interface for ATLAS and LHCb