460 likes | 585 Views
Grid interoperability using. Sylvain Reynaud, Pascal Calvat CC-IN2P3. Plan. demo of overview of demo of summary and perspectives. JUX. JSAGA is an API for uniform access to grids. JJS and JUX are tools using JSAGA. JJS – Overview.
E N D
Grid interoperability using Sylvain Reynaud, Pascal Calvat CC-IN2P3
Plan • demo of • overview of • demo of • summary and perspectives JUX JSAGA is an API for uniform access to grids. JJS and JUX are tools using JSAGA. JSAGA
JJS – Overview • JJS was developed by Pascal Calvat (CC-IN2P3) in 2003, to submit jobs to the DATAGRID infrastructure • has evolved to submit jobs to the EGEE infrastructure • JJS is designed to ease job submission from web servers hosted in laboratories • it is an alternative to User Interface + Resource Broker (or to gLite-UI + gLite-WMS) • JJS is optimized for submitting short-life jobs • based on observed QoS of sites: JJS give a score to selected sites and use it for subsequent match-makings • but it can also be used with long-life jobs 20/08/2014 3 JSAGA
JJS – Demo 1 job 1 job 1 job Overall performance for short-life jobs (install povray on-the-fly, then generate part of the image) 20/08/2014 4 JSAGA
JJS – Overview • JJS was initially developed on top of cog-jglobus API • cog-jglobus is being replaced with JSAGA for… • security (done) • data management (done) • execution management (in a near future) • job collection management (in a near future) • Using JSAGA enables JJS to become independent of gLite middleware evolutions • from Globus proxy to VOMS proxy (done) • from GSIFTP to SRM (workin progress…) • from LCG-CE to gLite-CREAM (in a near future) 20/08/2014 5 JSAGA
cluster JSAGA – targeted use cases • Motivations for using several grid infrastructures: • increasing the number of computing resources available to user • need for resources with specific constraints • super-computer • confidentiality • small overhead (e.g. consolidation) • interactivity • availability, on a given grid, of: • the data • the software JSAGA
Ready-to-use software, adapted to targeted scientific field Hide heterogeneity between grid infrastructures Hide heterogeneity between middlewares As many interfaces as ways to implement each functionality As many interfaces as used technologies SAGA JSAGA
SAGA SAGA: code example // use factories to create SAGA objects Session session = SessionFactory.createSession(); URL url = URLFactory.createURL("gsiftp://cclcgseli01.in2p3.fr/tmp/"); NSDirectory dir = NSFactory.createNSDirectory(session, url); // use SAGA objects List<URL> result = dir.list(); for (URL r : result) System.out.println(r); JSAGA
Ready-to-use software, adapted to targeted scientific field Hide heterogeneity between grid infrastructures Hide heterogeneity between middlewares As many interfaces as ways to implement each functionality As many interfaces as used technologies SAGA core engine + plug-ins JSAGA end user application developer plug-ins developer JSAGA
close to application developer needs object-oriented high-level uniform interface to all the supported technologies design objectives easy to use … but << certainly not simple to implement >> (T. Kielmann) engine code = 2 x plug-ins code close to existing middleware APIs service-oriented low-level as many interfaces as ways to implement each functionality optional interfaces design objectives easy to implement enable efficient usage of middleware APIs core engine + plug-ins JSAGA SAGA plug-ins interfaces JSAGA Plug-ins interfaces JSAGA
done construction planned core engine + plug-ins JSAGA Plug-ins: execution management Streaming Plug-in interfaces: direct/buffered/redirected streams used before/during/after execution Monitoring Plug-in interfaces: querying / listening individual job / list of jobs / filtered jobs set stream for interactive set stream for non- interactive get stream for interactive query status for individual job listen status for individual job query status for filtered jobs getInput getOutput getError getState waitFor SAGA user interface: getInput / getOutput SAGA user interface: getState / waitFor Job control Job monitoring gatekeeper gLite-WMS wsgram unicore6 ssh fork cream PBS remote naregi gatekeeper gLite-LB wsgram unicore6 ssh fork cream … JSAGA
done construction planned core engine + plug-ins JSAGA Physical files Logical files Plug-ins provided Security InMemCred Globus G. Legacy G. RFC820 MyProxy VOMS X509 SSH Login / pwd JKS Data catalog rns lfn srb / irods http https sftp rbyteio file zip gsiftp tar ftp mail cache srm Exec. (control) Exec. (monitor) Job control gatekeeper gLite-WMS wsgram unicore6 ssh fork cream PBS remote naregi gatekeeper gLite-LB wsgram unicore6 ssh fork cream … Expression Language basic default JEP BeanShell JSDL+ext. SAGA JDL RSL-2 RSL-4 JSAGA
core engine + plug-ins JSAGA hide middleware heterogeneity (e.g. gLite, Globus, Unicore) JDL RSL This is still not enough… job desc. JSAGA gLite plug-ins Globus plug-ins JSAGA
hide middleware heterogeneity (e.g. gLite, Globus, Unicore) EGEE OPlast delegate selection & files staging WMS SRM input data hide infrastructures heterogeneity (e.g. EGEE, OSG, DEISA) GridFTP LCG-CE LCG-CE WS-GRAM WS-GRAM firewall job job This is still not enough… job desc. JSAGA gLite plug-ins Globus plug-ins JDL RSL staging graph JSAGA
Ready-to-use software, adapted to targeted scientific field Hide heterogeneity between grid infrastructures Hide heterogeneity between middlewares As many interfaces as ways to implement each functionality As many interfaces as used technologies jobs collection JSAGA SAGA core engine + plug-ins JSAGA end user application developer plug-ins developer JSAGA
Middleware heterogeneity e.g. CREAM, WMS, SSH, GK Infrastructures heterogeneity Grid/site policy e.g. network filtering, shared FS Environment variables e.g. $VO_?_SW_DIR, /usr/local Configuration attributes (client) e.g. monitor service URL, shell path on cygwin, default SE URL Command line interfaces(worker) e.g. globus-url-copy, srmcp, Scp, wget, tar VOMS jobs collection JSAGA VOMS Globus Description of infrastructures example: execution management gatekeeper srb:// srm:// CC-IN2P3 lfn:// WMS gsiftp:// EGEE gatekeeper wsgram OpenPlast Grid http:// tar:// gatekeeper World localhost JSAGA
VOMS gatekeeper srb:// srm:// jobs collection JSAGA CC-IN2P3 lfn:// WMS gsiftp:// EGEE VOMS gatekeeper wsgram OpenPlast Grid http:// Globus tar:// gatekeeper World localhost plug-ins Transfer path depends on… • When using a single grid infrastructure • all files can be transported to/from the worker nodes through a single storage node • When using several grid infrastructures • need to dynamically build a more complex transfer graph, according to… url:// job desc. JSAGA JSAGA
jobs collection JSAGA VOMS gatekeeper srb:// srm:// CC-IN2P3 lfn:// WMS gsiftp:// EGEE VOMS gatekeeper wsgram OpenPlast Grid http:// tar:// Globus gatekeeper localhost World plug-ins Transfer path depends on… • grid or site • network filtering policy • commands available on workers • services available from workers (close Storage Element, shared FS) • supported context instances • data to stage • shared by several jobs • installed on some worker nodes • file size • required data protection level • execution service • protocols supported for staging • transfer protocol • access mode (RO, WO, RW) • third-party transfer • supported data protection level url:// job desc. JSAGA JSAGA
C C' common result R1 std-error E1 R1 C' job jobs collection JSAGA C SMTP SRB GSIFTP GSIFTP HTTP job OPlast EGEE CA OPlast OPlast OPlast OPlast OPlast E1 job VOMS gatekeeper srb:// srm:// CC-IN2P3 lfn:// WMS gsiftp:// EGEE VOMS GSIFTP gatekeeper wsgram OpenPlast OpenPlast Grid http:// tar:// Globus gatekeeper localhost World OpenPlast Transfer path depends on… • grid or site • network filtering policy • commands available on workers • services available from workers (close Storage Element, shared FS) • supported context instances • data to stage • shared by several jobs • installed on some worker nodes • file size • required data protection level • execution service • protocols supported for staging • transfer protocol • access mode (RO, WO, RW) • third-party transfer • supported data protection level JSAGA
data to stage shared by several jobs installed on some worker nodes file size required data protection level grid or site network filtering policy commands available on workers services available from workers (close Storage Element, shared FS) supported context instances E E src executable input data D1 SMTP SRB GSIFTP GSIFTP HTTP R1 C' D1 job jobs collection JSAGA C GSIFTP job OPlast EGEE CA OPlast OPlast OPlast OPlast OPlast TAR TAR E1 job C C' C'' common Transfer path depends on… result R1 • execution service • protocols supported for staging • transfer protocol • access mode (RO, WO, RW) • third-party transfer • supported data protection level std-error E1 E src C" E iGet JSAGA
E E src executable input data D1 jobs collection JSAGA OPlast OPlast C C' C'' common Example of generated graph result R1 std-error E1 Data flow example with several protocols used, but only 3 jobs submitted on 1 grid… JSAGA
Ready-to-use software, adapted to targeted scientific field Hide heterogeneity between grid infrastructures Hide heterogeneity between middlewares As many interfaces as ways to implement each functionality As many interfaces as used technologies Applications jobs collection JSAGA SAGA core engine + plug-ins JSAGA end user application developer plug-ins developer JSAGA
JSAGA provides command line interfaces for… security jsaga-context-init jsaga-context-info jsaga-context-destroy execution management jsaga-job-run jsaga-job-status jsaga-job-cancel data management jsaga-cat jsaga-cp jsaga-ls jsaga-mkdir jsaga-mv jsaga-rm jsaga-rmdir jsaga-stat jsaga-test jsaga-logical Applications Applications Command line interfaces JSAGA
Applications Applications Related projects • JSAGA is used by… • Elis@ • a web portal for submitting jobs to industrial and research grid infrastructures • JJS (Java Job Submission) • a tool for submitting jobs to EGEE • optimized for short-life jobs (resource selection based on QoS observed while submitting jobs) • JUX (Java Universal eXplorer) • a multi-protocols file browser / JSAGA
full java code JSAGA JUX – Overview • JUX is a file explorer designed to be independent of • Operating System • tested on Windows, Scientific Linux, Ubuntu, Mac • Data management protocol • tested with gsiftp, srb, irods, http, https, sftp, zip, (srm) • Security mechanism • tested with GSI, VOMS, Login/Password, X509, SSH • File content viewer • provided viewers are for text file, image viewer, audio player • can use local applications (only for protocol "file://" on OS "Windows") mp3, wav png, gif, jpg, bmp, tiff, dicom JSAGA
JUX – Overview • Data management and security • JUX does not only use the SAGA API • it also uses the JSAGA introspection API to discover… • list of available protocols • list of configured security contexts • list supported security context types, for each protocol • this allows JUX to be completely independent of technologies used • just copy your own JSAGA plug-in in JUX "lib/" directory to add the support for a new technology ! JSAGA
Demo of JUX … and then conclusion about JSAGA
Build process fully automated, including… build tools installation code generation testing unitary tests integration tests project web site generation http://grid.in2p3.fr/jsaga/ installer GUI generation (see next slide…) Plug-ins external dependencies reduced e.g. gLite-UI not needed most plug-ins supports a maven 'archetype' generates skeleton of new plug-in project plug-ins automatically validated with a reusable SAGA test suite Software quality # SAGA protocols test-suite configuration gsiftp.base=gsiftp://ccrugceli01.in2p3.fr/tmp/ gsiftp.base2=gsiftp://agena.c-s.fr/grid/tmp/ gsiftp.context=OpenPlast_proxy https.base=http://grid.in2p3.fr/html/Private/ https.context=Web_X509 file.base=file:///c:/tmp/ file.base2=file:///c:/ JSAGA
Installer GUI JSAGA
LGPL license for the core engine and most plug-ins Optional licenses for plug-ins having external dependencies, which license is not compatible with LGPL then, end-user must… either accept the terms of the license agreement or uncheck these plug-ins (see previous slide) License(s) JSAGA
Implement standard specifications from SAGA JSDL Provide high-level abstraction layer with no sacrifice on efficiency or scalability thanks to design (definition of plug-ins interface) thanks to cache mechanisms Use grid infrastructures as they are (i.e. no pre-requisite) thanks to Hide heterogeneity of middlewares of grid infrastructures VOMS gatekeeper srb:// srm:// CC-IN2P3 lfn:// WMS gsiftp:// EGEE VOMS gatekeeper wsgram OpenPlast Grid http:// tar:// Globus gatekeeper localhost World SummaryMain assets of JSAGA JSAGA
Support new technologies develop plug-ins gLite-CREAM French research grid middleware ? … integrate plug-ins developed by partners Implement new specifications SAGA Extension: Service Discovery API discussions on candidate spec. has just finished, the final spec. should be available soon JSAGA has no equivalent for this plug-in based implementation JSDL Extension: Parameter Sweep Job proposed for public comments JSAGA does this in a non-standard way Perspectives JSAGA
Backup slides JSAGA
overview summary and perspectives overview summary and perspectives overview summary and perspectives Plan JUX JSAGA
JJS – Performance For short-life jobs, grid overhead is not negligible need to optimize each step of job submission: → job submission: multi-threaded → data staging: input/output files are grouped in tarballs → monitoring: get all job status with a single request → job life-time: waiting and running jobs have a timeout limit …and last but not least: select the execution sites, which are the most efficient for short-life jobs (based on observed QoS) 20/08/2014 35 JSAGA
JJS – Performance (submission) Time elapsed before entering state WAITING (i.e. time for transferring the input sandboxes + submitting the jobs) 20/08/2014 36 JSAGA
JJS – Performance (monitoring) Use naming convention on GSIFTP server instead of Globus monitoring (detecting job failure is not needed because all the jobs timeout shortly…) 20/08/2014 37 JSAGA
JJS – Summary • Optimized for short-life jobs • QoS-based selection of execution sites • pragmatic usage of deployed grid technologies • Easy to install, configure and use • Robust • designed to be not sensible to grid middleware failures • because developed when grid was not mature (DATAGRID) http://cc.in2p3.fr/docenligne/269 20/08/2014 38 JSAGA
JJS - Perspectives • Finish integration of JSAGA • for job submission (SAGA) • for job collection management (JSDL Parameter Sweep Job Extension) • job description: independent of language • data staging: independent of protocols and infrastructure constraints • JJS is also waiting… • for SRM data management JSAGA plug-in • for Service Discovery API (SAGA Extension) support in JSAGA • in order to enable efficient usage of SRM with short-life jobs (by discovering GSIFTP servers through the SRM web service) JSAGA
overview summary and perspectives overview summary and perspectives overview summary and perspectives Plan JUX JSAGA
JUX – Screenshots The connection manager enables user to create connection profiles with URL and security context. Only the security contexts compatible with selected protocols appear in the popup list. 20/08/2014 41 JSAGA
JUX – Screenshots Connection is kept open until the nodes are collapsed (left side). Copy several files with a single drag-and-drop. 20/08/2014 42 JSAGA
JUX – Related work • Similar tools exist • HERMES (Australia) • VBrowser (Holland) • Using JSAGA for JUX enables • to factorize development efforts with JJS (for data staging) • to managelogical files through a common interface (SAGA) • protocol-specific optimizations • e.g. third-party transfer, filtered file list • to automatically recover some errors • e.g. create parent directory if missing, retry if error is IncorrectState based on Apache Commons VFS JSAGA
JUX – Summary • JUX can work with potentially any • protocol • security mechanism • file content • JUX is easy to use • targeted users are scientists • JUX is lightweight • currently 11 MB with all plug-ins you can develop the plug-ins missing for your use-case http://cc.in2p3.fr/docenligne/821 JSAGA
JUX – Perspectives (meta-data) SEARCH entry name *.txt and Study Date Patient's Name John S* and Patient's Sex M Patient's Age size Search Recursive JSAGA