110 likes | 229 Views
Report from the WP1 hands-on meeting. Massimo Sgaravatto INFN Padova. Agenda. Extract the code from CVS, compile it, and install the software on the WP1 testbed Integration tests. WP1 testbed. lx01.hep.ph.ic.ac.uk UI, client RB, LB (server + client) grid004f.cnaf.infn.it
E N D
Report from the WP1 hands-on meeting Massimo Sgaravatto INFN Padova
Agenda • Extract the code from CVS, compile it, and install the software on the WP1 testbed • Integration tests
WP1 testbed • lx01.hep.ph.ic.ac.uk • UI, client RB, LB (server + client) • grid004f.cnaf.infn.it • RB server, JSS, LB (server + client) • grid001f.cnaf.infn.it • II • CE’s • Milano - PBS • Padova – LSF, Condor • Prague - PBS • New Globus jobmanager (with logging functionalities) installed – CESNET modifications done on “Condor” jobmanager
GIS Schema • Tests based on the old schema specification • Agreed on waiting for approval from WP3 before modifying the schema in the WP1 testbed • A. Martin agreed with our proposal • Use of list deprecated (use of “multiple” attributes suggested AuthorizedUserList : user1$user2$user3 AuthorizedUser: user1 AuthorizedUser: user2 AuthorizedUser: user3 • Problems with SEprotocol objectclass • Action: new schema specification in a few days and use it in our testbed ?
Integration tests • dg-job-submit • Seems working • Submission to Milano (PBS), Padova (LSF) and Prague (PBS) tested • Matchmaking ok in the RB (but only MDS queried) • Input sandbox: UI RB CE • Output sandbox: CE RB • Dg-job-submit specifying the resource where to submit the job tested as well • Seems working • More tests needed
Integration tests • dg-list-job-match • Seems working • dg-job-cancel • Not tested yet (missing code in RB) • dg-job-get-output • Not tested yet (missing code in RB) • dg-job-status and dg-get-logging-info tested • Ok, but for some jobs problems with the job state machine
Missing functionalities/problems • Matchmaking performed also querying the RC • To be done and tested with the new MDS schema • Missing persistency in RB • To be fixed in the next days (use of PostgresSQL) • Implementation of dg-job-cancel in RB missing • Implementation of dg-job-get-output in RB missing
Missing functionalities/problems • Problems with time synchronization in L&B • Problems in finding the job status (for some jobs) considering the info stored in the L&B database • To be investigated
Missing functionalities/problems • JSS doesn’t manage RetryCount • Still waiting for a modification in Condor-G • JSS not too robust to malformed JDL expressions • To be fixed
Missing functionalities/problems • User proxy transferred from submitting machine to RB machine using gsiftp • Use of GSI mechanisms (see CESNET examples) • Gsiftp to be replaced with gridftp to transfer the input and output sandboxes • Installation of gridftp needed in our testbed • Missing Replica Catalog (filled with some data) to perform tests • Proxy renewal • Not for PM9
Implement and test the new functionalities • More stress tests needed • Integration of the real information providers in our testbed (some information providers for LSF ready to be tested)