150 likes | 300 Views
EU DataGrid Project TestBed Status and Plans. Bob Jones EU DataGrid Technical Coordinator CERN. Project Schedule. EU contract signed on December 29th, 2000 TestBed 0 (early 2001) International test bed 0 infrastructure deployed Globus 1 only - no EDG middleware TestBed 1 ( now )
E N D
EU DataGrid Project TestBed Statusand Plans Bob Jones EU DataGrid Technical Coordinator CERN
Project Schedule • EU contract signed on December 29th, 2000 • TestBed 0 (early 2001) • International test bed 0 infrastructure deployed • Globus 1 only - no EDG middleware • TestBed 1 ( now ) • First release of EU DataGrid software based on Globus 2 to defined users within the project: • HEP experiments (WP 8) • Biology applications (WP 9) • Earth Observation (WP 10) • Project Review by EU (1st March 2002) • External reviewers inspect deliverables & demo, partner’s contributions etc. • TestBed 2 (Sept-Oct. 2002) • Builds on TestBed 1 to extend facilities of DataGrid • TestBed 3 (March 2003) & 4 (Sept 2003) EU DataGrid
Grid aspects covered by EDG testbed 1 EU DataGrid
Replica Catalog testbed0226 Res. Broker Job Sub Srv testbed011 User Interface testbed006 Logging & Bookkeeping testbed012 LDAP servers For VOs NIKHEF Info Service MDS/ftree lxshare0225 TestBed 1 layout at CERN Storage Element lxshare0219 Worker Node lxshare0221 GateKeeper lxshare0220 PBS Storage Element lxshare0222 Worker Node lxshare0224 GateKeeper lxshare0223 LSF (initially fork) EU DataGrid
Job arguments Data Type : raw/dst Run Number :xxxxxx Number of evts :yyyyyy Number of wds/evt:zzzzzz Rep Catalog flag : 0/1 Mass Storage flag : 0/1 Generate Raw events on local disk pfn local ? n y Copy raw data From SE to Local disk Raw/dst ? Get pfn from Rep Catalog Read raw events Write dst events Move to SE, MS? raw_xxxxxx_dat.log raw_xxxxxx_dat.log SE dst_xxxxxx_dat.log MS Add lfn/pfn to Rep Catalog Move to SE, MS ? SE MS Add lfn/pfn to Rep Catalog Write logbook On client node Write logbook On client node Generic HEP application flowchart Testbed usage to date Physicists from LHC experiments submitting jobs with their application software that uses: • User interface (job submission language etc.) • Resource Broker & Job submission service • Information Service & Monitoring • Data Replication First simulated ALICE event generated by using the DataGrid Job Submission Service [reale@testbed006 JDL]$ dg-job-submit gridpawCNAF.jdl Connecting to host testbed011.cern.ch, port 7771 Transferring InputSandbox files...done Logging to host testbed011.cern.ch, port 1 5830 =========dg-job-submit Success ============ The job has been successfully submitted to the Resource Broker. Use dg-job-status command to check job current status. Your job identifier (dg_jobId) is: https://testbed011.cern.ch:7846/137.138.181.253/185337169921026?testbed011.cern.ch:7771 ======================================== [reale@testbed006 JDL]$ dg-job-get-output https://testbed011.cern.ch:7846/137.138.181.253/185337169921026?testbed011.cern.ch:7771 Retrieving OutputSandbox files...done ============ dg-get-job-output Success ============ Output sandbox files for the job: - https://testbed011.cern.ch:7846/137.138.181.253/185337169921026?testbed011.cern.ch:7771 have been successfully retrieved and stored in the directory: /sandbox/185337169921026 EU DataGrid
Eric van Herwijnen Claude Charlot Piergiorgio Cerello Silvia Resconi Dave Newbold J Closier Yves Schutz Craig Tull G Patrick Roberto Barbera Andrea Sciaba Oxana Smirnova Elisabetta Ronchieri Shahzad Muzaffar Alex Martin Maite Barroso Lopez Jean Philippe Baud Frank Bonnassieux WP1 WP2 WP3 WP4 WP5 WP7 Predrag Buncic D Galli Claudio Grandi Stan Thompson Federico.Carminati V Vagnoni Igor Semeniouk FairouzOhlsson-Malek Brian Coghlan Flavia Donno Eric Fede Fabio Hernandez Nadia Lajili Charles Loomis Pietro Paolo Martucci Andrew McNab Sophie Nicoud Yannik Patois Anders Waananen WP6 WP6 WP6 WP6 WP6 WP6 WP6 WP6 WP6 WP6 WP6 Marisa Luvisetto S Klous Paolo Capiluppi H van Bulten Daniele Mura Olga Kodolova G D Patel Fons Rademakers N.Kruglov Mario Sitta N Brook A.Kryukov A Khan L.Shamardin PierGiorgio Cerello Eric Van Herwijnen Julian Lindford Andrea Parrini Yannick Legre WP8 WP8 WP9 WP9 WP10 F Harris V.Kolosov D McPherson E.Tikhonenko V.Mitsyn Marco Verlato Massimo Sgaravatto A. Edunov B.Berdnikov Groups Involved in Testbed 1 Integration Team Users VO Admin ALICE:Daniele Mura ATLAS:Alessandro de Salvo CMS: Andrea Sciaba LHCb: Joel Closier EO:Yannick Legre Bio-Info:John van de Vegte ALICE ATLAS Site Admin CERN:Markus Schulz Lyon:Fabio Hernandez CNAF: A.Chierici et al. NIKHEF: Jeff Templon & David Groep RAL:B.Saunders LHCb CMS ALICE: Ingo Augstin & Steve Burke ATLAS: Mario Reale & Alessia Tricomi CMS & WP10: JJ Blaising LHCb: Jeff Templon EU DataGrid
Security Certificates • The project software supports 12 Certification Authorities from the various partners involved in the project • http://marianne.in2p3.fr/datagrid/ca/ca-table-ca.html • For a machine to participate as a Testbed 1 resource allthe CAs must be enabled. • all CA certificates can be installed without compromising local site security • Each host running a Grid service needs to be able to authenticate users and other hosts • site manager has full control over security for local nodes • Virtual Organisation represents a community of users • 6 VOs: 4 HEP (ALICE, ATLAS, CMS, LHCb), 1 EO, 1 Biology Account Registration Usage guidelines EU DataGrid
rdxprof ldxprof LCFG configuration files HTTP Generic Component mkxprof Web Server DBM File LCFG Components XML Profile (one per client node) Server node Client nodes Node configuration tools Node configuration and installation tools For reference platform (Linux RedHat 6.2) Initial installation tool using system image cloning LCFG(Edinburgh University) for software updates and maintenance Total of ~750 RPMs With a 10Mbit/sec link need just 10 mins to install anode EU DataGrid
Lessons learnt from testbed 1 • The raw ingredients exist – we just need to be sure of the recipe • Sufficient expertise exists in the different institutes to cover all aspects of the project • Expertise and enthusiasm needs to be channeled using agreed framework • CERN central role underestimated and under resourced • Integration and deployment is a labour intensive task • More planning & WP6 (sw integration) needs reinforcement (especially at CERN) • Better done in small steps using iterative releases • Support by and relationship with Globus developers is very important • International Aspects • Already an international testbed, need to agree plans with US similar activities • Underestimated the administrative effort involved in running an international testbed • Need more emphasis on testing • More unit & integration testing • Middleware WPs need to develop a test-plan (also WP6 for external packages & integration tests) and involved applications from early stage EU DataGrid
Iterative Releases • Planned intermediate release schedule • TestBed 1: October 2001 • Release 1.1: January 2002 • Release 1.2: March 2002 • Release 1.3: May 2002 • Release 1.4: July 2002 • TestBed 2: September 2002 • Each release includes • feedback from use of previous release by application groups • agreed high-priority improvements/extensions • use of software infrastructure • feeds into architecture group • Similar schedule will be organised for 2003 EU DataGrid
Release Plan . . . . . . . . . Release feedback Release Plan++ WP meetings Software Release Procedure • Coordination meeting • Gather feedback on previous release • Review plan for next release • WP meeting • Take basic plan and clarify effort/people/dependencies • Sw development • Performed by WPs in dispersed institutes and run unit tests • Software integration • Performed by WP6 on frozen sw • Integration tests run • Acceptance tests • Performed by Loose Cannons et al. • Roll-out • Present sw to application groups • Deploy on testbed Coord. meeting themed tech. meets Component 1 Component n WP1 WP7 WP3 Globus EDG release Distributed EDG release testbed 1: Dec 11 2001 ~100 participants Roll-out. meeting EU DataGrid
Development & Production testbeds • Development • Initial set of 5 sites will keep small cluster of PCs for development purposes to test new versions of the software, configurations etc. • Production • More stable environment for use by application groups • more sites • more nodes per site (grow to meaningful size at major centres) • more users per VO • Usage already foreseen in Data Challenge schedules for LHC experiments • harmonize release schedules Participating in InterGrid discussions on testbed organisation Antonia Ghiselli, Bob Jones, Francesco Prelz Analysis of interface with US testbeds to be performed by end of April (GriPhyN/PPDG meeting) EU DataGrid
Future Plans • Prepare for first project EU review 1st March 2002 (“managed panic”) • Expand testbed • More nodes per site, more sites (including US), more users • Evolve architecture and software on the basis of TestBed usage and feedback from users • Closer integration of the software components • Improve software infrastructure toolset and test suites • Look for convergence with PPDG/GriPhyN architecture • Enhance synergy with US via DataTAG-iVDGL and InterGrid • Address shortcomings in plan by collaborating with other EU projects (DataTAG, GridSTART, CrossGrid) • Promote early standards adoption with participation to GGF WGs • Final software release by end of 2003 EU DataGrid