1 / 27

Status of the ALICE Experiment

Status of the ALICE Experiment. Patricia Méndez Lorenzo CERN (IT-GD) / INFN(CNAF) FZK T1-T2 Workshop Forschungszentrum Karlsruhe 19-20 October 2005. Outlook. ◘ PDC04 Results ◘ Introduction: Some Generalities

tender
Download Presentation

Status of the ALICE Experiment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Status of the ALICE Experiment Patricia Méndez Lorenzo CERN (IT-GD) / INFN(CNAF) FZK T1-T2 Workshop Forschungszentrum Karlsruhe 19-20 October 2005

  2. Outlook • ◘ PDC04 Results • ◘ Introduction: Some Generalities • ➸Production on different Grids, available resources, interfacing AliEn and LCG • ◘ Scope and Planning for PDC05 • ◘ Goals of the new DC • ◘ Baseline Services • ◘ ALICE Requirements • ◘ Use of LCG and SC3 infrastructure • ◘ Next Steps • ◘ Support • ◘ Summary Thanks to S. Bagnasco, P. Buncic, L. Betev, F. Carminati, P-G. Cerello and P. Saiz 2 FZK 20th October Patricia Mendez Lorenzo

  3. Some Results of the last PDC04 • ◘ Phase 1: Production of RAW data at CERN (Mar-May 2004) • ➸Large output files • ➸1a: Central events • (long jobs, large files) • ➸1b: Peripheral events • (short jobs, smaller files) S. Bagnasco.SC3 Detailed Planning Workshop, CERN 13.June, 05) 3 FZK 20th October Patricia Mendez Lorenzo

  4. Some Results of the last PDC04 • ◘ Statistics after phase 1 (ended April 4, 2004): • ➸ALICE::CERN::LCG is the interface to LCG-2 • ➸ALICE::Torino::LCG is the interface to GRID.IT ~ 1.3 million files, 26 TB data volume S. Bagnasco.SC3 Detailed Planning Workshop, CERN 13.June, 05) 4 FZK 20th October Patricia Mendez Lorenzo

  5. Generalities about Alice Production • ◘ ALICE has it own Task Queue and related services and use the LCG RB to submit the job agents • ➸Pull Model service: a server holds a master queue of jobs and it is up to the CE that provides the CPU cycles. It asks for the jobs • ➸No Information System is included • ➸ It offers a single interface for ALICE users into the complex, heterogeneous (multiple Grids) and fast-evolving Grid reality • ◘ Several Grid infrastructures are available during their Data Challenges • ➸LCG INFNGRID possible others in US • ➸Lots of resources but different middlewares 5 FZK 20th October Patricia Mendez Lorenzo

  6. Production on different Grids • Design Strategy: • ◘ Use AliEn as a general front-end • ➸The resource is used transparently andindependently of the middleware system behind • ◘ Minimize points of contact between the systems • ➸No need to re-implement services • ➸No special requirements to run on remote CE/WNs • ◘ Make full use of the provided services • ➸Let the Grids do their work • ◘ Use high-level tools and APIs to access Grid resources • ➸Developers put a lot of abstraction effort into hiding the complexity and shielding the user from implementation changes 6 FZK 20th October Patricia Mendez Lorenzo

  7. Available Resources • ◘LCG’2 core sites • ➸ CERN, CNAF, FZK, NIKHEF, RAL, Lyon, Taiwan (more than 1000 CPUs) • ➸ Each LCG with a VO-BOX seen as an independent site • ➸ AliEn services (CE+SE sited in each • VO-BOX) • ◘INFNGRID • ➸ LNL.INFN, PD.INFN and several smaller ones (400 CPUs not including CNAF) 7 FZK 20th October Patricia Mendez Lorenzo

  8. Task Queue File Catalogue Interfacing AliEn and LCG ALICE SC3 layout LCG site (T1) ALICE Layout in terms of LCG/SC3 WN LCG CE User (Production Manager) VO-Box (UI) RB (external) SE (local) CElist T1 T2 VO-BOX (UI) CeList T2 T2 VO-BOX (UI) CeList T2 T2 VO-BOX (UI) CeList T2 8 FZK 20th October Patricia Mendez Lorenzo

  9. Submission to LCG+AliEn sites “Double access” for selected sites (CNAF and CT.INFN) (S. BagnascoSC3 Detailed Planning Workshop, CERN 13.June, 05) WN A User submits jobs Submission WN Server WN Alien CE LCG UI LCG CE/SE WN LCG RB WN 9 FZK 20th October Patricia Mendez Lorenzo

  10. New PDC05 FZK 20th October Patricia Mendez Lorenzo

  11. Scope and Planning of DC05 • Physics Data Challenge: Phases • ◘First Phase:Simulation of Monte Carlo Events in all available resources • ➸ flow (production over), p+p(current production), Hijing+(next) • ➸ Registration of all the outputs in the ALICE File Catalog (central catalog) and store them at CERN-CASTOR (for SC3) • ➸ Production foreseen using LCG resources as soon as the VO BOXES are all configured • ◘Second Phase:Reconstruction of the raw events stored at CERN • ➸ Test of file transfer utilities (FTS) • ➸ Use of the local catalog at each site (LFC) • ◘Third Phase:Analysis phase • ◘ From the LCG resources, ALICE will use only those sites involved in SC3 11 FZK 20th October Patricia Mendez Lorenzo

  12. Timeline of PDC05/SC3 2005 Aug Sep Oct Nov Dec Prototype data analysis (Phase 3) ALICE data ‘push’: - reserved/shared bandwidth - test of FTS (Phase 2) Job submission through LCG interface Event production (Phase 1) SC3 – start of service phase L. Betev, F. Carminati.GDB Meeting in Bologna. October 2005 12 FZK 20th October Patricia Mendez Lorenzo

  13. Primary and Secondary Goals • ◘ Fundamental Goals: • ➸Use of the deployed LCG SC3 infrastructure for the ALICE DC05 • ➸ Test of the data transfer and storage services (SC3) • ➸ Test of distributed reconstruction and calibration model (ALICE) • ➸ Integrate the use of LCG resources with other resources available to ALICE within one single VO interface for different Grids • ➸ Analysis of reconstructed data 13 FZK 20th October Patricia Mendez Lorenzo

  14. Baseline Services (I) • Services provided during SC3 (I. Bird, LCG PEB, 7th 2005) • ◘Storage Management services • ➸ Based on SRM as the interface • ◘ Basic transfer services • ➸ gridftp, srmcopy • ◘ Reliable file transfer service • ◘ Grid catalogue services • ◘ Catalogue and data management tools • ◘ Database services • ➸ Required at T1 and T2 • ◘ Compute Resource Services • ◘ Workload management 14 FZK 20th October Patricia Mendez Lorenzo

  15. Baseline Services (II) • ◘ Clear need for VOMS: roles, groups, subgroups • ◘ POSIX-like I/O service • ➸ local files and include links to catalogues • ◘ Grid monitoring tools and services • ➸ Focused on job monitoring • ◘ VO agent framework • ◘ Applications software installation service • ◘ Reliable messaging service • ◘ Information system 15 FZK 20th October Patricia Mendez Lorenzo

  16. Alice Requirements • VO-BOXES: Deployed in all T1 and T2 • PIII 2GHz, 1024 MB RAM. Any Linux flavour, kernel 2.4+. • User accounts for SGMs, via gsissh • UI functionality (including FTS and access to local catalog) • Access to the experiment software installation area • Agents and services • Site service interfaces and monitoring agents: • Storage Element Service (SES), File Transfer Daemon (Interface to FTS) • Cluster Monitor (CM), MonALISA, Agents Monitoring • Alien Computing Element (Interface to LCG RB) • PackMan (PM), xrootd • Connectivity • Outbound connectivity + Access to local storage (direct or SRM) • Inbound connectivity on some fixed network ports • From CERN, for CM and PM (e.g.: 8084 and 9991) • From World, for SES an xrootd (e.g.: 8082 and 51234) • Local data buffer for intermediate input/output of jobs (SES service) • Size: at least the number of job slots on the site * 3GB • Not necessary if xrootd is running on the site SE (may be included in DPM) 16 FZK 20th October Patricia Mendez Lorenzo

  17. Alice Requirements • VO-BOXES: Current Status • Deployed at: • ➸CERN, CNAF, NiKHEF/SARA, IN2P3, Catania, Torino, Bari, GSI, FZK, RAL • ➸ AliEn specific services and software deployed in all VO-BOXES • ➸ Submissions through VO-BOX • - Job submission to RB is possible • - Still some infrastructure missing (env variables, lcg- infosites...) • - Completed for next release (LCG2.7.0 in October) 17 FZK 20th October Patricia Mendez Lorenzo

  18. Alice Requirements • ◘ The configuration of the current deployed VO- BOX allows the job submissions… but… some considerations • ➸Some jobs submissions performed last week from VO-BOX at CERN • ➸A RB is defined in default • - Not sure is there in the rest of VO-BOXES • ➸A VO configuration file is however mandatory in the submission command line • ➸/tmp/jobOutput not available: Specify the output directory • ➸lcg-infosites has been installed by hand • - Probably missed in the rest of sites 18 FZK 20th October Patricia Mendez Lorenzo

  19. Use of the SC3 and LCG infrastructure • ALICE is beginning the production in LCG without FTS • ◘ FTS: Deployed in all sites • ➸ Perl API implemented in the ALICE framework • ➸ Tests among T0-T1 performed this summer • ➸ FTS will be used as FTD plug-in • ➸ FTD was tested between native AliEn sites • -Already used in DC04 • ◘ Current Status: • ➸FTS through FTD not yet ready • ➸ALICE is testing the FTS standalone 19 FZK 20th October Patricia Mendez Lorenzo

  20. Use of the SC3 and LCG infrastructure • ◘LFC: Deployed in all sites • ➸ Considered the local ALICE catalog • ➸ Central Alien storage index • ➸ Perl API implemented in the AliEn Framework • -More than 10000 entries (LFC as unique catalog) • - Too many authentications slow the process • ◘ Current Status of the LFC interface: • ➸OK but not used by current production • ➸ Will exercise it with special jobs • ◘ SRM: Deployed in all sites 20 FZK 20th October Patricia Mendez Lorenzo

  21. Next Steps for SC3 • ◘ Get the SC3 Production Started • ➸Integration of the last version of AliRoot v.2.3 • ➸ Build the distribution • ➸ Deployment at T1 (CC-IN2P3, CNAF, FYK, RAL) • ➸ Waiting for NIKHEF • ➸ Run massively on T1 • ➸Timescale: Already running at CERN, then running continuously • ➸Issues: FTS not yet included “Next Steps” reported by P-G. Cerello during the TF Meeting, 06/10/05 21 FZK 20th October Patricia Mendez Lorenzo

  22. Next Steps for SC3 • ◘ Extend to T2s • ➸ Foreseen T2s: Bari, Catania, GSI and Torino • ➸ Deployment on T2 • - LCG VO-BOX: site managers • - Alice Services on VO-BOXES • ➸ Operation: • - Monitoring-error reporting by ALICE Task Force support on sites • ➸Timescale: 1-2 weeks 22 FZK 20th October Patricia Mendez Lorenzo

  23. Next Steps for SC3 • ◘ FTS Tests: • ➸Massively test all the T0 <--->T1, T1<--->T2 connections/endpoint involved in SC3 • -Configure/test the script execution on VO- BOXES • - Run several threads of the testing script, so as to reach the highest throughput • - Issues: with or without LFC registration? • ALICE has decided to run FTS in the simpler mode: NO automatic update of the catalog • ➸Timescale: 2 weeks 23 FZK 20th October Patricia Mendez Lorenzo

  24. Next Step for SC3 • ◘ Steps: • ➸While FTS being tested, complete the Perl API integration in AliEn • ➸Test on a T1<--->T2 connection • Update AliEn distribution • ➸Deploy on VO-BOXES • ➸Start jobs with FTS transfer includes • ➸Timescale: 1 month? 24 FZK 20th October Patricia Mendez Lorenzo

  25. In terms of support • ◘ SC3 Weekly Meeting with site • and experiment represents • ◘ AliEn central services – • ALICE responsibility • ◘ Task Force Weekly Meeting • LCG, Sites, Alice and ARDA • ◘ Periodically Action List update • to cover the Alice needs L. Betev, F. Carminati. GDB Meeting in Bologna. October 2005 26 FZK 20th October Patricia Mendez Lorenzo

  26. Action List • ◘ Update every week during the TF Meeting • Task number: Task Label: Description. (Assigned to). Entry Date-Outing Data: Current Status • 1st Group: VO BOXES (VOBOX) • 1.1 VOBOX-conf: Fully configuration of the LCG VO-BOXES. (P. Cerello and P.Mendez). 06/10/05-(00/00/00): Work going on. Writing a brief document • (How-To) to describe the configuration of the UI inside the VO Box. • Then the document is planned to be passed to all sites. • 1.2 VOBOX-test: Configuration tests of the LCG Tools. (F.Donno): 06/10/05-(00/00/00): Work is ongoing. • 1.3 VOBOX-lcg-is:Installation of lcg-infosites in all VO BOXES. (P. Mendez): 06/10/05-(00/00/00): • Work almost finished: We have got in contact with all the sites. • Done: CERN, Torino, FZK, RAL, Bari • To be done in a short time: IN2P3, CNAF and GSI • Waiting answers from: NIKHEF and Catania • 1.4 VOBOX-AliEn: Deploy AliEn 2.3. (P. Buncic). 06/10/05-(00/00/00): • Deployed at CERN, deployment at the rest of the sites going on. • The new version is ready. 27 FZK 20th October Patricia Mendez Lorenzo

  27. Summary • ◘ PDC05 going for a good road: • ➸Testing the LCG provided baseline services (Workload management system, SRM, FTS) • ➸Development and testing of interfaces of AliEn to LCG and beyond – ARC (Nordic), OSG (US) • ◘ Coordination of activities: • ➸Fully integrated with LCG SC3 • ➸Operation of the DC is managed by the ALICE-LCG TF • -LCG, ARDA, site experts, ALICE • ◘ PDC05 tasks: • ➸Flow events (completed), starting with p+p • ➸Test of FTS – file replications (in 2 weeks) • ➸Prototype of analysis – end 2005/beginning 2006 28 FZK 20th October Patricia Mendez Lorenzo

More Related