1 / 23

DØ RACE Status Report

This report discusses the progress and challenges of the DØRACE project, which aims to facilitate remote code development, analysis, and data sharing within the DØ collaboration. It covers topics such as software distribution, hardware infrastructure, regional analysis centers, and software infrastructure.

jhugh
Download Presentation

DØ RACE Status Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DØ RACE Status Report Jae Yu July 10, 2002 DØ IB, Oklahoma Workshop • Introduction  • Software Distribution (DØRACE Setup) • Hardware Infrastructure • DØRAM Architecture • Regional Analysis Centers • Software Infrastructure(SAM-Grid) • Conclusions DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  2. DØRACE (Remote Analysis Coordination Effort) • Computing hardware is rather inexpensive • CPU and storage media are inexpensive • Small institutions can afford to own reasonable size computing facilities • DØ collaboration is larger and more international • Most the collaborating institutions are remote • Code development can occur at remote stations • Promote contribution of available human resources for software development • Give ownership to collaborators from remote institutes • Optimal and efficient access to data is of utmost importance to expedite analyses • Minimize travel around the globe for data access • Exploit existing but scattered computing resources • Sociological issue of HEP people at the home institutions • Sharing a 15-20fb-1 worth of raw and reconstructed data/MC (~9PB) efficiently is a big issue • Primary goal is empowering individual desktop users DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  3. What do we need? • Efficient remote DØ software distribution • Allow remote participation for code development is a bottleneck in expediting physics results • Allow remote analysis for histogram production • Allow remote reconstruction or production environment • Sufficient compute and storage hardware infrastructure • Optimized resource management tools • Allow to maximally utilize offsite resources • Allow participation of remote resources for collaboration’s needs • Efficient and transparent data delivery and sharing • Allow location independent access to sufficiently large data sets throughout the entire network of collaboration • Minimize central data storage dependence • Alleviate load for central data storage and servers DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  4. Software Distribution(DØRACE Setup)

  5. From the Nov. Survey • Identified Difficulties • Having hard time setting up initially • Lack of updated documentation • Rather complicated set up procedure • Lack of experience No forum to share experiences • OS version differences (RH6.2 vs 7.1), let alone OS • Most the established sites have easier time updating releases • Network problems affecting successful completion of large size releases (4GB) takes a couple of hours (SA) • No specific responsible persons to ask questions • Availability of all necessary software via UPS/UPD • Time difference between continents affecting efficiencies DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  6. DØRACE Setup Strategy Phase 0 Preparation • Categorized remote analysis system set up by the functionality • Desk top only • A modest analysis server • Linux installation • UPS/UPD Installation and deployment • External package installation via UPS/UPD • CERNLIB • Kai-lib • Root • Download and Install a DØ release • Tar-ball for ease of initial set up? • Use of existing utilities for latest release download • Installation of cvs • Code development • KAI C++ compiler • SAM station setup Phase I Rootuple Analysis Phase II Executables Phase III Code Dev. Phase IV Data Delivery DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  7. Regular bi-weekly meetings (9am, on-week Thursdays, in 9th circle and in the Saturn VR) • Updates for releases • Sites report in the meetings • Status • Difficulties/Issues • Solutions • Featured topics of common interests • Slides 100% posted on the web before the meeting • Meeting 100% on VRVS since June 5, 2002 (thanks to VCTF&SVCC) • Instructions for setup regularly updated and posted on the DØRACE web page  Setup is easier • Automatic release ready notification system in place • Releases are split in two packages (binaries and sources) to alleviate network dependencies DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  8. Progressive DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  9. Incomplete DØRACE Deployment Map (US and EU only) Processing Center Analysis Site w/ SAM Analysis Site w/o SAM Excuse me of my poor geography and missing continents!! You are welcome to provide me updates on the maps. No DØRACE DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  10. d0-remote-analysis@fnal.gov distribution list used for exchange of information from experience • Help each other • Find bugs and report for fixes • Accumulate expertise • Given the number of sites with setup, the load to releases managers (2) is not large • Run-time environment packages for consistent executable • McFarm control software distributed to a few new farms and tested running this afternoon with basic grid tools • A total of 38 institutions ready for code development • About a dozen SAM sites active  Must use them now!!! • We have established rather stable software distribution system • What next??? DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  11. DØRAM Hardware Infrastructure(DØRAC)

  12. DØ RAC Working Group • Formed after the Feb. DØRACE workshop • fully characterize RACs • Address issues • Members: • Iain Bertram (Lancaster), Chip Brock (MSU), Frank Filthaut (NIKHEF), Lee Lueking (FNAL), Peter Maettig (Wuppertal), Meena Narain (Boston), Bruno Thooris (Saclay), Jae Yu (UTA), Christian Zeitnitz (Meinz) • The group met every week via video for about 1.5 months prior to the Computing review for a timely input to the management • The review committee commended our aggressive approach to RAM • The hard work of the group result in …. • Proposal for DØ Regional Analysis Centers (DØ Note #3984) • http://www-hep.uta.edu/~d0race/d0rac-wg/d0rac-final.pdf DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  13. The DØRAC Proposal • Characterize use of RAC in a representative analysis • Proposed DØRAM • Characteristics of the data to be stored • Services to be provided by DØRACs • Requirements for DØRACs • Storage space justification • Compute resource justification • Possible candidate sites and their capabilities • Prototype RAC project • Organizational and bureaucratic issues • Implementation time scale • Conclusions with lots of issues need to be addressed DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  14. Normal Interaction Communication Path Occasional Interaction Communication Path …. RAC RAC ... …. IAC IAC IAC IAC …. …. DAS DAS DAS DAS DØ Remote Analysis Model (DØRAM) Central Analysis Center (CAC) Regional Analysis Centers Provide various services Institutional Analysis Centers Desktop Analysis Stations DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  15. What is a DØRAC? • An institute with large concentrated and available computing resources • An institute willing to provide services to a few small institutes (IAC) in the region and to the collaboration • An institute willing to provide increased infrastructure as the data from the experiment grows • An institute willing to provide support personnel DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  16. What services does DØRAC provide? • Service to IAC’s: • Accept and execute analysis batch job requests • Provide cache and storage space • Store and provide access to desired data sets • Provide database access • Provide intermediary code distribution • Services to Collaboration • Generate and reconstruct MC data set • Participate in re-reconstruction of data • Provide manpower support for the above activities DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  17. Regional Analysis Center Requirements • Located in geographically and infrastructure sensible place • Sufficiently large bandwidth to FNAL, other RACs, and IACs • Large storage space (robotic and/or) disk to store • 100% TMB in each RAC • 100% DST in the sum of all RACs, distributed randomly • Store MC data set • Sufficiently large compute resources • Support for the infrastructure and maintenance DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  18. Chip’s Sears Model of Categorization • Best RACs: • Gbit or better network bandwidth • Robotic tape storage: ~170TB • Disk storage space: ~110TB • Compute resources: ~50 cpu/year/RAC • Provide database proxy service • Cost: ~$1M/year • Good RACs: • Gbit or better network bandwidth • Disk storage: ~60TB • Compute resources:~50cpu/year/RAC • Provide database proxy service • Cost: $300k~$1M • Better DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  19. Other Issues • Obtaining personnel support commitment • Serious MOU structure to spell out commitment • Sharing resources with other experiments and discipline • Emergency resource loans • Technical conflicts, such as difference in OS • Need a world-wide management structure • How do we resolve and allocate resources? • How is the priority within the experiment between physics groups determined? • How do we address issues that affect other experiment and discipline? DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  20. Proposed DØRAC Implementation Timescale • Dec. 1, 2002: Implement Prototype RAC (pRAC project) • Cluster associated IAC’s • Transfer Thumbnail data set constantly from CAC to the RAC • Implement services to IAC’s • Monitor activities • Jan. 2003: Workshop on RAC • Mar. – Aug. 2003: Establish and initiate site selection process • Mar. – Oct. 2003: Establish and negotiate MOU agreements with RAC institutes • Jan. 31, 2004: Fully deploy and activate RACs with sufficient capacity DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  21. Funding…. • The proposal is a good place to point back at but • It needs to be adopted something the collaboration wants to be implemented • This process must be completed swiftly (by Oct. 1?) so that we can tap into the funding agencies with a few proposals • European, South America, and Asian countries seem to already have established national policies for funding HEP computing, independent of experiments • U.S. needs some 3 or so RACs • UTA has won a ~$1.35M MRI through CSE primarily for DØRAC  Barely enough for Run IIa • More funding necessary to establish two more sites and software development support • MRI and ITR funding DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

  22. DØRAM Software Infrastructure(SAM-Grid) • Project to include Job and Information Management with the SAM Data Management System • Project started in 2001 as part of the PPDG collaboration to handle DØ’s expanded needs. • Recently included CDF • Current SAM-Grid team includes: • Andrew Baranovski, Gabriele Garzoglio, Lee Lueking, Dane Skow, Igor Terekhov, Rod Walker (Imperial College), Jae Yu (UTA), Drew Meyer (UTA), Tomasz Wlodek in Collaboration with U. Wisconsin Condor team. • http://www-d0.fnal.gov/computing/grid

  23. Conclusions • DØ software setup deployed (38 sites w/ code development)  Must start using the setup • DØRAC Working group has completed its work and submitted a proposal (DØ Note #3984) to the collaboration • Time for the collaboration to act upon it and start implementation • A committee to look into and evaluate the proposal w/ recommendations • Need a volunteer site as the pilot RAC site  Does not have to be a full site later on, though desirable • Need to prove implementation of concepts • Work out issues prior to full site implementations • Must write proposals to acquire funds for the next sites • UTA MRI is a good starting point but… • We need more sites in Europe, S. America, Asia and 2-3 more US sites • Need funds for software and personnel supports • Need serious participation in software infrastructure development DØRACE Status Report, Jae Yu DØ IB Meeting, OU Workshop

More Related