1 / 25

Using the CVMFS for Distributing Data Analysis Applications for Fermilab Scientific Programs

Using the CVMFS for Distributing Data Analysis Applications for Fermilab Scientific Programs. A.Norman & A.Lyon Fermilab Scientific Computing Division. CVMFS for Fermilab Science. Not going to talk about the internals of CVMFS See J.Blomer et al (CHEP 2012)

deliz
Download Presentation

Using the CVMFS for Distributing Data Analysis Applications for Fermilab Scientific Programs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using the CVMFS for Distributing Data Analysis Applications for Fermilab Scientific Programs A.Norman & A.Lyon Fermilab Scientific Computing Division

  2. CVMFS for Fermilab Science • Not going to talk about the internals of CVMFS • See J.Blomer et al (CHEP 2012) • Will cover key design choices for running in the FNAL IF environment • Server Infrastructure • FNAL Based (for LAN ops) • OSG Based (for WAN ops) • Case Study of Migration to CVMFS Operations: • NOvA Experiment’s code base • Operational Experiences CHEP2013

  3. Goal in Deploying CVMFS • Introduce a new infrastructure to distribute input files to analysis jobs on the FNAL LAN • Something that was NOT susceptible to failure modes seen with central (NFS) disk • Meet the security needs of the labenvironment • Permit Overload Central Disk CHEP2013

  4. Requirements • Scalable to 10k’s concurrent Job level • Compatible with experiment’s analysis suites • Able to distribute files up to 2-4 GB sizes (static data tables, flux ntuples, event template libraries) • Centrally managed • Compartmentalized failure modes (i.e. experiments don’t adversely affect each other) • Secure access for experiment libraries to update/maintain code repositories • Extensible to non-local file distribution CHEP2013

  5. 2 Servers for 2 Environments Oasis Server Hosted by OSG-GOC Designed for WAN operation across all of OSG Designed to aggregate VO’s into a single mount point to simply site management and VO’s software availablity Scalable to meet the concurrency needs of the OSG Sized to accommodate diverse code bases Client Integrated into general OSG client software bundles • Hosted by Sci. Computing Division • Design for LAN operations at FNAL • Scalable to meet the concurrency needs of FermiGrid • Sized to accommodate large experiment code bases • Designed to compartmentalize experiment’s impact on each other • Avoid Bluearc style failures • Designed to meet data pre-servation needs of Tevatron Run II CHEP2013

  6. FNAL Infrastructure Redundant Stratum 1 servers for aggregating the individual repositories Separate Repository Servers for each experiment. BOTH CFS repos and the OSG oasis repo are avai-lable to the client and buffered by FNAL squids CVMFS Master’s role is ONLY to sign the repos

  7. FNAL Infrastructure Total outbound bandwidth can be scaled through additional Stratum 1’s and squid proxies • Each experiment is insulated from failures in the other repositories. • This is includes lock-outs due to repo updates • Permits scalability through addition of repo servers

  8. OSG Infrastructure • Central repository for all VO’s • Each VO has a directory hierarchy available under the common /cvmfs/oasis.opensciencegrid.org/ base path which maps to working areas within the repository servers • This allows for common/shared code across VO’s (i.e. fermilab common product from ouser.fermilab, are accessible transparently to the nova and g-2 VO’s) • Each VO has an independent logon to the repository server and can affect only files in their area • Updates/publication of the repository are unified (update process rebuilds all VO’s areas) Published View of VO’s Private View of VO’s sandboxes CHEP2013

  9. Code Repository Structure • Transitioning to a CVMFS infrastructure required both the experiment code base and the external product dependencies be: • Runnable from a read-only file system • Improper transient/temp files (made within module homes & within user areas) • Ownership/Access corrections/enforcement • Fully re-locatable at runtime or build time • Removal of absolute path references, assumptions about directory hierarchies, relocation of references files, etc… • External products need to resolve link time dependencies correctly CHEP2013

  10. Code Repository Structure • Experiment Specific daily changes • Represents many different job classes • Many small files • Required for builds systems • Seldom changes • Used by framework and different job classes • Large storage footprint • Very small run time footprint • Experiment Specific but rarely changes • Common across many job classes • Large file sizes (Gig+) CHEP2013

  11. Nova Code Repository Structure $CVMFS_DISTRO_BASE $CVMFS_DISTRO_BASE/nova $CVMFS_DISTRO_BASE/externals $CVMFS_DISTRO_BASE/novasvn • Frozen releases for all active analysis • Development snapshots (7 days) • SRT release management • Full ART suite distributions for compatibility with all frozen nova • System Compatibility Libraries (SLF5/6, Ubuntu) • Merge-able with other UPS trees (common tools) • Neutrino flux file libraries (200 GB) • Neutrino interaction templates (140 GB) • Database support files CHEP2013

  12. CVMFS Distribution/Release Deployment • Goal is for everything to be deployable as “tar” style archives to specific boxes in the distribution (i.e. products, releases, etc) • Design of repository allows for independent updating of products & releases • Also allows for patches to individual trees within boxes. • Deployment model uses arch specific build machines • Builds specific tagged/frozen releases for major architectures • Auto (nightly) builds of experiment specific development CHEP2013

  13. FNAL CVMFS Deployments • Trivial deployment • Librarian can copy tarballs to staging area on repository node • Kick off an untar operation of the archives to the proper top node in the area to be published • Initiate a rebuild of the CVMFS catalogs • Rebuild process returns status to allow for error checking by the librarian Experiment Repository Node untar to dest. tree Publication to Stratum 1 Release tar CVMFS Catalog Update PublicationVolume StagingVolume Deployment/Release archives (2-400 GB per archive) User initiated CHEP2013

  14. OSG CVMFS Deployments • Dedicated staging space limited • Can not stage large files (i.e. 200 GB archives) • Instead Librarian must do a streaming transfer/unpack of the tarballs directly into an area that is mirrored to the actual repository • Initiate a sync of the mirror area to the real server • Master catalog is rebuilt • Note: Rebuild on oasis affects ALL VO’s. Repo maint. can not be retriggered for your VO until existing builds are done. cat nova_release.tar |gsissh oasis-opensciencegrid.org “tar –C <deploy_path> -xf –” Oasis Login Node Oasis Repository Publication to Stratum 1 Release tar CVMFS Catalog Update stream untar sync to repo PublicationVolume MirrorVolume Deployment/Release archives (2-400 GB per archive) CHEP2013

  15. Distribution Setup • Entire NOvA offline distribution can be setup (anywhere) using: • The CVMFS base path is detected using a block like: function setup_novaoffline { export CVMFS_DISTRO_BASE=/cvmfs/oasis.opensciencegrid.org export EXTERNALS=$CVMFS_DISTRO_BASE/externals export CODE=$CVMFS_DISTRO_BASE/novasvn source $CODE/srt/srt.sh source $CODE/setup/setup_novasoft.sh“$@” } for distro in ${distrolist[*]} do if [ -r $distro/setup ] then export CVMFS_DISTRO_BASE=$distro return 0 fi return 1 done distrolist=(/cvmfs/novacfs.fnal.gov /cvmfs/oasis.opensciencegrid.org /nusoft/app/externals … ) With: Hierarchical support for different CVMFS repo’s as well as central disk (BlueArc) legacy distributions CHEP2013

  16. Job Submission/Running • After initializing the CVMFS resident code distribution, running analysis is transparent to the end user. • All search and library load paths are searched properly • Base release files are pickup from CVMFS distro • Test release files in local sotrage(overriding the base release) • Configuration files are pulled from base or test releases properly. • Output is generated in the proper writable • Scripts for data handling are stored in CVMFS to simplify offsite copy back CHEP2013

  17. First Run Cache Footprint Test Monte Carlo Job #2 Test Monte Carlo Job #1 CHEP2013

  18. Correlation with NOvA Job Flow ART module load (Readout SIM) Readout Simulation Root Geom Load GeantDetector SIM ARTLib Load GeantCrosssections Loading GeantInit CHEP2013

  19. Repeat Run Cache Size Repeated Runs of the same (or similar) Jobsstart with a fully populated cache and take no startup penalty CHEP2013

  20. Repeat Run Cache Size Cache growth as the simulation enters into the detsim and loads different interaction tables CHEP2013

  21. Startup Overhead • Measured Job startup overhead using jobs which generate a single Monte Carlo event • This is the minimum ratio of work to overhead • Average job time (empty cache) • 241.8 seconds • Average job time (pre-loaded cache) • 279.6 seconds • Variation in the processingtime per event completelydominates the measurement CVMFS Job Run Time (3343 trials) Average event generation time = 198 s Generation Time 100 events (min) CHEP2013

  22. FermiGrid Running • Large Scale Running against the FNAL CVMFS has been successful • Demonstrated peak concurrent running of 2000 NOvAusing CVMFS for program+ static data delivery • 2000 concurrent jobs (from single submission) is a limitation of the queue system (not CVMFS) and is being addressed 2000 Concurrent CVMFS based jobs Running on Fermigrid CHEP2013

  23. OSG Oasis Running • Started a pilot campaign with OSG to migrate generation of NOvA Monte Carlo to the Open Science Grid using Oasis hosted CVMFS • Phase 1: Generate 1 million cosmic ray interaction overlay events using specific sites • Phase 2: Generate 16 million cosmic ray interaction overlay events using general OSG resources • Phase 3: Reconstruct 16 million cosmic ray events using output from Phase 2 CHEP2013

  24. Phase 1: Results • Generated 934500 events across 10 OSG sites • Oasis CVMFS was used for distribution of all job code At peak had ∼1250 concurrent jobs using Oasis Combination of “NOvA” dedicated sites And other general OSG sites CHEP2013

  25. Summary • Fermilab has designed, implemented and deployed a new data handling infrastructure based on CVMFS • Major legacy code bases have been ported/restructured to work with the system (including the NOvA & g-2 experiments) • The new system is compatible with both Fermilab and OSG grid enviroments • Large scale production has been successfully initiated using this system CHEP2013

More Related