1 / 8

File sharing requirements of the physics community

This document outlines the file sharing requirements of the physics community, including transparent file access, customizable protections, authentication, backups, and access for visitors, laptops, software development, and web services.

ericroyal
Download Presentation

File sharing requirements of the physics community

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. File sharing requirements of the physics community • Background • General requirements • Visitors • Laptops • Software development and physics analysis • Web services • Summary Marco Cattaneo - DTF - 28th February 2001

  2. Background • IT divisional planning report (CERN-IT-DLO-2000-005) • “A three person team, composed of representatives from the physics user community, PDP and IS groups should produce a concise document consolidating the existing user requirements for Data Sharing by April 2001” • MC, Bernd Panzer-Steindel, Alberto Pace • Produce early draft, to gather feedback from DTF and FOCUS. • Draft URD written by M.Cattaneo • With input from a few users in a number of experiments and roles • Atlas, CMS, LHCb, (Aleph), IT/CO • Physicists, Librarians, Software developers, external users • Not yet discussed widely in EP and/or FOCUS • Not official position of anybody! • Not addressing sharing of bulk physics data and “group n-tuples”

  3. General requirements • Transparent file access from any node on site • Interactive and batch, Windows and Unix • Same files and directory structure, same access rights, no ‘stale’ files • “Native” access to files • Use native OS commands, native access to files from applications • Customisable protections • ACLs for individuals and groups • Modifiable by users, via scriptable tools, at file or directory granularity • Groups can be “corporate”, user defined, overlapping • Authentication • Single site-wide login, remote command execution • Transmitted to batch and scheduled jobs • Fast and reliable • << 1 sec for file retrieval and directory browsing, >> 99% up time • Regular backups (including open files), restore in few minutes • Possibility to execute unique site-wide login script • With user, group, experiment customisation • Source control and versioning

  4. Visitors • Efficient read/write access to personal files • Home institute files from CERN, CERN files from home institute • Ideally, the same physical home directory (or mirror) • Not yet possible due to network response • Transparent read/write access to remote directory acceptable alternative • Maximum once/day authentication at remote site • Technology chosen at CERN should be: • Installable also at home labs • At reasonable cost • Ideally as single HEP-wide solution • Allow simple porting of scripts and other tools to home-lab file system • Allow simple mirroring of selected directories to home-lab file system

  5. Laptops • Possibility to install authentication and file-sharing software on laptop • Access to CERN files when on CERN intranet • Access to home-lab files when on home-lab intranet • without major reconfiguration of laptop • Automatic synchronisation of selectable sets of files upon connection to CERN intranet • Allow laptop user to continue working seamlessly in CERN-like environment when not connected.

  6. Software development (including for physics analysis) • Requirements for code repository • Experiment wide read access from anywhere in the world • Write access rights based on individual and/or group identification • Requirements for software build and release • Highly efficient, native, source code access by build process • Not necessarily shared • But must be possible to install binaries on shared file system • Automatic submission of builds on other platforms • And other sites? • Access to same files via multiple paths • aka Unix soft links • Requirements for release areas • Site-wide, native, access to source code and binaries • Compile, link against released software, load shareable libraries at run-time • Site-wide access to both general purpose and experiment specific software • On all interactive and batch nodes • World-wide access to official release areas • Via world-wide access to CERN file system • Especially for “nightly builds” or rapidly evolving software • Via automatic mirroring, or distribution kits • World wide access to automatic code documentation

  7. Web access • Possibility to make any file accessible from web, by simple manipulation • Possibility to make any directory accessible from the web • With or without directory browsing • Possibility to restrict access • With similar range of protection categories as for home directories • Ideally with same authentication mechanism • Possibility to have multiple authors • Possibility to maintain web sites remotely • i.e. from outside CERN for site hosted at CERN • Possibility to edit sites from any platform • i.e. from both Windows and Linux • Especially important for sites with multiple authors

  8. Summary • Site wide, cross platform (Unix/Windows) data sharing • Application transparent file access, including file manipulation with native commands • Access control from all platforms, with authentication based on groups as well as individuals. • Access from outside CERN • Document aggregation (“directories”) to allow sharing of a large number of (small) files as a single object. • Fast and reliable file access • Source control and automated versioning

More Related