90 likes | 100 Views
File sharing requirements of the physics community. Background General requirements Visitors Laptops Software development and physics analysis Web services Summary. Background. IT divisional planning report (CERN-IT-DLO-2000-005)
E N D
File sharing requirements of the physics community • Background • General requirements • Visitors • Laptops • Software development and physics analysis • Web services • Summary Marco Cattaneo -After C5 - 1st June 2001
Background • IT divisional planning report (CERN-IT-DLO-2000-005) • “A three person team, composed of representatives from the physics user community, PDP and IS groups should produce a concise document consolidating the existing user requirements for Data Sharing by April 2001” • MC, Bernd Panzer-Steindel, Alberto Pace • Produce early draft, to gather feedback from DTF and FOCUS. • Draft URD written by M.Cattaneo • With input from a few users in a number of experiments and roles • Atlas, CMS, LHCb, (Aleph), IT/CO • Physicists, Librarians, Software developers, external users • Not yet discussed widely in EP and/or FOCUS • Not official position of anybody! • Not addressing sharing of bulk physics data and “group n-tuples”
Timetable • Presentation to DTF on 28th February • “IT should digest the information already collected” • “More fact-finding is needed, especially on available technologies” • Repeat the presentation for C5 (today) • Gather feedback from implementation point of view • Produce a more realistic, prioritized, set of requirements • Review modified requirements at EP Forum (18th June) • Seek approval of EP Division as a whole • Present to FOCUS (28th June) • Only the requirements, or also available technologies?
General requirements • Transparent file access from any node on site • Interactive and batch, Windows and Unix • Same files and directory structure, same access rights, no ‘stale’ files • “Native” access to files • Use native OS commands, native access to files from applications • Customisable protections • ACLs for individuals and groups • Modifiable by users, via scriptable tools, at file or directory granularity • Groups can be “corporate”, user defined, overlapping • Authentication • Single site-wide login, remote command execution • Transmitted to batch and scheduled jobs • Fast and reliable • << 1 sec for file retrieval and directory browsing, >> 99% up time • Regular backups (including open files), restore in few minutes • Possibility to execute unique site-wide login script • With user, group, experiment customisation • Source control and versioning
Visitors • Efficient read/write access to personal files • Home institute files from CERN, CERN files from home institute • Ideally, the same physical home directory (or mirror) • Not yet possible due to network response • Transparent read/write access to remote directory acceptable alternative • Maximum once/day authentication at remote site • Technology chosen at CERN should be: • Installable also at home labs • At reasonable cost • Ideally as single HEP-wide solution • Allow simple porting of scripts and other tools to home-lab file system • Allow simple mirroring of selected directories to home-lab file system
Laptops • Possibility to install authentication and file-sharing software on laptop • Access to CERN files when on CERN intranet • Access to home-lab files when on home-lab intranet • without major reconfiguration of laptop • Automatic synchronisation of selectable sets of files upon connection to CERN intranet • Allow laptop user to continue working seamlessly in CERN-like environment when not connected.
Software development (including for physics analysis) • Requirements for code repository • Experiment wide read access from anywhere in the world • Write access rights based on individual and/or group identification • Requirements for software build and release • Highly efficient, native, source code access by build process • Not necessarily shared • But must be possible to install binaries on shared file system • Automatic submission of builds on other platforms • And other sites? • Access to same files via multiple paths • aka Unix soft links • Requirements for release areas • Site-wide, native, access to source code and binaries • Compile, link against released software, load shareable libraries at run-time • Site-wide access to both general purpose and experiment specific software • On all interactive and batch nodes • World-wide access to official release areas • Via world-wide access to CERN file system • Especially for “nightly builds” or rapidly evolving software • Via automatic mirroring, or distribution kits • World wide access to automatic code documentation
Web access • Possibility to make any file accessible from web, by simple manipulation • Possibility to make any directory accessible from the web • With or without directory browsing • Possibility to restrict access • With similar range of protection categories as for home directories • Ideally with same authentication mechanism • Possibility to have multiple authors • Possibility to maintain web sites remotely • i.e. from outside CERN for site hosted at CERN • Possibility to edit sites from any platform • i.e. from both Windows and Linux • Especially important for sites with multiple authors
Summary • Site wide, cross platform (Unix/Windows) data sharing • Application transparent file access, including file manipulation with native commands • Access control from all platforms, with authentication based on groups as well as individuals. • Access from outside CERN • Document aggregation (“directories”) to allow sharing of a large number of (small) files as a single object. • Fast and reliable file access • Source control and automated versioning