220 likes | 734 Views
Automated software packaging and installation for the ATLAS experiment. Simon George Royal Holloway, University of London Christian Arnault, LAL Orsay; Michael Gardner, RHUL; Roger Jones, University of Lancaster; Saul Youssef, Boston University. e-Science All Hands Meeting Nottingham
E N D
Automated software packaging and installationfor the ATLAS experiment Simon George Royal Holloway, University of London Christian Arnault, LAL Orsay; Michael Gardner, RHUL; Roger Jones, University of Lancaster; Saul Youssef, Boston University e-Science All Hands Meeting Nottingham 2-4 September 2003 S.George@rhul.ac.uk ATLASexperiment.org
Introduction • This talk is about packaging, distribution and installation for a large software project • It is essential because • The project computing resources are widely distributed around 140 institutes, who all want to use the software • We want to be able to use Grid resources that do not have locally managed installations of the software • Our working model also requires the ability to deploy user code that is not part of an official distribution • I’ll describe the process developed and the tools used. Simon George RHUL
Contents • ATLAS and its software • Requirements • Tools and formats • Meta data • Naming conventions • Creating and installing the kits • Conclusions and outlook Simon George RHUL
The ATLAS Experiment • A Particle Physics experiment at the Large Hadron Collider, CERN • 1600 physicists, 140 institutes,6 continents • Studies include • search for the origin of mass • excess of matter over antimatter in the universe • evidence for Supersymmetry • other new physics Simon George RHUL
ATLAS software suite • Simulation, data processing and analysis • 500 “packages”, 50 external, inter-dependent. • 100s of developers and 1000s of users in 140 institutes • One release build is 2.5 GB of files • It takes 10 hours to build • Build types and frequencies • Production release 3-4 times per year • Developer release every 2-3 weeks • Nightly build of snapshot • Build configuration permutations • Optimised, debug and sometimes also profile builds. • Two platforms (RedHat 7.3 on Intel x86, Solaris 8 on SPARC) • One or more compilers (gcc 3.2) • Config. management, build and install handled by CMT • So not a trivial task to package, distribute and install Simon George RHUL
CMT CMT www.cmtsite.org • Configuration management tool • Concerned with setting up user’s environment to build and run software • Needs help of tools for a large project • CMT helps to define and impose conventions • For naming packages, files, directories • For describing their relationships • In other words, package metadata • This is the key feature exploited for this project. • Useful features to manage sub projects, dependencies • A broad user base, especially in Particle Physics and Astronomy experiments. Simon George RHUL
Packaging Requirements • Three types of kit required • Binary kit • Pre-built executables, libraries and configuration files needed to run the software • Used for data challenges, production, basic users • Developer’s kit • Binary kit plus • Headers, libraries and configuration needed to build against it • For developers and most users • Full source kit • To rebuild from scratch on binary-incompatible platforms • When local source code browsing is required • For each permutation of platform, config, compiler Simon George RHUL
Installation requirements • For large facilities: unattended, push button deployment • For normal user: relocateable, no root access • Automatic configuration • Updates, multiple versions • Avoid duplication and unnecessary downloads • Possibility to take subset of software • Self contained, apart from … • Prerequisite software: modest list and automatic check • Set up user’s environment (e.g. LD_LIBRARY_PATH) • Reversible: uninstall • Install and work disconnected from network, e.g. install onto a laptop from CDs Simon George RHUL
Constraints • ATLAS software is divided into sub-projects • Currently ATLAS and Gaudi • Could be more in the future, e.g. split ATLAS into simulation and reconstruction • Each sub-project consists off many packages • External/Internal package distinction • Internal packages are developed and managed within the ATLAS software project • External packages are the opposite, e.g. software from the Particle Physics community, public domain software or commercial products. • Interface packages for externals • Pure metadata package • Actual external sw can be installed anywhere, any way. • Gives it the outside appearance of an internal package Simon George RHUL
Constraints, continued • Existing use of CMT • Package structure already in place • Meta data provided by packages or implied by default policies is already enough for automated packaging. • Problems • ATLAS software is written by large communities with a mixed level of experience • All such software projects will have small flaws introduced in each release • These must be worked around when they impact on the packaging. • For example, one problem of particular relevance to packaging & installation is cyclic dependencies Simon George RHUL
Packaging: starting point • One kit per package • Follow existing granularity • Separate metadata and payload • Two parts to each kit • Performed by librarian as integral part of release procedure • Distribution by web or distributed filesystem (e.g. AFS) Simon George RHUL
Tools used • CMT • Define and impose conventions on packages • Query the metadata needed for packaging • Pacman • Metadata format • Tool used to manage kit installation • Tar and RPM • Payload format – the package itself • “Deployment tools” shell scripts • Construct the kits using CMT • Control location of Pacman cache and distribution • Post-installation configuration Simon George RHUL
CMT CMT Web server or AFS Compile and link own library Run software Configure environment Install and configure Fetch packages Create kits Librarian Deployment Tools Pacman Local computers Developer Local s/w manager Overview of process and tools Simon George RHUL
Pacman • A package manager • Packager defines how the software should be fetched, installed, configured, updated, in a “Pacman” file. The package itself can be in any format as that file is separate. • A directory of these files is known as a cache, usually available on the web. • Pacman tool is used to install the software • Pacman’s feature list is a good match to the requirements for installation. • Already used by several Particle Physics and GRID projects. http://physics.bu.edu/~youssef/pacman Simon George RHUL
Package distribution format • Tar vs. RPM • Both can be made relocateable • Feature set • Tar has a simple feature set but is complementary to CMT and Pacman • RPM overlaps with CMT and Pacman • e.g. RPM also handles dependencies and prerequisites • Platforms • RPM is only widely used on Linux, while tar is standard on pretty much any Unix • Annoyances • Default RPM database needs root access to write to it • There are workarounds for this but not pretty • Conclusion • Decided to use tar • but retained RPM as an option Simon George RHUL
Meta data • For each package • Other packages it uses (dependencies) • Location of constituents • Applications and libraries • Header files • Run time/config files • CMT requirements file • External packages • Pure meta data “glue” packages • Just define paths to export • All defined in CMT requirements files • or implied by default conventions of ATLAS • Can be queried through cmt • cmt show uses • cmt show macro <package>_export_paths Simon George RHUL
Naming and structure • Package naming convention • Packages in a sub-project • <package name>-<sub-project release id> • External packages • <package name>-<version id> • These names are used when expressing the inter-package dependencies • Directory structure within each kit • <sub-project>/<release-id>/InstallArea/ • contains the sub-directories bin, lib, include, share. • <sub-project>/<release-id>/<package>/<version>/cmt/ • Contains the configuration management files • <external-package>/ • Assumed to have their own internal structure for versions & builds • This is designed to support coexistence of: • Different versions of every piece of software • Different binary versions (platform and build config) Simon George RHUL
Examples CMT requirements file: package ExamplePkgA author A. Person <ap@cern.ch> use ExamplePkgB use ExampleExtPkg library ExamplePkgA *.cxx apply pattern component_library apply pattern declare_runtime Package name and author Inter-package dependencies Instruction to build a library from source files Type of library to build, implies library file names Default location implied Pacman file: description=‘Package ExamplePkgA-01-07-02 in release 6.5.0’ url=‘http://atlas.web.cern.ch/Atlas/GROUPS/SOFTWARE/OO’ source=‘../dist’ download = { ‘*’:’ExamplePkgA-6.5.0.tar.gz’ } depends = [ ‘ExamplePkgB-6.5.0’, ‘ExampleExtPkg-v1’ ] Simon George RHUL
Creating the kits • First, build a release • Discover cycles in the dependencies • Use a feature of CMT to discover cycles in the dependencies, as these must not be propagated to the kits. Record the output in a file. • Then, use a feature of CMT to visit every package in a dependency tree and apply a command there • cmt broadcast <command> • Usage of the script to create a kit: • create_kit.sh –release <release-id> -cycles <file> [-rpm] <target distribution directory> • Creates a pacman file and tar file, optional RPM file • Finally, there are often a few things to fix by hand specific to each release. • Note that CMT itself is included as a kit Simon George RHUL
Installation • Performed by site software manager or end user on desktop or laptop • Straightforward procedure: • Install Pacman, if not already done • Install prerequisite software • Currently just RedHat 7.3 o/s, gcc-3.2 and Java SDK 1.4.1 • Choose directory for the installation • Probably the same as before • Choose which release to install • Available releases are listed on a web page • Use Pacman to download, install and configure it, e.g.pacman –get ATLAS:AtlasRelease-6.5.0 • Dependencies followed automatically to get everything you need • Optionally, run script to set up a user environment and run a test • User configures software in the usual way • Just choose release and private working area as normal • Run a setup script provided by CMT Simon George RHUL
Conclusions • Procedures and tools have been developed for the packaging, distribution and installation of ATLAS software • Based on Pacman, CMT, tar/rpm and some shell scripts • The basic principles could be applied more generally • Using some or all of the same tools • It satisfies most of the requirements for run-time and developers’ kits and for installation. • Full source kit still to be done. • Early adopters have given useful feedback and it is now being imported into Grid production systems • Must now move to its use as part of the standard release procedure in ATLAS • by December 2003, for our global `Data Challenge 2' Simon George RHUL
Future developments • Better handling of prerequisite software and platform compatibility checks • EDG WP4 configuration management task • Potential to work with an installation on demand mechanism for GRID farms • LCG/EDG/iVDGL GLUE • Meta packaging proposal for Grid middleware and applications, O. Barring et al. • Pacman version 3 Simon George RHUL