250 likes | 260 Views
Learn about DAR, a tool developed at Fermilab for quick and easy deployment of CMS software applications onto systems without a pre-existing CMS environment.
E N D
Software Packaging with DAR Natalia Ratnikova, Anzar Afaq, Greg Graham Fermilab Tony Wildish, Veronique Lefebure CERN
Introduction ► Motivation • Compact Muon Solenoid CMS HEP experiment will run on the LHC accelerator at CERN. • CMS is using GRID technologies to utilize available computing resources for the worldwide distributed Monte Carlo event production. • To make this possible CMS software applications must be brought to the production sites.
Introduction ► Scope • CMS software includes a wide range of inter-related projects and external tools managed by the Software Configuration Release and Management tool SCRAM. • Complete installation of the CMS software and environment on the remote sites is uneasy task, and actually it is not necessarily required in order to run ready applications.
Introduction ► Goal • The USCMS software and computing project goal was to move CMS MC production in the US completely onto the GRID computing resources. • We wanted to have an automated way to create self-consistent distributions of the applications, based on the software released at CERN. • The Distribution After Release DAR tool was developed at Fermilab for quick-an-easy deployment of the CMS software applications, which can run on the systems that do not have pre-existing CMS environment.
DAR concept • DAR automatically creates and installs software applications based on the runtime environment . • Applicationis a complete, self-contained software program, including all required shared libraries and other files, that can be executed in a particular environment to accomplish a particular computing task. • Runtime environmentis a set of UNIX shell environment variables used by the program during the runtime.
Concept ► Choices, Decisions • There is a class of tools and utilities, such as operating system kernel, loader, that though needed for the applications, are usually present on the remote computing node. • It’s hard to define a clear border between the application and the operating system, so one sometimes has to decide what to include into the distribution, and what must be pre-installed by the local system administrator • In CMS software these issues are controlled through the projects configuration, which specifies the required tools and corresponding environment
Concept ► Conditionals • Application software must be relocatable. • This is most important and natural requirement. Most of real quality software products are relocatable, and the location is usually controlled through the shell environment variable. • No hard-coded absolute paths in the program or in the shared libraries (except those referred to the system area). • All executables are found in the PATH. • DAR distributions rely on the system compatibility
DAR Implementation • DAR is implemented in scripting languages, no compilation is required. • Core of DAR code is written in PERL. • Interfaces and extensions are written in Python. • DAR code can be simply download from the CVS repository or from the web, and can be used immediately. In the CMS environment: • dar -c <top release directory> <temporary directory> On the remote site: • Dar -i <distribution darball> <installation directory>
Implementation ►Shared Libraries • DAR will walk through the directories specified in the $LD_LIBRARY_PATH environment variable and package all found libraries into the distribution. It will insure that upon installation the runtime environment scripts will set proper $LD_LIBRARY_PATH in correct order. • DAR does not rely on the output of the • ldd <executable> command, as this is considered unsafe in case of dynamically loaded libraries.
Implementation ► Executables • By default DAR will walk through the directories in the $PATH environment variable (only the portion added for this particular application) and include the contents of directories into DARball. • This behavior can be overwritten by setting the $DAR_runtime_PATH environment variable, in which case the associated files and directories will be included into the distribution, and will be added to the $PATH in the DAR runtime environment scripts.
Implementation ► Other Variables • DAR distinguishes between three types of the runtime environment variables: • Simple values (flags) • Variables associated with some path to existing file or directory in the local file system • Variables associated with several paths in the local file system ($PATH-like variables, were entries are separated by the colon delimiter) • All physical files and directories found in specified paths are included preserving the underlying directory structure.
General Practices, Tests • All sophisticated work is done by DAR while creating the distribution • The installation procedure is extremely simple. • Friendly user interface: simple commands, built-in help, backward compatibility.
Tests • Run same application in the native environment • Install DARball and run on the same node • Install and run application on remote host without pre-installed CMS environment Same output in all three cases means success. Second type of tests is optional, and can be used to identify any discrepancies in the operating system configuration.
Using DAR in Production • DAR created distributions have been used as a mandatory way to install software for the official CMS Monte Carlo production. • Using the same set of applications and consistent software distribution mechanisms insured stable performance and trustworthy results. • The RefDB2DAR interface has been developed to formalize the requests for applications and provide bookkeeping of the available distributions.
CMS production over GRID (fall 2002) The CMS Integration GRID Testbed produced 1.2 million CMS Monte Carlo events from generation with PYTHIA physics generator through simulation with GEANT and digitization with Objectivity based applications.All results shown here were run on Red Hat 6 systems, though some GEANT-only production was also run on newer Red Hat 7 systems.
Next steps ► Bookkeepiing • The RefDB2DAR interface allows to download request file from the RefDB. • Refdbdar utility is then used to • parse and validate the RefDB request file • call Packager: CMSIM_packager, CMKIN_packager, or DAR_packager for scram managed projects, • packager builds executables as requested and creates distribution
Next steps ► Optimizations runtime environment contains some superfluous directories and files. However for detection of files, that could be safely excluded, expert's knowledge of the software application is required. • a number of new expert options allow to filter the contents, but it may take several iterations to figure out what can be removed, and whether it is efficient and safe.
Next steps ► Optimizations • Space optimizations: • Avoid duplications (all duplicated files are replaced by symbolic links) • Introduced expert’s options : • Runtime environment contains some superfluous directories and files. • However for detection of files, that could be safely excluded, expert's knowledge of the software application is required • Time optimizations: • Automating tests
Distribution process • Production Coordinator fills web form to create DARball request. Generated request is stored in the RefDB, notification is sent by e-mail. • DARball is created then created using refdbdar and request file, based on software release installation at CERN. • Application is installed and tested in DAR runtime environment.
Distribution process • DARball is put into SRB for distribution and is ready for the production assignments. • Production sites get the assignments with the indication of the DARball (by name). DARball is then downloaded from the SRB and installed, using DAR, on the worker nodes. • McRunJob tool creates job based on application and submitts it to the production GRID.
Using DAR in MOP • MOP is a system for distributing CMS Monte-Carlo production jobs over the GRID. • MOP has capability of running any type of scripts (jobs) at remote GRID sites, called Worker Sites. • MOP run jobs as DAGs (Decyclic Acrylic Graph) which could be combined together to create complex workflows.
Using DAR in MOP • In general every DAG contain 04 stages. • Stage-in: Bring in the required input files (from several sources) to worker site. • Run: Execute the job itself, producing results, logs, data. • Stage-out: Send out produced results/data/logs. • Clean-up: clean the left over files/directories at worker site.
Using DAR with MOP • DAR installation at a worker site is achieved by • creating a special MOP job • that first pull DAR tool and Application DAR distribution in stage-in, • runs installation by invoking DAR in run-stage, • Bring back the results of installation to submission site in stage-out • and then performs a clean up operations at worker site.
Summary • DAR-based distribution scheme is successfully used in the CMS event production for an extended period of time. • It allows to keep the pace with the software developments and deliver software applications to the productions sites with ease and in a timely fashion. • Being re-packaged into RPM files, applications can be re-used within different distribution approaches (e.g. LCFG).
Acknowledgements • Main credit in this work should be addressed to the core CMS software developers, architects and release managers for the constant care about software quality. • We would like to thank CMS and USCMS software and computing managers for their attention paid to this project, CMS Production Team for providing excellent working environment, and all CMS colleagues from many counties and institutions for their useful feedback. • My special thanks to Dr. Yujun Wu for presenting this talk to You, and numerous fruitful discussions. THANK YOU