180 likes | 307 Views
CMS Software Installation. Bockjoo Kim U of Florida. Bockjoo Kim U of Florida. CMSSW Installation Sites on OSG. 9 T3 9 T2. T2. T3. T2. T3. T2. T2. T3. (Caltech). T3. T2. T3. T3. T2. T3. T2. T2. T3. T2. T3. (FIU). CMSSW on US T3 Sites(as of 10/21/08).
E N D
CMS Software Installation Bockjoo Kim U of Florida Bockjoo Kim U of Florida
CMSSWInstallation Sites on OSG 9T3 9 T2 T2 T3 T2 T3 T2 T2 T3 (Caltech) T3 T2 T3 T3 T2 T3 T2 T2 T3 T2 T3 (FIU)
Centralized CMS Installation • CMS requires CMSSW to be installed centrally for T2 • For T3, one can install it locally or centrally • non-CMS OSG sites, it can be done centrally • All installations are uniform ($OSG_APP/cmssoft/cms/$SCRAM_ARCH/cms/cmssw/CMSSW_X_Y_Z) • Central Installation on OSG done by me • LCG/EGEE counter part (serveral people) is doing this on LCG/EGEE
CMS Software Life Cycle • Software Development • Release Build • APT Packaging • Tagging Release in XML for Publication • Release Announcement • Deployment on Grids/Local • Release Deprecation Pre-Announcement • Tagging for Release Deprecation Release • Release Deprecation Announcement • Remove Deprecated Release
Features of Software Deployment Tool • Condor-G Job Submission with Customized Installation/Verfication Scripts • Central Run and Bookkeeping DB • Possible Install via Grid Proxy Based Portal: Different DN Can Install Different Site Simul. • Cronized Installation In Parallel per Release • Production CMS Soft Release on OSG T2/T3 • Twiki Pages : https://twiki.cern.ch/twiki/bin/view/CMS/CMSSoftDeployOSG • Portal : https://dev01.ihepa.ufl.edu:8443/csdogrid/csdogrid/
Consideration For Installation • Different Linux Flavors • 64-bit at 32-bit mode • Network Insulation (non-CMS sites) • Missing Tools on WN : apt-get, rpm, rpmbuild (non-CMS sites) • Different Shared File Systems (Lustre, AFS, etc) • Dedicated Slot : Needs to be done before any other CMS job starts • Disk Space Issue : ~50GB required typically • Many files ==> Installation time gets longer • Keeping many releases ==> Deprecation becomes terrible and should be done on time
APT and File Systems • APT (RPM tool) uses lots of locks • At least, four different FS accros OSG sites : NFS, AFS, GPFS, lustre • GPFS, lustre, other not well-known FS requires special treatment for locks using local FS ( not many exotic FS, though) • Many files ==> Installation time gets longer • Keeping many releases ==> Deprecation becomes very important ==> (64-bit OS/8GB required in the worst case)
Deprecation • Mostly same as installation • Most time consuming part is find dependency and dependency bookkeeping • NFS stale file handle and GPFS : ‘rm -rf’ • Recycle dependency calculation from one site • Deprecation is also automated and cronized
CMS Software Packaging Tool • CMS employs APT packaging for software distribution • CMS provides packaging tools for initial setup and update • RPMs are installed in a non-root area • CMS provides release publication : this allows deployment people for immediate deployment
Automated CMS software Deployment • Well-established OSGCMS sites needs software deployment promptly • CRON is used for automation • List of well-established OSGCMS sites • Database(DB) for bookkeeping • Scripts checks and executes: • New release that needs to be deployed • Deployment status from DB • Deployment job submission/resubmission • Installation job self-monitoring • Email notification • Repeatition of all if necessary
Implementation of the Deployment Tool Site List CRON X509 Web Portal Site Catalog GridCat Client Site Availability DB Check DB Check GridCat MySQL DB Job Status Email Deploy Script Execution Script CMS Grid Users DB Update Condor-G Job OSG Software Local Scripts Application Area CMS Pkg Tool Info. Publication RPMs Tool Design A Remote OSG Site Execution Script OSG CMS CMS Pkg Tool RPMs CMS APT Repository
SAM and SW Installation Monitoring SAM monitors SW Installations Related with SW Installation
List of Problems and Solutions • RPM version mismatch -> Rebuild RPM DB • rpm-wrapper error 88 -> insufficient disk • rpm-wrapper error 92 -> permission problem • “Could not get lock” -> FileSystem, use local disk • memory alloc (4byte)-> Remove releases • memory alloc (8byte) -> Use 64-bit apt-get
Statistics of CMS SW Deployment Years 2006 - 2008
Summary • More than 717(1024) installations/removal for CMS software have been deployed on OSG, 2006-2008 • Automated installation works quite efficiently with almost no problem theses days • Recently, most installation problem comes from rpm db limitation and can be fixed via 64-bit apt-get • T3 sites are not required to deploy CMSSW centrally. At the moment, there are 9 sites under the regular/central deploy list. • If other T3 sites wish to be included, please let me know