160 likes | 324 Views
Quattor-for-Castor. Jan van Eldik Sept 7, 2005. Outline. Overview of Quattor @ CERN Central bits CDB template structure SWREP Local bits Updating profiles SPMA NCM Exercises Browsing CDB templates Changing RPMs with the SPMA Updating Lemon monitoring configuration
E N D
Quattor-for-Castor Jan van Eldik Sept 7, 2005
Outline • Overview of Quattor @ CERN • Central bits • CDB • template structure • SWREP • Local bits • Updating profiles • SPMA • NCM • Exercises • Browsing CDB templates • Changing RPMs with the SPMA • Updating Lemon monitoring configuration • Later: giving root access, adding new RPMs to the repository, configuring castor
Disclaimer(s) • This is very incomplete! • Focus is on “standard operations” • I am Just Another Quattor user tip of the day: Monkey see, monkey do
CDB configuration database • “Global schema” to describe node • Written in homegrown PAN language (declarative and procedural bits) • Host templates profile_<hostname>.tpl, including more general templates pro_*.tpl • Compiled into XML files • Every node has a local copy of its configuration information • Very node-centric (by design)
CDB - 2 • CDB updates can be (very) slow… • Most information also available in CDBSQL • i.e. in an Oracle database • asynchronous updates, can be very slow too…
Example: profile_tpsrv901.tpl • From node specific to site specific • Overwriting of certain values • If-then-else, functions • Include-files make it hard to find where information comes from • Software packages • Configuration info • Hardware description • Administrative information • serial console, info derived from landb, …
Modify templates with cdbop • Useful commands: • help • list profile_tpsrv* • get profile_tpsrv901.tpl pro_system_tapeserver.tpl • !sh # drops you to a shell • !emacs profile_tpsrv901.tpl pro_system_tapeserver.tpl • update *.tpl • commit • Versioning available • Run on lxplus, NICE authentication
Software repository SWREP • If you want to provide new RPMs…swrep-client put i386_slc3 myfile.rpm /cern/cc • Separate repositories • per architecture {i386,ia64,x86_64}_slc3 • OS i386_{slc3,rhes3} • On lxplus, uses ssh authentication
On the nodes… • Synchronization between CDB and local profile is crucial!!! • But automagic: • Hosts are notified of profile changes • Hourly cron job, just in case… • You can /usr/bin/ccm-fetch by hand • List local cache, as rootncm-query –dump / | less
Invoking SPMA to change RPM-set • PAN functions pkg_add() ,pkg_del(), pkg_repl() • SPMA can be configured to not touch packages it does not know about"/software/components/spma/userpkgs" = "yes"; "/software/components/spma/userprio" = "yes"; • SPMA can be forcefully disabled echo ‘disabling SPMA – JvE’ > /etc/nospma • Run (as root): spma_wrapper.sh[--noaction] [INFO] The following package operations are required: replace- SINDES 0.9 11 noarch with http://swrep/swrep/i386_slc3/ SINDES 0.9 12 noarch install http://swrep/swrep/i386_slc3/ stk-ssi-devel 2.3 0.cern i386 [INFO] Please be patient... 2 operation(s) to verify/execute. [OK] SPMA finished successfully.
Configuring the node • NCM components configure servicesafs, sendmail, tapeserver, fmonagent, spma, … • All components on the nodencm-ncd –list • Configure access control and grubncm-ncd –configure access_control grub All-in-one: spma_ncm_wrapper.sh
some hints ‘n tips… • Use wassh to run commands on multiple hostswassh –s slc3 root@castoradm uptime wassh root@lxfs60\[01-04] shutdown –r now • Quick check to see if Lemon is happy check-this-host • Disable alarm sendingecho ‘JvE did this’ > /home/operator/nomorealarms • Use serial console connect2console.sh l3006d # and pray :)
#1 – use cdbop • Log on to lxplus, start cdbop, get the node profile of a tapeserver of my choice • Log on as root to that tapeserver, and runncm-query –dump /hardware | less • Locate the serial number of the machine in both sessions • Locate the serial numbers of the harddisks in both sessions
#2 – adding an RPM • Add to your tapeserver profile"/software/packages"= pkg_add("CERN-CC-PrepareInstall","2.3-0","noarch"); • Commit the change • Use ncm-query –dump /software/packages | lessand try to find the new package • Run spma_wrapper.sh to install the package • Run rpm –e CERN-CC-PrepareInstallto remove it • Re-run spma_wrapper.sh
#3 – change Lemon alarm • Disable alarm sending on your tapeserver • Check the Sure alarms • Stop atd daemon with service atd stop • Check the Sure alarms again • Locate the “atd” monitoring configuration, starting at the Lemon website • De-activate it in the tapeserver profile“/system/monitoring/….…./active” = false; • Reconfigure fmonagent • The Sure alarm should be gone… • Reactive atd monitoring, start the daemon, re-enable alarm sending
Follow-up • Adding new RPMs to SWREP,and deploying them on a cluster • Giving (root) access