80 likes | 201 Views
gLite 3.1 Windows Worker Node Installation. Eng. Dario Russo – I.N.F.N Catania. Installing WINWN on VN machines. Fire up a connection to our VM windows2003s: Start->Programmi->Connessione Desktop Remoto Connect to infn-win-win-XX.ct.pi2s2.it with XX=06-27
E N D
gLite 3.1 Windows Worker Node Installation Eng. Dario Russo – I.N.F.N Catania
Installing WINWN on VN machines • Fire up a connection to our VM windows2003s: • Start->Programmi->Connessione Desktop Remoto • Connect to infn-win-win-XX.ct.pi2s2.it with XX=06-27 • User Administrator password tutorial.007 • Open iexplore and download http://gilda-forge.ct.infn.it/frs/download.php/138/Grid2WinWorkerNodeInstaller-0.1.2.exe or follow links for Wninstaller from gilda-forge.ct.infn.it • Execute the installer and when a shell spawn WAIT! If you input anything wrong you have to uninstall all and start all over again
Script Questions • First 2 questions asked: • Worker node hostname: input infn-winwn-XX.ct.pi2s2.it • CE hostname: input fixed to infn-wince-01.ct.pi2s2.it (no mistakes, it will change owner(not loggable) and permission so badly you’d prefer to reinstall all over again instead of fixing it later) • Sshd questions: • if the question is yes/no, answer yes • If the question is Enter the value of CYGWIN for the daemon: [ntsec]: answer ntsec • Please insert supported VO: • Answer gilda (no mistakes, its gonna create 80 users gildaXXX) Note: the script just executed is located at /sbin/gliteEdgPbsInstallation.sh
Software just installed • Cygwin is needed to run a worker node • Most of GNU software linked against cygwin • Globus 4.0.5 needs IPv6 support which Cygwin doesn’t offer: Cygwin1.dll is patched from http://win6.jp/Cygwin/index.html • gLite User Interface software + some clients (globus-gass-cache, edg-gridftp-rm etc) • Torque (pbs gpl version) ported on cygwin
Services just Started • SSHD, to get WN ssh keys (and for failsafe administration, optionally) • CRON, to update CRLs • PBS_MOM (torque daemon responsible for the LS) • syslog-ng
Why is windows a pain? • Pbs_mom needs to release privileges and su on the selected VO user account • Windows doesn’t agree… (setuid gives permission denied even if the caller process is owned by Administrator or SYSTEM/local (services spawner) • Cygwin sets a special account on windows 2003 called sshd_server which is able to do it, if installer fails OR you are installing over windows XP WN, execute from windows rxvt and create it • Pbs_mom torque service will use such user • net user sshd_server /ADD • editrights -a SeAssignPrimaryTokenPrivilege -u sshd_server • editrights -a SeCreateTokenPrivilege -u sshd_server • editrights -a SeTcbPrivilege -u sshd_server • editrights -a SeDenyInteractiveLogonRight -u sshd_server • editrights -a SeDenyNetworkLogonRight -u sshd_server • editrights -a SeDenyRemoteInteractiveLogonRight -u sshd_server • editrights -a SeIncreaseQuotaPrivilege -u sshd_server • editrights -a SeServiceLogonRight -u sshd_server • Remove pbs_mom service and reinstall with : • cygrunsrv -I pbs_mom -p $PBS_HOME/sbin/pbs_mom.exe -u sshd_server –w <passwd> –stdout \ $PBS_HOME/mom_logs/pbsmomwinservicestdout --stderr \ $PBS_HOME/mom_logs/pbsmomwinservicestderr; net start pbs_mom
Troubleshooting… • What can go wrong: • Pbs_mom not present: check if sshd_server user is present and has right privileges. • CE complains that submit-helper can’t find globus-url-copy or globus_gass_cache : check if $PBS_HOME/pbs_environment has GLOBUS_LOCATION correctly exported (tipically /opt/globus) • CE complains that submit-helper cant find any *ftp rm (check that edg-gridftp-* commands are reacheable by PATH in $PBS_HOME/pbs_environment (they should be in /opt/edg/bin and /opt/edg/libexec) • Jobs failing mysteriously: • Check if the pool account is correcly installed in /home, • check if cron is updating CRL’s (run crontab –l from Administrator, if no output, type crontab /etc/crontabadministrator, you can force CRL update by hands typing edg-fetch-crl in /etc/grid-security/certificates). • Check /etc/grid-security/vomsdir/<voname>/<voserver>.lsc files (im getting rid of .pem files, they tend to expire ) • Check that ssh key are correctly exchanged (check /etc/ssh/ssh_known_host on CE and /etc/ssh_known_host on WN’s)