230 likes | 337 Views
CE + WN installation and configuration. Vanessa Hamar Universidad de Los Andes – Mérida, Venezuela 9 th EELA Tutorial Bogota, 05-09 March,2006. Outline. What is a Computing Element (CE) ? What is a Torque Server ? What is a Worker Node?
E N D
CE + WN installation and configuration Vanessa Hamar Universidad de Los Andes – Mérida, Venezuela 9th EELA Tutorial Bogota, 05-09 March,2006
Outline • What is a Computing Element (CE) ? • What is a Torque Server ? • What is a Worker Node? • How to install and configure a Computing Element with Torque Server. • How to install and configure a Worker Node with Torque
What is CE? • The CE is a service representing a computing resource. • Its main functionality is job management (job submission, job control, etc.). • For job submission, the CE can work in: • push model (where the job is pushed to a CE for its execution). • pull model (where the CE asks the WMS for jobs).
What is Torque? • TORQUE(Tera-scale Open-source Resource and QUEue management) is a resource management providing control over batch jobs and distribuited compute resource. • The Torque System is composed by a: • pbs_server which provides the basic batch services such as receiving/creating a batch job or protecting the job against system crashes. • job_scheduler which contains the site's policy used to decide which job must be executed. • pbs_mom which places the job into execution. It is also responsible for returning the job’s output to the user.
What is a Worker Node? • The Worker Node (WN) is a set of clients required to run jobs sent by the CE via the Local Resource Management System. It currently includes the: • gLite I/O Client, • the Logging and Bookkeeping Client, • the R-GMA Client and • the WMS Checkpointing library.
Installing CE + Torque Server WN + Torque
Preliminary and common steps • Start from an instalation of SLC 3.0.X • Install JAVA SDK • Remove LAM and Postfix • Check the hostname • Install and configure ntp daemon • Install the latest version of YAIM • Install the latest version of GILDA YAIM • Install X.509 host certificates /etc/grid-security and check their file permissions. • Install the middleware
Installing pre-requisites • JAVA is not included in distribution. Install it separately (>= 1.4.2_08) http://java.sun.com/j2se/1.4.2/download.html chmod +x j2sdk-1_4_2_13-linux-i586-rpm.bin ./j2sdk-1_4_2_10-linux-i586-rpm.bin rpm -ivh j2sdk-1_4_2_13-linux-i586.rpm Preparing... ########################################### [100%] 1:j2sdk ########################################### [100%]
Installing pre-requisites • Depending on the packages set you selected when installing the operating system, it may be possible that lam package is installed on your WN. Please remove lam. apt-get remove lam • There is a known installation conflict between the 'torque-clients' rpm and the 'postfix' mail client (Savannah. bug #5509). If you are going to install Torque, uninstall postfix package apt-get remove postfix
Installing pre-requisites • Check the FQDN hostname • Ensure that the hostnames of your machines are correctly set. Run the command: hostname -f
Installing pre-requisites • Syncronization among all gLite nodes is mandatory. Install ntp if not already available for your system: • apt-get install ntp • Add your time server in /etc/ntp.conf • restrict <time_server_IP_address> mask 255.255.255.255 nomodify notrap noquery • server <time_server_name> • (you can use ntp-1.infn.it – IP 193.206.144.10) • Edit /etc/ntp/step-tickers adding your(s) time server(s) hostname • If you are running a firewall, you will have to allow inbound comminication on the NTP port: • -A INPUT -s <NTP-serverIP-1> -p udp --dport 123 -j ACCEPT • Activate the ntpd service with the following commands: • ntpdate <your ntp server name> • service ntpd start • chkconfig ntpd on • You can check ntpd’s status with: • ntpq -p
Installing pre-requisites • Install glite-yaim and gilda_ig-yaim packages on your nodes • Download and install latest version of glite-yaim-3.0.0 -* on all your grid nodes: http://glitesoft.cern.ch/EGEE/gLite/APT/R3.0/rhel30/RPMS.Release3.0/ rpm -hiv glite-yaim-3.0.0-16.noarch.rpm Preparing... ########################################### [100%] 1:glite-yaim ########################################### [100%]
Installing pre-requisites • Download and install the latest version of gilda_ig-yaim-3.0.0 -* on all your grid nodes: http://grid018.ct.infn.it/apt/gilda_app-i386/utils [root@eelatut37 root]# rpm -hiv gilda_ig-yaim-3.0.0-11.noarch.rpm Preparing... ########################################### [100%] 1:gilda_ig-yaim ########################################### [100%]
Installing pre-requisites • Request host certificates for the CE to a CA • https://gilda.ct.infn.it/CA/mgt/restricted/srvreq.php • Copy host certificate (hostcert.pem and hostkey.pem) in /etc/grid-certificates. • Change the permisions • chmod 644 hostcert.pem • chmod 400 hostkey.pem • If you plan to use certificates released by unsupported EGEE CA’s, be sure that their public key and CRLs (usually distributed with a rpm) are installed. • The CRL of the VO GILDA are available from https://gilda.ct.infn.it/RPMS/ca_GILDA-1.0-1.i386.rpm
Installing CE+Torque Server via apt • All the configuration values to sites have to be configured in a site configuration file using key-value pairs. • This file is shared among all the different gLite node types. So edit once and keep it in a safe place • Create a copy of /opt/glite/yaim/examples/gilda-site-info.def template (coming from the gilda-yaim RPM) to your reference directory for the installation (e.g. /root): • cp /opt/glite/yaim/examples/gilda_ig-site-info.def /root/gilda-site-info.def • A good syntax test for your site configuration file is to try to source it manually running the command: • source gilda-site-info.def
Installing CE+Torque Server via apt • vi /opt/glite/yaim/examples/gilda_wn-list.conf eventogridXXX.uniandes.edu.co …..
Installing CE+Torque Server via apt • Install the node /opt/glite/yaim/scripts/gilda_ig_install_node gilda_ig-site-info.def GILDA_ig_CE_torque • Configure the node /opt/glite/yaim/scripts/gilda_ig_configure_node gilda_ig-site-info.def GILDA_ig_CE_torque
Installing CE+Torque Server via apt • If the installation is performed successfully, the following components are installed: • gLite in /opt/glite • Condor in /opt/condor-x.y.x (where x.y.z is the current condor version) • Globus in /opt/globus • Tomcat in /var/lib/tomcat5 • Torque in /var/spool/pbs
Installing CE+Torque Server via apt • Edit /etc/ssh/sshd_config and add the following lines at the end: HostbasedAuthentication yes IgnoreUserKnownHosts yes IgnoreRhosts yes • Restart the server with: /sbin/service sshd restart
Installing CE+Torque Server via apt • On the CE generate an updated version of /etc/ssh/ssh_know_hosts by running: • edg-pbs-shostsequiv • edg-pbs-knownhosts • Copy that file into all the WorkerNodes.
Installing WN Server via apt Install the node /opt/glite/yaim/scripts/gilda_ig_install_node gilda_ig-site-info.def GILDA_ig_WN_torque Configure the node /opt/glite/yaim/scripts/gilda_ig_configure_node gilda_ig-site-info.def GILDA_ig_WN_torque
References • https://gilda.ct.infn.it/docs/GILDAsiteinstall-3_0_0.html • https://grid.ct.infn.it/twiki/bin/view/GILDA/GliteElementInstallation