280 likes | 370 Views
Instalación y configuración de CE+WN. Alicia Acero Fernández CIEMAT 6 th EELA Tutorial Madrid 16-20/10/2006. Outline. What is a Computing Element (CE) ? What is a Torque Server ? What is a Worker Node? How to install and configure a Computing Element with Torque Server.
E N D
Instalación y configuración de CE+WN Alicia Acero Fernández CIEMAT 6th EELA Tutorial Madrid 16-20/10/2006
Outline What is a Computing Element (CE) ? What is a Torque Server ? What is a Worker Node? How to install and configure a Computing Element with Torque Server. How to install and configure a Worker Node with Torque Computing Element Testing 6th EELA Tutorial, Madrid, 16-20.10.2006
What is CE? • The CE is a service representing a computing resource. • Its main functionality is job management (job submission, job control, etc.). • For job submission, the CE can work in: • push model (where the job is pushed to a CE for its execution). • pull model (where the CE asks the WMS for jobs). 6th EELA Tutorial, Madrid, 16-20.10.2006
What is Torque? • TORQUE(Tera-scale Open-source Resource and QUEue management) is a resource management providing control over batch jobs and distribuited compute resource. • The Torque System is composed by a: • pbs_server which provides the basic batch services such as receiving/creating a batch job or protecting the job against system crashes. • job_scheduler which contains the site's policy used to decide which job must be executed. • pbs_mom which places the job into execution. It is also responsible for returning the job’s output to the user. 6th EELA Tutorial, Madrid, 16-20.10.2006
What is a Worker Node? • The Worker Node (WN) is a set of clients required to run jobs sent by the CE via the Local Resource Management System. It currently includes the: • gLite I/O Client, • the Logging and Bookkeeping Client, • the R-GMA Client and • the WMS Checkpointing library. 6th EELA Tutorial, Madrid, 16-20.10.2006
Installing CE + Torque Server WN + Torque 6th EELA Tutorial, Madrid, 16-20.10.2006
Preliminary and common steps • Start from an instalation of SLC 3.0.X • Install JAVA SDK • Install and configure ntp daemon • Install the latest version of YAIM • Install the latest version of GILDA YAIM • Install X.509 host certificates (it is necessary for CE, SE and RB) /etc/grid-security and check their file permissions. • Install the middleware 6th EELA Tutorial, Madrid, 16-20.10.2006
Installing pre-requisites • JAVA is not included in distribution. Install it separately (>= 1.4.2_08) http://java.sun.com/j2se/1.4.2/download.html chmod +x j2sdk-1_4_2_12-linux-i586-rpm.bin ./j2sdk-1_4_2_12-linux-i586-rpm.bin rpm -ivh j2sdk-1_4_2_12-linux-i586.rpm Preparing... ########################################### [100%] 1:j2sdk ########################################### [100%] 6th EELA Tutorial, Madrid, 16-20.10.2006
Installing pre-requisites • Depending on the packages set you selected when installing the operating system, it may be possible that lam package is installed on your WN. Please remove lam. apt-get remove lam • There is a known installation conflict between the 'torque-clients' rpm and the 'postfix' mail client (Savannah. bug #5509). If you are going to install Torque, uninstall postfix package apt-get remove postfix • Check the FQDN hostname • Ensure that the hostnames of your machines are correctly set. Run the command: hostname -f 6th EELA Tutorial, Madrid, 16-20.10.2006
Installing pre-requisites • Download and install latest version of glite-yaim-latest.rpm on all your grid nodes: http://grid-deployment.web.cern.ch/grid-deployment/gis/yaim/ rpm -ivh glite-yaim-latest.rpm • Download and install the latest version of gilda_ig-yaim-latest on all your grid nodes: http://grid018.ct.infn.it/apt/gilda_app-i386/utils rpm –ivh gilda_ig-yaim-latest 6th EELA Tutorial, Madrid, 16-20.10.2006
Installing pre-requisites • Request host certificates for the CE to a CA • https://gilda.ct.infn.it/CA/mgt/restricted/srvreq.php • Create the directory /etc/grid-security and copy host certificates (hostcert.pem and hostkey.pem) in it. • Change the permisions • chmod 644 hostcert.pem • chmod 400 hostkey.pem • If you plan to use certificates released by unsupported EGEE CA’s, be sure that their public key and CRLs (usually distributed with a rpm) are installed. • The CRL of the VO GILDA are available from https://gilda.ct.infn.it/RPMS/ca_GILDA-1.0-1.i386.rpm 6th EELA Tutorial, Madrid, 16-20.10.2006
Installing pre-requisites • Syncronization among all gLite nodes is mandatory. Install ntp if not already available for your system: • apt-get install ntp • yum install ntp • Add your time server in /etc/ntp.conf • restrict hora.rediris.es mask 255.255.255.255 nomodify notrap noquery • server hora.rediris.es • (you can use ntp-1.infn.it – IP 193.206.144.10) • Edit /etc/ntp/step-tickers adding your(s) time server(s) hostname • If you are running a firewall, you will have to allow inbound comminication on the NTP port: • -A INPUT -s <NTP-serverIP-1> -p udp --dport 123 -j ACCEPT • Activate the ntpd service with the following commands: • ntpdate <your ntp server name> • service ntpd start • chkconfig ntpd on • You can check ntpd’s status with: • ntpq -p 6th EELA Tutorial, Madrid, 16-20.10.2006
Installing CE+Torque Server • All the configuration values to sites have to be configured in a site configuration file (site-info.def) using key-value pairs. • This file is shared among all the different gLite node types. So edit once and keep it in a safe place • Create a copy of /opt/glite/yaim/examples/gilda_ig-site-info.def template (coming from the lcg-yaim RPM) to your reference directory for the installation (e.g. /root): • cp /opt/glite/yaim/examples/gilda_ig-site-info.def /root/my-site-info.def • A good syntax test for your site configuration file is to try to source it manually running the command: • source my-site-info.def 6th EELA Tutorial, Madrid, 16-20.10.2006
Customize gilda_ig-site-info.def • Open file /root/my-site-info.def and set the values according to your grid environment: MY_DOMAIN=fdi.ucm.es CE_HOST=grid-xx.$MY_DOMAIN RB_HOST=arquimedes.rediris.es WMS_HOST=glite-rb.ct.infn.it PX_HOST=grid001.ct.infn.it BDII_HOST=euclides.rediris.es MON_HOST=aliserv6.ct.infn.it FTS_HOST=fts.ct.infn.it REG_HOST=rgmasrv.ct.infn.it : 6th EELA Tutorial, Madrid, 16-20.10.2006
Customize gilda_ig-site-info.def • OS_REPOSITORY="rpm http://sunsite.rediris.es/ftp/volumes/vol2/grid.infn.it/rep slc306-i386 os updates extras " • LCG_REPOSITORY="rpm http://sunsite.rediris.es/ftp/volumes/vol2/grid.infn.it/rep glite_sl3-i386 3_0_0 3_0_0_externals 3_0_0_updates" • IG_REPOSITORY="rpm http://sunsite.rediris.es/ftp/volumes/vol2/grid.infn.it/rep ig_sl3-i386 3_0_0 utils“ • GILDA_REPOSITORY="rpm http://sunsite.rediris.es/ftp/volumes/vol2/grid.infn.it/rep gilda_app-i386 app 3_0_0" • CA_REPOSITORY="rpm http://sunsite.rediris.es/ftp/volumes/vol2/grid.infn.it/rep glite_sl3-i386 security" 6th EELA Tutorial, Madrid, 16-20.10.2006
Customize gilda_ig-site-info.def • JAVA_LOCATION="/usr/java/j2sdk1.4.2_12“ • MYSQL_PASSWORD=rediris06 • APEL_DB_PASSWORD="APELDB_PWD" • SITE_EMAIL=info@rediris.es • SITE_NAME=“EELA Tutorial” • SITE_LOC=“Madrid,Spain" • SITE_LAT=37.5 • SITE_LONG=15.152 • SITE_WEB="http://www.eu-eela.org." • SITE_TIER="EELA GILDA tutorial" • SITE_SUPPORT_SITE=“root@localhost" 6th EELA Tutorial, Madrid, 16-20.10.2006
Installing CE+Torque Server JOB_MANAGER=lcgpbs CE_BATCH_SYS=pbs BATCH_BIN_DIR=/usr/bin BATCH_VERSION=torque-1.0.1b CE_CPU_MODEL=PIII CE_CPU_VENDOR=intel CE_OS="Scientific Linux SL" CE_OS_RELEASE="SL" CE_OS_VERSION=3.0.6 CE_MINPHYSMEM=1024 CE_MINVIRTMEM=2048 CE_SMPSIZE=2 CE_SI00=1000 CE_SF00=1200 CE_OUTBOUNDIP=TRUE CE_INBOUNDIP=TRUE CE_RUNTIMEENV="a list of tags" 6th EELA Tutorial, Madrid, 16-20.10.2006
Installing CE+Torque Server • Edit the file /opt/glite/yaim/etc/gilda_ig-wn-list.conf and insert your WN hostnames grid-xx.fdi.ucm.es 6th EELA Tutorial, Madrid, 16-20.10.2006
Installing CE+Torque Server • Copy from /opt/lcg/yaim/examples to /opt/lcg/yaim the files: site-info.def, users.conf, wn-list.conf and modify them according to your site. • Install the node /opt/glite/yaim/scripts/gilda_ig_install_node /root/my-site-info.def GILDA_ig_CE_torque • Configure the node /opt/glite/yaim/scripts/gilda_ig_configure_node /root/my-site-info.def GILDA_ig_CE_torque 6th EELA Tutorial, Madrid, 16-20.10.2006
Installing CE+Torque Server • If the installation is performed successfully, the following components are installed: • gLite in /opt/glite • Condor in /opt/condor-x.y.x (where x.y.z is the current condor version) • Globus in /opt/globus • Tomcat in /var/lib/tomcat5 • Torque in /var/spool/pbs 6th EELA Tutorial, Madrid, 16-20.10.2006
Installing CE+Torque Server • Edit /etc/ssh/sshd_config and add the following lines at the end: HostbasedAuthentication yes IgnoreUserKnownHosts yes IgnoreRhosts yes • Restart the server with: /sbin/service sshd restart 6th EELA Tutorial, Madrid, 16-20.10.2006
Installing CE+Torque Server • On the CE generate an updated version of /etc/ssh/ssh_know_hosts by running: /opt/edg/sbin/edg-pbs-knownhosts • Copy that file into all the WorkerNodes. 6th EELA Tutorial, Madrid, 16-20.10.2006
Installing WN Server • Install the node • /opt/glite/yaim/scripts/gilda_ig-install_node /root/my-site-info.def GILDA_ig_WN_torque_noapp • Configure the node • /opt/glite/yaim/scripts/gilda_ig-configure_node /root/my-site-info.def GILDA_ig_WN_torque_noapp 6th EELA Tutorial, Madrid, 16-20.10.2006
Computing Element Testing 6th EELA Tutorial, Madrid, 16-20.10.2006
CE Testing • Log as a new created user (e.g gilda003) • Edit a file and write #!/bin/sh sleep 10 hostname • Save it (as test.sh) and set the permission of execution chmod 700 test.sh • Submit the job: qsub –q short test.sh • Test the status of the job qstat -a 6th EELA Tutorial, Madrid, 16-20.10.2006
CE Testing • After the job execution there are two output files: [dteam048@bhp0101 dteam048]$ ls test.sh test.sh.e12 test.sh.o12 • And show these files: • Error file [dteam048@bhp0101 dteam048]$ cat test.sh.e12 [dteam048@bhp0101 dteam048]$ • Output file [dteam048@bhp0101 dteam048]$ cat test.sh.o12 bhp0101.ciemat.es 6th EELA Tutorial, Madrid, 16-20.10.2006
References • https://gilda.ct.infn.it/docs/GILDAsiteinstall-3_0_0.html • https://grid.ct.infn.it/twiki/bin/view/GILDA/GliteElementInstallation 6th EELA Tutorial, Madrid, 16-20.10.2006
Questions 6th EELA Tutorial, Madrid, 16-20.10.2006