250 likes | 350 Views
Instalación y configuración de CE+WN. Angelines Alberto angelines.alberto@ciemat.es CIEMAT Grid Tutorial, Sept. 2007. Outline.
E N D
Instalación y configuración de CE+WN Angelines Alberto angelines.alberto@ciemat.es CIEMAT Grid Tutorial, Sept. 2007
Outline What is a Computing Element (CE) ?What is a Torque Server ?What is a Worker Node?How to install and configure a Computing Element with Torque Server.How to install and configure a Worker Node with TorqueComputing Element Testing To change: View -> Header and Footer
What is CE? • The CE is a service representing a computing resource.Its main functionality is job management (job submission, job control, etc.). • For job submission, the CE can work in: • push model (where the job is pushed to a CE for its execution). • pull model (where the CE asks the WMS for jobs).
What is Torque? • TORQUE(Tera-scale Open-source Resource and QUEue management) is a resource management providing control over batch jobs and distribuited compute resource. • The Torque System is composed by a: • pbs_server which provides the basic batch services such as receiving/creating a batch job or protecting the job against system crashes. • job_scheduler which contains the site's policy used to decide which job must be executed. • pbs_mom which places the job into execution. It is also responsible for returning the job’s output to the user.
What is a Worker Node? • The Worker Node (WN) is a set of clients required to run jobs sent by the CE via the Local Resource Management System. It currently includes the: • gLite I/O Client, • the Logging and Bookkeeping Client, • the R-GMA Client and • the WMS Checkpointing library.
Installing CE + Torque Server WN + Torque
root GildaVM.06 ./netconfig
Preliminary and common steps • Start from an instalation of SLC 3.0.X • Install JAVA SDK • Install and configure ntp daemon • Install the latest version of YAIM • Install the latest version of GILDA YAIM • Install X.509 host certificates (it is necessary for CE, SE and RB) /etc/grid-security and check their file permissions. • Install the middleware
Installing pre-requisites • Check the FQDN hostname • Ensure that the hostnames of your machines are correctly set. Run the command: hostname -f
Installing pre-requisites • Edit the list file of your WNs: vi /opt/glite/yaim/examples/ gilda_wn-list.conf and insert a line for each WN, with its Fully Qualified Domain Name
Installing pre-requisites • Copy the gilda_wn_list.conf file to the yaim folder cp /opt/glite/yaim/examples/gilda_wn-list.conf /opt/glite/yaim/ Ensure that the file is right gridxx.cica.es
Customize gilda_ig-site-info.def • Open file /root/my-site-info.def and set the values according to your grid environment: MY_DOMAIN=cica.es CE_HOST=gridXX.$MY_DOMAIN RB_HOST=glite-rb.ct.infn.it WMS_HOST=glite-rb.ct.infn.it PX_HOST=grid001.ct.infn.it BDII_HOST=grid004.ct.infn.it MON_HOST=aliserv6.ct.infn.it FTS_HOST=fts.ct.infn.it AMGA_HOST=amga.ct.infn.it REG_HOST=rgmasrv.ct.infn.it NTP_HOSTS_IP=“130.206.1.3”
Customize gilda_ig-site-info.def OS_REPOSITORY="rpm http://sunsite.rediris.es/mirror/grid.infn.it/rep slc306-i386 os updates extras " LCG_REPOSITORY="rpm http://sunsite.rediris.es/mirror/grid.infn.it/rep glite_sl3-i386 3_0_0 3_0_0_externals 3_0_0_updates" IG_REPOSITORY="rpm http://sunsite.rediris.es/mirror/grid.infn.it/rep ig_sl3-i386 3_0_0 utils“ GILDA_REPOSITORY="rpm http://sunsite.rediris.es/mirror/grid.infn.it/rep gilda_app-i386 app 3_0_0" CA_REPOSITORY="rpm http://sunsite.rediris.es/mirror/grid.infn.it/rep glite_sl3-i386 security"
Customize gilda_ig-site-info.def MYSQL_PASSWORD=sevillaxx JOB_MANAGER=lcgpbs CE_BATCH_SYS=pbs BATCH_BIN_DIR=/usr/bin BATCH_VERSION=torque-1.0.1b VOS= “gilda” ALL_VOMS_VOS=“gilda”
Installing CE+Torque Server CE_CPU_MODEL=PIII CE_CPU_VENDOR=intel CE_OS="Scientific Linux SL" CE_OS_RELEASE="SL" CE_OS_VERSION=3.0.6 CE_MINPHYSMEM=1024 CE_MINVIRTMEM=2048 CE_SMPSIZE=2 CE_SI00=1000 CE_SF00=1200 CE_OUTBOUNDIP=TRUE CE_INBOUNDIP=FALSE CE_RUNTIMEENV="a list of tags"
Installing CE+Torque Server • Copy from /opt/glite/yaim/examples to /opt/glite/yaim the files: users.conf is right wn-list.conf gridxx.cica.es and modify them according to your site.
Installing CE+Torque Server • Install the node /opt/glite/bin/gilda_ig_install_node /root/my-site-info.def GILDA_ig_CE_torque • Configure the node /opt/glite/bin/gilda_ig_configure_node /root/my-site-info.def GILDA_ig_CE_torque GILDA_ig_BDII_site
Installing CE+Torque Server • Edit /etc/ssh/sshd_config and add the following lines at the end: HostbasedAuthentication yes IgnoreUserKnownHosts yes IgnoreRhosts yes • Restart the server with: /sbin/service sshd restart
Installing CE+Torque Server • On the CE generate an updated version of /etc/ssh/ssh_know_hosts by running: /opt/edg/sbin/edg-pbs-knownhosts • You have to copy this file in each WN
Installing WN Server • Install the node • /opt/glite/bin/gilda_ig-install_node /root/my-site-info.def GILDA_ig_WN_torque • Configure the node • /opt/glite/bin/gilda_ig-configure_node /root/my-site-info.def GILDA_ig_WN_torque
Computing Element Testing
CE Testing • Log as a new created user (e.g gilda003) • Edit a file and write #!/bin/sh hostname sleep 10 • Save it (as test.sh) and set the permission of execution chmod +x test.sh • Submit the job: qsub –q short test.sh • Test the status of the job qstat -a
CE Testing • After the job execution there are two output files: ls test.sh test.sh.e12 test.sh.o12 • And show these files: • Error file cat test.sh.e12 • Output file more test.sh.o12 grid24.cica.es
References • https://grid.ct.infn.it/twiki/bin/view/GILDA/ComputingElement • https://grid.ct.infn.it/twiki/bin/view/GILDA/WorkerNode