240 likes | 365 Views
gLite Information System. UNIANDES OOD Team Daniel Alberto Burbano Sefair, dburbano@uniandes.edu.co Michael Angel Pérez Cabarcas, mic-pere@uniandes.edu.co Universidad de Los Andes, (Colombia) 26-27, Febreary 2009. Overview. BDII Introduction
E N D
gLite Information System UNIANDES OOD Team Daniel Alberto Burbano Sefair, dburbano@uniandes.edu.co Michael Angel Pérez Cabarcas, mic-pere@uniandes.edu.co Universidad de Los Andes, (Colombia) 26-27, Febreary 2009
Overview • BDII Introduction • BDII Structure of oper.vo.eu-eela.eu and prod.vo.eu-eela.eu. • Getting information with lcg-infosites and lcg-info • Learned Experiences • Questions to consolidate knowledge • References • Questions
BDII Introduction • What is? • System to collect information on the state of resources • Used for? • To discover resources of the grid and their state • Workload management (WMS) • Monitoring (health status of resources) • How? • Monitoring state of the resources locally and publishing fresh data on the information system. • Adopting a data model using for all components. • BDII (Berkeley DB Information System) • BDII is a system information based on LDAP (Light Direct Access Protocol). • LDAP is application level protocol that allows access to a directory service
BDII Structure (from lcg-info point of view) The GRIS can be LFC, WMS, LB
lcg-infosites using TopBDIIs (First Level) • Gettinginformation of the CEsfrom the TopBDII: bdii.eela.ufrj.br [michael@yali ~]$ echo $LCG_GFAL_INFOSYS bdii.eela.ufrj.br:2170 [michael@yali ~]$ lcg-infosites --vo prod.vo.eu-eela.eu ce #CPU Free Total Jobs Running Waiting ComputingElement ---------------------------------------------------------- 771 270 0 0 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-asixhour 771 270 1 1 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-dthreeday 771 270 201 200 1 gridgate.cs.tcd.ie:2119/jobmanager-pbs-coneday 771 270 0 0 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-bthirtym 96 50 0 0 0 ce-eela.ciemat.es:2119/jobmanager-lcgpbs-prod_eela 104 104 0 0 0 ce-eela.ceta-ciemat.es:2119/jobmanager-lcgpbs-prod 146 78 1 1 0 ce01.macc.unican.es:2119/jobmanager-lcgpbs-eelaprod 28 28 0 0 0 ce01.eela.if.ufrj.br:2119/jobmanager-lcgpbs-prod 290 86 0 0 0 grid012.ct.infn.it:2119/jobmanager-lcglsf-prod 112 88 0 0 0 ce.eela.cesga.es:2119/jobmanager-lcgsge-eelaprod 22 22 0 0 0 ramses.dsic.upv.es:2119/jobmanager-lcgpbs-eela
lcg-infosites using TopBDIIs (First Level) • Gettinginformation of the CEsfrom the TopBDII: bdii-eela.ceta-ciemat.es [michael@yali ~]$ echo $LCG_GFAL_INFOSYS bdii.eela.ufrj.br:2170 [michael@yali ~]$ lcg-infosites --vo prod.vo.eu-eela.eu ce --is bdii-eela.ceta-ciemat.es #CPU Free Total Jobs Running Waiting ComputingElement ---------------------------------------------------------- 771 269 1 1 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-asixhour 771 269 3 3 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-dthreeday 771 269 201 200 1 gridgate.cs.tcd.ie:2119/jobmanager-pbs-coneday 771 269 0 0 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-bthirtym 96 49 2 2 0 ce-eela.ciemat.es:2119/jobmanager-lcgpbs-prod_eela 104 104 0 0 0 ce-eela.ceta-ciemat.es:2119/jobmanager-lcgpbs-prod 146 78 1 1 0 ce01.macc.unican.es:2119/jobmanager-lcgpbs-eelaprod 52 28 0 0 0 kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-prod 28 28 0 0 0 ce01.eela.if.ufrj.br:2119/jobmanager-lcgpbs-prod 290 81 0 0 0 grid012.ct.infn.it:2119/jobmanager-lcglsf-prod 112 88 0 0 0 ce.eela.cesga.es:2119/jobmanager-lcgsge-eelaprod 22 22 0 0 0 ramses.dsic.upv.es:2119/jobmanager-lcgpbs-eela
lcg-infosites using TopBDIIs (First Level) TopBDII: bdii.eela.ufrj.brVO: oper.vo.eu-eela.eu Gettinginformation of the CE • [michael@yali ~]$ lcg-infosites --vooper.vo.eu-eela.eu ce –is bdii-eela.ceta-ciemat.es • #CPU Free Total JobsRunningWaitingComputingElement • ---------------------------------------------------------- • 771 273 0 0 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-asixhour • 771 273 1 1 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-dthreeday • 771 273 201 200 1 gridgate.cs.tcd.ie:2119/jobmanager-pbs-coneday • 771 273 0 0 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-bthirtym • 96 52 0 0 0 ce-eela.ciemat.es:2119/jobmanager-lcgpbs-oper • 22 22 0 0 0 ramses.dsic.upv.es:2119/jobmanager-lcgpbs-edteam • 28 28 0 0 0 ce01.eela.if.ufrj.br:2119/jobmanager-lcgpbs-oper • 290 87 0 0 0 grid012.ct.infn.it:2119/jobmanager-lcglsf-oper • 146 78 0 0 0 ce01.macc.unican.es:2119/jobmanager-lcgpbs-eelaoper • 104 104 0 0 0 ce-eela.ceta-ciemat.es:2119/jobmanager-lcgpbs-oper • 112 88 0 0 0 ce.eela.cesga.es:2119/jobmanager-lcgsge-eelaoper TopBDII: bdii-eela.ceta-ciemat.es VO: oper.vo.eu-eela.eu [michael@yali ~]$ lcg-infosites --vo oper.vo.eu-eela.eu ce #CPU Free Total Jobs Running Waiting ComputingElement ---------------------------------------------------------- 771 273 0 0 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-asixhour 771 273 1 1 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-dthreeday 771 273 201 200 1 gridgate.cs.tcd.ie:2119/jobmanager-pbs-coneday 771 273 0 0 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-bthirtym 96 52 0 0 0 ce-eela.ciemat.es:2119/jobmanager-lcgpbs-oper 22 22 0 0 0 ramses.dsic.upv.es:2119/jobmanager-lcgpbs-edteam 28 28 0 0 0 ce01.eela.if.ufrj.br:2119/jobmanager-lcgpbs-oper 290 87 0 0 0 grid012.ct.infn.it:2119/jobmanager-lcglsf-oper 146 78 0 0 0 ce01.macc.unican.es:2119/jobmanager-lcgpbs-eelaoper 104 104 0 0 0 ce-eela.ceta-ciemat.es:2119/jobmanager-lcgpbs-oper 112 88 0 0 0 ce.eela.cesga.es:2119/jobmanager-lcgsge-eelaoper
lcg-infosites using SiteBDIIs (Second level) The SiteBDII shows the information of the CE and SE. In the second case (2) the SiteBDII is located in the CE. [michael@yali ~]$ lcg-infosites --vo prod.vo.eu-eela.eu all --is piaroa.uniandes.edu.co #CPU Free Total Jobs Running Waiting ComputingElement ---------------------------------------------------------- 52 28 0 0 0 kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-prod Avail Space(Kb) Used Space(Kb) Type SEs ---------------------------------------------------------- 427900000 42 n.a moboro.uniandes.edu.co 1 2 [michael@yali ~]$ lcg-infosites --vo prod.vo.eu-eela.eu all --is ce-eela.ceta-ciemat.es #CPU Free Total Jobs Running Waiting ComputingElement ---------------------------------------------------------- 104 104 0 0 0 ce-eela.ceta-ciemat.es:2119/jobmanager-lcgpbs-prod Avail Space(Kb) Used Space(Kb) Type SEs ---------------------------------------------------------- 66020000 1936426 n.a se-eela.ceta-ciemat.es
lcg-infosites using GRIS (Third level) [michael@yali ~]$ lcg-infosites --vo prod.vo.eu-eela.eu all --is kuragua.uniandes.edu.co #CPU Free Total Jobs Running Waiting ComputingElement ---------------------------------------------------------- 52 28 0 0 0 kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-prod [michael@yali ~]$ lcg-infosites --vo prod.vo.eu-eela.eu all --is ce01.eela.if.ufrj.br #CPU Free Total Jobs Running Waiting ComputingElement ---------------------------------------------------------- 28 28 0 0 0 ce01.eela.if.ufrj.br:2119/jobmanager-lcgpbs-prod [michael@yali ~]$ lcg-infosites --vo prod.vo.eu-eela.eu all --is lnx105.eela.if.ufrj.br Avail Space(Kb) Used Space(Kb) Type SEs ---------------------------------------------------------- 837640000 3281642 n.a lnx105.eela.if.ufrj.br
Getting Information entre CE y CE This command gets the SE closest to the CE that belongs to the same site What happen with 2 SEs that belongs to the same Site? In theory, there is an algortihm that compare the geographic location with the velocity of the access way of the storage element. There is a variable that declared the closest SE to the CE. [michael@yali ~]$ lcg-infosites --vo prod.vo.eu-eela.eu closeSE Name of the CE: ce-eela.ciemat.es:2119/jobmanager-lcgpbs-prod_eela se-eela.ciemat.es se-eela.ciemat.es Name of the CE: ce-eela.ceta-ciemat.es:2119/jobmanager-lcgpbs-prod se-eela.ceta-ciemat.es Name of the CE: ce01.macc.unican.es:2119/jobmanager-lcgpbs-eelaprod se01.macc.unican.es Name of the CE: kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-prod moboro.uniandes.edu.co Name of the CE: ce01.eela.if.ufrj.br:2119/jobmanager-lcgpbs-prod lnx105.eela.if.ufrj.br Name of the CE: grid012.ct.infn.it:2119/jobmanager-lcglsf-prod aliserv1.ct.infn.it
lcg-info (Listattributes) This command shows the attributes used to get information about the VOs [michael@yali ~]$ echo $LCG_GFAL_INFOSYS bdii.eela.ufrj.br:2170 The first column is used to get information of the Grid infraestrucuture usign lcg-info command. The third column is used as conditions in the JDL. [michael@yali ~]$ lcg-info --list-attrs Attribute name Glue object class Glue attribute name WorstRespTime GlueCE GlueCEStateWorstResponseTime CEAppDir GlueCE GlueCEInfoApplicationDir TotalCPUs GlueCE GlueCEInfoTotalCPUs MaxRunningJobs GlueCE GlueCEPolicyMaxRunningJobs CE GlueCE GlueCEUniqueID WaitingJobs GlueCE GlueCEStateWaitingJobs MaxCPUTime GlueCE GlueCEPolicyMaxCPUTime LRMSVersion GlueCE GlueCEInfoLRMSVersion MaxTotalJobs GlueCE GlueCEPolicyMaxTotalJobs CEStatus GlueCE GlueCEStateStatus LRMS GlueCE GlueCEInfoLRMSType CEVOs GlueCE GlueCEAccessControlBaseRule AssignedJobSlots GlueCE GlueCEPolicyAssignedJobSlots FreeCPUs GlueCE GlueCEStateFreeCPUs RunningJobs GlueCE GlueCEStateRunningJobs EstRespTime GlueCE GlueCEStateEstimatedResponseTime FreeJobSlots GlueCE GlueCEStateFreeJobSlots Cluster GlueCE GlueCEInfoHostName
lcg-info (Applications) [michael@yali ~]$ lcg-info --vo prod.vo.eu-eela.eu --list-ce --attrs Tag - CE: ce-eela.ceta-ciemat.es:2119/jobmanager-lcgpbs-prod - Tag LCG-2 LCG-2_1_0 LCG-2_1_1 LCG-2_2_0 LCG-2_3_0 LCG-2_3_1 LCG-2_4_0 LCG-2_5_0 LCG-2_6_0 LCG-2_7_0 GLITE-3_0_0 GLITE-3_0_1 GLITE-3_1_0 R-GMA MPICH - CE: ce-eela.ciemat.es:2119/jobmanager-lcgpbs-prod_eela - Tag LCG-2 LCG-2_1_0 LCG-2_1_1 Thiscommandlists the CEswiththeirapplicationsthat can beexecuted in the VO (prod.vo.eu-eela.eu).
lcg-info (Listapplications of a Site) [michael@yali ~]$ lcg-info --vo prod.vo.eu-eela.eu --list-ce --attrs Tag –bdii piaroa.uniandes.edu.co:2170 - CE: kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-prod - Tag LCG-2 LCG-2_1_0 LCG-2_1_1 LCG-2_2_0 LCG-2_3_0 LCG-2_3_1 LCG-2_4_0 LCG-2_5_0 LCG-2_6_0 LCG-2_7_0 GLITE-3_0_0 GLITE-3_0_1 GLITE-3_0_2 R-GMA GAUSSIAN03 RASTER3D ANGA-1.2.10 VO-cms-CMSSW_1_6_12 VO-cms-CMSSW_1_8_4 Thiscommand shows the applicationsthat can beexecuted in the CE of kuragua.uniandes.edu.co. (prod.vo.eu-eela.eu).o
lcg-info (List the closest queues) [michael@yali ~]$ lcg-info --vo prod.vo.eu-eela.eu --list-se -attrs CloseCE - SE: gridstore.cs.tcd.ie - CloseCE gridgate.cs.tcd.ie:2119/jobmanager-pbs-gridwebcom gridgate.cs.tcd.ie:2119/jobmanager-pbs-bthirtym gridgate.cs.tcd.ie:2119/jobmanager-pbs-asixhour gridgate.cs.tcd.ie:2119/jobmanager-pbs-dthreeday gridgate.cs.tcd.ie:2119/jobmanager-pbs-himem gridgate.cs.tcd.ie:2119/jobmanager-pbs-coneday gridgate.cs.tcd.ie:2119/jobmanager-pbs-twoweek gridgate.cs.tcd.ie:2119/jobmanager-pbs-test - SE: moboro.uniandes.edu.co - CloseCE kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-prod kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-cms kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-oper kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-dteam kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-ops lcg-info --vo prod.vo.eu-eela.eu --list-ce -attrs CloseSE --bdii ce-eela.ciemat.es:2170 Thiscommandlist the closestqueues of the CEsto the SEs.o Usinganespecificsite.
lcg-info (given a query) [michael@yali ~]$ lcg-info --vo prod.vo.eu-eela.eu --list-ce --query 'TotalCPUs = 52' --attrs 'RunningJobs,FreeCPUs' - CE: kuragua.uniandes.edu.co:2119/jobmanager-lcgpbs-prod - RunningJobs 0 - FreeCPUs 27 Thiscommandrequest a CE that has 52 CPUs and show the attributes: Runningjobs and free CPUs
Learnedexperiences • The following files are used to public the application list that are installed in the sites. This file is located in the CE. • /opt/edg/var/info/prod.vo.eu-eela.eu • /opt/edg/var/info/oper.vo.eu-eela.eu • Restart the computer to get the new values. We don`t know the jobs that control that. If you know send a email.
Learned experiences [dburbano@nobsa ~]$ glite-wms-job-list-match -a -o id –vo VO-name job.jdl Warning - --vo option ignored Connecting to the service https://bache.uniandes.edu.co:7443/glite_wms_wmproxy_server Error - Operation failed Unable to perform the operation: The Operation is not allowed: Error during matchmaking: Problems during rank evaluation (e.g. GRISes down, wrong JDL rank expression, etc.) Method: jobListMatch • Error Message 1 • Description • The WMS can´t select a CE, but the job can be submitted with the glite-wms-job-submit –a –o output –r job.jdl • Detection • One of the possible errors, in this case, the BDII_HOST variable inside of site-info.def is wrong. It must be equal to the TopBDII.
Learned experiences (Solution to the error message 1) In theorytomds-vo-name=? TopBDIImustbe local SiteBDIImustbdiiSiteName GRIS mustberesources The nextslide shows more information • Changes the value of BDII_HOST, inside of site-info.def, with the hostname of the TopBDII (bdii.eela.ufrj.br or bdii-eela.ceta-ciemat.es) find on http://eoc.eu-eela.eu/doku.php?id=central_services and yaim again. • Verifies in the site-info.def the following variables are correctly declared: • BDII_REGIONS="CE SE LFC PX MON" # list of the services provided by the site • BDII_CE_URL="ldap://$CE_HOST:2170/mds-vo-name=local,o=grid" • BDII_SE_URL="ldap://$DPM_HOST:2170/mds-vo-name=local,o=grid" • BDII_RB_URL="ldap://$RB_HOST:2170/mds-vo-name=local,o=grid" • Verify that the “mds-vo-name” variable of the BDIIs (TopBDII, SiteBDII,GRIS) in /opt/bdii/etc/bdii.conf must be configured with the same value: BDII_MODIFY_DN=yes BDII_BIND=mds-vo-name=local,o=grid Question: The following parameters are correct for gLite 3.1 and gLite 3.0? NO With gLite 3.1 we use XX_HOST:2170/mds-vo-name=local With gLite 3.0 we use XX_HOST:2135/mds-vo-name=resource
gLite Tutorial Resource information: GRIS Generic Information Provider (GIP): Configurable information provider that makes a separation between static and dynamic information. Produces “ldif” files and publishes in LDAP servers. Information can be retrieved contacting a given port globus-mds ldapsearch -x -H ldap://<Resource DN>:2135 -b mds-vo-name=local,o=grid BDII ldapsearch -x -H ldap://<Resource DN>:2170 -b mds-vo-name=resource,o=grid Site BDII ldapsearch -x -H ldap://<SiteBDII DN>:2170 -b mds-vo-name=<SITE NAME>,o=grid Top BDII ldapsearch -x -H ldap://<Resource DN>:2170 -b mds-vo-name=local,o=grid
Learned experiences (SiteBDII files) 1/2 SiteBDII Configuration of the bdii.conf file. The configutration time do this very well, so don´t touch. This come from site-info.def file.
Learned experiences (SiteBDII files) 2/2 SiteBDII Configuration of the bdii-update.conf file. Do the framed lines after GIP line are necessary? You don´t public the framed lines to the TopBDIIs, because allow VO users use this resources.
References • GILDA Tutorials • https://grid.ct.infn.it/twiki/bin/view/GILDA/InformationSystems • BDII Documentation • https://twiki.cern.ch/twiki/bin/view/EGEE/BDII • https://twiki.cern.ch/twiki/bin/view/LCG/BdiiNotes • LCG-2 User Guide • https://edms.cern.ch/file/454439//LCG-2-UserGuide.html • GLUE Schema • http://infnforge.cnaf.infn.it/glueinfomodel
Some Exercises • Topic related Wiki pages: • https://grid.ct.infn.it/twiki/bin/view/GILDA/InformationSystems