130 likes | 139 Views
Problem of Application Job Monitoring in GRID Systems. V. Kalyaev ( kalyaev@theory.sinp.msu.ru ), A. Kryukov ( kryukov@theory.sinp.msu.ru ) SINP MSU, Moscow. A.Kryukov NEC-2003, Varna, 15-20 September. Outlook. Introduction Impala/McRunJob solution
E N D
Problem of Application Job Monitoring in GRID Systems V. Kalyaev (kalyaev@theory.sinp.msu.ru), A. Kryukov (kryukov@theory.sinp.msu.ru) SINP MSU, Moscow A.Kryukov NEC-2003, Varna, 15-20 September
Outlook • Introduction • Impala/McRunJob solution • GRID and Application Job Monitoring • Conclusion A.Kryukov NEC-2003, Varna, 15-20 September
Introduction: Job Monitoring in GRID In the GRID there are some monitoring facilities. However, these facilities just fixed general status of jobs: • Scheduled • Running • Canceled • Finished It is completely insufficient for complex applications. A.Kryukov NEC-2003, Varna, 15-20 September
What is Application Job Monitoring? Let us consider very simple example: CMSIM. Summary information of the program is a number of generated events. The knowledge of this number can be used by user for diagnostic of the process of generation of events. So, it is very important to supply user some specific information from application in real-time mode. A.Kryukov NEC-2003, Varna, 15-20 September
MC Event Simulation for LHC(on CMS example) • Simulation of physical events • Pythia • Detector simulation • GEANT-3/4 • Digitization (overlap, noise) • ORCA • Reconstruction • ORCA A.Kryukov NEC-2003, Varna, 15-20 September
MySQL server JOB MySQL client JOB MySQL client Impala/McRunJob scheme • Insecurity. • User have to know where information is. • Predefine type of monitoring information. A.Kryukov NEC-2003, Varna, 15-20 September
MC event generation with GRID GRIDMiddleWare PC farm RB PC farm A.Kryukov NEC-2003, Varna, 15-20 September
MySQL server MC event generation with GRID GRIDMiddleWare PC farm RB PC farm A.Kryukov NEC-2003, Varna, 15-20 September
Application Job Monitoring Scheme UI WN RB CE atm-user-register atm-job-wrapper atm-job-register Original job atm-jdl-parser edg-job-submit monitor ATM DB atm-job-register-c Allowed user DB atm-register-s Allowed job DB atm-user-register-c Job status DB atm-user-register-s atm-job-monitor-s A.Kryukov NEC-2003, Varna, 15-20 September
Job status DB Authentication Application Job Monitoring: Web Interface Web Server Web Client A.Kryukov NEC-2003, Varna, 15-20 September
JDL Example Executable = “atm-wrapper”; StdOutput = “aliroot.out”; StdError = “aliroot.err”; InputSandbox = {“atm-wrapper”,“start_aliroot2.sh”,” rootrc”,”grun2.C”,”Confiig.C”}; OutputSandbox = {“aliroot.err”,”alirot.out”,”galice.root”}; RetryCount = 10; Arguments = -id=123 –password=567 –site=test.domain /bin/sh start_aliroot.sh 3.02.04 3.07.01; Requirements = Member(other.RunTimeEnvironment,”ALICE-3.07.01”); The old JDL file is converted to new one automatically. A.Kryukov NEC-2003, Varna, 15-20 September
Problems of IO Buffering • If a program send to standard output something like “completed 20 from 200 events”, then output buffer will complete after 20 hours of work. • modify code to invoke IO buffer flush • forbid use of IO buffer. A.Kryukov NEC-2003, Varna, 15-20 September
Conclusions • Security • GSI • User can monitor his jobs only. • Monitoring information • In current realization – standard output. • There is Web interface for authorize access to application job status • We plan to re-implement the scheme by using OGSA/Globus3. A.Kryukov NEC-2003, Varna, 15-20 September