260 likes | 340 Views
EGEE Middleware The Resource Broker. EGEE project members. Contents. Short review of concepts Requirements of the applications communities Overview of the main grid services A closer look. Input “sandbox”. DataSets info. Output “sandbox”. SE & CE info. Job Submit Event. Job Query.
E N D
EGEE MiddlewareThe Resource Broker EGEE project members
Contents • Short review of concepts • Requirements of the applications communities • Overview of the main grid services • A closer look EGEE ResourceBroker
Input “sandbox” DataSets info Output “sandbox” SE & CE info Job Submit Event Job Query Publish Job Status Storage Element Current production middleware LCG FileCatalogue (LFC) “User interface” Information Service Resource Broker Author. &Authen. Input “sandbox” + Broker Info Output “sandbox” Logging & Book-keeping Computing Element Job Status EGEE ResourceBroker
Building on basic tools and Information Service Example JDL file Executable = “gridTest”; StdError = “stderr.log”; StdOutput = “stdout.log”; InputSandbox = {“/home/joda/test/gridTest”}; OutputSandbox = {“stderr.log”, “stdout.log”}; … Submit job to grid via the “resource broker”, edg_job_submit my.jdl EGEE ResourceBroker
The user’s interface to the Grid Command-line interface to Proxy server Job operations To submit a job Monitor its status Retrieve output Data operations Upload file to SE Create replica Discover replicas Other grid services Also C++ and Java APIs To run a job user creates a JDL (Job Description Language) file UI JDL User Interface node EGEE ResourceBroker
Building on basic tools and Information Service Submit job to grid via the “resource broker (RB)”, edg_job_submit my.jdlReturns a “job-id” used to monitor job, retrieve output Example JDL file Executable = “gridTest”; StdError = “stderr.log”; StdOutput = “stdout.log”; InputSandbox = {“/home/joda/test/gridTest”}; OutputSandbox = {“stderr.log”, “stdout.log”}; InputData = “lfn:/grid/VOname/mydir/testbed0.00019”; DataAccessProtocol = “gridftp”; Requirements = other.Architecture==“INTEL” && \ other.OpSys==“LINUX” && other.FreeCpus >=4; Rank = “other.GlueHostBenchmarkSF00”; EGEE ResourceBroker
Building on basic tools and Information Service Submit job to grid via the “resource broker”, edg_job_submit my.jdlReturns a “job-id” used to monitor job, retrieve output Example JDL file Executable = “gridTest”; StdError = “stderr.log”; StdOutput = “stdout.log”; InputSandbox = {“/home/joda/test/gridTest”}; OutputSandbox = {“stderr.log”, “stdout.log”}; InputData = “lfn:/grid/VOname/mydir/testbed0-00019”; DataAccessProtocol = “gridftp”; Requirements = other.Architecture==“INTEL” && \ other.OpSys==“LINUX” && other.FreeCpus >=4; Rank = “other.GlueHostBenchmarkSF00”; lfn: logical file name RB uses Catalog to find replica locations EGEE ResourceBroker
Building on basic tools and Information Service Submit job to grid via the “resource broker”, edg_job_submit my.jdlReturns a “job-id” used to monitor job, retrieve output Example JDL file Executable = “gridTest”; StdError = “stderr.log”; StdOutput = “stdout.log”; InputSandbox = {“/home/joda/test/gridTest”}; OutputSandbox = {“stderr.log”, “stdout.log”}; InputData = “lfn:testbed0-00019”; DataAccessProtocol = “gridftp”; Requirements = other.Architecture==“INTEL” && \ other.OpSys==“LINUX” && other.FreeCpus >=4; Rank = “other.GlueHostBenchmarkSF00”; Uses BDII Information System EGEE ResourceBroker
Job submission UI RB node LFC Network Server Workload Manager Inform. Service Job Contr. - CondorG CE characts & status SE characts & status Computing Element Storage Element
Job Status UI RB node LFC Network Server Workload Manager Inform. Service UI: allows users to access the functionalities of the WMS (via command line, GUI, C++ and Java APIs) WMS: Workload Management System Job Contr. - CondorG CE characts & status SE characts & status Computing Element Storage Element
edg-job-submit myjob.jdl Myjob.jdl JobType = “Normal”; Executable = "$(CMS)/exe/sum.exe"; InputSandbox = {"/home/user/WP1testC","/home/file*”, "/home/user/DATA/*"}; OutputSandbox = {“sim.err”, “test.out”, “sim.log"}; Requirements = other. GlueHostOperatingSystemName == “linux" && other. GlueHostOperatingSystemRelease == "Red Hat 7.3“ && other.GlueCEPolicyMaxCPUTime > 10000; Rank = other.GlueCEStateFreeCPUs; Job Status UI RB node submitted Replica Location Server Network Server Workload Manager Inform. Service Job Contr. - CondorG Job Description Language (JDL) to specify job characteristics and requirements CE characts & status SE characts & status Computing Element Storage Element
NS: network daemon responsible for accepting incoming requests submitted waiting UI RB node Job Status LFC Network Server Job Input Sandbox files Workload Manager Inform. Service RB storage Job Contr. - CondorG CE characts & status SE characts & status Computing Element Storage Element
submitted waiting UI RB node Job Status LFC Network Server Job Workload Manager Inform. Service RB storage WM: responsible to take the appropriate actions to satisfy the request Job Contr. - CondorG CE characts & status SE characts & status Computing Element Storage Element
Job submission submitted waiting UI RB node Job Status LFC Network Server Match- Maker/ Broker Workload Manager Inform. Service RB storage Where must this job be executed ? Job Contr. - CondorG CE characts & status SE characts & status Computing Element Storage Element
Job submission submitted waiting UI RB node Job Status LFC Network Server Matchmaker: responsible to find the “best” CE where to submit a job Match- Maker/ Broker Workload Manager Inform. Service RB storage Job Contr. - CondorG CE characts & status SE characts & status Computing Element Storage Element
Job submission submitted waiting UI RB node Job Status Where are (which SEs) the needed data ? LFC Network Server Match- Maker/ Broker Workload Manager Inform. Service RB storage What is the status of the Grid ? Job Contr. - CondorG CE characts & status SE characts & status Computing Element Storage Element
Job submission submitted waiting UI RB node Job Status LFC Network Server Match- Maker/ Broker Workload Manager Inform. Service RB storage CE choice Job Contr. - CondorG CE characts & status SE characts & status Computing Element Storage Element
Job submission submitted waiting UI RB node Job Status LFC Network Server Workload Manager Inform. Service RB storage Job Adapter Job Contr. - CondorG CE characts & status JA: responsible for the final “touches” to the job before performing submission (e.g. creation of wrapper script, etc.) SE characts & status Computing Element Storage Element
Job submission submitted waiting UI ready RB node Job Status LFC Network Server Workload Manager Inform. Service RB storage Job Job Contr. - CondorG JC: responsible for the actual job management operations (done via CondorG) CE characts & status SE characts & status Computing Element Storage Element
Job submission submitted waiting UI ready scheduled RB node Job Status LFC Network Server Workload Manager Inform. Service RB storage Job Contr. - CondorG Input Sandbox files CE characts & status SE characts & status Job Computing Element Storage Element
submitted waiting UI ready scheduled running Job RB node Job Status LFC Network Server Workload Manager Inform. Service RB storage Job Contr. - CondorG Input Sandbox “Grid enabled” data transfers/ accesses Computing Element Storage Element
submitted waiting UI ready scheduled running done RB node Job Status LFC Network Server Workload Manager Inform. Service RB storage Job Contr. - CondorG Output Sandbox files Computing Element Storage Element
submitted waiting UI ready scheduled running done RB node Job Status edg-job-get-output <dg-job-id> LFC Network Server Workload Manager Inform. Service RB storage Job Contr. - CondorG Output Sandbox Computing Element Storage Element
UI RB node Job Status submitted LFC Network Server waiting ready Output Sandbox files Workload Manager Inform. Service RB storage scheduled Job Contr. - CondorG running done cleared Computing Element Storage Element
Job monitoring UI RB node edg-job-status <dg-job-id> edg-job-get-logging-info <dg-job-id> Network Server LB: receives and stores job events; processes corresponding job status Workload Manager Job status Logging & Bookkeeping Job Contr. - CondorG Log Monitor Log of job events LM: parses CondorG log file (where CondorG logs info about jobs) and notifies LB Computing Element
Possible job states EGEE ResourceBroker