240 likes | 415 Views
JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow. Contents. What is JIM? What is SAMGrid? How does JIM relate to SAMGrid? Components of JIM Using JIM Job Types Station Setup Deployment Plan. What is JIM ?. J ob and I nformation M anagement
E N D
JIM and SAMGrid for CDFMòrag Burgon-LyonUniversity of Glasgow JIM for CDF
Contents • What is JIM? • What is SAMGrid? • How does JIM relate to SAMGrid? • Components of JIM • Using JIM • Job Types • Station Setup • Deployment Plan JIM for CDF
What is JIM? Job and Information Management • Job Management Infrastructure is the framework allowing job submissions for execution on a cluster that has enough resources to complete the job. • Information Management is knowing what resources are available and the status of the jobs. JIM for CDF
What is SAMGrid? • SAMGrid is a grid infrastructure whose goal is to allow globally distributed computing for current experiments, DØ, CDF and Minos at Fermilab. • Communicating with LHC experiments CMS (Fermilab) and Atlas (Brookhaven) JIM for CDF
How does JIM relate to SAMGrid? Complements Sequencial Access via Metadata to provide complete grid services Job Management Information and Monitoring Data Handling JIM for CDF
How does JIM relate to SAMGrid? JIM for CDF
How does JIM relate to SAMGrid? • JIM allows a user to submit jobs to SAMGrid and to access the output files on completion. • JIM chooses which resources will be used to execute a job. This decision is based on the amount of data required by the job already cached at the sites. • JIM submits the job to the local BS and SAM provides the I/O data management for the files. JIM for CDF
How does JIM relate to SAMGrid? • Condor MMS was expanded for SAMGrid • broker can query SAM station to see how much data is already present. • dynamic selection of a globus gatekeeper. • the match is determined by calling by calling external custom code, e.g. in SAMGrid the SAM station is invoked to determine the rank of a match. • Globus toolkit is used for job transfer and monitoring JIM for CDF
How does JIM relate to SAMGrid? • Distinguishes grid level (global) scheduling (selection of a cluster to run) from local scheduling (distribution of the job within the cluster). • Distinguishes structured jobs (where the details are known to Grid middleware) from unstructured jobs (where the whole job is mapped onto a single cluster). JIM for CDF
Components of JIM • All Sites • sam_gsi_config (includes sam_gridftp and globus_dh_client and server) for grid security • xmldb as a SAMGrid interface • Client Site - used to submit jobs to SAMGrid • typically a remote server or workstation that is used to send jobs to a Submission site • jim-client JIM for CDF
Components of JIM • Submission Site - maintains a spool of jobs • acts as a client to the broker periodically sending jobs to available resources at an Execution site • jim_broker_client • www_jim_sandbox (optional) JIM for CDF
Components of JIM • Execution Site - runs the job • sam and sam_station • sam_batch_adapter • globus_rm_server • jim_jobmanager • jim_sandbox • jim_config and jim_advertise JIM for CDF
Components of JIM • Monitoring Site - provides information on the state of each submitted job and allows the output of completed jobs to be downloaded • globus_is_server and globus_is_client • jim_info_providers JIM for CDF
Using JIM – Submitting a job User creates a jdl file such as the example shown: sam_dataset = jpmm08-1file executable = retrieve.sh input_sandbox = /home_scotgrid/m/mlyon/test/testjob cpu-per-event = 1s job_manager = sam job_type = sam_analysis sam_universe = prd sam_experiment = cdf log = testjob.log output = testjob.out error = testjob.err arguments = Download Output Yet? group = test instances = 1 JIM for CDF
Using JIM – Submitting a job • The job is submitted by typing: samg submit testjobfile.jdf • The progress of the job can be viewed by selecting the submission site from the list: http://samgrid.fnal.gov:8080 • Select the job from the list. Details of the job state are displayed • Once the job has been completed the output may be downloaded JIM for CDF
Using JIM – Viewing Map JIM for CDF
Using JIM – Viewing submission sites JIM for CDF
Using JIM – Viewing submitted jobs JIM for CDF
Using JIM – Downloading output JIM for CDF
Types of Jobs • Monte Carlo • events generated, passed through detector simulation and reconstructed • e.g. typically no input files, one output file per job • Each job part of a well defined (generator, parameters) set going to a given dataset • Reconstruction • real data reconstruction • in general one input file from a dataset going to one output file in corresponding dataset • Analysis • an entire dataset is input, many input files JIM for CDF
Station Setup • Initial installation uses current versions of all products. • Care must be taken when upgrading • to avoid version incompatibility • to preserve existing configuration JIM for CDF
What will JIM do once complete? • Current functionality allows job submission and output retrieval • Next steps: • Deployment of secure web-download • User support and defect fixing through testing phase • Roll out of SAMGrid to all CDF sites with available resources • Add more brokering criteria • Directors review of Run II computing has recommended the expansion of SAM to be a lab-wide product JIM for CDF
Deployment Plan • Glasgow University has a complete installation of SAMGrid on both ScotGrid and the CDF cluster. This installation is being tested with Monte Carlo simulation. • Oxford University has SAMGrid installed. • Installation workshop organised for 20th-22nd Jan 04 JIM for CDF
Credits • Thanks to the JIM team for providing material for this presentation. JIM for CDF