Distributed Processing

Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software Workshop BNL - May 7, 2002

Distributed Processing • ATLAS distributed processing, PPDG year 2 program • role of MOP, other middleware & third party tools • objectives: deliverables to users • job description language: status, objectives, options • EDG JDL

Architecture Local Application Local Database Local Computing Grid Application Layer Grid Job Management Data Management Metadata Management Object to File Mapper Collective Services Information & Monitoring Replica Manager Grid Scheduler Replica Catalog Interface Replica Optimization Underlying Grid Services SQL Database Service Computing Element Services Storage Element Services Replica Catalog Authorisation, Authentication and Accounting Service Index Grid Fabric services Fabric Monitoring and Fault Tolerance Node Installation & Management Fabric Storage Management Resource Management Configuration Management pink: WP1yellow:WP2

Jul’01: PSEUDOCODE FOR ATLAS SHORT TERM UC01 Logical File Name LFN = "lfn://"hostname"/"any_string Physical File Name PFN = "pfn://"hostname"/"path Transfer File Name TFN = "gridftp://"PFN_hostname"/path JDL InputData = {LFN[]} OutputSE = host.domain.name Worker Node LFN[] = WP1.LFNList() for (i=0;i<LFN.list;i++){ PFN[] = ReplicaCatalog.getPhysicalFileNames(LFN[i]) j = Athena.eventSelectonSrv.determineClosestPF(PFN[]) localFile = GDMP.makeLocal(PFN[j],OutputSE) Athena.eventSelectionSrv.open(localFile) } PFN[] = getPhysicalFileNames(LFN): PFN = getBestPhysicalFileName(PFN[], String[] protocols) TFN = getTransportFileName(PFN, String protocol) filename = getPosixFileName(TFN)

Sample Use Case: Simple Grid Job • Submit and run a simple batch job to process one input file to produce one output file • The user specifies his job via a JDL file: Executable=/usr/local/atlas.sh Requirements = TS >= 1GB Input.LFN = lfn://atlas.hep/foo.in argv1 = TFN(Input.LFN) Output.LFN= lfn://atlas.hep/foo.out Output.SE = datastore.rl.ac.uk argv2 = TFN(Output.LFN) • and where the submitted “job” is: #!/bin/sh gridcp $1 $HOME/tmp1 grep higgs $HOME/tmp1 > $HOME/tmp2 gridcp $HOME/tmp2 $2

Get LFN to SFN mapping send job job done copy input file, allocate output file start job copy output file Steps for Simple Job Example Grid Scheduler Replica Manager User Select CE and SE Replica Catalogue Compute Element Storage Element Compute Element Storage Element site A site B

Steps to Executethis Simple Grid Job • User submits the job to the Grid Scheduler. • Grid Scheduler asks the Replica Manager for list of all PFNs for the specified input file. • Grid Scheduler determines if it is possible to run the job at a Compute Element that is “local” to one of the PFNs. • If not, it locates the best CE for the job, and creates a new replica of the input file on a SE local to that CE. • Grid Scheduler then allocates space for the output file, and “pins” the input file so that is not deleted or staged to tape until after the job has completed. • Then the job is submitted to the CEs job queue. • When the Grid Scheduler is notified that the job has completed, it tells the Replica Manager to create a copy of the output file at the site specified in the Job Descriptions file. • Replica Manager then will tag this copy of the output file the “master”, and make the original file a “replica”.

WP1: Job Status • SUBMITTED -- The user has submitted the job to the User Interface. • WAITING -- The Resource Broker has received the job. • READY -- A Computing Element matching job requirements has been selected. • SCHEDULED -- The Computing Element has received the job. • RUNNING -- The job is running on a Computing Element. • CHKPT -- The job has been suspended and check-pointed on a Computing Element. • DONE -- The execution of the job has completed. • ABORTED -- The job has been terminated. • CLEARED -- The user has retrieved all output files successfully. Bookkeeping information is purged some time after the job enters this state.

WP1: Job Submission Service (JSS) • strictly coupled with a Resource Broker • deployed for each installed RB • single interface (non-blocking), used by the RB • job_submit() • submit a job to the specified Computing Element, managing also input and output sandboxes • job_cancel() • kill a list of jobs, identified by their dg_jobId. • Logging and Bookkeeping (LB) Service - store & manage logging and bookkeeping information generated by Scheduler & JSS components (Information and Monitoring service • Bookkeeping: currently active jobs - job definition, expressed in JDL, status, resource consumption, user-defined data(?) • Logging - status of the Grid Scheduler & related components. These data are kept for a longer term and are used mainly for debugging, auditing and statistical purposes

Condor classified advertisements (ClassAds) adopted as Job Description Language (JDL) Semi-structured data model: no specific schema is required. Symmetry: all entities in the Grid, in particular applications and computing resources, should be expressible in the same language. Simplicity: the description language should be simple both syntactically and semantically. Executable = “simula”; Arguments = “1 2 3”; StdInput = “simula.config”; StdOutput = “simula.out”; StdError = “simula.err”; InputSandbox = {“/home/joe/simula.config”, “/usr/local/bin/simula”}; OutputSandbox = {“simula.out”, “simula.err”, “core”}; InputData = “LF:test367-2”; Replica Catalog = “ldap://pcrc.cern.ch:2010/rc=Replica Catalog, dc=pcrc, dc=cern, dc=ch” DataAccessProtocol = {“file”, “gridftp”}; OutputSE = “lxde01.pd.infn.it”; Requirements = other.Architecture == “INTEL” && other.OpSys == “LINUX”; Rank = other.AverageSI00; WP1: Job Description Language (JDL)

WP1: Sandbox • Working area (input & output) replicated on each CE to which Grid job is submitted. • Very convenient & natural. • My Concerns: • Requires network access (with associated privileges) to all CEs on Grid. • Could be a huge security issue with local administrators. • Not (yet) coordinated with WP2 services. • Sandbox contents not customizable to local (CE/SE/PFN) environment. • Temptation to Abuse (not for data files)

EDG JDL • job description language: status, objectives, options • Status: • Working in EDG testbed • Objectives: • Provide WP1 Scheduler enough information to locate necessary resources (CE, SE, data, software) to execute job. • Options:

Distributed Processing

Distributed Processing

Presentation Transcript

Distributed Query Processing

Communication Management and Distributed Processing,

Distributed Data Processing

Open Distributed Processing

Distributed Signal Processing

Decentralized Distributed Processing

Distributed Query Processing

Open Distributed Processing and Multimedia

Distributed Transaction Processing

Distributed Graph Processing

CIS669 Distributed and Parallel Processing

Distributed Parallel Processing – MPICH-VMI

Chapter 3 : Distributed Data Processing

5. Distributed Query Processing

What is Distributed Processing?

Distributed network signal processing

Distributed Processing Goes Galactic

CS6035 Parallel/Distributed Processing II:

Distributed Query Processing

Open Distributed Processing

Support for Distributed Processing

Distributed Query Processing