140 likes | 241 Views
Migration to the GRID A Case Study. Nick West. Why “A Case Study?”. Why not “How we are migrating to the GRID?” Because there isn’t “The” GRID, but many. In Europe it is LCG (LHC Computing GRID) aka (almost) EGEE (Enabling GRID for E SciencE) In US it is OSG (Open Science GRID)
E N D
Migration to the GRIDA Case Study Nick West
Why “A Case Study?” • Why not “How we are migrating to the GRID?” • Because there isn’t “The” GRID, but many. • In Europe it is LCG (LHC Computing GRID) aka (almost) EGEE (Enabling GRID for E SciencE) • In US it is OSG (Open Science GRID) • There is some shared middleware, but inadequate for a complete solution. • It depends on your objectives • In UK: retain access to computing + storage at RAL after it “goes GRID” with minimum disruption • In US: Use Fermigrid from within FNAL • That’s why it’s a Case Study • Why we in the UK are doing what we are doing.
SE (Storage Element) Why bother to give this talk in the US? • If it’s UK/Europe specific what’s the point? • Because there are generic issuesin this computing model:- A range of storage technologies e.g. dCache, CASTOR, DPM Input Sandbox (~ few KB) UI (User Interface) Machine from which To submit jobs Event data flows (~ GB) CE (Computing Element) Job resource requirements Remote batch farm with “close” SEs Output Sandbox (~ few KB) • You might find it interesting • But then you need to get out more
The Core Activities • There are generic core activities • Installing software on a CE. • Managing data and meta-data:- • Moving event data in/out/between SEs. • Setting up and maintaining databases. • Setting up and running job production. • For each the GRID presents challenges. • The subject of this talk • Will not deal with other aspects common to all off-line operations such as:- • Software development. • Production bookkeeping.
Software Installation • The challenge • Install on virgin system using standard GRID job submission (pass only few KB) – interactive login discouraged. • Mitigation • Can wget from Internet from CE. • Can install s/w on disk visible to all WN (Worker Nodes) on CE. • Our solution: RSD (Remote Software Deployment) • Have talked about before. • Nothing fundamental has changed. • Bootstraps first RSD then application tars from Internet. • Nothing very clever • It’s application developer who writes assemble and install tools • But only that; RDS deals with chores e.g. dependenacies, install.validate/remove cycle, software tags • Has been use for • Frozen and Snapshot releases on LCG GRID • VMC • Genie
Managing event data- pre-GRID era • We use DCM (Data Cache Manager) • Data query • SAM query – resolve using SAM web client. • By file name – resolve using ASCII filelist dump of Enstore. • DCM • First resolves query into file names. • If not on local NFS disk finds free space, transfers and adds to catalogue. • Data Discovery • Nightly scan of local disk to see what users have sneaked in and add to catalogue. • Was the original function – to try to prevent duplication without trying to get users to follow rules! • Note • Query by SAM or file name, not by directory. • Mapping directory file name can be complex e.g. raw data. FNAL wget DCM Data query Application Local NFS disks Catalogue (directory ofsoft links)
Managing event data- GRID era The challenge • Complications • Multiple SEs • Multiple technologies (in fact 2 of each) • Available protocols depends on both SE and CEe.g. dCache from RAL T1 different to RAL T2 • We will eventually loose most of our NFS disk. FNAL RAL T1 CE dCache SE RAL UI NFS disk Most will eventually go. Not visible from CEs RAL T2 CE Castor SE
Managing event data - GRID era The “Official” Solution • It’s based on LFC and lcg-utils • LFC (LCG File Catalogue) • Presents all data as a single UNIX-like directory structure: /grid/minos • Can handle replication i.e. same data on multiple sites so CE can find closest SE. • Set of commands to manage catalogue e.g. lfc-ls, lfc-mkdir • lcg-utils • Technology neutral interface to move data in/out/between SEs and update LFCe.g. lcg-cp, lcg-del • TGFALFile • ROOT layer to GFAL, technology neutral interface to both dCache and Castor. • Snags (there are always snags) • Does not incorporate SAM or FNAL dCache • Presents directory structure API, not filename. • More than one expert has warned me “technology maturing” avoid, if we can in short term. • We can, are within RAL firewall so can use technology specific, lower level protocols • When I tried TGFALFile it failed • To be fair it was a while ago • There are lower level alternatives: TDCacheFile, TRFIOFile
Managing event data - GRID era Our Strategy… • We already have a data access API (DCM), so develop it. • Perform nightly scans of each local SE • Got permission to do this “sympathetically” • Not just ls –lR ! • Combine these with scan of local disk + FNAL Enstore listing to form a complete set of catalogues. FNAL dCache SE NFS disk DCM Castor SE DCM ASCII Catalogues Of NFS disks, SEs, and FNAL
Managing event data - GRID era …Our Strategy… Data query • Production Model • On UI use DCM to locate file with same 2 stage approach • Resolve query by SAM and/or local catalogue • Locate files by “best” source: disk, then SE, then FNAL • Approach allows dataset definition using FNAL but still use local files. • But don’t retrieve data, instead return DCM URL:- • Encode SE, directory, files name and file size e.g.:-dcm://fnal-dcache-enstore/fardet_data/2006-11/F00037000_0000.mdaq.root#29730062 • Submit jobs to WNs and pass URLs • On CEs WN run DCM to pull the data (including from FNAL). • Where possible system can use ROOT URLs instead.. RAL UI RAL T1 CE dCache SE DCM Catalogues DCM URL RAL T2 CE Castor SE
Managing event data - GRID era …Our Strategy • DCM protocol to access SE is data driven • A global file describes each SE and the protocols it supports and the commands to invoke them. • A file per CE or UI lists the SEs it can access and the protocols to use. • The DCM URL only encodes a location so will select the right protocol when run on the CE to access the SE. • Gives us a migration strategy when the LFC and lcg-util are “mature” • Just introduce new protocols • But unless we also scan the LFC we would have to start using directories for data query..
Managing meta-datai.e. Databases • SAM • Our model only requires SAM installed on UI. • MySQL • We are local (within firewall) at RAL • RAL T1 and RAL T2 can both see the database. • So we only need to ensure that our UI can continue to manage it.
Setting up and running job production • Generic software is very basic • Along with execution script/executable pass JDL (Job Description Language) file that specifies input and output sandboxes and resources required. • Submit job and get back a URL. • Query URL to check on job status and, when complete, retrieve job output. • All LHC experiments have layered their own systems on top. • They could not risk waiting for a generic solution. • Most systems are experiment specific but I found one that wasn’t… • Ganga • A collaboration between Atlas and LHCb • Its free (GPL). • Its used by two big experiments so will be well supported. • Because there are two experiments there are clear internal experiment interfaces • So the framework is experiment neutral. • We know where we have to add MINOS extensions. • There are plug-ins for multiple back-ends including PBS, LSF, LCG and Condor. • It’s a python based OO system built on the concept of a ‘job’ object. • The job can be created, configured, submitted, monitored and used to retrieve output. • The user interacts with the system using ipython so can develop productions systems as python scripts • I have tried it • It ran straight out of the box. • The developers are friendly and helpful. Have answered questions and are considering a change I suggested. • Although we can use ‘as is’ I would like to learn how to develop a MINOS application (which means I get to learn python )
Current Status • As of early June have proof of principle:- • Used GANGA to launch an RSD job which installed the latest Snapshot Release on the RAL Tier 1 CE. • Used GANGA to run a loon job that:- • Used DCM to read a file from RAL dCache • Ran loon • Used DCM to write output back to RAL dCache • There is still a lot to do before the system is really ready • Add support for neugen, gminos, flux files • Add support for private software and “Home Delivery” data. • Work on a production system • For the full work program see:-http://www-pnp.physics.ox.ac.uk/~west/minos/WebDocs/GRID_UK_work_program.html • Beyond that there is • Battle hardening • Fixing bugs in my system. • Where possible automatically recovering from failures of aspects of the GRID. • Where not possible providing tools to make manual intervention as easy as possible. • User education • It’s way different from running on a PBS farm reading and writing to NFS disk!!