40 likes | 173 Views
Using the Grid - a users perspective. Ivan Hollins University of Birmingham. What I have been doing…. Using the Light Grid Submission Framework https://uimon.cern.ch/twiki/bin/view/Atlas/LJSFGettingStarted Doing Full Simulation with atlas software 10.0.1 only
E N D
Using the Grid - a users perspective Ivan Hollins University of Birmingham
What I have been doing… • Using the Light Grid Submission Framework • https://uimon.cern.ch/twiki/bin/view/Atlas/LJSFGettingStarted • Doing Full Simulation with atlas software 10.0.1 only • Have run ~400k Full Sim events equating to ~ 32K jobs on the Grid, from July to date. Warning • Am very much a “novice” user • Experience’s are heavily dependant on the Ljsf, which sitting on top of the grid tools may be as much responsible for some of my experiences
Summary of Nice things… • Good documentation REALY helps to understand the tools • Hello world example very nice • Once “language” mastered operations seem fairly intuitive • Ranking and Interactive features v nice (although sometimes felt a little in the dark as which was the most appropriate way to Rank sites) • Ganglia tool RAL has nice, also Gstat • http://ganglia.gridpp.rl.ac.uk/specials/pbs.php?h=OpenPBS%20server%2fcsflnx353.rl.ac.uk&m=%5Bnone%5D&r=day&s=descending • http://goc.grid.sinica.edu.tw/gstat/FZK-LCG2/
Summary of Not so Nice things… Getting Started • Getting all my certificates and registering seemed complicated / unclear. Would be useful to have an idiots guide, step by step. Also unclear which VO I should join (atlas has different options) Running • Sites that advertise software version’s that aren’t installed properly. • Could there be a monitoring tool to periodically test the sites to weed these out? • Some jobs just Hang! – absolutely no idea why. Would be nice to be able to see how much cpu jobs are using then could tell easily which have hung. • Resource Brokers – can be a bit temperamental. Had to keep changing these, not obvious at first how to do this. Can the RB’s be hidden? • Registering files to the Grid – overall works well with a couple of exceptions. • You seem to be able to overwrite a catalogue entry with a new file, however the storage element doesn’t allow files to be rewritten. End result Cat ID != File ID • Can also register a non-existent file! Not sure if this is a Grid or Ljsf thing… • Storage Elements • Not immediately obvious where to store data, what were the choices etc • CASTOR – obviously not directly a Grid problem, but has been a big source of pain. Once a file goes onto tape since the retrieval time long, your job times out and ends.