200 likes | 290 Views
Summary of the Analysis Systems. Outline. Slightly unusual to be asked to summarise a session that everybody has everyone has just sat through so: I will try summarise the important points of each model This will be a personal view
E N D
Summary of the Analysis Systems David Colling Imperial College London
Outline • Slightly unusual to be asked to summarise a session that everybody has everyone has just sat through so: • I will try summarise the important points of each model • This will be a personal view • I “manage” a distributed Tier 2 in the UK that currently supports all LHC experiments • I am involved in CMS computing/analysis • Then there will be further opportunity to question the experiment experts about the implementation of the models on the Tier 2s. David Colling Imperial College London
Comparing the Models • Firstly, only three of the four LHC experiments plan to do any analysis at the Tier 2s • However, conceptually, those three have very similar models. • They have the majority (if not all) end user analysis being performed at the Tier 2s. This gives the Tier 2 a crucial role in extracting the physics of the LHC. • The analysis share the Tier 2s with Monte Carlo production David Colling Imperial College London
Comparing the Models • The experiments want to be able to control the fraction of the Tier 2 resources that are used for different purposes (analysis v production, analysis A v analysis B) • They all realise that data movement followed by knowledge of the content and location of those data is vitally important. • They all separate the data content and data location databases • All have jobs going to the data • The experiments all realise that there is a need to separate the user from complexity of the WLCG. David Colling Imperial College London
So where do they differ? • Implementation, here they differ widely on: • What services need to be installed/maintained at each Tier 2. • What additional software software that they need above the “standard” grid installations. • Details of the job submission system (e.g. pilot jobs or not, very different UIs etc) • How they handle different Grids • Maturity: • - CMS has a system capable of running >100K jobs/month whereas Atlas only has a few hundred GB of appropriate data David Colling Imperial College London
Implementations • Lets starts with Atlas… • Different implementations on different Grids. • Looking at the EGEE Atlas implementation. • No services required at the Tier 2 only software installed by SGM. • All services (file catalogue, data moving services of Don Quixote etc) at the local T1. • As a Tier 2 “manager” this makes me very happy as it minimises the support load at the Tier 2 and means that it is left to experts at the Tier 1. Means that all sites within the London Tier 2 will be available for Atlas analysis. David Colling Imperial College London
Dataset catalog http CE Tier 0 rfio dcap gridftp nfs lrc protocol VOBOX SE LRC FTS Tier 2 Tier 1 Accessing data for analysis on the Atlas EGEE installation David Colling Imperial College London
Production 70% CE Long 20 % Software 1 % CE Short 9 % Atlas Implementations Prioritisation mechanism will come from the EGEE Priorities Working group David Colling Imperial College London
Atlas Implementations and maturity • US Using Panda system: • Much more work at the Tier 2 • However US Tier 2 seem to be better endowed with support effort so this may not be problem. • NorduGrid • Implementation still ongoing • Maturity • Only a few hundred GB of appropriate data • Experience of SC4 will be important, especially Don Quixote David Colling Imperial College London
CMS Implementation • Require installation of some services at Tier 2s: PhEDEx & trivial file catalogue • However, it is possible to run the instances for different sites within a distributed T2 at a single site. • So as a distributed Tier 2 “manager” I am not too unhappy … for example in the UK I can see that all sites in the London Tier 2 and in SouthGrid running CMS analysis but less likely in NorthGrid and ScotGrid David Colling Imperial College London
CMS Implementation across Grids • Installation as similar as possible across EGEE and OSG. • Same UI for both … called crab • Can use crab to submit to OSG sites via an EGEE WMS or directly via CondorG David Colling Imperial College London
CMS Maturity • PhEDex has proved to be a very reliable since DC04 • CRAB in use since end of 2004 • Hundred thousand jobs a month • Tens of sites both for execution and submission • Note that there are still failures David Colling Imperial College London
Alice Implementation • Only really running on EGEE and Alice specific sites • Puts many requirement on a site: xrootd, VO Box running AliEn SE and CE, the package management server, MonLisa server, LCG UI and AliEn file transfer. • All jobs are submitted via AliEn tools • All data is accessed only via AliEn David Colling Imperial College London
Tier-2 Infrastructure/Setup Example Vo-Box Can run on the VO Box Disk Server 1 SA CE FTD MonaLisa PackMan LCG-UI Disk Server 2 Xrootd redirector Disk Server 3 Port # Access: Outgoing + Service =============================================) 1094 incoming from World xrootd file transfer. Storage Port # Access: Outgoing + Service ============================================= 8082 incoming from World SE (Storage Element) 8084 incoming from CERN CM (ClusterMonitor) 8083 incoming from World FTD (FileTransferDaemon) 9991 incoming from CERN PackMan LCG CE LCG FTS/SRM-SE/LFC Workernode configuration/requirements are equal for batch processing at Tier0/1/2 centers (2 GB Ram/CPU – 4 GB local scratch space) David Colling Imperial College London
Alice Implementation • All data access is via xrootd … allows innovative access to data. However, it is a requirement on site. • May be able to use xrootd front ends to standard srm • Batch analysis implicitly allows prioritisation through a central job queue • However, this does involve using glexec like functionality David Colling Imperial College London
Tier-2 Tier-2 Task Queue AliEn CE LCG UI RB LCG CE JDL match JDL Submission JDL Batch Sys. apply Agent Tier-2 Optimiziation - Splitting - Requirements - FTS replication - Policies JDL JDL XML ROOT API Services Xrootd Tier-2 SE AliEn FC Central services Alice Implementation –Batch analysis David Colling Imperial College London
Alice Implementation • As a distributed Tier 2 “manager” this set up does not fill me with joy. • I cannot imagine installing such VO boxes within the London Tier 2 and would be surprised if any UK Tier 2 sites (with the exception Birmingham) install such boxes. David Colling Imperial College London
Maturity • Currently, only a handful of people trying to perform Grid based analysis • Not a core part of SC4 activity for Alice. Alice Implementation • Interactive Analysis • More important to Alice than others. • Novel and interesting approach based Proof and xrootd David Colling Imperial College London
Conclusions • Three of the four experiments plan to use Tier 2 sites for end user analysis. • These three experiments have conceptually similar models (at least for batch ) • The implementations of the similar models have very different implications for the Tier 2 supporting the VOs David Colling Imperial College London
Discussion… David Colling Imperial College London