140 likes | 381 Views
Status GridKa & ALICE T2 in Germany. Kilian Schwarz GSI Darmstadt. ALICE T2. Present status Plans and timelines Issues and problems. Status GridKa. Pledged: 600 KSI2k, delivered: 133%, 11% of ALICE jobs (last month). FZK. CERN. GridKa – main issue.
E N D
Status GridKa&ALICE T2in Germany Kilian Schwarz GSI Darmstadt
ALICE T2 • Present status • Plans and timelines • Issues and problems
Status GridKa • Pledged: 600 KSI2k, delivered: 133%, 11% of ALICE jobs (last month) FZK CERN
GridKa – main issue • Resources provided according to megatable • The share among Tier1s comes automatically when considering the Tier2s connecting to this Tier1 … • GridKa pledges 2008: tape 1.5 PB, disk 1 PB • Current megatable: tape 2.2 PB !!! Much more than pledged, more than all other experiments together, most of the additional demand due to the Russian T2 (0.8 PB) The point is: the money is fixed. In principle switch between tape/disk/cpu should be possible – not on short notice, though. Eventually for 2009 things still can be changed.
CERN ALICE T2 – present status GridKa 150 Mbps Grid 30 TB + 120 ALICE::GSI::SE::xrootd vobox LCG RB/CE GSI Batchfarm (39 nodes/252 cores for ALICE) & GSIAF(14 nodes) Directly attached disk storage (55 TB) ALICE::GSI::SE_tactical ::xrootd PROOF/Batch GSI
Present Status • ALICE::GSI:SE::xrootd • > 30 TB disk on fileserver (8 FS a 4 TB each) • + 120 TB disk on fileserver • 20 fileserver 3U 15*500 GB disks RAID 5 • 6 TB user space per server • Batch Farm/GSIAF and ALICE::GSI::SE_tactical::xrootd nodes dedicated to ALICE: • 15 D-Grid funded boxes: each • 2*2core 2.67 GHz Xeon, 8 GB RAM • 2.1 TB local disk space on 3 disks + system disk Additionally 24 new boxes: each • 2*4core 2.67 GHz Xeon, 16 GB RAM • 2.0 TB local disk space on 4 disks including system
ALICE T2 – short term plans • Extend GSIAF to all 39 nodes • Study coexistence of interactive and batch processes on the same machines. Develop possibility to increase/decrease the number of batch jobs on the fly to give advantage to analysis. • Add newly bought fileservers (about 120 TB disk space) to ALICE::LCG::SE::xrootd
ALICE T2 – medium term plans • Add 25 additional nodes to GSI Batchfarm/GSIAF to be financed via 3rd party project (D-Grid) • Upgrade GSI network connection to 1 Gbs either as dedicated line to GridKa (direct T2 connection to T0 problematic) or as general internet connection
ALICE T2 – ramp up plans http://lcg.web.cern.ch/LCG/C-RRB/MoU/WLCGMoU.pdf
Plans for the Alice Tier 2&3 at GSI: • Remarks: • 2/3 of that capacity is for the tier 2 (ALICE central, fixed via WLCG MoU) • 1/3 for the tier 3 (local usage, may be used via Grid) • according to the Alice computing model no tape for tier2 • tape for tier3 independent of MoU • hi run in October -> upgrade operational: 3Q each year
ALICE T2/T3 • Language definition according to GSI interpretation: • ALICE T2: central use • ALICE T3: local use. Resources may be used via Grid. But no pledged resources. • remarks related to ALICE T2/3: • At T2 centres are the Physicists who know what they are doing • Analysis can be prototyped in a fast way with the experts close by • GSI requires flexibility for optimising the ratio of calibration/analysis & simulation at tier2/3
data transfers CERN GSI • motivation: calibration modell and algorithms need to be tested before October • test the functionality of current T0/T1 T2 transfer methods. • At GSI the CPU and storage resources are available, but how do we bring the data here ?
data transfer CERN GSI • The system is not ready yet for generic use. Therefore expert control by a „mirror master“@CERN is necessary. • In principle: individual file transfer works fine, now. Plan: next transfers with Pablos new collections based commands. Webpage where transfer requests can be entered and transfer status can be followed up. • So far about 700 ROOT files have been successfully transfered. This corresponds to about 1 TB of data. • 30% of the newest request still pending. • Maximum speed achieved so far: 15 MB/s (almost complete bandwidth of GSI), but only during a relatively short time • Since August 8 no relevant transfers anymore. Reasons: • August 8, pending xrootd update at Castor SE • August 14, GSI SE failure due to network problems • August 20, instability of central AliEn services. Production comes first -- Up to recently: AliEn update • GSI plans to analyse the transferred data ASAP and to continue with more transfers. Also PDC data need to be transferred for prototyping and testing of analysis code.