210 likes | 429 Views
The LHCb Tier-2 at INFN. Domenico Galli, Bologna. INFN CCRWS06 Otranto, 7.6.2006. Outlook. The LHCb data flow Aim of LHCb Tier-2s Why LHCb Analysis at Tier-1? Why LHCb Italian Tier-2 at CNAF? LHCb Tier-2 Exploitation in the Forthcoming Years LHCb Tier-2 Size and Cost Overall remarks.
E N D
The LHCb Tier-2 at INFN Domenico Galli, Bologna INFN CCRWS06 Otranto, 7.6.2006
Outlook • The LHCb data flow • Aim of LHCb Tier-2s • Why LHCb Analysis at Tier-1? • Why LHCb Italian Tier-2 at CNAF? • LHCb Tier-2 Exploitation in the Forthcoming Years • LHCb Tier-2 Size and Cost • Overall remarks The LHCb Tier-2 at INFN. 2 Domenico Galli
CERN On-line Farm MC calibration data Selected DST+RAW TAG RAWmc data RAW data • CERN • Tier-1s Physics Analysis reconstruction User DST n-tuple User TAG rDST Local Analysis pre-selectionanalysis • Tier-3s Paper DST+RAW TAG The LHCb Dataflow • Tier-2s • On-line Farm • CERN • Tier-1s • CERN • Tier-1s Chaotic job Scheduled job The LHCb Tier-2 at INFN. 3 Domenico Galli
The LHCb Dataflow (II) The LHCb Tier-2 at INFN. 4 Domenico Galli
Aim of LHCb Tier-2s • In the LHCb Computing Model Tier-2s are used (at least in countries also provided with a Tier-1) as a purecentrally scheduled Monte Carlo simulation facility. • In the LHCb Computing Model Analysis is not performed on Tier-2s. • A LHCb Tier-2 consists in a PC-farm with a small disk buffer (3 TB in steady state) used as a temporary cache until data transfer to Tier-1. • LHCb-Italy proposes to build 1 Tier-2 hosted at Bologna-CNAF, together with Italian Tier1. The LHCb Tier-2 at INFN. 5 Domenico Galli
Why LHCb Analysis at Tier-1? • LHCb analysis jobs consist in selecting the events (stripped DST) stored at Tier-1 to focus on one particular analysis channel: • Typical analysis jobs run on a~106 event sample. • Some analysis jobs will run ona larger ~107 event sample. • Average event reduction of a factor of 5. • Analysis input is completely stored at each Tier-1. • Analysis output (20-200 GB) can be processed by a small Tier-3 facility. Available at each Tier-1 Available at each Tier-1 SelectedDST+RAW119 TB Event TagCollection20 TB Physics Analysis n-tuple / User DST + User TAGTypical: 20 GB Large: 200 GB Local Analysis The LHCb Tier-2 at INFN. 6 Domenico Galli Paper
Tier-1 Tier-2 139 TB Output 20-200 GB DST+RAW+TAG139 TB 139 TB Buffer LAN WAN Tier-1 Output 20-200 GB DST+RAW+TAG139 TB 139 TB Why LHCb Analysis at Tier-1? (II) • Comparison between 2 models: • Analysis job at Tier-2: • Analysis job at Tier-1: • Data accessed at Tier-1 per analysis job are the same. • But the second is faster and less expensive in terms of hardware, infrastructure, staff resources and WAN load. LAN The LHCb Tier-2 at INFN. 7 Domenico Galli
Why Tier-2 at CNAF? • This model would allow maximum flexibility in moving resources back/forth between Tier-2 and Tier-1 to optimize resource exploitation. (to satisfy peak MC/analysis request). • Simply by means of a software operation. • No competition among Italian sites on Tier-2 resources, since MC production is centrally scheduled. • Italian LHCb people involved in computing are mainly located in Bologna. The LHCb Tier-2 at INFN. 8 Domenico Galli
Why Tier-2 at CNAF? (II) • LHCb experiment staff from Bologna • Already heavily involved in LHCb computing; • Established strict collaboration with the CNAF staff. • Same trained and skilled management and technical staff. • Given the growing profile, LHCb Tier-2 is at maximum a +10% perturbation of CNAF Tier-1 resources. • Same building. • Same cooling system. • Same fire alarm and surveillance. • Same network. • Same electric power and UPS. • Etc. The LHCb Tier-2 at INFN. 9 Domenico Galli
LHCb Tier-2 Exploitation in the Next Years • From now on, practically speaking, an almost continuous MC production is foreseen for LHCb. Mainly for: • Physics studies; • HLT studies. • Order of 100's Mevents/year. The LHCb Tier-2 at INFN. 10 Domenico Galli
LHCb Tier-2 Size and Cost • We are sizing the Italian Tier-2 to be 15% of the total Tier-2 resources of the whole LHCb collaboration. • 15% = Italian fraction of CORE funds; • 15% = fraction of Italian physicists with respect of the total involved in the LHCb experiment. The LHCb Tier-2 at INFN. 11 Domenico Galli
LHCb Tier-2 (@CNAF): Additional Size and Cost (according to LHCb Computing Model) 3.2 GHz Xeon = 1.2 kSi2k The LHCb Tier-2 at INFN. 12 Domenico Galli
LHCb Tier-2 (@CNAF): Additional Infrastructures Power consumption: 50 W/kSi2k in 2006 25 W/kSi2k in 2010 70W/TiB The LHCb Tier-2 at INFN. 13 Domenico Galli
LHCb Tier-1/Tier-2 at INFN Comparison The LHCb Tier-2 at INFN. 14 Domenico Galli
3.2 CERN Tier-1 INFN Tier-2 INFN 3.2 DT: 50/50/72 RP: 168/232/232 STR: 120/120/120 DT: 88/88/176 RP: 176/480/480 STR: 608/608/608 6.4 Other Tier-1s LHCb-INFN Network Bandwidth • Throughput in Mb/s for the years 2008/2009/2010. • DT: Data Taking; • RP: Reprocessing: winter shut-down (2 months); • STR: Re-stripping: 3 times in a year: after data taking (1 month), after reprocessing (2 months), before next year data taking (1 month). The LHCb Tier-2 at INFN. 15 Domenico Galli
LHCb software need at Tier-2 • The functioning of LHCb Tier-2s is well-known, since they only perform Monte Carlo production. • Already at the start-up of Tier-1 the LHCb Bologna group set up a farm for MC production. • No I/O intensive. • Services to be implemented includes CE, SE, RB, G-PBox. • Tier-2s are not technically critical in LHCb: • They only have to produce the required amount of MC data. The LHCb Tier-2 at INFN. 16 Domenico Galli
Overall remarks • On October 10, 2005, LHCb requested to the CSN1 the resources allocation for the Tier-2 at CNAF. • The request was partly (100 k€) approved sub-judice by CSN1, but availability of these resources is now bound to the resolution of general CNAF infrastructure problems. • Problem fixing could take a while. In the meantime LHCb needs resources for Monte Carlo productions. • Till now LHCb has made up for its lack of dedicated resources with the exploitation of the unused INFN-Grid resources. • If the exploitation policy of Tier-2 resources will not be closedandcomputing resources will continue to be available for LHCb then no further problems will arise, waiting for the CNAF problems to be fixed. The LHCb Tier-2 at INFN. 17 Domenico Galli
Overall remarks (II) • The Tier-1 center must work. • Infrastructural and organizational problems must be resolved. • The Tier-2 federation could not make up for the Tier-1 malfunctioning. • In the last 5 years the Tier-1 personnel has learnt a skill that other centers would take few years to collect. • All seems simple until one comes up against scalability problems. • The need of dedicated personnel cannot be set aside. • This will be more evident during steady state. In the set-up phase there are many mobilized troops that in steady state will be missing. Job efficiency The LHCb Tier-2 at INFN. 18 Domenico Galli
Overall remarks (III) • In order for the Tier-2 federation to work efficiently: • A Tier-2 coordination must exist above the sites and the experiments. • A policy enforcement mechanism (G-PBox) must work. • Otherwise we could have simultaneously unused resources together with resource lack. The LHCb Tier-2 at INFN. 19 Domenico Galli
Overall remarks (IV) • At the Tier-1 (as well as for the other experiments at the Tier2s) LHCb needs very performing disk storage solutions for analysis jobs. • High availability and high I/O throughput; • Possibility to access files on the SE using a “file” protocol: • i.e. without copying the file from/to the local disk; • we would not like to rely on the local disks of the WNs at all! • Dynamic management of disk storage allocation: • lifetime, pinning, space pre-allocation provided by SRM-v2, • see G. Donvito and A. Forti at this workshop for the status with SRM-v2. • Various solutions on the market, each one with its pro’s and con’s, the perfect product does not exist yet! The LHCb Tier-2 at INFN. 20 Domenico Galli
Overall remarks (V) • LHCb-INFN investigated, together with CNAF, the parallel file systems (GPFS, Lustre) as a valuable solution, if provided together with an appropriate SRM-v2, e.g. StoRM (developed by INFN/EGRID) • Works at lower level; • Less increase of complexity with scale; • Less maintenance/configuration needs. • See V. Sapunenko and A. Brunengo at this workshop. • Other valuable solutions include: dCache, DPM, xrootd: • Developed respectively in US/DESY, CERN, US (the latter also with a developer from INFN-Padova). The LHCb Tier-2 at INFN. 21 Domenico Galli