180 likes | 268 Views
IFAE-Atlas Workshop Dec 21st 2005. Data Access Models: Can remote groups compete?. 1 ATLAS raw event size will be 1.6 MB + Raw events will flow from the Event Filter at a rate of 200 Hz = ATLAS will produce 320 MB/s = 27.6 TB/day. Andreu Pacheco / IFAE Release 6. Some questions from Ilya.
E N D
IFAE-Atlas WorkshopDec 21st 2005 Data Access Models: Can remote groupscompete? 1 ATLAS raw event size will be 1.6 MB + Raw events will flow from the Event Filter at a rate of 200 Hz = ATLAS will produce 320 MB/s = 27.6 TB/day Andreu Pacheco / IFAE Release 6
Some questions from Ilya • Will grid work? really? when? • Will PIC be of any help for the IFAE-Atlas group? • Should the IFAE-Atlas group deploy additional storage and processing? • Should we all go to CERN instead? • What data access model will we use?
Minimal introductionto the grid • The grid we plan to use today is a galaxy of linux clusters. • In order for a linux cluster to join the grid a set of software packages and configuration files must be installed. • In our case and at Dec 2005 the set of software packages is known as “LCG version 2.6”
Components of alinux cluster • User interfaces – Desktops and public login machines. • Computer Element – Server holding the batch queue manager. • Worker nodes – Server executing the batch jobs • Storage Elements – Disk servers.
Common grid components for an experiment • VOMS Server – Authorizes users to become members of the project or experiment and use its resources. • Resource Broker – Manages the generic batch queues pointing to several computer elements. • Catalog Server – Maintains a common directory catalog for all data files tracking data locations and replicas.
Common components for a country • Certification Authority Server – Needed to obtain the digital certificate and to maintain the list of valid certificates. • Grid Monitoring Server – Needed to check that all clusters are functionally working. • User support server – Needed to follow user problems.
What Atlas users can do now with the grid? • Get a digital certificate. • Register to an experiment (Atlas) • Store, retrieve and replicate data files. • Register data files in the Atlas catalog. • Submit jobs to a specific site or to all free worker nodes of the sites. • Use a tool for submitting and trace a massive number of jobs.
Current issues with the grid • The commands and instructions how to use the grid will change. • The stability of grid services is difficult to achieve with young code and a geographically very distributed set of computing resources. • Procedures for adding users, security, operations,... are incipient. • Training and support.
LCG-Atlas Computing Model • Several years ago it was clear that the computing infrastructure required for the LHC experiments did not exist and had to be created. • Four types of roles for computing facilities: Tier-0, Tier-1, Tier-2 and Tier-3. • At IFAE all the computing effort was focused to create the PIC (3 years) as the Spanish Tier-1.
Simplest Tier definition • Tier-0 is CERN. The big one. • The Tier-1’s are the distributed storage facilities for experimental and Monte Carlo data with computing resources to reprocess and calibrate data. • The Tier-2’s are the distributed analysis and Monte Carlo generation facilities of the experiments open to Atlas users. • The Tier-3’s are the private computing resources of the IFAE-Atlas group.
Clarification on Tier-1 and Tier-2 • Tier-1: Hosting raw data, ESD, AOD and TAG datasets. Hosting Monte Carlo data produced by the Tier-2. These facilities will provide the reprocessing of the data and will run the calibration jobs. • Tier-2: Hosting of some of the AOD data and full TAG samples. These facilities will provide simulation and analysis capacity for the physics working groups.
Tier-2 Tier-1 MSU CIEMAT IFCA UB Cambridge Budapest Prague Taipei TRIUMF UAM IFIC IFAE Legnaro USC Krakow NIKHEF small centres desktops portables RAL IN2P3 FNAL CNAF FZK PIC ICEPP BNL • Tier 0 – CERN • Tier 1 – PIC (~33% for ATLAS) • Tier 2 – We have a federated one in Spain (IFIC-50%+UAM-25%+IFAE-25%) • Tier 3 – There are one at each spanish Atlas Group.
Atlas resources in Bellaterra for Atlas (from 1st Jan 2008) • Note: One Pentium IV CPU we use in the desktops is around 1.5 kSI2000, one rack server can hold 6 kSI2000 (dual-core dual-cpu).
What will be our data access model? • Data (raw and Monte Carlo) will be in the Tier-0 and Tier-1’s. • We’ll have Replicas some of the data needed in the Tier2 and Tier-3. • The MC data we will generate and reconstruct will be copied to the Tier-1’s. • We will analyze the data sending the jobs to the Tier-2 and Tier-3 from our desktops or public login machines.
What will happen with the Tier-1 computing power? • At the long term the Tier-1 batch queues will be used for large data processing jobs: re-reconstruction of data and jobs will be under control of a production manager. • At the short term the Tier-1’s are being used for distributed analysis and Monte Carlo production at low priority. This will probably change during 2006/2007.
What we have to do at IFAE? • In order to fund the Tier-2 and Tier-3 computing facilities a 2-year project (2006-2007) has been approved by the Spanish HEP program. • We need to deploy the grid at the desktops in IFAE, install a local cluster in IFAE building (tier-3) and install a grid cluster in the PIC computer room (Tier-2).
What else we have to do at IFAE? • The IFAE tier-2 and tier-3 facilities must be able to be reused for Event Filter tests. • The IFAE tier-2 and tier-3 facilities must be optimised to be used by the IFAE group working in Tilecal and the scpecific IFAE activities. • We must identify and train a good Data Manager, key to success...
Conclusions • Atlas users can use the grid now. • However, grid will not be stable after Sep 2006 as earliest. • The IFAE Atlas group must deploy their Tier-2 and Tier-3 in cooperation with PIC as Tier-1. • We should start to have talks or meetings how to use grid computing for Atlas at IFAE.