200 likes | 320 Views
Data GRID deployment in HEPnet-J. Takashi Sasaki Computing Research Center KEK. Who we are?. KEK stands for K ou = High E nerugi = Energy K asokuki Kennkyu Kiko = Accelerator Research Organization
E N D
Data GRID deployment in HEPnet-J Takashi Sasaki Computing Research Center KEK
Who we are? • KEK stands for • Kou = High • Enerugi = Energy • Kasokuki Kennkyu Kiko = Accelerator Research Organization • We are one of the governmental agencies as like other national universities and national laboratory in Japan since the year 2004 • We are an Inter-University Research Institute Corporation
Belle CP violation @KEK K2K, T2K Neutrino KEK/Tokai to Kamioka CDF Hadron collider, top quark Fermi Lab., US ALTAS Hadron collider, SUSY CERN, Switerland J-PARC Joint project with JAEA Being built at Tokai ILC International Linear Collider Site still note decided International competition Japan has interests to host Lattice QCD Dedicated IBM blue gene 57.3TFlops Material and life science Synchrotron radiation Muon and meson science Technology transfer Medical applications Simulation Accelerator Major projects at KEK
HENP institutes in Japan • KEK is the only central laboratory in Japan • Smaller scale centers are exist also • ICEPP(U Tokyo), Riken, Osaka Univ. and a few • Majorities are smaller groups in universities • Mostly 1-3 faculties and/or researchers and graduate students • No engineers nor no technicians for IT • This is not HENP specific, but commonly observed • KEK has a role to offer necessary assistance to them • Mostly graduate students in physics are the main human resource to support IT unfortunately
HEPnet-J • Originally, KEK organized HEP institutes in Japan to provide the networking among them • We started from 9600bps DECnet in early 1980’s • KEK is one of the first Internet sites and the first web site in Japan (1983? and 1992) • This year, Super SInet3 will be introduced with 20Gbps and 10Gbps to main nodes as the final upgrade • Shift to more application oriented rather than the band width • GRID deployment is an issue • Virtual Organization for HEP Japan
History of HEPnet-J 2003 Super SInet (backbone) IP 10Gbps
Data flow model in Belle • At every beam crossing, an interaction between particles happens and final state particles are observed by the detector • Event • Different type of interactions may happen at each beam crossings • Events are in time sequence • Something like one picture in the movie film • Run • Something like a role of the movie film • Cut at a good file size for later processing (historically a size of a tape, 2GB or 4GB) • Data from the detector (signals) are called as “raw data” • Physical properties for each particles are “reconstructed” • Vectorization of images and conversions of units • a signal processing • Events are classified into types of interactions (pattern matching) • Data Summary Tape (DST) • More condensed events samples are selected from DST • Something like a knowledge discovery in images • Called Mini DST • Detector signals are striped • Sometimes, subset of mini DST, micro DST is produced
Frequency of reprocessing Reconstruction from raw data One a year or less DST production Twice a year or less Mini DST production Many times Micro DST production Many times End users analysis Every day, very many times Monte Carlo production More than number of real data More likely CPU intensive jobs Full simulation Fast simulation Event size 40KB in raw data (signal only) Record rate 10MB/sec Accumulated event in total 1 PB Belle data analysis
Event processing • Reconstruction and DST production is done on site due to large data size • Physics analysis jobs are executed locally against miniDST or microDST, and also MC • What they are doing mainly is statistical analysis and visualization of histograms • Also software development • Official jobs, like MC production, cross the levels • CPU intensive jobs • miniDST and microDST production are done by sub-groups and can be localized • Most of jobs are integer intensive than floating points • Many branches in the code
Level 0 (a few PB) Only KEK has raw data and reconstructed data Whole MC data Level 1 (a few 10TB) Big institutions may want a replica of DST Join MC production Level 2 (a few 100GB) Most of institutions are satisfied with mini DST Join May join MC production Smaller institutions may satisfied with micro DST even Collaboration wide data set Raw data Reconstructed data DST MC events (background+ signal) Sub group wide data set Mini DST Micro DST MC events (signals) Data Distribution Model in Belle
Bare Globus Up to GT2 and gave up to follow We have our own GRID CA In production since this January Accredited by APGRID PMA Two LCG sites and one test bed KEK-LCG-01 For R&D KEK-LCG-02 For production Interface to HPSS Test bed Training and tests NAREGI test bed Under construction SRB (UCSD) GSI authentication or password SRB-DSI became available Works as SRM for the SRB world from LCG side Performance test will be done Performance tests among RAL, CC-IN2P3 and KEK is on going Gfarm Collaboration with AIST GRID deployment at KEK
GRID deployment • ATLAS definitely require LCG/gLite • ICEPP (International Center for Elementary Particle Physics), U of Tokyo will be a tier-2 center of ATLAS • They have degraded from tier-1 • One professor, one associate professor and a few assistant professors are working on the tier-2 center • No technician, no engineer nor no contractors, but only “physicists” • Can you believe this? • How other ATLAS member institutes, mostly smaller groups, can survive? • Belle • Some of the collaborators requested us to support a GRID environment for data distribution and efficient analysis • Some time their collaborators also join either of LHC experiments • They want to use the same thing for both
LCG/gLite • LCG (LHC Computing GRID) is now based on gLite 3.0. • Only middleware available today to satisfy HEP requirements • US people are also developing their own • Difficulty • Support • Language gaps • Quality assurance • Assumes rich man power
NAREGI • What we expect for NAREGI • Better quality • Easier deployment • Better support in the native language • What we need but still looks not in NAREGI • File/replica catalogue and data GRID related functionalities • Need more assessments • Comes a little bit late • Earlier is better for us • We need something working today! • Require commercial version of PBS for β
First stage plan • Ask NAREGI to implement LFC on their middleware • We assume job submission between them will be realized • Share the same file/replica catalogue space between LCG/gLite and NAREGI • Move data between them using GridFTP • Try something by ourselves • Brute force porting of LFC on NAREGI • NAREGI<->SRB<->gLite will be tried also • Assessments will done for • Command level compatibility (syntax) between NAREGI and gLite • Job description languages • Software in experiments, especially ATLAS • How depends on LCG/gLite?
Future strategy • ILC, International Linear Collider, will be the target • interoperability among gLite, OSG and NAREGI will be required
Conclusion • HE(N)P has a problem to be solved today • GRID seems the solution, however, much human resource consumption is the problem • We expect much on NAREGI • Still we cannot escape from gLite • Interoperability is the issue • We work on this issue together with NARGI and IN2P3