1 / 39

Gird Deployment at KEK

Gird Deployment at KEK. Go Iwai , Yoshimi Iida, Setsuya Kawabata, Takashi Sasaki and Yoshiyuki Watase Joint Meeting of Pacific Region Particle Physics Communities DPF2006 and JPS2006 Oct. 29 ~ Nov. 3 2006 Sheraton Waikiki Hotel, Honolulu, Hawaii. Outline. Introduction KEK

drewl
Download Presentation

Gird Deployment at KEK

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Gird Deployment at KEK Go Iwai, Yoshimi Iida, Setsuya Kawabata, Takashi Sasaki and Yoshiyuki Watase Joint Meeting of Pacific Region Particle Physics Communities DPF2006 and JPS2006 Oct. 29 ~ Nov. 3 2006 Sheraton Waikiki Hotel, Honolulu, Hawaii

  2. Outline • Introduction • KEK • Network in Japan • Grid Deployment • Current Status • Strategy • Hosted VOs: Belle, ILC, APDG and so on • Grid Inter-operability • NAREGI: NAtional REsearch Grid Initiative • Relationship among LCG, NAREGI and KEK • Summary "Grid Deployment at KEK", DPF2006 & JPS2006

  3. Next Topics • Introduction • KEK • Network in Japan • Grid Deployment • Current Status • Strategy • Hosted VOs: Belle, ILC, APDG and so on • Grid Inter-operability • NAREGI: NAtional REsearch Grid Initiative • Relationship among LCG, NAREGI and KEK • Summary "Grid Deployment at KEK", DPF2006 & JPS2006

  4. KEK Computing Research Center Introduction • Our mission related computing and networking • Providing Computing Facility • KEK-B/Belle • ILC • J-PARC • Proton Synchrotron • K2K, T2K • long baseline neutrino detection • Accelerator design • Application at Synchrotron Radiation Facility • material science, life science and etc • Networking • Security • Support for university groups in the field • As an Inter University Research Institute Corporation Overview of CRC Super Computer System "Grid Deployment at KEK", DPF2006 & JPS2006

  5. HEPnet-J Originally, KEK organized HEP institutes in Japan to provide the networking among them We started from 9600bps DECnet in early 1980’s KEK is one of the first Internet sites and the first web site in Japan (1983? and 1992) The current network infrastructure is the SuperSINET operated by NII (National Institute of Informatics) NII will be upgraded to the SINET3 in April 2007. The SINET3 will provide multi-layered network service with 10 - 40 Gbps backbone. Introduction The 1st Homepage in Japan "Grid Deployment at KEK", DPF2006 & JPS2006

  6. Introduction HEPnet-J (cont.) Network Topology Map of SINET/SuperSINET(Feb. 2006) Tab.: Line Speeds Tab.: Number of SINET particular Organizations (Feb. 2006) "Grid Deployment at KEK", DPF2006 & JPS2006

  7. Next Topics • Introduction • KEK • Network in Japan • Grid Deployment • Current Status • Strategy • Hosted VOs: Belle, ILC, APDG and so on • Grid Inter-operability • NAREGI: NAtional REsearch Grid Initiative • Relationship among LCG, NAREGI and KEK • Summary "Grid Deployment at KEK", DPF2006 & JPS2006

  8. Strategy on GRID Grid Deployment • Deployment at KEK for major groups • BELLE • Ongoing experiment • ILC • Near future target • University support • education and training • Deployment at smaller centers • HEPNET-J VO Overview of KEK-B accelerator Design of ILC accelerator/detector "Grid Deployment at KEK", DPF2006 & JPS2006

  9. Recent Events Grid Deployment • Nov. 2005: HEP Data Grid Workshop • training and cooperation in Asia-Pacific region • at KEK • Mar. 2006: First meeting on NAREGI/EGEE Interoperability • launched Inter-OP projects between NAREGI and EGEE • talk about NAREGI later • at CERN • Aug. 2006: Belle workshop on Grid • to share the information among Belle collaborations • at Nagoya Univ. • Sep. 2006: Japan-France Workshop on Grid Computing • at IN2PS/Lyon Univ. "Grid Deployment at KEK", DPF2006 & JPS2006

  10. Summary of LCG Deployment JP-KEK-CRC-01 (Pre-Production System) since Nov. 2005. is registered to GOC, is ready to WLCG (World wide LCG). is operated by KEK staffs. Site Role: practice for production system JP-KEK-CRC-02. test use among university groups in Japan. Resource and Component: SL-3.0.5 w/ LCG-2.7 upgrade to gLite-3.0 is done. CPU: 14, Storage: 1TB Supported VOs: belle, apdg, dteam and ops JP-KEK-CRC-02 (Production System) since early 2006. is registered to GOC, is ready to WLCG. is outsourced to IBM Co.,Ltd. Resource and Component: SL or SLC w/ LCG-2.7 upgrade to gLite-3.0 is done. CPU: 48, Storage: 6TB (w/o including HPSS) Supported VOs: belle, apdg, atlasj, ilc, dteam and ops JP-KEK-CRC-00 (Testbed System) since Jun. 2005. is closed environment in comparison with other sites. easy to access and configure. Resource and Component: SL-3.0.5 w/ gLite-3.0 (100% pure) Supported VOs: belle, apdg, atlasj and g4med Grid Deployment "Grid Deployment at KEK", DPF2006 & JPS2006

  11. SRB: Storage Resource Broker The SRB is a client-server middleware that provides a uniform interface for connecting to heterogeneous data resources over a network and accessing replicated data sets. We have started to work on SRB earlier than LCG The zone federation among the Belle member institutes has been established. SRB-DSI works as the gridftp server, and is easily integrated into LCG services. SRB exports existing data without copying physically and will be useful for existing projects. Grid Deployment Bonny Strong, RAL http://www.globus.org/ "Grid Deployment at KEK", DPF2006 & JPS2006

  12. Other Grid-related Services Grid Deployment • We have our own GRID CA • is started on Feb. 2006, and is recognized by LCG. • is accredited by APGRID PMA • # of issued user certificates: 25 • # of issued host certificates: 74 • http://gridca.kek.jp/ • VO Membership Service • Supported VOs: • apdg is the VO for Asia-Pacific Data Grid. • belle is the VO for Belle experiments. • atlasj is the VO for Atlas experiments in Japan. • g4med is the VO for Geant4 medical application. • Local Mirror Service • SL, SLC, LCG, gLite • It takes ~30 minutes to update by using apt-get with CERN or FNAL repositories. • ~3 minutes with KEK repository • http://hepdg.cc.kek.jp/mirror/ • Semi-automatic Installation Service • WNs can be installed semi-automatically by PXE (Preboot eXecution Environment) and kickstart configuration file. • http://hepdg.cc.kek.jp/install/ • Site Portal • http://grid.kek.jp/ KEK Grid CA Web Repository "Grid Deployment at KEK", DPF2006 & JPS2006

  13. Belle VO Grid Deployment • Started using SRB and LCG • LCG site: JP-KEK-CRC-01/02 • Data distribution service using SRB-DSI • Belle already have a few PBs data in total including 100s TB DST and MC • Bulk file register helps us: Sregister • we do not move any of them • Benefits both for native SRB users and LCG users • VO is supported by KEK • Nagoya (JP), Melbourne (AU), Academia Sinica (TW), Krakow (PL) and etc "Grid Deployment at KEK", DPF2006 & JPS2006

  14. New B Factory Computer System Grid Deployment - New B Factory Computer System since March 23. 2006 - History of B Factory Computer System Moore’s Law: 1.5y=x2.04y=x~6.3 5y=x~10 "Grid Deployment at KEK", DPF2006 & JPS2006

  15. Photo of New B System Storage System (DISK): 1PB Computing Server: ~42,500 SPECint2K Storage System (HSM): 3.5PB "Grid Deployment at KEK", DPF2006 & JPS2006

  16. Belle Grid Deployment Plan Grid Deployment • We are planning a 2-phased deployment for BELLE experiments. • Phase-1: BELLE user uses VO in JP-KEK-CRC-02 sharing with other VOs. • JP-KEK-CRC-02 consists of “Central Computing System” maintained by IBM corporation. • Available resources: • CPU: 72 processors (opteron), SE: 200TB (with HPSS) • Phase-2: Deployment of JP-KEK-CRC-03 as BELLE Production System • JP-KEK-CRC-03 uses a part of “B Factory Computer System” resources. • Available resources (maximum estimation) • CPU: 2200 CPU,SE: 1PB (disk), 3.5 PB (HSM) • This system will be maintained by CRC and NetOne corporation. "Grid Deployment at KEK", DPF2006 & JPS2006

  17. Belle Grid Deployment Plan (cont.) We are planning to federate with Japanese universities. KEK hosts the BELLE experiment and behaves as Tier-0. Univ. with reasonable resources: full LCG (Tier-1) Univ. without resources: UI The central services such as VOMS, LFC and FTS are provided by KEK. KEK also covers web Information and support service. Grid operation is co-operated with 1~2 staffs in each full LCG site. Grid Deployment preliminary design deploy at phase-2 University University University UI UI UI JP-KEK-CRC-03 JP-KEK-CRC-02 Tier-0 UI UI UI UI UI UI University University University University University University Tier-1 "Grid Deployment at KEK", DPF2006 & JPS2006

  18. ILC VO GRID for ILC is sponsored by GAKUJYUTSU SOUSEI budget (a grant from MEXT) French-Japan Joint Lab. Program Initial goal As a tool to share data of total size 1~10TB among Institutes in Japan, Asia, and World Wide. Grid Deployment "Grid Deployment at KEK", DPF2006 & JPS2006

  19. APDG VO Asia Pacific Data GRID Collaboration among Academia Sinica(TW), Center for HEP-Korea, University of Melbourne and KEK Regular meetings, workshops and conferences We are seeking tighter collaboration with ASGC GOC in Asia Grid Deployment "Grid Deployment at KEK", DPF2006 & JPS2006

  20. ATLAS JAPAN VO Grid Deployment • federation among ICEPP, Kobe Univ., Nagoya Univ. and KEK. • Okayama Univ. and Hiroshima-IT are also potential sites • VO Usage • Testing inter-connectivity among ICEPP, Kobe Univ. and KEK • Testing Function of middleware • Measuring performance of data sharing • ATLAS RC will be hosted by ICEPP not us "Grid Deployment at KEK", DPF2006 & JPS2006

  21. Next Topics • Introduction • KEK • Network in Japan • Grid Deployment • Current Status • Strategy • Hosted VOs: Belle, ILC, APDG and so on • Grid Inter-operability • NAREGI: NAtional REsearch Grid Initiative • Relationship among LCG, NAREGI and KEK • Summary "Grid Deployment at KEK", DPF2006 & JPS2006

  22. NAREGI: NAtional REsearch Grid Initiatives Nagoya Tokyo SuperSINET IMS NII 5 TFLOS (896 CPU) Software Testbed 10 TFLOPS (1,618 CPU) Application Testbed * As of 2004 Grid Inter-operability • National Research Grid Initiatives (NAREGI) • Apr. 2003 MEXT funded NAREGI 5 years Project • Lead by Prof. Ken Miura (NII) • Development of Grid infrastructure and an application for promotion of national economy • Target application is nano science and technology for new material design • Players • Computing & networking: NII, AIST, TITEC • Material scientists :IMS, U. Tokyo, Tohoku U., Kyushu U., KEK, .. • Companies: Fujitsu, Hitachi, NEC • Distributed facility: Computing Grid up to 100 TFLOPS in total • Extended to 2010 as a part of National Peta-scale Computing Project "Grid Deployment at KEK", DPF2006 & JPS2006

  23. Collaboration with NAREGI Grid Inter-operability • What we expect for NAREGI • Better quality • Easier deployment • Better support in the native language • What we need but still looks not in NAREGI • File/replica catalogue and data GRID related functionalities • Need more assessments • Comes a little bit late • Earlier is better for us • We need something working today! • Require commercial version of PBS for β1 • LCG (LHC Computing GRID) is now based on gLite 3. • Only middleware available today to satisfy HEP requirements • US people are also developing their own • Difficulty • Support • Language gaps • Quality assurance • Assumes rich man power "Grid Deployment at KEK", DPF2006 & JPS2006

  24. LCG and NAREGI Inter-operability Grid Inter-operability • NAREGI has much interests on interoperability because they came late and they decided to establish in their side • First meeting at CERN • March 2006 • NAREGI, LCG and people from KEK • Second meeting at GGF Tokyo "Grid Deployment at KEK", DPF2006 & JPS2006

  25. KEK Planon GRID Inter Operability Grid Inter-operability • NAREGI will implement LFC on their middleware • We assume job submission between them will be realized soon • Share the same file/replica catalogue space between LCG and NAREGI • Move data between them using GridFTP • NAREGI <--> SRB <--> LCG will be tried also • using SRB-DSI • Assessments will be done for • Command level compatibility (syntax) between NAREGI and LCG • Job description languages • Software in experiments • ILC, International Linear Collider, will be a target • interoperability among LCG, OSG and NAREGI will be required "Grid Deployment at KEK", DPF2006 & JPS2006

  26. Next Topics • Introduction • KEK • Network in Japan • Grid Deployment • Current Status • Strategy • Hosted VOs: Belle, ILC, APDG and so on • Grid Inter-operability • NAREGI: NAtional REsearch Grid Initiative • Relationship among LCG, NAREGI and KEK • Summary "Grid Deployment at KEK", DPF2006 & JPS2006

  27. Summary • We, KEK Computing Research Center, are working for Belle and ILC mainly related GRID • Belle has started to use LCG • ILC soon • University supports also • 3 LCG sites are in operation at KEK. • Other Grid-related services are in operation also. • CA, VOMS, Mirror, Installation and Documentation • Grid inter operability between NAREGI and LCG will be established. "Grid Deployment at KEK", DPF2006 & JPS2006

  28. Thank You 1 High Energy Accelerator Organization (KEK) 2 Japan Science and Technology Agency (JST) 3 SOUM Co.,Ltd. 4 Nagoya University 5 IBM Japan Systems Engineering Co.,Ltd. 6 Kobe University 7 Ashikaga Institute of Technology 8 ICEPP, University of Tokyo 9 Naruto University of Education K. Amako1,2, J. Ebihara3, Y. Iida1, K. Inami4, K. Ishikawa5, M. Kaga4, S. Kameoka1,2, S. Kawabata1, K. Kawagoe6, A. Kimura7, Y. Kiyamura6, M. Matsui5, K. Murakami1,2, H. Sakamoto8, T. Sasaki1,2, S. Suzuki1, Y. Watase1, S. Yashiro1 and H. Yoshida9

  29. Backup

  30. Introduction to SRB The SDSC Storage Resource Broker (SRB) is a client-server middleware that provides a uniform interface for connecting to heterogeneous data resources over a network and accessing replicated data sets. -- Bonny Strong, RAL "Grid Deployment at KEK", DPF2006 & JPS2006

  31. What is SRB • Software product developed by the San Diego Supercomputing Centre (SDSC), as part of U.S. National Partnership for Advanced Computational Infrastructure. • It has been operational at San Diego since 1997, where currently 200 TB of data are shared between 30 participating universities. -- Bonny Strong, RAL "Grid Deployment at KEK", DPF2006 & JPS2006

  32. SRB History at KEK • 2003 Jun. 06 Start operation of the test system 1 with PostgreSQL • HPSS interface • 2003 Oct. 22 Test system 2 with DB2 • PostgreSQL looks better than DB2 • 2004 Apr. 02 SRB in BELLE Computer system • Interface to SONY Peta Site via NFS • Test with Melbourne • 2004 Dec. 06 Previous workshop • Together with Michel Wan, SDSC, SRB federation has been established among AU, KR, TW, CN, PL and JP • 2005 Jul. 28 Nagoya Joined the federation • 2006 Apr. 18 Replace with the new system • Interface to HPSS via NFS/VFS • 2006 Jun. 01 Federated with IN2P3 "Grid Deployment at KEK", DPF2006 & JPS2006

  33. SRB Main Features • Allows users to access files and database objects across a distributed environment. • Actual physical location and way the data is stored is abstracted from the user. • Can manage replication and movement of data. • Allows the user to add user defined metadata describing the scientific content of the information, which can be searched for data discovery. • Metadata held includes the physical and logical details of the data held and its replicas, user information, and security rights and access control. -- Bonny Strong, RAL "Grid Deployment at KEK", DPF2006 & JPS2006

  34. SRB-DSI Architecture The Storage Resource Broker Data Storage Interface (SRB-DSI) is an extension to the GridFTP server that allows it to interact with SRB. Plugging this extension into a GridFTP server allows the GridFTP server to access a SRB resource and serve it to any GridFTP client as though it were a filesystem. "Grid Deployment at KEK", DPF2006 & JPS2006

  35. Performance of SRB-DSI ~60MB/sec ~60MB/sec ~30MB/sec ~40MB/sec Band width: 117MB/sec (with iperf) "Grid Deployment at KEK", DPF2006 & JPS2006

  36. LCG History at KEK • 2005 • Jun: Testbed project was started with LCG-2.6. • this site named JP-KEK-CRC-00 is used for test. • Nov: held workshop at KEK. • JP-KEK-CRC-01 with LCG-2.7 • Pre-production site • APDG • 2006 • Feb: CA service started. • Mar: JP-KEK-CRC-01 site registered to GOC. • May: JP-KEK-CRC-02 started with LCG-2.7. • Jun: JP-KEK-CRC-02 site registered to GOC. • CRC-00 being re-constructed with gLite-3.0 from scratch. • not ready to start. • Aug: BELLE Grid Workshop • CRC-01/02 updated to glite-3.0 "Grid Deployment at KEK", DPF2006 & JPS2006

  37. JP-KEK-CRC-00 130.87.208.0/22 cc.kek.jp dg01.cc.kek.jp dg03.cc.kek.jp dg04.cc.kek.jp dg05.cc.kek.jp dg06.cc.kek.jp dg02.cc.kek.jp N/A 192.168.162.0/24 kekgrid.jp dg07.cc.kek.jp wn001.kekgrid.jp wn002.kekgrid.jp • We have positioned JP-KEK-CRC-00 as the testbed site. • We use it for: • practice to install LCG/gLite. • functional tests. • This site is closed environment in comparison with other sites. • easy to access and configure • We install gLite-3.0 (100% pure) • but it doesn’t work yet • configured VOs • belle (BELLE experiments) • apdg (Asia-Pacific Data Grid) • test VO for AP region people • g4med (Geant4 Medical Application) 7 nodes & 2 WNs via NAT "Grid Deployment at KEK", DPF2006 & JPS2006

  38. JP-KEK-CRC-01 Hostname/130.87.208.xxx Last-mod: 2006-08-03 dg09 169 DNS, NAT RB WN Farm 1 wn001~wn015 ce01.keklcg.jp 192.168.1.1 dg10 170 CE se01.keklcg.jp 192.168.1.2 dg11 171 SE mon.keklcg.jp 192.168.1.2 dg12 172 MON xxx.keklcg.jp 192.168.1.0/24 dg13 173 dg18 178 VOMS SE dg14 174 1TB RAID UI dg16 176 PX BDII LFC dg15 175 NFS ce02.keklcg.jp 192.168.1.4 dg17 177 VOBOX se02.keklcg.jp 192.168.1.5 • We have positioned JP-KEK-CRC-01 as the Pre-production site. • We use it for: • practice for production site JP-KEK-CRC-02 • test use among university groups in Japan • SL305/LCG-2.7 • upgrade to gLite-3.0 is done • CPU: 14 • SE: 1TB • under GOC monitoring • supported VO • belle (BELLE experiments) • apdg (Asia-Pacific Data Grid) • dteam • ops • # of users: ~20 (for WLCG operation) "Grid Deployment at KEK", DPF2006 & JPS2006

  39. JP-KEK-CRC-02 GRID UI CE RB LFC VOMS rls01 rls03 rls05 rls07 rls09 LSF ( UI ) CE PX/BDII SE MON rls06 rls02 rls04 rls08 rac01 LCGファイルサーバ + DISK (AIX5L 5.3) HPSS システム PBS 200TB 48CPU WN WN LCG計算サーバ (CERN SL) rlc01~rlc18 LCG計算サーバ (CERN SL) rlc19~rlc36 • We use it as production. • SL305/LCG-2.7 • upgrade to gLite-3.0 is done. • CPU: 48 • SE: 6TB (w/o HPSS) • HPSS is now connected to a part of DPM pool. • Kohki talk about using HPSS later. • CE: we have a plan to use LSF as LRMS. • under GOC monitoring • supported VO • belle (BELLE experiments) • ilc (International Linear Collider experiments) • apdg (Asia-Pacific Data Grid) • atlas_j (Atlas Japan community) • dteam • ops • # of users: ~20 (for WLCG operation) "Grid Deployment at KEK", DPF2006 & JPS2006

More Related