400 likes | 545 Views
KEK-ICEPP data transfer test. ICEPP-KEK CRC collaboration. Objective. Establish a simple data grid bw KEK and ICEPP (U-Tokyo) on Super Sinet WAN. NorduGrid/Globus HPSS data access from NorduGrid Performance test of HPSS over WAN. Transfer LCD-0 data generated by ICEPP PC-farm to KEK user.
E N D
KEK-ICEPP data transfer test ICEPP-KEK CRC collaboration
Objective • Establish a simple data grid bw KEK and ICEPP (U-Tokyo) on Super Sinet WAN. • NorduGrid/Globus • HPSS data access from NorduGrid • Performance test of HPSS over WAN. • Transfer LCD-0 data generated by ICEPP PC-farm to KEK user. 1st PacifiGrid
Hardware 1st PacifiGrid
ICEPP-KEK Grid Test BedHardware • Hardwares • ICEPP • Computing Elements • 4 nodes with 8cpus. (Athlon MP 2000+ 1.7GHz 512M mem) • KEK • Computing Elements • 50 nodes with 100 cpus.(PenIII 1GHz 512M mem) 1st PacifiGrid
ICEPP-KEK Grid Testbed • Network • 1 GbE connection over Super-SINET between ICEPP PC farm, KEK PC farm and HPSS servers in single subnet. • RTT ~ 4ms / quality is quite good. 1st PacifiGrid
NorduGrid - grid-manager - gridftp-server Globus-mds Globus-replica PBS server NorduGrid - grid-manager - gridftp-server Globus-mds PBS server 1Gbps 100Mbps GRID testbed environmentwith HPSS through GbE-WAN HPSS servers ICEPP KEK SE HPSS 120TB CE SE ~ 60km 0.2TB CE 6CPUs CE 100 CPUs PBS clients PBS clients User PCs 1st PacifiGrid
Software 1st PacifiGrid
ICEPP-KEK Grid Testbedsoftware • Software • Globus 2.2.2 • Nordugrid 0.3.12 + PBS 2.3.16 • HPSS 4.3+ GSI enabled pftp (GSI-pftp) 1st PacifiGrid
NorduGrid • As a Grid middle-ware. • NorduGrid (The Nordic Test bed for Wide Area Computing and Data Handling) • http://www.nordugrid.org • “The NorduGrid architecture and tools” presented by A.Waananen et al. @ CHEP03 1st PacifiGrid
Why NorduGrid • Natural Application of GLOBUS toolkit for PBS. • PBS clients do NOT need Globus/NorduGrid installation. • We installed NG/Globus to just 3 nodes. (ICEPP CE,KEK CE, KEK HPSS SE) but can use more than 60nodes. • Simple, but sufficient functionality. • Actually used at ATLAS DC in Nordic states. • Good start for basic regional center functionality test. 1st PacifiGrid
HPSS as NorduGrid Storage Element • HPSS does not speak ‘Globus’. We need something • GridFTP for HPSS • In design phase at Argonne Lab. • Some are also being developed? (SDSC?) • GSI enabled pftp (GSI-pftp) • developed at LBL. • SRB • GSI-pftp is not a GridFTP. But…. 1st PacifiGrid
GSI-pftp as NorduGrid SE • Both Gridftp and GSI-pftp are a kind of ftp,only extended protocols are not common. 1st PacifiGrid
GSI-pftp as NorduGrid SE • Protocols for parallel transfer and buffer management are different. • DCAU (Data Channel Authentication) is unique for Gridftp. But it is option of user. • GSI-pftpd and Grid-ftp client can successfully communicate each other excepting parallel transfer. 1st PacifiGrid
Sample XRSL &(executable=gsim1) (arguments="-d") (inputfiles= (”data.in" "gsiftp://dt05s.cc.kek.jp:2811/hpss/ce/chep/manabe/data2")) (stdout=datafiles.out) (join=true) (maxcputime="36000") (middleware="nordugrid") (jobname="HPSS access test") (stdlog="grid_debug") (ftpThreads=1) 1st PacifiGrid
Performance measurement 1st PacifiGrid
x3 x3 Players In HPSS Disk Mover (Disk Cache) HPSS server Computing Element in ICEPP/KEK Shared by many users Tape: 3590 (14MB/s 40GB) Disk mover GSIpftp Server Tape movers CE (Gridftp client) 2CPU Power3 375MHz AIX 4.3 HPSS 4.3 Globus 2.0 2CPU PenIII 1GHz RedHat 7.2 Globus 2.2 2CPU Power3 375MHz AIX 4.3 HPSS 4.3 Disk mover 1st PacifiGrid
Possible HPSS Configuration 1 KEK ICEPP Super-SINET 1GbE HPSS Server Disk Mover Computing Element 60km SP Switch 150MB/s • Put ‘disk mover (cache)’ near HPSS server. • Cache should be near to consumer but ‘disk mover’ is far from CE. • Get high-performance of SP switch. 1st PacifiGrid
Possible HPSS Configuration 2 ICEPP KEK • Put ‘remote disk mover(cache)’ near CE. • Fast access between CE and cached files. • If access to the same file from KEK side CE, long detour happen. Super-SINET 1GbE Computing Element Computing Element HPSS Server LAN 1GbE Disk Mover 1st PacifiGrid
Possible HPSSconfiguration 3 KEK ICEPP • To avoid long access delay for CE in KEK, disk layer can be divided into two hierarchy. But complicated configuration is it. Computing Element Computing Element HPSS Hierarchy 3 HPSS Hierarchy 2 HPSS Server HPSS Hierarchy 1 1st PacifiGrid
WAN LAN Possible HPSS Configuration 1 KEK ICEPP Super-SINET 1GbE HPSS Server Disk Mover 60km Computing Element • Current Setup LAN 1GbE Computing Element 1st PacifiGrid
Performance • Basic Network performance. • HPSS Client API performance. • pftp client - pftp server performance. • Gridftp client - pftp server performance. • HPSS • Client API (HPSS orignal) • Parallel FTP (HPSS original) • GridFTP (by GSI-pftp ) Note 1st PacifiGrid
RTT~4ms packet loss free. MTU=1500 CPU/NIC is bottleneck. Max TCP Buffer Size(256k) in HPSS servers cannot changed.(optimized for IBM SP switch) Basic Network Performance LAN WAN
Basic network performanceon Super-sinet Network transfer with # of TCP session 600 • >4 TCP session gets MAX transfer speed. • If enough TCP buffer size ~1 session get almost MAX speed. Client Buffer size = 1MB WAN 400 Aggregate Tx speed (MBit/s) Client Buffer size = 100KB 200 ICEPP client KEK HPSS mover Buffer size HPSS mover = 256kB 0 0 2 4 6 8 10 # of TCP session
Disk mover disk performance • HPSS SSA raw disk performanceread/write~ 50/100 MB/s • PC farm’s disk performance.Read/write ~ 30-40MB/s
HPSS Client API LAN HPSS disk <-> CE memory WAN
HPSS Client API • NW latency impacts to file transfer speed. • Max. raw TCP speed was almost same, but data transfer speed became 1/2 in RTT~4ms WAN. • The reason is not clear yet. But frequent communication between HPSS core server and HPSS client exists?(every chunk size (=4kB) ?) • write overhead at single buffer transfer was bigger than read. • 64MB buffer size was enough for RTT=~4ms network. 1st PacifiGrid
Pftpd→pftp HPSS mover disk -> Client 80 LAN to client /dev/null 60 WAN Transfer speed (MB/s) 40 KEK client 20 ICEPP client ICEPP client Pwidth 0 0 2 4 6 8 10 # of file transfer in parallel
pftpd-pftp ‘get’ performance • Same as in Client-API transfer, even with enough buffer size, transfer speed in WAN is 1/2 of that in LAN. • Simultaneous multiple file transfer (>4) gain aggregate transfer bandwidth. We had 2 disk movers with 2 disk paths each (2x2=4) • Single file transfer with multiple TCP session (pftp function (command=pwidth)) was not effective for RTT=4ms network with enough FTP buffer. 1st PacifiGrid
Pftp→pftp HPSS mover disk Client disk 80 60 to /dev/null Aggregate Transfer speed (MB/s) KEK client (LAN) 40 ICEPP client(WAN) 20 Ftp buffer=64MB to client disk client disk speed 35-45MB/s 0 0 2 4 6 8 10 # of file transfer in parallel Client disk speed @ KEK = 48MB/s Client disk speed @ ICEPP=33MB/s
640Mbps=80MB/s CPU CPU 40MB/s 100MB/s Pftpd→pftp get performance (2) • Even if each component (disk, network) has good performance. Total throughput becomes bad. Because access is done in serial way. Total speed = 1/( 1/100 + 1/80 + 1/40) = 21MB/s 1st PacifiGrid
HPSS ‘get’ with Tape Library pftp-pftp get performance • Thanks to HPSS multi file transfer between tape and disk hierarchy, and enough number of tape drives, we could get speed up in multiple file transfer even if data was in tapes. tape off drive 300 tape in drive Data in Tape 200 Elapsed Time (sec) Data in disk cache 100 data was on HPSS mover disk data was in HPSS mover mounted tape data was in HPSS mover unmounted tape 0 0 2 4 6 8 10 # of file transfer in parallel
pftp→pftpd ‘put’ performance 1 file N files Aggregate N files N files 1 file file (pwidth)
GSI pftpd –GSI pftp client • We compared GSI pftpd – GSI pftp client transfer with normal kerb-pftp-pftp. Both had equivalent transfer speed. Since just initial authentication is different, this was likely result. 1st PacifiGrid
GSI pftpd – Grid FTP client • We compared GSI pftpd – GSI pftp client (pwidth=1) transfer with GSI pftpd – “Grid” FTP client. Both had equivalent transfer speed in many cases, but ….. The difference bw `GSI pftpd – GSI pftp c’ and `GSI pftpd – Grid ftp’ is a feature of parallel multiple-TCP session transfer (pwidth >1). So the result seems reasonable. 1st PacifiGrid
Gridftp client and GSI-pftp server disk mver (!=pftpd) client pftp-pftpd disk mver (=pftpd) client gridftp-pftpd disk mver (!=pftpd) client gridftp-pftpd
GSI-pftpd with Gridftp client • It works ! • But less secure than Gridftp-Gridftpd (omit data path authentication) • In our environment, GridFTP parallel TCP transfer is not needed. • With multiple disk mover, all data transfer go through single pftpd server. (if use with Gridftp client) 1st PacifiGrid
Gridftp client and GSI-pftp server disk mver (!=pftpd) client pftp-pftpd disk mver (=pftpd) client gridftp-pftpd disk mver (!=pftpd) client gridftp-pftpd
Disk mover pftp Server Disk mover pftp Server x3 x3 Tape mover Tape mover CE (gridftp client) CE (pftp client) Tape mover Tape mover x3 x3 Disk mover Disk mover Path difference pftp - pftpd Gridftp – GSI-pftpd
Summary • ICEPP and KEK configured NorduGrid test bed with HPSS storage server over High speed GbE WANetwork. • Network latency affected HPSS data transfer speed especially for HPSS client API. • ‘GSI-pftpd’ developed by LBL is successfully adopted to the interface between NorduGrid and HPSS. But it has room for performance improvement with multi-disk movers. 1st PacifiGrid
From HPSS developers • In case of API client with old version of HPSS, flow control in each 4kB data transfer was existed to support not only TCP but also IPI3.In the present version, HPSS has TCP only mode and also using hpss_ReadList()/WriteList() will help the performance. • In case of pftp access over WAN, pdata-only and pdata-push protocol which is introduced since HPSS ver.5.1 will increase the performance. 1st PacifiGrid