1 / 25

Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.

Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University of Glasgow Graeme A. Stewart - University of Glasgow Greig A. Cowan - University of Edinburgh.

dacey
Download Presentation

Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow email – J.Ferguson@physics.gla.ac.uk Jamie K. Ferguson – University of Glasgow Graeme A. Stewart - University of Glasgow Greig A. Cowan - University of Edinburgh

  2. Typical Tier 2 & Purpose of the Inbound Transfer Tests Details of the hardware/software configuration for the File Transfers Analysis of Results Introduction

  3. LHC and the LCG • LHC – most powerful instrument ever built in the field of physics • Generate huge amounts of data every second it is running • Retention of 10PB annually to be processed at sites • Use case is typically files of size ~GB, many of which are cascaded down to be stored at T2s until analysis jobs process them

  4. Typical Tier2 - Definition Limited Hardware Resources (In GridPP) Using dCache or dpm as SRM Few (one or two) Disk Servers Few Terabytes of RAIDed Disk Limited Manpower Not enough time to Configure and/or Administer a Sophisticated Storage System Ideally want something just to work “out of the box”

  5. Importance of Good Write (and Read) Rates Experiments Desire Good in/out Rates Write more stressful than read, hence our focus Expected data transfer rates (T1==>T2) will be directly proportional to the storage at a T2 site Few 100Mbps for small(ish) sites up to several Gbps for large CMS sites Limiting Factor could be one of many things I know this from recently coordinating 24hour tests between all 19 of the GridPP T2 member institutes We also yielded file transfer failure rates

  6. Glite File Transfer Service Used FTS to manage transfers Easy to use file transfer management software Uses SURLs for source and destination Experiments shall also use this software Able to set channel parameters Nf and Ns Able to monitor each job, and each transfer within each job. Pending, Active, Done, Failed, etc.

  7. What Variables Were Investigated? • Destination srm • dCache (v1.6.6-5) • dpm (v1.4.5) • The underlying File system on the destination • ext2, ext3, jfs, xfs • Two Transfer-Channel Parameters • No. of Parallel Files • No. of GridFTP Streams • Example => Nf=5, Ns=3

  8. Software Components • Dcap and rfio are the transportation layers for dCache and dpm respectively • Under this software stack is the filesystem itself. e.g. ext2 • Above this stack was the filetransfer.py script • See http://www.physics.gla.ac.uk/~graeme/scripts/|filetransfer#filetransfer

  9. Software Components - dpm • All the daemons of the destination dpm were running on the same machine • dCache had a similar setup in terms everything housed in a single node

  10. Hardware Components • Source was a dpm. • High performance machine • Destination was single node dual core Xeon CPU • Machines were on same network. • Connected via 1GB link which had negligible other traffic. • No firewall between source and destination • No iptables loaded • Destination had three 1.7TB partitions • Raid 5 • 64K stripe

  11. Kernels and Filesystems • A CERN contributed rebuild of the standard SL kernel was used to investigate xfs. • This differs from the first Kernel only in the addition of xfs support • Instructions on how to install kernel at http://www.gridpp.ac.uk/wiki/XFS_Kernel_Howto • Necessary RPMs available from ftp://ftp.scientificlinux.org/linux/scientific/305/i386/contrib/RPMS/xfs/

  12. Method 30 source files, each of size 1GB were used This size is typical of the sizes of LCG files that shall be used by LHC experiments Both dCache and dpm were used during testing Each kernel/Filesystem was tested - 4 such pairs Values of 1,3,5,10 were used for No. Files and No. Streams - giving a matrix of 16 test results Each test was repeated 4 times to attain a mean. Outlying results (~ < 50% of other results) were retested This prevented failures in higher level components e.g. FTS adversely affecting results

  13. Results – Average Rates • All results are in Mbps

  14. Average dCache rate vs. Nf

  15. Average dCache rate vs. Ns

  16. Average dpm rate vs. Nf

  17. Average dpm rate vs. Ns

  18. Results – Average Rates • In our tests dpm outperformed dCache for every average Nf, Ns

  19. Results – Average Rates Transfer rates are greater when using jfs and xfs rather than ext2 or ext3 Rates for ext2 are better than ext3 due to the fact that ext2 does not suffer from journalling overheads

  20. Results - Average Rates Having more than one Nf on the channel substantially improves the transfer rate for both SRMs and for all filesystems. And for both SRMs, the average rate is similar for Nf =3,5,10 dCache Ns = 1 is the optimal value for all filesystems dpm Ns = 1 is the optimal value for ext2 and ext3 For jfs and xfs rate seems independent of Ns For both SRMs, the average rate is similar for Ns =3,5,10

  21. Results – Error (Failure) Rates • Failures, in both cases tended to be caused by a failure to correctly call srmSetDone() in FTS resulting from a high machine load • Recommended to separate the SRM daemons and disk servers, especially at larger sites

  22. Results – Error (Failure) Rates dCache small number of errors for the ext2 and ext3 filesystems caused by high machine load No errors for the jfs and xfs filesystems dpm all filesystems had errors As in dCache case, caused by high machine load Error rate for jfs was particularly high, but this was down to many errors in one single transfer

  23. Results – FTS Parameters • Nf • Initial tests indicate that Nf set at a high value (15) causes a large load on machine when first batch of files completes. Subsequent batches time-out. • Caused by post-transfer SRM protocol negotiations occurring simultaneously • Ns • > 1 caused slower rates for ¾ of the SRM/filesystem combinations • Multiple streams causes a file to be split up and sent down different TCP channels • This results in “random writes” to the disk. • Single streams cause the data packets to arrive sequentially and can be written sequentially also

  24. Future Work Use SL4 as OS allows testing of 2.6 kernel Different stripe size for RAID configuration TCP read and write buffer sizes Linux kernel-networking tuning parameters Additional hardware, e.g. More disk servers More realistic simulation Simultaneous reading/writing Local file access Other filesystems? e.g. reiser, but this filesystem is more applicable to holding small files, not the sizes that shall exist on the LCG

  25. Conclusions • Choice of SRM application should be made at site level based on resources available • Using newer high performance filesystem jfs or xfs increases inbound rate • Howto move to xfs filesystem without loosing data http://www.gridpp.ac.uk/wiki/DPM_Filesystem_XFS_Formatting_Howto • High value for Nf • Although too high will cause other problems • Low value for Ns • I recommended Ns=1 and Nf=8 for GridPP inter-T2 tests that I'm currently conducting

More Related