140 likes | 254 Views
dCache Locality Performance Testing. Sarah Williams Indiana University 2013-09-13. Test Cases. Downloading the files via xrdcp 5 MB and 2 GB files dummy files, full of 0’s Directly access files via readDirect
E N D
dCache Locality Performance Testing Sarah Williams Indiana University 2013-09-13
Test Cases • Downloading the files via xrdcp • 5 MB and 2 GB files dummy files, full of 0’s • Directly access files via readDirect • The test file is NTUP_SMWZ.01120689._000089.root.1, a fairly typical user input file of size 4.6 GB. • The tests read 10%-90% of the file, in 10% increments • readDirect does not allow us control the bytes read, only number of events read. I ran a series of tests to find the number of events read which would correspond to 10-90% of bytes read.
Test Conditions • The test clients are worker nodes temporarily removed from the cluster. They are running only the OS (Scientific Linux 6.4), Condor in an offline mode, and the tests themselves. • The tests were run serial and single-threaded. • The servers are our production servers. The tests are spread across 10 servers. • The cluster was running a normal production load. The LAN network and WAN connections showed typical usage. Tests from times of non-typical conditions were discarded.
Network conditions Remote Storage Remote Storage Remote Storage Test Worker Nodes Clients ran at IU, with local storage at IU and remote storage at UC. Note this route does not use LHCOne. Near-future network improvements are not shown. LAN Storage [root@iut2-c200 ~]# traceroute uct2-s20.uchicago.edu traceroute to uct2-s20.uchicago.edu (128.135.158.170), 30 hops max, 60 bytepackets 1 149.165.225.254 (149.165.225.254) 20.328 ms 20.325 ms 9.372 ms 2 et-10-0-0.2012.rtr.ictc.indiana.gigapop.net (149.165.254.249) 0.448 ms 0.491 ms 0.484 ms 3 149.165.227.22 (149.165.227.22) 5.087 ms 5.196 ms 5.273 ms 4 10.4.247.230 (10.4.247.230) 5.229 ms 5.396 ms 5.547 ms 5 uct2-s20.uchicago.edu (128.135.158.170) 5.323 ms 5.347 ms 5.220 ms
Test Code • FAX End-user tutorial code base: • https://github.com/ivukotic/Tutorial/ • readDirect opens a ROOT file and reads the specified percentage of events from it. • Locality-caching specific code: • https://github.com/DHTC-Tools/ATLAS/tree/master/Performance%20Tests/Locality%20Caching • test_local.sh is run directly on the worker node, and runs the local disk test with memory flushing. The line that executes ‘releaseFileCache’ can be commented out to do non-flushed tests. • test_direct.sh, test_hit.sh and test_miss.sh are run on the management host and ssh to the test hosts to run the tests. • ref_points.txt contains the set of percentage of events read needed to get the correct percentage of bytes read for readDirect • releaseFileCache removes a specific file from the Linux page cache • pooltest*.txt contain the basenames of the files used for the tests. The directories are hardcoded in the scripts. The filename is also the name of the data server the file resides on, ex. ‘uct2-16_1’ • testshosts.txt contains the hostnames of the worker nodes to use for testing.
Small file downloads In these small files tests, the optimum strategy is no caching. You can see in the chart below that cache hit downloads take about 700 ms longer than non-cached downloads. This is caused by the small overhead caching imparts on the central dCache manager, which must look up what data servers a client is permitted to read and check if the requested file is on one of those servers.
Large file downloads Larger files show the advantages of caching in an environment where files will be reused. Note that the time to download a file on a cache miss is roughly the sum of an non-cached download and a cache hit download. Cached performance is much more consistent than non-cached.
Local disk (memory cached) tests The test file was pre-loaded to local disk, and then the tests were run on it sequentially. Due to memory caching in the Linux kernel, there was no actual disk IO during these tests. This is not necessarily a realistic test case. In production there would be N jobs competing for memory, and it would be impossible for them all to keep their input files in memory. A more realistic case would be to remove the file from memory between tests.
Local disk (memory flushed) tests The test file was pre-loaded to local disk, and then the tests were run on it sequentially. Between each test, the releaseFileCache utility was used to remove the test file from memory. The tests were narrowed down to the two faster processors, the X5660 and E5440, to simplify testing.
WAN reads (caching disabled) WAN tests are vulnerable to network conditions on inter-campus links, making them more variable than LAN tests
Cache Hits Reads over the LAN show more consistency than WAN tests.
Cache Misses Cache miss tests results are equivalent to cache hit results plus 90s. We can infer that it takes 90s to transfer the file from remote storage into local storage, at about 52 MB/s. Standard deviations are higher due to the inclusion of the inter-site link, the remote storage server and the local storage server, all of which are also serving the active cluster.
Direct comparison of caching strategies No caching is the optimum strategy when less than 75% of the file is read. When more than 75% is read, caching becomes optimal.
Conclusions • Caching is less effective for very small files and for jobs that read only part of the file • Cache is most effective medium to large files, when the entire file is downloaded, or when the file is reused • Next steps: • Turn on locality caching for production nodes and monitor. Monday 9/16?