70 likes | 81 Views
Explore the potential of Xcache in different applications, including file transfer, remote access, data streaming, HPC, and optimum workflow.
E N D
XcachePossibilities 4th US ATLAS HPC Meeting LBNL Berkeley CA September 26, 2019 Andrew Hanushevsky, SLAC http://xrootd.org
Xcacheand File Transfer September 26, 2019 • Writable Xcachecaches not supported • This is a strictly pull model • We do support FRM caching for full files • Push or pull mode possible • Not clear we need this at all • Well, beyond what we have, unless… • We want a mix of Rucio and on-demand styles • Possible to do but would require FTE’s
Xcacheand Remote Access September 26, 2019 • Perhaps the most used model • Essentially provides a CDN for remote data • Definitely, the most successful application • However… • Maintaining data integrity expectations is difficult • Bad data is sticky in a cache and hard to find • Work on the way to improve this situation • Use TLS to weed out transmission errors • Overkill but the fastest solution for now • Enhance file system integrity for data at rest
Xcacheand Data Streaming September 26, 2019 • Block caching simulates true streaming • Prefetching practically eliminates data jitter • Anything missing? • Perhaps deletion upon close • Technically, single use streams so data not needed • However, purge takes care of this eventually • Anything else to consider • Server-less Xcache may be very relevant • Certainly applicable for single use streams
Xcacheand HPC September 26, 2019 • Optimum access uses HPC FS as cache • Xcacheruns on DTN’s • Except for random outages best location • Allows RDMA access to fully cached files • E.g. LustreXcache+ direct cache access @ NERSC • Workable but not the best solution • Requires file to be fully cached (low probability) • Has security implications in terms of access • Best to add RDMA support to Xcache • Requires additional FTE effort
XcacheEffective Use September 26, 2019 • The following is true of any cache • Effective use is proportional to data reuse • Only two known proposals on this • Virtual Placement from Ilija • This is simply a Rucio placement optimization • Non-simulated (i.e. real) effectiveness unknown • Cache affinity scheduling from Andy • Requires Panda to add cache as a scheduling resource • Concept is effective for LSST query scheduling • Will it work with Panda?
XcacheOptimum Workflow September 26, 2019 • Since we need high reuse…. • Caches most suitable to analysis • Will not help production, unless… • Used as a streaming appliance for event delivery • To reduce disk usage and steady the stream • Refer to the previous slide on streaming