210 likes | 399 Views
XROOTD Storage. Recent directions. Fabrizio Furano. The ALICE recipe for storage. Many sites, exposing the XROOTD protocol Native XROOTD A few with DPM+XROOTD One with CASTOR+XROOTD A few with Dcache’s Xrootd protocol implementation Native XROOTD + 2 plugins + MonALISA
E N D
XROOTD Storage Recent directions Fabrizio Furano
The ALICE recipe for storage • Many sites, exposing the XROOTD protocol • Native XROOTD • A few with DPM+XROOTD • One with CASTOR+XROOTD • A few with Dcache’s Xrootd protocol implementation • Native XROOTD + 2 plugins + MonALISA • In a simple bundled setup • Alien points directly to single SEs • Privileges local data according to the catalogue’s content • OCDB accessed via WAN XROOTD storage and ALICE AF - Recent directions
An unique protocol • Having an unique WAN+LAN compliant protocol allows to do the right thing • Exploit locality whenever possible (=most of the times) • Do not worry too much if a job accesses some data files which is not in the same site. This has to be possible and foreseen. • Explicitly creating 100s of replicas just for a job takes much more time and risk. • Access condition data ONLY via WAN XROOTD storage and ALICE AF - Recent directions
The XROOTD way • Each server manages a portion of the storage • many servers with small disks, or • fewer servers with huge disks • Low overhead DB-free aggregation of servers • Gives the functionalities of an unique thing • A non-transactional file system • Efficient LAN/WAN byte-level data access • Protocol/architecture built on the tough HEP requirements XROOTD storage and ALICE AF - Recent directions
What we can do • Build efficient storage clusters • Aggregating storage clusters into WAN federations • Access efficiently remote data • Build proxies that can cache an external repository • And increase the data access performance (or decrease the WAN traffic) through a decent ‘hit rate’ • Build hybrid proxies • Caching an external repository while storing local data locally • In practice, the ‘Jamboree demonstrator’ XROOTD storage and ALICE AF - Recent directions
Aggregated sites • Suppose that we can easily aggregate sites • And provide an efficient entry point that “knows them all natively” • We could use it to access data directly • We could use it as a building block for a proxy-based structure called VMSS • If site A is asked for file X, A will fetch X from some other ‘friend’ site, though the unique entry point • A itself is a potential source, accessible through the same entry point XROOTD storage and ALICE AF - Recent directions
Xrootd Cmsd Local clients work Normally at each site Any other Xrootd site Xrootd site Xrootd site The VMSS A globalized cluster ALICE global redirector A smart client could point here Virtual Mass Storage System … built on data Globalization Missing a file? The storage asks to the global redirector Gets redirected to the right collaborating cluster, and fetches it. Immediately. XROOTD storage and ALICE AF - Recent directions
Xrootd Xrootd Xrootd Xrootd Xrootd Cmsd Cmsd Cmsd Cmsd Cmsd The ALICE CAF storage • Data is proxied locally to adequately feed PROOF • From the 91 AliEn sites ALICE CAF Data mgmt tools AliEn XROOTD storage and ALICE AF - Recent directions
The SKAF/AAF storage • Take a PROOF cluster, with XROOTD storage, make it easily installable and well monitored (MonALISA) • Add the xrd-dmpluginby M.Vala • Transform your AF into a proxy of the ALICE globalized storage, through the ALICE GR • If something needed is not present, it will be fetched in FAST • Also support sites not seen by the GR, through internal dynamic prioritization of the AliEn sites. • Data management: how does the data appear? • (Pre)stagingrequests • This means that it works with the usual ROOT tools but also without • Suppose that an user always runs the same analysis several times • Which is almost always true • The first round will be not so fast but working, the subsequentswill be fast • The first one was the ALICE SKAF (Kosice, Slovakia) XROOTD storage and ALICE AF - Recent directions
AliEn LFN/PFN • Often a source of misunderstandings • AliEn has LFNs • They are the user-readable names • AliEn converts them to PFNs • The ugly filenames with numbers • An AliEn PFN is considered by XROOTD as an XROOTD LFN • XROOTD takes care internally of its PFN translation • Hiding the internal mount points • At the end: • USERS see Alien LFNs • SYSADMINS see XROOTD PFNs (= Alien PFNs with a prefix) XROOTD storage and ALICE AF - Recent directions
The ALICE PROOF farms • Historically, the *AF admins didn’t like to deal with the AliEn PFNs • The ugly filenames made by numbers • They wanted to store only LFNs (i.e. the human-readable filenames) • So,Afs are ALREADY storing native LFNs • If these XROOTD-based storages get aggregated by the Global Redirector: • Their content will be accessible as a whole, with no need of translating names through AliEn, the files are there with their true name. • Interesting wild experiment (pioneered by SKAF) • The *AFs could give data each other, by using the VMSS • ATLAS-US is doing this as a demonstrator for tier-3s • So, a part of the ALICE storage could be accessed directly with nice names, skipping theoverhead of theAliEn xlation. XROOTD storage and ALICE AF - Recent directions
What’s needed… at the end? • The storage part acting as an automatic stager (an LFN-based proxy of the ALICE storage) • Like thisby default now ! • Looks for friend AFs hosting the LFN through the GR • Eventually, look in AliEn-only SEs • Through the AliEn mechanism (lfn->guid->pfn) • And keep the file named with the LFN • Internally prioritizes sites with a “penalty” mechanism • WAN accessibility of the cluster, without NATs • OR: a small set of machines that proxy it through the firewall (maybe a future dev) XROOTD storage and ALICE AF - Recent directions
PROOF on the GRID? • GRID WNs to start PROOF interactive workers • Ongoing interesting developments, e.g. PoD by Anar Manafov • Data globalization/proxying seems an interesting match to feed them with data • Ideas are welcome • The purpose is: • Give handles to build a lightweight/dynamic Data Management structure • Whose unique goal is to work well • Enable interactivity for users XROOTD storage and ALICE AF - Recent directions
Proxy sophistication • Proxying is a concept, there are basically two ways it could work: • Proxying whole files (e.g. the VMSS) • The client waits for the entire file to be fetched in the SE • Proxying chunks (or data pages) • The client’s requests are forwarded, and the chunks are cached in the proxy as they pass through • In HEP we do have examples of the former • It makes sense to make also the latter possible • Some work has been done (the original XrdPss proxy or the newer, better prototype plugin by A.Peters) XROOTD storage and ALICE AF - Recent directions
The eXtreme Copy • Let’s suppose that we have to get a (big) file • And that there are several replicas in different sites • Big question: where to fetch it from? • The closest one? • How can we tell if it’s the closest? Closest to what? Will it be faster as well? • The best connected one? • It can always be overloaded or weak or broken • Whatever we choose, the situation can change over time • Instead we want always the max efficiency XROOTD storage and ALICE AF - Recent directions
The eXtremeCopy Copy program xrdcp –x Wants to get ‘myfile’ from the repository Locate ‘myfile’ A globalized cluster (ALICE global redirector) Xrootd Cmsd Any other Xrootd site Xrootd site A Xrootd site B XROOTD storage and ALICE AF - Recent directions
Open items (1/2) • Hot: putting clients into servers (e.g. to make efficient proxies) • Or: different criteria to fully differentiate clients in the same app • E.g. How to instantiate together: • a client tuned for WAN TTreeCache-based random access • one optimized for blasting non-TTreeCache LAN traffic ? • one optimized for large files xfers • The Extreme Copy : Torrent-like dynamic multiserver file fetching • Needs the previous item to be really strong • Components for site cooperation • Proxies and caching proxies (proofs of concept right now) • Bandwidth/queuing manager (early alpha) • ‘Personal’ persistent caching proxy, caching chunks in a local disk • A full-featured ‘xrd’ command line interface • The current one is a quite rough tool XROOTD storage and ALICE AF - Recent directions
Open items (2/2) • Client-side data management funcs (e.g. ‘recursive ls’ or ‘df’): good level but incomplete by now • WAN performance: huge breakthrough, still to gain • Both for file xfer and data access (TTreeCache and not) • A robust and complete server-to-server file copy • ROOT integration: very good quality, but still to gain • Only partially asynchronous (XrdClient can be fully async instead) • Will be more evident with the parallelization of the computing, I/O will likely become the bottleneck again • More “Intelligent” readahead • An homogeneous, top-class support structure XROOTD storage and ALICE AF - Recent directions
Greedier data consumers • In the data access frameworks (e.g. ROOT) many things evolve • Applications tend to become more efficient (=greedier) • Applications exploiting multicore CPUs will be even more • An opportunity for interactive data access (e.g. from a laptop) • A challenge for the data access providers (the sites) • The massive deployment of newer technologies could be the real challenge for the next years XROOTD storage and ALICE AF - Recent directions
Questions? Thank you! XROOTD storage and ALICE AF - Recent directions