200 likes | 350 Views
CASTOR status and plan for future developments. Presentation to LCG PEB 12/08/2003 Olof Bärring, CERN-IT. Outline. CASTOR status Plan for future developments Conclusions. CASTOR status. Usage at CERN ~1.5PB data ~9 million files Operation Repack utility close to production quality
E N D
CASTOR status and plan for future developments Presentation to LCG PEB 12/08/2003 Olof Bärring, CERN-IT
Outline • CASTOR status • Plan for future developments • Conclusions LCG PEB, CASTOR Status and Plan for future developments
CASTOR status • Usage at CERN • ~1.5PB data • ~9 million files • Operation • Repack utility close to production quality • 10 – 20TB repacked so far • Repack of 9940A B media possible but requires man-power and tape drives • TMS being phased out • CDR running smoothly • COMPASS recently peaked at 6TB/day (70MB/s) • Support • User support in Remedy since April • Second level handled by operation team • Third level handled by development team • Support to external sites is being discussed in HEPCCC LCG PEB, CASTOR Status and Plan for future developments
CASTOR@CERN evolution Drop due to deletion of 0.75Million ALICE MDC4 files (1GB each) LCG PEB, CASTOR Status and Plan for future developments
Plan for future developments • The vision • Some problems with today’s system • Proposal • Ideas • Architecture • Request registration and scheduling • Catalogues • Disk server access and physical file ownership • Some interesting features • Project planning and progress monitoring LCG PEB, CASTOR Status and Plan for future developments
Vision... • With clusters of 100s of disk and tape servers, the automated storage management faces more and more the same problems as CPU clusters management • (Storage) Resource management • (Storage) Resource sharing • (Storage) Request scheduling • Configuration • Monitoring • The stager is the main gateway to all resources managed by CASTOR Vision: Storage Resource Sharing Facility LCG PEB, CASTOR Status and Plan for future developments
... and a caveat • The vision is to provide a scalable Storage Resource Sharing Facility • The hope is to achieve similar efficiency for the storage resource utilization as LSF provides for the CPU resources today • However: nothing in the proposed design enforces a single shared stager instance • Today’s configurations with some 40 independent stagers is still OK LCG PEB, CASTOR Status and Plan for future developments
Some problems today’s stager • Lot of code for supporting direct tape access • No true request scheduling • Throttling, load-balancing • Fair-share • Resource sharing not supported • Stagers are either dedicated or public • Dedicated resources Some disk servers are 100% full/loaded while others are idle • Public resources No control of who gets how much of the resources. Invites for abuse • Operational issues • No unique request identifiers • Problem tracing difficult stagein –V P01234 –v EK4432 –q u –f MYHIGGSTODAY \ -g 994BR5 –b 8000 –F FB –L 80 –C ebcdic,block –E skip LCG PEB, CASTOR Status and Plan for future developments
ProposalIdeas for the new stager • Pluggable framework rather than total solution • True request scheduling: delegate the scheduling to a pluggable black-box scheduler. Possibly using third party schedulers, e.g. Maui or LSF • Policy attributes: externalize policy engines governing the resource matchmaking. Start with today’s policies for file system selection, GC, migration, .... Could move toward full-fledged policy languages, e.g. implemented using “GUILE” • Restricted access to storage resources to achieve predictable load • No random rfiod eating up the resources behind the back of the scheduling system • Disk server autonomy as far as possible • In charge of local resources: file system selection and execution of garbage collection • Loosing a server should not affect the rest of the system PROPOSAL LCG PEB, CASTOR Status and Plan for future developments
Physics application RFIO API stage API Access /castor/… Request /castor/… Request queue RequestHandler Cns VMGR mstaged Request scheduler VDQM Scheduling policies Global catalogue Common RTCOPY client: rtcpclientd Local catalogue sstaged Local Request scheduler rfiod Get physical path Local policies rfiod Get disk server rtcpd Start tape request PROPOSAL Existing module New module Existing, modified External LCG PEB, CASTOR Status and Plan for future developments
ProposalRequest scheduling (1) • A “master stager” (mstaged) receives all CASTOR file access requests • Authenticate client and register the request • Queue the request • The request registration is independent of the scheduling. It has to be designed to cope with high request load peaks • Pluggable scheduler manages the queue and applies configured policies • E.g. requests from gid=1307 should only run on atlas001d, ... PROPOSAL LCG PEB, CASTOR Status and Plan for future developments
Typical file request Authenticate “castor” Get Jobs Disk server load Request registration: Must keep up with high request rate peaks Store request user “castor” has priority File staged? DN=castor Read: /castor/cern.ch/user/c/castor/TastyTrees Request scheduling: Must keep up with average request rates Run request on pub003d ProposalRequest handling & scheduling Fabric Authentication service e.g. Kerberos-V server RequestRegister Request repository (Oracle, MySQL) PROPOSAL Scheduler Thread pool Catalogue Scheduling Policies Dispatcher LCG PEB, CASTOR Status and Plan for future developments
ProposalRequest scheduling (2) • A “slave stager” (sstaged) runs on each disk server • Executes and controls all requests scheduled to it by the mstaged • Takes care of local resource scheduling such as file system selection and execution of garbage collector • The sstaged also gathers relevant local load information for the central scheduler PROPOSAL LCG PEB, CASTOR Status and Plan for future developments
ProposalCatalogues • Request catalogues • Central Repository of all running requests + request history • Predictable load facilitate load balancing • Usage accounting from request history • Fair-share • File catalogues • Central CASTOR file disk server mapping allows for finding files • Local CASTOR file physical filename catalogue on the disk servers PROPOSAL LCG PEB, CASTOR Status and Plan for future developments
ProposalDisk server access • Today a user can access files on disk servers either by • The CASTOR file name /castor/cern.ch/... • The physical file name /shift/lhcb003d/... • With the new stager we restrict • To only allow for access by CASTOR file name • All physical files are owned by a generic account (stage,st) and their paths are hidden from direct RFIO access PROPOSAL • WHY???? LCG PEB, CASTOR Status and Plan for future developments
ProposalDisk server access • Avoid two databases for file permissions & ownership • CASTOR name server • File system holding physical file • Facilitate migration/recall of user files • Files with different owners are normally grouped together on tapes owned by a generic account (stage,st) • Would like to avoid setuid/setgid for every file • Avoid backdoors: all disk server access must be scheduled PROPOSAL An useful analogy: forbid interactive login access to the batch nodes in a LSF cluster LCG PEB, CASTOR Status and Plan for future developments
ProposalSome interesting features • Modifications to the tape mover allows for adding files to running tape requests • Migration (and recall) controlled by a new central component called rtcopyclientd • Initiates the tape requests • Schedules the file copies just-in-time when the tape is positioned • Dynamically expanding migration streams • Better load-balancing is possible since the files copies are scheduled according to the load • Allow for seeks in RFIO v3 (streaming) mode PROPOSAL LCG PEB, CASTOR Status and Plan for future developments
Project planning and monitoring • Detailed plan in proposal document • http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW • Three milestones: • October -03: Demonstrate concept of pluggable scheduler and high rate request handling • February -04: Integrated prototype of the whole system • April -04: Production system ready for deployment • Progress monitoring • Using the Project/Task manager provided by LCG Savannah portal for the CASTOR project: http://savannah.cern.ch/projects/castor/ • Progress reviews at each milestone? are the experiments interested in providing efforts for helping with review? LCG PEB, CASTOR Status and Plan for future developments
Logging facility • Designed and implemented • Being tested (test-suite ready) • Cupv (CASTOR user privilege validation) daemon first candidate for using the new logging Progress up to now • mstaged/sstaged design • Prototype with standard Maui scheduler • Good collaboration with Maui developers. ~weekly phone-confs • Also looking at LSF scheduler interface • Catalogue design • Delayed due to problems with important tasks outside the plan (CASTOR-GridFTP service) • To be re-planned • rtcpd modifications, design • Design proposal ready. • Prototype being developed • Security design • Design document draft • Recommendation GSS-API for authentication. • Timing tests between Kerberos-V and GSI. Former is >100 times faster • Final decision depend on choices for CERN security infrastructure. • Prototype with GSS-API for RFIO being developed • GSS-API plug-in for gSOAP developed. Being used for CASTOR SRM v1.0 server LCG PEB, CASTOR Status and Plan for future developments
Conclusions • CASTOR@CERN status OK. Good progress on operation improvements (Repack, TMS phasing-out) • New CASTOR stager: The proposal aims for • A pluggable framework for intelligent and policy controlled file access scheduling • Evolvable storage resource sharing facility framework rather than a total solution • File access request running/control and local resource allocation delegated to disk servers • The progress is on track except for the design of the new catalogue. Milestones are still OK. LCG PEB, CASTOR Status and Plan for future developments