440 likes | 529 Views
SRM V2.1: Additional Design Issues. Participants: meeting held at CERN December, 2002. JLAB : Bryan Hess, Andy Kowalski Fermi : Don Petravick, Timur Perelmutov LBNL : Alex Sim, Arie Shoshani WP2 : Peter Kunszt, Heinz Stockinger, Kurt Stockinger, Erwin Laure
E N D
SRM V2.1:Additional Design Issues Participants: meeting held at CERN December, 2002 JLAB: Bryan Hess, Andy Kowalski Fermi: Don Petravick, Timur Perelmutov LBNL: Alex Sim, Arie Shoshani WP2: Peter Kunszt, Heinz Stockinger, Kurt Stockinger, Erwin Laure WP5: Jean-Philippe Baud, Stefano Occhetti, Jens Jensen, Emil Knezo, owen synge
Agenda • Goal • Agree on open SRM issues • Main issues • Space reservations • Directory functions • Access control • Lifetime negotiations (space and file) • Other issues • Meaning of lifetime • Renew lifetime • Revisit srmCopy (push mode) • Revisit call-backs with WSDL handles • Architecture • Where do SRM fit?
High Level View of Demo Setup Client (USER/APPLICATIONS) CASTOR SRM V2.0 uniform interface SRM V1.1 SRM SRM JASMine SRM Enstore
Disk Cache Disk Cache How Was It Done?only MSS-specific module modified HRM-HPSS HRM-HPSS Disk Resource Manager (DRM) Disk Resource Manager (DRM) Tape Resource Manager (TRM) Tape Resource Manager (TRM) HPSS-specific Access Module NCAR-specific Access Module Specialize for NCAR-MSS NCAR-MSS
Disk Cache Disk Cache SRMs in Action : PPDG Anywhere HRM-Client Command-line Interface HRM-COPY (thousands of files) BNL LBNL HRM-GET (one file at a time) HRM (performs writes) HRM (performs reads) GridFTP GET (pull mode) Network transfer archive files stage files
Web-Based File Monitoring Tool • Shows: • Files already transferred- Files during transfer • Files to be transferred • Also shows for • each file: • Source URL • Target URL • Transfer rate
Disk Cache Disk Cache SRMs in Action : PPDG Anywhere HRM-Client Command-line Interface HRM-COPY (thousands of files) BNL LBNL HRM-GET (one file at a time) HRM (performs writes) HRM (performs reads) GridFTP GET (pull mode) Network transfer archive files stage files
Web-Based File Monitoring Tool • Shows: • Files already transferred- Files during transfer • Files to be transferred • Also shows for • each file: • Source URL • Target URL • Transfer rate
Recent Measurements of large multi-file replication Shows that the network is the bottleneck
HRMs and GridFTP Using HRM protocol New: GridFTP-HPSS through HRM Client Client GridFTP-API SRM-API GridFTP entry HRM SRM-API HRM GridFTP-API GridFTP GridFTP move
GridFTP-HRM-Layerimplementation detail Client GridFTP-API 1a 1b GridFTP entry GridFTP move GridFTP exit FTP- HRM Layer 2a Shared memory 2b HRM Corba 3b 3a 1a: stor/retv 1b: hrm_get/hrm_put 2b: call_back 2a: unblock semaphore 3a: success_code 3b: hrm_release
SRM V2.1:Additional Design Issues Current version proposal by Alex Sim, Junmin Gu, Arie Shoshani
client client Replica catalog Request Interpreter Request Executer request planning Network Weather Service HRM DRM DRM tape system Disk Cache Disk Cache Disk Cache Request Manager and SRMs ... Client’s site logical query property-file index logical files site-specific files site-specific files requests pinning & file transfer requests network ...
client client Replica catalog Request Interpreter Request Executer request planning Network Weather Service HRM DRM DRM tape system Disk Cache Disk Cache Disk Cache Request Manager and SRMs ... Client’s site logical query property-file index logical files site-specific files requests site-specific files pinning & file transfer requests network ...
How to do direct file access • Now: • use srmGet with desired protocol, e.g. rfio, dcap (desy), file:, globus-xio • Expect to call srmRelease when done • Explore: how to use this with “redirect” • Agreed to have at least one common protocol: • Gridftp for file transfer • ??? For direct access (should be POSIX compliant)
Space Reservation • Space types • Option 1: volatile, durable, permanent • Option 2: volatile, permanent • Why support “durable” space? • Reminder: durable files are temporary files that can only be removed by owner/administrator • For systems that do not intend to have permanent space (e.g. DRMs), but wish to support durable files • Recommendation • Support space reservations for all three types • Size of reservation is negotiable • Request for additional space possible • Lifetime of reservation is negotiable • Request for lifetime extension possible
File assignment to spaces • Implications • Need to acquire durable/permanent space to store durable files • Need to acquire permanent space to store permanent files • Can change file types within a space according to above figure Space Types Volatile durable permanent File Types Volatile durable permanent
Semantics of spaces • File assignment • File lifetime cannot exceed space lifetime • Reservation guarantees • Volatile: soft guarantee – space can be reclaimed, minimum is guaranteed • Durable, permanent – hard guarantee • Space removal/release • Permanent – by client • ReleaseSpace – requires all files in space released • ForceReleaseSpace – remove all files as well • Release pins, reclaim space if needed • Durable – by client or administrator if lifetime expires • ReleaseSpace – requires all files in space released (files with lifetime expired are not considered “released”) • ForceReleaseSpace – remove all files as well • Release pins, reclaim space if needed • Volatile • Assign minimum quota if request is still active • Otherwise, reclaim space
Guaranteeing quotas • Add the concept of Durable – “best effort” • In addition to guaranteed • Volatile is always “best effort” • Managing quotas – is an SRM choice • At first, start with BE, ignore quota • SRM adjust quotas and other policies • VO responsibility SRM should have reporting capabilities
Parameters of space reservation request / negotiations Function: srmReserveSpace • In parameters • User_ID • Type of space • Size (granularity: MBs) • Lifetime (granularity: minutes) • Return parameters • Space_size_approved (MBs) • Lifetime_approved (minutes) • Request_ID • Hold lifetime (minutes) Function: srmConfirmSpace • In parameters • User ID • Request ID • Return parametrs • OK, refuse
Space request functions • srmReserveSpace() • srmReleaseSpace() option:force • srmResizeSpace() • Total space desired • srmCompactSpace() option:dynamic • Reduce space to size of active files (non-released files) • Reduce space as soon as files get released • If srmResizeSpace • srmGetCurrentSpace() • Comments: • Do not use /durable etc. as a way to address the space. • E.g. srmPut (UserID=john, dir=xyz file=foo, space=durable) • UserID not needed if security system is used (optional) • StorageAuthenInfo can be optionally provided (optional)
Space Usage Reporting Reporting granularity per period? Per event? per file? Per request? • Space used (in MB-Minutes) • Space reserved (in MB-Minutes) • Usage level (in MBs) - optional • Total MBs of data transferred on behalf of user associated with space • MBs are binaries – 1024x1024 bytes
File Sharing in Spaces • Files (owned) in any spaces can be shared by requesters • Provided that requester has permission to “read” • In volatile space • File size is counted against requester’s quota • In durable / permanent space • Requester does not have to have durable / permanent space • File size not counted against requester’s quota • File is not moved to requester’s space • If owner removes / releases file, wait till second client releases file or lifetime expired • And no new clients can get access to it • srmReassignToUser() • Owner give permission to reassign directory/file to another user • Lifetime should be specified • Space continues to be charged to owner till srmMoveTo Space happens • srmMoveToSpace() • move from: directory of in one space to directory into requerter’s space • Can designate a file type different from original • Recommendation for implementation: do not copy files
Directory Semantics • Client can establish directories for any of their spaces • Including “volatile” spaces ! • File sharing in “volatile spaces” must be supported(implementation choice – e.g. links) • Only one space type per client ! • Clients refer to spaces as: • volatile, durable, permanent (reserved terms) • Top level directory names is local DRM choice • E.g. /home/srm/john/durable • TFNs (in TURLs) returned to clients include 4 levels + user established directories • E.g. /home/srm/john/durable/abc/xyz/file.foo(here abc, xyz are directories john setup)
Directory Operations • ls, mkdir, mv, cd, rm, rmdir, cd, pwd • (Prefix with srm?) • Relative to space name: /volatile, /durable, /permanent • Default is /volatile • Allow –r (recursive) in rm, rmdir, ls • Allow for ls: max_list, offset (Chip Watson) • Rm • Hard delete in /durable or /permanent space • Soft delete (advisory delete) in /volatile • Access control • In future: all access through CAS • In current version: • Defaults • Volatile: read-world • Durable / permanent: read-owner • User can use chmod • Volatile – chmod may be refused by SRM
Lifetime Issues • Option 1: Relative time since response to request is made? • e.g. 30 min from now • what if communication is slow? • Agreed to have relative time • Option 2: Absolute time • how to synchronize clocks • Agreed to have creation time with GMT (best if synchronized with NTP or atomic clock) • Lifetime granularity – Keep it seconds to simplify implementation • Should a lifetime of a file be negotiable? Yes • Per request? Yes-optional (default applies) • Per file? Yes-optional (default applies) • Per space? Yes-optional (default applies) • If so, need to add confirmation – no confirmation, client can release • Alternative: not negotiable (covered by above) • Volatile files: srm’s choice • Durable: client’s choice (not exceeding space lifetime)
Security Issues • How to support multiple security models • GSI, Kerberos, SSL, etc… over http • Need a common ground (GSI) • GSI over HTTP for communication - OK • GridFTP (gsiftp) for data transfer – OK • currently you need to be a user of the system to put files in that system • Can also be done by group ID, and SRM manage quotas • E.g. expereince with GSI–kerboros map (at Fermi?) • How to deal with systems that have their own internal security : optional StorageSystemID • E.g. HPSS at NERSC • Who is enforcing file access permissions in a Virtual Organization (VO) • If CAS in the future, what mechanism? • How do SRMs report usage to CAS?
srmGet/srmPut SRM/ No-SRM SRM Client Client-FTP-put (push) FTP-get Client-FTP-get (pull) srmCopy SRM/ No-SRM SRM Client SRM-FTP-get (pull) SRM-FTP-put (push) Revisit Features: srmGet, srmPut, srmCopy
Revisit Features: get, put • srmGet, srmPut: relative to client • srmGet: (Client wants to pull a file from the SRM) • If file already in SRM space => pin, return location • If file not in SRM space => allocate space, get file from its source location, pin, call-back when file arrives • New: Can specify location in a directory, file-type, space-type • Followed by srmRelease • srmPut: (Client wants push a file into SRM) • Allocate space, return location • New: Can specify location in a directory, file-type, space-type • Followed by srmPutDone
Revisit Features: copy • srmCopy: relative to SRM • srmCopyPull: Copy from remote location to SRM • Call-back when done • Option: notify remote source to release file when done • srmCopyPush: Copy from SRM to remote location (add?) • Call-back when done • Option: Release file when transfer done • Example scenario: ask an HRM to push file to a site that have only GridFTP • Discussion: Don Petravick
Revisit Features: directory functionality • Should we allow specification of a directory for srmGet, srmPut, srmCopy? • srmGet • Now: srmGet {LFN, SURL} • Add: srmGet (Dir=path/directory_name) • Recursive? No • srmPut • Now: srmPut {LFN, StURL} • Add: srmPut [{LFN}, StDir= path/directory_name] • Recursive? No • srmCopy • Now: srmCopy {source-URL} • Add: srmCopy (source-URL, target-StDir) • Add: srmCopy (source-StDir, target-StDir) • Recursive? - srmCopy (source-StDir, target-StDir, - r)
Revisit Features: LFNs • LFN can be very long • Long path name, long filename • Any ideas • SRMs could assign unique IDs internally • Should IDs be visible externally, like RDBMSs?
Revisit Features: others • “Call_backs" through WSDL – handle • Since supported by WSDL, why not include in Basic Version? – OK, but optional • File and system status functionality • Add: which files in file-set are currently in SRM cache- Yes • what should be reported that is useful for planning? • what is useful for advertising to RepCat and Info-service? • SRM should register itself with Info-service • SRM should be notifying RepCat on deletes • Flag for registering in RepCat • should status be active or passive? • Add max-file-size to get/put/copy requests? • Expect the system to allocate max-file-length and adjust space after file is transferred • Expect system to kill transfer if file size exceeds max-file-length - OK, optional • Remove “srm” from all commands except get, put, copy? srmPrepareGet, srmPreparePut, srmCopy
New feature: include Replica Catalog connectivity • Replica Catalogs • RLS, Jlab-RepCat, … • Private catalogs: STAR-catalog, Babar-SRB, … • Replica management • Globus: grid-ftp + RLS • Desirable: SRM + some catalog • File Replication Service • Robust copying of large number of files (1000’s) • Automatic recovery from transient failures • Logs, dynamic tracking and monitoring • Automatic registration to replica catalogs • Add: post-hook? • After each file gets transferred, call post-hook service • (Also add: pre-hook?) • Call filtering function before file is transferred
SOAP Inter-Operational issue • SOAP implementations are not compatible in complex data types, specially with string array and struct array. • Most array test failures are due to the lack of support for SOAP sparse array and partially transmitted array encodings. Array tests with 'regular' arrays are mostly okay and interoperable • Need a common ground: OGSA/GT3 • In the meantime, use SOAP over https-with-GSI • Implies: GSI using minimum security model
What else? • Issues we do not want to discuss now, but may be important in the future • Interacting with RLS and other catalogs • Pre/post processing hooks • Advertising SRM capabilities • Providing soft guarantees for planning • …
SOAP InterOp test Test according to http://www.whitemesa.com/interop/proposal2.html and http://www.whitemesa.com/interop/proposalB.html
1) Where do SRMs belongin the Grid architecture? : . G N O R 2 S O T R O I E E O T Request Workflow or C C N V I A L Application- Community Consistency Services I I I F Interpretation Request C A T V A I I Specific Data Authorization (e.g., Update Subscription, C C R U M L and Planning Management E T E E P O E Discovery Services Services Versioning, Master Copies) P L S R P V Services Services D I L S I A V T O C C E L L G : R O 1 S N O C I E E E L F T V L Data Filtering or C A Data Data General Data Storage Compute Monitoring/ A I S P R R T N I E Transformation E Transport Federation Discovery Management Scheduling Auditing U T I C C D L N O E I Services Services Services Services (Brokering) (Brokering) Services R U E V L S L M G O R E O E O R C S C E L : S G E E N C Resource I Storage C File Transfer Data Filtering or Database Compute R S R Monitoring/ U Service Resource Transformation Management Resource U G O O N Auditing (GridFTP) Manager Services Services Management S I S R E E A R R H S Y T I V I T Communication Authentication and C E Protocols (e.g., Authorization N TCP/IP stack) Protocols (e.g., GSI) N O C C This figure based on the Grid Architecture paper (new version) I Other Storage R Mass Storage System (HPSS) Compute B Networks A Systems F systems
2) Where do SRMs belongin the Grid architecture? : . G N O R 2 S O T R O I E E O T Request Workflow or C C N V I A L Application- Community Consistency Services I I I F Interpretation Request C A T V A I I Specific Data Authorization (e.g., Update Subscription, C C R U M L and Planning Management E T E E P O E Discovery Services Services Versioning, Master Copies) P L S R P V Services Services D I L S I A V T O C C E L L G : R O 1 S N O C I E E E L F T V L Data Filtering or C A Data Data General Data Storage Compute Monitoring/ A I S P R R T N I E Transformation E Transport Federation Discovery Management Scheduling Auditing U T I C C D L N O E I Services Services Services Services (Brokering) (Brokering) Services R U E V L S L M G O R E O E O R C S C E L : S G E E N C Resource I Storage C File Transfer Data Filtering or Database Compute R S R Monitoring/ U Service Resource Transformation Management Resource U G O O N Auditing (GridFTP) Manager Services Services Management S I S R E E A R R H S Y T I V I T Communication Authentication and C E Protocols (e.g., Authorization N TCP/IP stack) Protocols (e.g., GSI) N O C C This figure based on the Grid Architecture paper (new version) I Other Storage R Mass Storage System (HPSS) Compute B Networks A Systems F systems
3) Where do SRMs belongin the Grid architecture? : . G N O R 2 S O T R O I E E O T Request Workflow or C C N V I A L Application- Community Consistency Services I I I F Interpretation Request C A T V A I I Specific Data Authorization (e.g., Update Subscription, C C R U M L and Planning Management E T E E P O E Discovery Services Services Versioning, Master Copies) P L S R P V Services Services D I L S I A V T O C C E L L G : R O 1 S N O C I E E E L F T V L Data Filtering or C A Data Data General Data Storage Compute Monitoring/ A I S P R R T N I E Transformation E Transport Federation Discovery Management Scheduling Auditing U T I C C D L N O E I Services Services Services Services (Brokering) (Brokering) Services R U E V L S L M G O R E O E O R C S C E L : S G E E N C Resource I Storage C File Transfer Data Filtering or Database Compute R S R Monitoring/ U Service Resource Transformation Management Resource U G O O N Auditing (GridFTP) Manager Services Services Management S I S R E E A R R H S Y T I V I T Communication Authentication and C E Protocols (e.g., Authorization N TCP/IP stack) Protocols (e.g., GSI) N O C C This figure based on the Grid Architecture paper (new version) I Other Storage R Mass Storage System (HPSS) Compute B Networks A Systems F systems
GGF – standards • GLUE • Common minimal grounds • Security • Misc. • GLUE
Basic functionality • Space types • Volatile at minimum • Permanent, durable is optional • Durable: best effort, guaranteed • Space reservations • Space reservation allowed • Directory support • Yes • srmMoveDir • yes • Security • Gsi over http • Transfer protocols • gridFTP • srmChangeFileType • Provided that space of the requested type was acquired • srmCopy • Both push and pull
Action Items (1) • Canonical specification by Arie, everyone check • WSDL v.2.1 : generated by Timur, Alex, Bryan, … • Java translation by Timur • C/C++ translation by Alex • Simple Java client/server based on WSDL v.2.1 compatible with httpg/OGSA/GT3 by Timur • Simple C++ client/server based on WSDL v.2.1 • compatible with httpg/OGSA/GT3 by Alex • LBNL (Alex) will provide/maintain a repository for documentations, WSDL and sample packages • Also LBNL will run test servers for both Java and C/C++ • http://sdm.lbl.gov/srm-wg will contain information • Interface with SRM from the EDG Replica Manager (Reptor)- Heinz and Kurt
Action Items (2) • SRM mailing list to be maintained : by Timur • Participating in GGF as a BoF/RG/WG – Peter to clarify with Peter Clarke • Annual/bi-annual SRM workshop • Next meeting ?