1 / 15

CASTOR / GridFTP

CASTOR / GridFTP. Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003. Outline of this talk. Introduction to CASTOR HSM CASTOR/GridFTP approach GridFTP problems CASTOR/GridFTP test service Configuration issues Usage examples

linh
Download Presentation

CASTOR / GridFTP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7th Collaboration Meeting, Oxford UK July 1st 2003

  2. Outline of this talk • Introduction to CASTOR HSM • CASTOR/GridFTP approach • GridFTP problems • CASTOR/GridFTP test service • Configuration issues • Usage examples • Plan for CASTOR/GridFTP service CASTOR & GridFTP / Emil Knezo CERN

  3. CASTOR • CASTOR Mass Storage System evolved from SHIFT (tape management system of 90’s) • CASTOR is HSM • Today @ CERN: 2066.37 TB of data of 10.51 M files stored in CASTOR • CASTOR provides to users: • Name space • File names are in the form: /castor/domain_name/experiment_name/… for example: /castor/cern.ch/cms/ /castor/domain_name/user/… for example: /castor/cern.ch/user/k/knezo • POSIX compliant I/O: RFIO + 64-bits support, streaming mode; - security CASTOR & GridFTP / Emil Knezo CERN

  4. NAME server RFIO Client VDQM server NAME server VDQM server STAGER TPDAE MON (PVR) RTCOPY CLIENT RTCPD RTCPD (TAPE MOVER) RFIOD (DISK MOVER) MSGD VOLUME manager DISK POOL CASTOR current layout CASTOR & GridFTP / Emil Knezo CERN

  5. GridFTP for CASTOR • Motivation for GridFTP interface to CASTOR • LCG • Data-movement protocol to couple different HSM systems of Tier-1 centers • Used by Replica Management System • Experiments • Offer experiments a secure alternative to rfio and FTP • Support CMS world-wide production starting in July Mid-July 2003: 1TB per day to CASTOR from 12 regional centers February 2004: several TB per day from/to CASTOR • Approach for GridFTP interface to CASTOR • Modification of external GridFTP server to act as rfio-client to CASTOR • Solution already proven for FTP servers • Not enough man-power do develop and maintain our own server • Development time restriction CASTOR & GridFTP / Emil Knezo CERN

  6. GridFTP Control Data 2811 GridFTP process GridFTP server RFIO CASTOR stager Tapes Selected GridFTP server Globus Toolkit GridFTP-1.5 server • Based on wu-ftp 2.6.2 • Widely used • expected good support • Supported GridFTP extensions: • EBLOCK mode • PARALLEL transfer • REST STREAM • DCAU • ERET, ESTO • Also supported: • Third-party transfer • PBSZ, PROT • MDTM • Not supported GridFTP extensions: • STRIPING, SPAS, STOR • ABUF, SBUF CASTOR & GridFTP / Emil Knezo CERN

  7. GridFTP problems • Firewalls • Bi-directional data transfer in EBLOCK mode • Cannot open data-connection – blocked by firewall • Firewalls with NAT • GSI mutual authentication errors • HSM • Data existing in HSM name space are not always readily accessible: • Possible disconnection of idle control channel socket by some firewalls • Third-party transfer from HSM suffers from data-connection accept timeout at the data-receiving end. • Solution • HSM: • Always pre-stage your data in HSM before transfer • Currently with CASTOR “stagein” command; when available with SRM interface. • Firewall: • Do not use firewalls with NAT • Do not block data-connections in firewall CASTOR & GridFTP / Emil Knezo CERN

  8. GEANT US-link DataTAG 1Gb/s 622Mb/s 2.5Gb/s router 350Mb/s half-duplex PIX HTAR 1Gb/s 350Mb/s half-duplex router 1Gb/s  1Gb/s GridFTP server External network connection • Data connections of CASTOR GridFTP server are routed via 1Gb/s High Throughput Access Route (HTAR) • Control connections are routed via PIX • TCP window size is fixed to 64kB if data-connection goes via PIX. • Only high # ports connections to/from CASTOR GridFTP server are routed via HTAR • Configuration issue • Port #s interval currently applicable:<50k,51k> • External GridFTP clients or servers must also select data-connection port #s from the interval of HTAR routed ports, otherwise data channel will go via PIX! • LCG guidelines for used data-connection port numbers can solve this kind of configuration issues. CASTOR & GridFTP / Emil Knezo CERN

  9. CASTOR/GridFTP test-service CERN • Test service in operation from mid-January 2003 • Installation based on • EDG Globus, rel.24 (January – mid.June) • VDT 1.1.8 (since mid.June) • Supports • All EDG GridFTP clients, globus-url-copy • Still on server-code TO-DO list • 64-bit file support (currently no files > 2GB) • CWD, CDUP fails on CASTOR name-space (“..” problem). In the meantime, full path is to be used by clients for CASTOR files • Internal “ls”, currently patched CASTOR’s nsls client used • Test some currently not used GridFTP commands (ESTO, ERET)_ CASTOR wacdr002d GridFTP 1Gbit/s (via HTAR since mid-May) 1 Gbit/s GEANT link rfio GridFTP CASTOR & GridFTP / Emil Knezo CERN

  10. CASTOR stageatlas Serv_1 griftpd GridFTP via HTAR cms001d Serv_2 rfio stagepublic griftpd DNSload-balancing … … Serv_n griftpd UID – stager mapping Evolution of CASTOR/GridFTP service • Set of configurations extended by UID--Stager mapping • DNS-load balancing (still to be verified) • Stager-response logging • Increased data-connection accept timeout (20 min) CASTOR & GridFTP / Emil Knezo CERN

  11. Performance and statistics • Performance • CERN internal transfer was: 5MB/s in/out; now: 7MB/s in/out • Transfer from NIKHEF was 3MB/s in/out; now: not available yet • Standard CERN TCP configuration (64kB TCP buffer size) • Not via HTAR • 10 parallel streams • Statistics • Not properly kept • Ftp-xferlog file – broken file size for outbound traffic • GridFTP-xferlog – repeated file-record per every parallel stream of a transfer • Example: 2 weeks statistics May 26 – June 9: • Transferred 1480 files (1217 inbound, 263 outbound) • 627,425 GB stored to CASTOR via GridFTP wacdr002d service • Main user: ATLAS • gppui04.gridpp.rl.ac.uk, aftpexp.bnl.gov, lscf.nbi.dk CASTOR & GridFTP / Emil Knezo CERN

  12. DN -- User mapping EDG-mechanisms used • grid-mapfile at the moment • Mapping granularity on VO-level (LDAP URL) • Currently un-maintainable to have user-level granularity • No dynamic pool accounts • edg-gridmap.conf: group ldap://grid-vo.nikhef.nl/ou=testbed1,o=alice,dc=eu-datagrid,dc=org alice001 group ldap://grid-vo.nikhef.nl/ou=testbed1,o=atlas,dc=eu-datagrid,dc=org atlas001 group ldap://grid-vo.nikhef.nl/ou=tb1users,o=cms,dc=eu-datagrid,dc=org cms001 group ldap://grid-vo.nikhef.nl/ou=tb1users,o=lhcb,dc=eu-datagrid,dc=org lhcb001 group ldap://grid-vo.nikhef.nl/ou=tb1users,o=biomedical,dc=eu-datagrid,dc=org biome001 group ldap://grid-vo.nikhef.nl/ou=tb1users,o=earthob,dc=eu-datagrid,dc=org ob001 group ldap://marianne.in2p3.fr/ou=ITeam,o=testbed,dc=eu-datagrid,dc=org iteam001 group ldap://marianne.in2p3.fr/ou=wp6,o=testbed,dc=eu-datagrid,dc=org wpsix001 • Up to VO Admin to create subsets of users for other UIDs • One DN – One User restriction • Hard to sell to experiments • VOMS should solve the problem • VOMS provide <DN + role> based UID mapping • VOMS to be tested with CASTOR GridFTP server (configuration issue) CASTOR & GridFTP / Emil Knezo CERN

  13. Umask and usage examples • Umask 002 => “rw-rw-r—” permissions on CASTOR • Per server umask configuration • CASTOR at the moment still requires world-readable files • Usage examples • Prestage file stagein [-h wacdr002d] -M /castor/cern.ch/atlas/subdirectory/file.namestageqry [-h wacdr002d] -M /castor/cern.ch/atlas/subdirectory/file.name • Will be replaced by SRM prepareToGet call • Retrieve file from CASTOR globus-url-copy [-p 10]gsiftp://wacdr002d.cern.ch/castor/cern.ch/atlas/subdirectory/file.namefile:///home/knezo/file.name • Third party transfer from CASTOR globus-url-copy [-p 10]gsiftp://wacdr002d.cern.ch/castor/cern.ch/atlas/subdirectory/file.namegsiftp://spider.usatlas.bnl.gov/usatlas/workarea/knezo/file.name CASTOR & GridFTP / Emil Knezo CERN

  14. Plan for CASTOR/GridFTP service • One year horizon • Support for CMS world-wide production • This is now High Priority Task • Performance challenge for server • Requires TCP-tuning, likely dedicated stager, maybe NAPI • DNS load-balanced server cluster • Sufficient for users with no strict throughput requirements for the coming year (ATLAS, LHCB, EDG) • Service To-Do list • Performance tuning • DNS-load balancing configuration tests • Integrate with CERN monitoring, plus scripts to create server usage statistics • VOMS to improve DN–User mapping • Still to improve logging • Synchronisation on package upgrades with EDG • Prepare user & admin documentation, plus rpms • Shown interest from external institutes: INFN, IFAE, IFIC • Beyond one year • Need to understand what Globus GridFTP server evolution will be. CASTOR & GridFTP / Emil Knezo CERN

  15. Conclusions • GridFTP interface to CASTOR already exists • Ready to use service requires to solve: • Configuration issues • Performance issues • Admin issues • Service has potential to satisfy CASTOR users for the next year CASTOR & GridFTP / Emil Knezo CERN

More Related