1 / 9

HEPCAL view on File Access

HEPCAL view on File Access. Jeff Templon NIKHEF templon@nikhef.nl. Interesting. Outline. HEPCAL file access What I don’t like about SRM How SRM got into EDG SE (WP5) ( personal view ). HEPCAL on file access. A dataset (DS) can be any sort of collection of information

rhonda
Download Presentation

HEPCAL view on File Access

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HEPCAL view on File Access Jeff Templon NIKHEF templon@nikhef.nl

  2. Interesting

  3. Outline • HEPCAL file access • What I don’t like about SRM • How SRM got into EDG SE (WP5) (personal view)

  4. HEPCAL on file access • A dataset (DS) can be any sort of collection of information • Datasets are Write-Once-Read-Many • DMS (data management system) must be able to associate default remote access protocol to each dataset; DMS is expected to make sure dataset lands only on SEs that can support the protocol • Root daemon • AMS • Files belonging to a dataset should be made available for opening via a POSIX call or an application specific remote access protocol. The Grid should provide a mechanism whereby a user can present a LDN and receive in return a list of physical file names (and possibly the protocols by which they can be opened) that can be mapped to the original files that were uploaded to the Grid.

  5. More HEPCAL on file access • Regardless of which method (optimization in making the file available to the user) the Grid chooses, the user accesses the DS by providing an LDN and passing the returned file identifier to an open call. • … to write a dataset directly onto an SE. We consider this a special case of the “DS upload” use case. In this case the Grid provides a dataset staging area where files can be created via standard POSIX calls. This will be either a suitable area on the local machine or on the SE, or even a different area as long as it optimises the subsequent upload of the dataset to the SE. • The user opens the files for reading with a POSIX open or using the syntax of the specified access protocol; (from uc#dsaccess use case)

  6. Conclusions, HEPCAL file access • Present LDN, pass returned object to “open”, get the bytes • Multi-file datasets are possibly in conflict with this model: • Files belonging to a dataset should be made available for opening via a POSIX call or an application specific remote access protocol. The Grid should provide a mechanism whereby a user can present a LDN and receive in return a list of physical file names (and possibly the protocols by which they can be opened) that can be mapped to the original files that were uploaded to the Grid. • POSIX access discussed yesterday in EDG ATF • Obvious that in falls within either WP2 or WP5 • Neither feels they have time to do it

  7. Multi-File Datasets Unresolved Issue events for a given run might be partially resident in several files. If a physicist requests the dataset Omega-20070312 and wants to read vertex events, he wants to be sure to open the file “Vertex” and not something else. This means it must be possible to “label” the various components for identification later. One could take a unix-like approach where the DS name is like a “directory”, and the component files like the “files” in that directory. We were’nt able to decide if this approach was good. the problem of files/directories is not exactly equivalent to DS/components.

  8. What I don’t like about SRM • Files don’t stay put. • We get a SFN which really isn’t a file name. There is no guarantee that if I ssh onto the SE, that there will be a file with that name • The actual files on disk may have a different name each time they show up on disk • This maybe isn’t so bad, but one cannot just open the file! Opening the file becomes a two-call sequence • Maybe after seeing HEPCAL again I should not worry … just don’t ever look at an SE again

  9. How SRM got into EDG • Not a party line view • SRM was presented at December 2002 ATF meeting • Was not generally realized that SE would be based on SRM • People “woke up” during Feb 2003 ATF meeting when WP5 expressed surprise that we thought “get” actually got the bytes • May all be irrelevant, unless SE converges within next N hours it may not be in LCG-1.

More Related