90 likes | 186 Views
Amsterdam, DAaM 18 th June 2010. Final summary. Ian Bird. Potential Demonstrators. Bockelman xrootd demonstrator (with filesystems ): should collab with IT-DSS xrootd ideas ? Massimo: Stewart : +DIRAC Panda dynamic data placement?
E N D
Amsterdam, DAaM 18thJune 2010 Final summary Ian Bird
Potential Demonstrators • Bockelman • xrootd demonstrator (with filesystems): should collab with IT-DSS xrootd ideas? • Massimo: • Stewart: +DIRAC • Panda dynamic data placement? • CHIRP (== afs-like fs) – similar to xrootd proposals? • Behrman: • ARC caching – general use? • Baud/McCance: • Use MSG as weak coupling to address data consistency (storage, catalogues, apps)? Or as DM info system? • CMS catalogues – Simon Metson • Catalogues: Oscar • DHT, Bloom filters, MQ, Alien FC ??? • Pablo: LFC vsAlienFC Ian.Bird@cern.ch
Jeff: • CoralCDN • Dirk/Rene: proxy caches • NFS j-p/gerd • File access - jens Ian.Bird@cern.ch
And... • Need network planning group (David F.) Ian.Bird@cern.ch
Discussion items • Data transfer... • Many suggestions – both here and in contributed docs • Not all coherent... • Probably need a group to address this: • Can “FTS” + suggestions/fix do what is needed • Do we need an (other) asynchronous data transfer mechanism (job finishes, here is output, deliver it to archive) • SRM: • Is it dead? • Separate archives from caches – archive interface is simpler (subset of SRM?); cache interface is fs (-like) • Not dead but only use limited pieces Ian.Bird@cern.ch
Discussion items • Access protocol • Can we have a data access common protocol? • Is it xrootd? • What is it we are trying to optimise? • Not CPU! But this is what (all) we measure today... • Should better specify metrics for success • Monitoring now • We need to measure what we are doing so that we have something to compare with! • Real information from MSS and other data management components is missing. We need it now. Ian.Bird@cern.ch
And ... • Can we have a simulation ?? • What are the metrics?? • Security issues • WN access to WAN • Policies • Demonstrators • Follow up in GDBs Ian.Bird@cern.ch
My personal summary • Storage: • Separate archive (tape) and cache systems • Simplifies interfaces to both • Allows industrial (standard) solutions for archives • Never read tapes • Data Access layer • Need combination of data placement and caching • Effective caches can reduce (or optimise existing) space usage (separate tools from policies) • Several potential caching mechanisms • Can’t assume that jobs find all of the files needed at a site – get 90% remote access (cache) the rest • Can’t assume that catalogues are fully up to date • Model of access is filesystem (-like) Ian.Bird@cern.ch
Summary -2 • Data transfers • Need a reliable way to move data from a job to an archive (or point to point) • Need data placement mechanism • Need transport for caching • Need remote access mechanism • Namespaces, catalogues, authz, quotas etc. • Want dynamic catalogues that reflect changing contents of storage • Could be LFC + MQ, DHT, Bloom filters, Alien FC (?) • Computing models should recognise that information is best guess (not 100% reliable) • Grid-wide home directory • Is needed • Technologies? How to do this? Ian.Bird@cern.ch