140 likes | 297 Views
POSIX-like OGSA/SOAP Services. Arun Jagatheesan Architect & Team Lead, SDSC Matrix San Diego Supercomputer Center. GFS, Global Grid Forum-9 October 7, 2003, Chicago. Talk Outline. Grid File System The small big picture Need for Schema Need for Operation definitions Data Transport.
E N D
POSIX-like OGSA/SOAP Services Arun Jagatheesan Architect & Team Lead, SDSC Matrix San Diego Supercomputer Center GFS, Global Grid Forum-9 October 7, 2003, Chicago
Talk Outline • Grid File System • The small big picture • Need for Schema • Need for Operation definitions • Data Transport
Grid File System Applications (Astronomy, Physics, Life Science, business apps, . . .) Hierarchical Logical Name space, ACL, metadata Grid File System Service (POSIX-like Interface) NFS/CIFS … Virtual Directory Service (Management of virtualization) Data Services Coordinated with other groups Data Sources
OGSA/SOAP based interfaces for file operations NFS or other standard interface over the virtualized schema The small big picture NFS/CIFS … Grid File System Service (POSIX-like Interface) XML Schema for Collections, Data Sets Virtual Directory Service (Management of virtualization) Data Services Data Sources
Grid Collection Schema • XML Schema based Description for • Collections or Virtual Directories • Data Sets • File System Meta-data (file size, date created, …) • Application Specific Meta-data • Access Permissions • … • Logical Name space • Extensible • Scalable (more federations) • Dynamic Composition of the name space • Import and Export
Operations on Logical Namespace • OGSA/SOAP based interfaces • Grid File System operations • Similar to traditional file systems operations / POSIX • Open (= Get a GSR?), Read, Seek’n’Read, Seek’n’Write, … • Simple Control (Context) Operations • Management of Logical Namespace • SOAP based bindings • Bulk (Content) Operations • Only SOAP bindings for data transport ??? (NOPE) • Alternative mechanisms needed in standard
myActiveNeuroCollection patientRecordsCollection image.cgi image.wsdl image.sql E:\srbVault\image.jpg /users/srbVault/image.jpg Select … from srb.mdas.td where... Logical Layers (bits,data,information,..) Collections or Virtual Directories Virtual Data Transparency Data Replica Transparency image_0.jpg…image_100.jpg Data Identifier Transparency Storage Location Transparency Storage Resource Transparency
Storage Resource Transparency (1) • Storage repository abstraction • Archival systems, file systems, databases, FTP sites, … • Logical resources • Combine physical resources into a logical set of resources • Hide the type and protocol of physical storage system • Load balancing – based on access patterns • Unlike DBMS, user is aware of logical resources • Flexibility to changes in mass storage technology
Storage Resource Transparency (2) • Standard operations at storage repositories • POSIX like operations on all resources • Storage specific operations • Databases - bulk metadata access • Object ring buffers - object based access • Hierarchical resource managers - status and staging requests
Storage Location Transparency • Support replication of data for performance • Transparent access to physical location and physical resource • Virtualization of distributed data resources • Data naming managed by the data grid • Redundancy for preservation • Resource redundancy – “m of n” resources in list • Location redundancy – replicate at multiple locations
Data Identifier Transparency • Four Types of Data Identifiers: • Unique name • OID or handle • Descriptive name • Descriptive attributes – meta data • Semantic access to data • Collective name • Logical name space of a collection of data sets • Location independent • Physical name • Physical location of resource and physical path of data
Data Replica Transparency • Replication • Improve access time • Improve reliability • Provide disaster backup and preservation • Physically or Semantically equivalent replicas • Replica consistency • Synchronization across replicas on writes • Updates might use “m of n” or any other policy • Distributed locking across multiple sites • Versions of files • Time-annotated snapshots of data
Conclusion • Lot of possibilities • Need for a Standard Grid File Schema and Global Logical Namespace for virtualization • Need for Standard description of Operations or Grid File System Service • Call for • Users, Projects • Developers, Vendors • It’s a stone’s throw away – together, we will do it.