170 likes | 285 Views
Data Grid Web Services . Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers. A Three Tier Web Services Architecture. Web Browser. Application. Authenticated connections. XML to HTML servlet. Web Service. Web Server (Portal). Web Service. Web Service. Remote Web Server.
E N D
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers
A Three Tier Web Services Architecture Web Browser Application Authenticated connections XML to HTML servlet Web Service Web Server (Portal) Web Service Web Service Remote Web Server Grid Service Web Service Grid resources, e.g. Condor Local Backend Services (batch, file, etc.) Storage system Batch system
Why Web Services? • Strong industry support & growing adoption • Self describing interfaces & protocol • Support in all languages • Easy addition of additional input or output parameters • Interface evolution w/o breaking what works
Web Services Meta Data Catalog Replica Catalog Replication Service File Client HRM++ Service File Server(s) HRM Listener Storage Resource Single Site Data Grid Web Services Architecture
Web Services Meta Data Catalog Replica Catalog Replication Service File Client HRM++ Service File Server(s) HRM Listener Storage Resource Single Site Components: Replica Catalog • Get replicas: GFN -> SURLs • Get best replica? • Create replica • Input: GFN, SURL • Specify <meta data> for new GFN • Remove replica • Input: GFN, SURL • Make / delete directory (recursive) • Directory Listing • terse or verbose, optionally more than 1 level deep • optionally matching a pattern (regexp?) • Create / delete link (soft) to another file or directory
Web Services Meta Data Catalog Replica Catalog Replication Service File Client HRM++ Service File Server(s) HRM Listener Storage Resource Single Site Components: HRM Listener This component serves as the link between the grid-unaware HRM and the replica system. The HRM / storage resource generates 2 possible types of events. • Advice request: proposed deletion of file X. Listener responds with advice as a number in the range of 0.0 (please don’t) to 1.0 (OK). The listener could base this advice upon interaction with the replica catalog to discover if this is the last disk resident copy, for example. • State change notification: File X is added, or deleted, or cache state is changed. In this case the listener updates the replica catalog.
Web Services Meta Data Catalog Replica Catalog Replication Service File Client HRM++ Service File Server(s) HRM Listener Storage Resource Single Site Components: Replication Service This component acts as an agent for the client to make replicas, and manipulate replica policy Web services: • Copy a replica of GFN / SURL to site X. • Get status of replication operation. • Add / edit / remove a local replication policy (push, maybe pull) To implement a replication policy, it may register as a listener with the HRM
Web Services Meta Data Catalog Replica Catalog Replication Service File Client HRM++ Service File Server(s) HRM Listener Storage Resource Single Site Components: HRM++ Service HRM Web Services: • File status (cached, pinned, permanent, size, owner, etc.) • File status changes (e.g. stage a file, pin a file, make permanent) • Mapping from SURL to TURL for file get, including protocol negotiation • Space allocations for put, including protocol negotiation to yield TURL Extended functions: • Directory listings, search (like replica catalog) • Reliable (as much as possible) third party file transfers to/from another Data Grid Site (reliable), or to/from a site with a supported protocol (e.g. ftp site)
Web Services Meta Data Catalog Replica Catalog Replication Service File Client HRM++ Service File Server(s) HRM Listener Storage Resource Single Site Technologies Employed • Apache web server • Tomcat servlet engine • JAXM for SOAP Messages • XML data format
Implementation • Replica Catalog • SOAP servlet + mySQL back end • (future) global replication policy, client to replication service • HRM++ Service • HRM: SOAP servlet wrapping JASMine • Extensions to HRM: • reliable file transfer (wrap gridftp, etc.), queuing • directory listings, tree search • Replication Service • SOAP servlet + mySQL for request persistence & queues • (future) listener for new files + policy for replication (push) • HRM Listener • SOAP servlet, client to Replica Catalog
Status • Year old raw XML limited prototypes: • Replica catalog • Read-only listings, GFN -> SURL • Loaded with silo info (>100,000 files) • Pre-HRM service • Read-only listings, SURL -> TURL (multi-protocol) • New SOAP components currently in development • Replica catalog • full capabilities except ACL’s, user defined meta-data (deferred) • HRM++ service • Recursive file transfer client <-> unmanaged storage (jparss) • 3rd party reliable file transfers
WSDL • Web Services Definition Language (equivalent to CORBA IDL) http://lqcd.jlab.org/grid/gridService_wsdl.xml
Capabilities (prototype) • Browse contents of file system • Managed disk cache on data grid node • Unmanaged Local or Remote file system • Tertiary storage (eventually HRM) • Move files between managed and unmanaged storage • Within a single data grid node • Between local file system and data grid node • 1Q02: Between data grid nodes (3rd party transfer) • Status – displays if file is currently in disk cache • Migrate from tape to disk (not released)
Standardization Activities PPDG Activity: Jlab is working with the SRB group to standardize web services (WSDL) for managing a data grid • Common interface for JASMine and SRB • Web services client to inter-operate between dissimilar back ends • Extend to additional systems once operational