160 likes | 303 Views
A Web-Based Data Grid. Chip Watson, Ian Bird, Jie Chen, Ying Chen, Bryan Hess, Andy Kowalski Thomas Jefferson National Accelerator Facility. Outline. Overview of a prototype JLAB data grid architecture Status of the development Expected future milestones Lessons learned so far.
E N D
A Web-Based Data Grid Chip Watson, Ian Bird, Jie Chen, Ying Chen, Bryan Hess, Andy Kowalski Thomas Jefferson National Accelerator Facility
Outline • Overview of a prototype JLAB data grid architecture • Status of the development • Expected future milestones • Lessons learned so far
JLAB Prototype Architecture Summary • The prototype data grid consists of • Web services for information management and control • File daemons (like ftpd) for bulk data transfer • Back-end services used by the web services • Communication w/ web services is via HTTP and XML (HTTPS w/ X.509 certificate for privileged operations) • Communication w/ file daemons is via a daemon specific protocol • Communication w/ back-end services is site specific
In picture form… ClientProgram Agent ReplicaCatalog DataGridServer FileServer R C Host File Host
Web Services • Replica Catalog • Holds global file namespace • May itself be replicated for redundancy or performance • References (for given file) data grid nodes (but not physical path) • Data Grid Server (aka Replica Host) • Holds and serves files • May be a disk cache; may include tertiary storage • Translates global name to URL for retrieval (if cache resident) (pull by client) • Accepts new files (push by client) • Supports queuing of file transfer requests between nodes (3rd party) • Supports policy based file movement
Replica Catalog Components • Relational database • Global directory name, file name, owner, size, etc • Set of Data Grid Nodes holding copies of the file, and last reported state of that replica copy (online, offline) • XML servlet • Directory level services per invocation, returning rich info from the database as an XML document • Catalog updates • HTTP servlet • Applies style sheet(s) to the XML document, allows easy browsing and simple interactions with just a simple web browser
Current Status of Replica Catalog • A prototype exists with following functionality • Database populated with ALL files from the Jefferson Lab silo (no owner, group, file size info loaded for now) • XML servlet for browsing • HTTP servlet for browsing http://129.57.41.138/servlet/dg.HttpReplicaCatalog?dname=/ • Missing functionality in this prototype • Authentication Easy, already done for another (batch system) prototype • Edit catalog In principle easy, just need to finalize scenarios • Extensible file properties Moderately easy, just need to add a name-value table to db and expand the XML document for a single file to include this info
Status (cont.) • Observations • Web browsing into directories w/ thousands of files is slow (produces an ENORMOUS web page), but works • Plan to segment, with “Next Page” link • Probably need to allow client to specify number of files to retrieve, and offset for next retrieval
Data Grid Node Components • XML (and HTTP) servlets • File Catalog Servlet (Replica Host) • Translates file I/O requests to specific URL (including protocol negotiation or selection) • Provides offline / online status of file • Transfer Request Servlet • Queues file transfer requests, reports status • Edits transfer policy for specified directory • Disk Cache Manager Servlet • Edits policy of disk cache manager • File Server(s) • ftp, bbftp, gridftp, …
Data Grid Server Components (Implementation) • Disk Cache Manager (back end service) • Java application • Manages disk pool -- NFS mounted read-only to local users • SQL database to track cached files, pending transfers • Migrates files to / from tape (if requested and if has a reference to a Tape Manager) • Interacts with a Disk Policy Agent (planned) • Tape Manager (back end service) • Separate Java application & db (running on different host) • Stages files to or from silo (has own small disk cache) • NFS exports stub file system
Data Grid Node Components (Implementation) • Disk Policy Agent (back end service) • Runs in Disk Cache Manager’s VM • Keeps replica catalog up to date • Advises cache manager as to which files to delete (deleting last globally disk resident copy is expensive) • Propagates transfer policy from Replica Catalog • Grid Transfer Agent (back end service) • Operates on queued transfer requests • Uses remote File Servers (e.g. is or spawns an xxftp client) • Runs (probably) in disk cache manager’s VM
Current Status of Data Grid Node • Data Grid Servlets • Translation from global name to URL is hard coded • Supports browsing of disk cache • Newest prototype allows browsing of unmanaged node-local file system, including /home, /data, …, and the copying of files within a single data node (adding authentication soon) • File Servers • bbftp in production use at Jlab; waiting for gridFTP
Back End Status • Disk Cache Manager • Simple LRU policy (pluggable), no user quotas • No use of policy agent yet (to sync with replica catalog) • Automatic migration of specified files to tape guaranteed before deletion • Only 1 node operating in this mode (variant of other disk cache managers at Jlab) • Tape Manager • Fully operational, in production use at Jlab • File Transfer Agent • Just starting development
Status Summary • Missing Functionality • A lot! • Transfer queuing • Advanced reservation & quotas • Policy based operations • Automatic updates of replica catalog All of these are planned or in progress…
Data Grid Applications: File Manager • File Manager Design • Uses Replica Catalog (XML) • Uses Data Grid Node (XML) • GUI to browse files • GUI to copy files (and view queues) • Status • XML communications and file GUI done • 3rd party transfer operations awaiting additional functionality in the data grid node • Currently application, but plan to make into an applet
Deployment / Development • 2Q 01 • 2 data grid servers running at Jlab & MIT for LQCD • grid browsing (replica catalog and data grid server) • retrieve file: http, bbftp & gridftp • Command line utility and web interface to “publish” a file (insert into grid node from co-located machine / local file system) • 3Q 01 • 2nd grid running between Jlab & FSU for CLAS (Hall D prototype) • “push” file into a data grid server from offsite • 3rd party file transfers on demand (queued) • 1Q 02 • Policy based file migration • Asynchronous event notification (HTTP based)