170 likes | 330 Views
Applying Data Grids to Support Distributed Data Management Storage Resource Broker Reagan W. Moore Ian Fisk Bing Zhu University of California, San Diego moore@sdsc.edu http://www.npaci.edu/DICE/. Data Management Systems. Data sharing - data grids Federation across administration domains
E N D
Applying Data Grids to Support Distributed Data ManagementStorage Resource BrokerReagan W. MooreIan FiskBing ZhuUniversity of California, San Diegomoore@sdsc.eduhttp://www.npaci.edu/DICE/
Data Management Systems Data sharing - data grids Federation across administration domains Latency management Sustained data transfers Data publication - digital libraries Discovery Organization Data preservation - persistent archives Technology management Authenticity
Consistent Data Environments Storage Resource Broker combines the functionality of data grids, digital libraries, and persistent archives within a single data environment SRB provides Metadata consistency Latency management functions Technology evolution management
Metadata Consistency Storage Resource Broker uses a logical name space to assign global identifiers to digital entities Files, SQL command strings, database tables, URLs State information that characterizes the result of operations on the digital entities is mapped onto the logical name space Consistency of state information is managed as update constraints on the mapping Write locks, synchronization flags, schema extension SRB state information is managed in the MCAT metadata catalog
SRB Latency Management Remote Proxies, Staging Data Aggregation Containers Prefetch Network Destination Network Destination Source Caching Client-initiated I/O Streaming Parallel I/O Replication Server-initiated I/O
SRB 2.0 - Parallel I/O Client-directed parallel I/O- Client/Server Thread-safe client client decides the number of threads to use each thread is responsible for a data segment and connects to the server independently utilities srbpput and srbpget Sustains 80% to 90% of available bandwidth using 4 parallel I/O streams and a window size of 800 kBytes
SRB 2.0 - Parallel I/O (cont1) Server-directed parallel I/O - Client/Server Server plans and decides number of threads to use Separate “Control” and “data transfer” sockets Client listens on the “control” socket and spawns threads to handle data transfer Always a one-hop data transfer between client and server Similar to HPSS Works seamlessly with HPSS Mover protocol Also works for other file systems
SRB 2.0 - Parallel I/O (cont2) Parallel I/O - Server/Server Copy, replicate and staging operations Always used in third-party transfer operations Server/server data transfer, client not involved Uses up to 4 threads depending on file size 7-10 times improvement for large files across country Up to 39 MB/sec across campus (PC raid disk, gBit ethernet).
Federated SRB server model Peer-to-peer Brokering Read Application Parallel Data Access Logical Name Or Attribute Condition 1 6 5/6 SRB server SRB server 3 4 5 SRB agent SRB agent 2 Server(s) Spawning R1 MCAT 1.Logical-to-Physical mapping 2.Identification of Replicas 3.Access & Audit Control R2 Data Access
SRB 2.0 - Bulk operations Uploading and downloading large number of small files Multi-threaded Bulk registration – 500 files in one call Fill 8 MB buffer before sending Use of container New Sbload and Sbunload utilities Over 100 files per second registration 3-10+ times speedup
Technology Management C, C++, Libraries Unix Shell Databases DB2, Oracle, Postgres Archives HPSS, ADSM, UniTree, DMF File Systems Unix, NT, Mac OSX SDSC Storage Resource Broker & Meta-data Catalog Application Linux I/O OAI WSDL Access APIs DLL / Python Java, NT Browsers GridFTP Consistency Management / Authorization-Authentication Prime Server Logical Name Space Latency Management Data Transport Metadata Transport Catalog Abstraction Storage Abstraction Databases DB2, Oracle, Sybase, SQLServer Servers HRM
SRB Archival Tape Library System SRB archival storage system in addition to HPSS, UniTree, ADSM. A distributed pool of disk caches for front end A tape library system back end STK silo for tape storage and tape mount 3590 tape drives I/O always performed on disk cache Always stage data to cache
CMS Experiment Ian Fisk - user level application Installed SRB servers at CERN, Fermi Lab, UCSD under a user account Remotely invoked data replication From UCSD, invoked data replication from CERN to Fermi Lab, and to UCSD Data transfers automatically used four parallel I/O streams, default window size of 800 kBytes Observed Sustained data transfer at 80% to 90% of available bandwidth Transferred over 1 TB of data per day using multiple sessions
Future plans SRB 2.1 - Grid-oriented features, SRB-G (5/31/03) Add GridFTP driver – Access data through GridFTP server Upgrade to GSI 2.2 (GSI 1.1 in current version) Provide encrypted data transfer facility, using GSI encryption, between servers and between server and client. Explore network encryption as a digital entity property WSDL Services interface for SRB including data movement,replication, access control, metadata ingestion and retrieval and container support. SRB 2.2 – Federated MCATs (8/30/03) Peer-to-peer MCATs Mount point like interface - /sdsc/…, /caltech/…
Next CMS Experiments Sustained transfer Use 4 MB window size Bulk data registration In tests with DOE ASCI project, sustained registration of 400 files per second Peer-to-peer federation Prototype of ability to initiate data and metadata exchanges between MCAT catalogs
For More Information Reagan W. Moore San Diego Supercomputer Center moore@sdsc.edu http://www.npaci.edu/DICE http://www.npaci.edu/DICE/SRB/index.html http://www.npaci.edu/dice/srb/mySRB/mySRB.html