130 likes | 148 Views
This research paper explores the concept of on-demand grid storage by harnessing idle storage space from user desktop workstations. It discusses the benefits, challenges, and design choices involved in implementing this approach.
E N D
On-demand Grid Storage Using Scavenging Sudharshan Vazhkudai Network and Cluster Computing, CSMD Oak Ridge National Laboratory http://www.csm.ornl.gov/~vazhkuda vazhkudaiss@ornl.gov Acknowledgments: ORNL Collaborators: Dr. Xiaosong Ma and Dr. Vincent Freeh (NCSU) PDPTA June 21th, 2004
Outline • Grid Storage Fabric Background • The Evolving Computing Landscape—An Analogy • Storage Scavenging of User Desktop Workstations • Use Cases • Related Work and Design Choices • Architecture • Storage Layer • Management Layer • Current Status
Grid Storage Fabric Background • Scientific discoveries driven by analyses of massively distributed, bulk data. • Proliferation of high-end mass storage systems, SANs and datacenters • Providers such as IBM, HP, Panasas, etc. • Merits: • Excellent price/performance ratio • Good storage speeds and access control • Support intelligent parallel file systems • Optimized for wide-area, bulk transfers • Reliability! • Successful demonstration in production Grids: DOE Science Grid, Earth System Grid, TeraGrid, etc. • Drawbacks: • Increasing deployment/maintenance/administrative costs • Specialized software and central points of failure • Costs and specialized features prohibit wider acceptability and limits to select few research labs & organizations • Aforementioned production Grids are hardly half-a-dozen sites…!! Meta-Message:If grids are to become prevalent and grow beyond the confines of a few organizations, exploiting commodity fabric features is absolutely essential!
Aggregating idle storage space from Commodity PCs Aggregating idle CPU cycles from Commodity PCs Time flies… RAID-like aggregation Beowulf Style Time flies again… Supercomputers Datacenters Tightly Coupled Loosely Coupled Loosely Coupled Tightly Coupled The Evolving HPC Landscape • Computing fabric for the Grid: Storage Fabric…?? Volatility Trust Performance Meta-Message:Proprietary systems are being replaced with commodity clusters, delivering new levels of performance and availability at dramatically affordable price point.
Storage Scavenging of User Desktop Workstations • Harnessing collective storage potential of individual workstations ~ Harnessing idle CPU cycles • Why Storage Scavenging can be viable? • Economics of buying gigabytes of storage is increasingly affordable • Space usage to Available storage ratio is significantly low • Increasing numbers of workstations are online most of the time • Even a modest contribution (Contribution << Available) can amass collective, staggering aggregate storage! • Concerns: • Vagaries of volatility… • Question of Trust: datasets on arbitrary user workstations • Performance of such aggregate storage Meta-Message:Despite the high maintenance and administrative costs, a factor that attracts the Grid community to high-end storage and data centers is their ability to deliver sustained high-throughput for data operations.
Use Cases • Storage cloud as a: • Cache • Intermediate hop • Local, client-side scratch • Grid replica • RAS for Terascale Supercomputers
Related Work and Design Choices • Related Work: • Network/Distributed File Systems (NFS, LOCUS) • Parallel File Systems (PVFS, XFS) • Serverless File Systems (FARSITE, xFS, GFS) • Peer-to-Peer Storage (OceanStore, PAST, CFS) • Grid Storage Services (LegionFS, SRB, IBP, SRM, GASS) • Design Choices & Assumptions: • Scalability: O(100) or O(1000) • Commodity Components: Quality & Quantity • User Autonomy • Well connected & Secure • Heterogeneity • Large, “write once read many” datasets • Transparent • Grid Aware
Grid Data Access Tools Management Layer Data Placement, Replication, Grid Awareness, Metadata Management Registration Storage Layer Pool A Pool n Morsel Access, Data Integrity, NonInvasiveness Pool m Registration Architecture Meta-Message:Imagine “Condor” for Storage.
File n: 1a 2a 3a 4a File 1: 1 2 3 1 3 1 3a 2 2 1a 3a 1a 4a 2a 2a Storage Layer • Benefactors: • Morsels as a unit of contribution • Basic morsel operations as RPC services [new(), free(), get(), put()…] • Space Reclaim: • User withdrawal • Which morsels to relocate/evict? • Which benefactor workstations to relocate to? • Data Integrity through checksums • Performance Traces • Pools: • Benefactor registrations (soft state) • Dataset distributions • Metadata • Selection heuristics
Management Layer • Manager: • Pool registrations • Metadata: datasets-to-pools; pools-to-benefactors, etc. • Availability: • Redundant Array of Replicated Morsels • Minimum replication factor for morsels • Where to replicate? • Which morsel replica to choose from in response to user file fetches? • Grid Awareness: • Information Providers • Space reservations • Transfer protocol agnostic • Transparent Access: • Namespace
Current Status • rpc (A) services: • Create/delete files • Reserve…. • rpc (B) services: • File fetches • Hints… • rpc (C) services: • Control • Dataset distributions • Benefactor alerts, warnings, alarms to manager • ………………… • rpc (D) services: • Morsel relocations • Status info • Load balancing • ………………… • rpc (E) services: • Morsel relocations to different pools • Under direction of manager • ………………… reserve(): cancel() store() : open(); benefactorID.put() retrieve(): open(); benefactorID.get() delete() Application ftp/GridFTP Proxy rpc (A) rpc (B) Manager rpc (C) rpc (D) new() free() get() put() Benefactor Benefactor OS OS
Philosophical Musings... • It’s all about commoditizing… • Quality • Trust • Performance • What the scavenged storage “is not”: • Not a replacement to high-end storage • What it “is”: • Low cost, fault-tolerant alternative to be used in conjunction with high-end storage
Further Information • My Website: • http://www.csm.ornl.gov/~vazhkuda