240 likes | 383 Views
OSD: Storage Substrate for the Enterprise and … the Grid. Feng Wang Department of Computer Science University of Minnesota. Requirements for the Enterprise and the Grid. To provide a scalable, ubiquitous, robust storage infrastructure … data objects must be replicated and migrated
E N D
OSD: Storage Substrate for the Enterprise and … the Grid Feng Wang Department of Computer Science University of Minnesota
Requirements for the Enterprise and the Grid • To provide a scalable, ubiquitous, robust storage infrastructure … • data objects must be replicated and migrated • replication: increased availability, greater performance • migration: lower latency, always available • Research Issues • When/where to create/delete replicas? • When/where to migrate? • How to do replica selection? • How to replicate the object of objects? • How to keep objects consistent?
Solutions: State-of-the-Art • Coda • Globus • GDMP - Oceanstore
Coda File System • Server replication - static replication - unit: volume - read-write replicas • Lack of support of migration OSD: - dynamic replication - object migration - object replication
Coda File System • Scalability - client caching - client selecting replica (ok for few server) - client propagating updates to all AVSGs OSD: - provide multi-level services for clients with different storage resource and processing power - use distributed intelligent storage devices, to alleviate the burden of client, and the server
Globus • Replica catalog – provide location service - based on LDAP • Replica management - replica creation, deletion, selection • Reliable replication – GridFTP • replication of large scientific data set • Centralized, hierarchical -> decentralized, p2p OSD: - a device is like a site in Globus - no active object in Globus
Reliable Replication Reliable Transport Disk Cache TapeLibrary Disk Array A Model Architecture for Data Grids Attribute Specification Replica Catalog Metadata Catalog Application Multiple Locations NWS Logical Collection and Logical File Name Selected Replica Replica Selection MDS Performance Information and Predictions Disk Cache Replica Location 1 Replica Location 2 Replica Location 3
GDMP GDMP - Grid Data management Pilot - asynchronous replication with a subscription model producer -> export catalog -> import catalog -> consumer - partial-replication model filter criteria e.g. only replicate file with “Muon” in the name or owned by user “Roy” - centralized replica catalog
GDMP • static producer/consumer relationship • file replication (one file may include multiple objects) • based on globus replica catalog, GridFTP, GSI • a central replica catalog for Grid OSD: object replication, dynamic replication, p2p environment for replica catalog.
OceanStore • cluster recognition - identify and group closely related files - help prefetching • replica management - dynamic replication system decides the number and location of floating replicas by monitoring client requests and system load system forwards a request to object’s parent node, and the parent can create additional floating replicas on nearby nodes - analyze global usage trends e.g., detect periodic migration of clusters from site to site and prefetch data based on these cycles OSD: - devices can participate and simplify this introspection process, - let owner/application to explicitly specify policies - two-tier vs. p2p
Limitation • Server-based solutions • Ad-hoc (static) - lack of flexibility - need more management • Rely upon knowledge maintained solely at the server - do not scale well in performance and functionality • e.g. server-only based schemes may be bottlenecks - server modification required for new data objects • e.g. multimedia objects require QoS, what will be next? - point solutions • e.g. one kind of consistency • Offload too much work to client (like xFS) • Oceanstore is not suitable for enterprise environment • Globus does not consider active object, read-only data
Our Solution: OSD • Data encapsulated as active objects - coupled with object-based meta-data is more scalable along both axes • flexibility in functionality captured by object-based meta-data • e.g. object can define its own consistency or replication policy • greater dynamic performance is possible • e.g. an object can request replication when demand grows without server intervention • Migration and Dynamic replication - support mobility of clients, objects, devices - ease of management • Differentiated service provided for different clients - e.g., PDA client vs. laptop client • Help survive failures • Read-write Consistency -access semantic of the object effects the number of replicas of the object.
Three-layer approach • Object initiated: based on the metadata associated with the object e.g., object specifies if access to the object reach the “replicate_when” threshold, creates a replica; or object requires 5 replicas in the system • Device initiated: based on local policy; only inside the region the device has better knowledge of its own load than the regional manager. If the access to one device reach the local threshold, device may initiate replication of popular objects to enforce load balancing. • Regional manager initiated: based on global policy e.g., backup, mirroring
The Problem • We will address in the context of OSD • How to explore this three layer architecture? • what meta-data is needed? • which meta-data is local to the object, which is on device, and which must be distributed to servers? • How to keep policies coherent? and avoid bad behaviors? • What is the mechanisms and policies to take advantage of this three layer?
Meta-data Strawman • Meta-data (mandatory part) • replica: (x152338) • mobile: yes/no • consistency: Unix, Session, … • demand: requests/time • replication_when: demand > b • replication_where: region_list • time_to_live: 24 hours • access_semantic: read-only/read-most/single-writer/multiple-writer • contained objects: x343687, x458034 • Preferred_transfer_speed: 10Mbps
Meta-data Strawman • version: 1.0 • location: device_ip_list • itinerary: (region, time), (region time), … • access_pattern: <access_frequency>, <coaccess object lists> • security-related attributes: • other_constraints: e.g., processing power of the device • storage_layout: • other data object specific attributes (non-mandatory) • allow content-based searches and customized delivery • User Profile can help set the value of these metadata entries. e.g., the mobility pattern of a client can decides the itinerary entry.
Metadata on a Device • device description: bandwidth, available space,… • the objects and their metadata • the soft-state locations for all objects on itself - get the information from the regional manager - record it when the device create the replica • Device stores the soft-state information about its neighbor devices - search the nearby neighbors by itself - learn from regional manager • Thus the device can help during the period of regional manager failure
Metadata on Regional Manager • Metadata of active object including replica location info support of attribute filtering • Device status
Metadata on Client • Replica location for read-only data, client can cache this information and can access object without contact the regional manager • If the client has enough storage space, it can caches the object.
Scenarios: Migration • Grid computing • I have a reservation for a supercomputer in remote region R1 at time T1 to finish at T2 and this computation requires data object x123. The output will be a data object x934 that I want delivered back to my home region Rhome: • itinerary for x123 has an entry (R1, T1) • itinerary for x934 has an entry (Rhome, T2) when x934 is created at time T2 • Personal computing • I have a local object x123 (Rhome), and I am going to a conference on Tues-Thurs in region R1 and want to operate on the object there. • itinerary for x123 has entries (R1, Tues), (Rhome , Thurs) • if it is going on my laptop – add more entries • Support user-directed migration such as these • Research questions • develop object-directed migration policies that exploit meta-data e.g. object observes that many requests are coming from a particular region it self-migrates over there • space reservation • security
Scenarios: Replica Creation • Collaboration of object, device and regional manager A company publishes a new video that needs to be read by every employee. The secretary stores this object and asks the system to disperse this object to all regions, and the devices holding the objects should at least be able to support 10 employees to read this object at 1Mbps simultaneously object: - QoS: 1Mbps - replication_where: all regions - estimated_traffic: 10Mbps regional manager: - according to the estimated traffic, find a proper device in this region to store this object - inform other regional managers to store this object inside their region device: - if the device cannot support the 1Mbps delivery of this object, it can create replicas of this object on nearby devices, or it can create replicas of other heavy-loaded objects on nearby devices based on local policy.
Scenarios: Replica Creation • Research questions: - define the nearby neighbor - metrics that will effect replication security restriction device configuration access_semantic of object - define a way to register and locate replicas, i.e., location service - define a way to discover the available storage resources and their attributes, i.e., resource discovery - monitors for local traffic
Scenarios: Replica selection The application wants to get an object which can be transferred at least at 10Mbps and has lowest latency Combination of several ways: - client gets all replica information and makes the decision by itself - regional manager makes the decision for client - device that serves the request can redirect the request to other devices if it cannot satisfy the requirement or ask the regional manager to choose another replica - when one entity does replica selection, it can choose the “best” two replicas to let the client turn to another in case of failure Research issues: - Define a way to describe the replica and the application requirements in order to find the suitable replica
Scenarios: Failure • Failure of a regional manager - client has cached the location of the object it needs - client may have cached the object it needs - client may choose a device to ask for an object, device can search itself, or ask its neighbors, it is like a peer to peer between devices • Failure of a device - client can choose another one based on the cached object location - client may have cached the object - client can ask the regional manager to get another replica Induce more issues for security and consistency