1 / 18

An End-to-End Approach to Globally Scalable Network Storage

An End-to-End Approach to Globally Scalable Network Storage. Presented in cs294-4 P2P Systems by Sailesh Krishnamurthy 15 October 2003. Logistical Networking. Models sync/async aspects of communication Single “fabric” that unifies: Data Storage Data Transportation

alma
Download Presentation

An End-to-End Approach to Globally Scalable Network Storage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An End-to-End Approach to Globally Scalable Network Storage Presented in cs294-4 P2P Systems by Sailesh Krishnamurthy 15 October 2003

  2. Logistical Networking • Models sync/async aspects of communication • Single “fabric” that unifies: • Data Storage • Data Transportation • Internet scalability goals • Claim: end-to-end design principles vital

  3. Background: SAN vs NAS • Current trends in storage networking • SAN - Storage Area Network • NAS - Network Attached Storage

  4. NAS Wires: TCP/IP Protocol: NFS, CIFS SAN Wires: Fiber Channel Protocol: Encapsulated SCSI More on NAS v. SAN Where is the File System ?

  5. Traditional networks • General goals • Minimize delay • Minimize probability of corruption • Maximize probability of delivery • Assumptions in traditional storage nets • When storage is closely coupled, delay and probability of corruption can be low while availability is high.

  6. SANs cannot scale to SWANs • In the SWAN, resources can be intermittently unavailable • So we need e2e strategies • Simple retries • Redundant data accesses spread across nw • High-latency archival backups

  7. Correctness in the wild • SANs are “controlled” environments and correctness is not an issue • In the SWAN, data storage may not be reliable. • Data accuracy must be checked by producers and consumers - at endpoints

  8. SWAN Security • SWANs are not physically localized • SAN security assumptions don’t hold • Again, e2e approaches are required • DoS is a gotcha for SWANs • Can’t be prevented with e2e strategies • Imitate techniques for handling DoS in IP

  9. Unbounded Size/Duration • Since single store may not have all the resources all the time, the endpoint has to manage distribution of data • Unbounded duration allocation hurts resource sharing. • Should this be managed at endpoints ?

  10. Logistical Networking • Storage Networking • IP networks: interconnection fabric of storage pool • Logistical Networking • Storage part of the networking infrastructure • Shared resource fabric exposing storage resources • Similar to how internet exposes bandwidth resources • Storage Stack • Bottom-up, layered e2e design approach • Internet Backplane Protocol (IBP)

  11. Storage Stack

  12. IBP - Internet Backplane Protocol • First layer of stack that’s globally accessible • Abstracts access layer resources (file/block storage services) • Expose underlying storage resources to maximize freedom at higher levels • Implement only indispensable & common functions • Enable scalable internet style resource sharing • Mask peculiarities of access layer resource • Abstract service based on data blocks that are managed as “byte arrays”

  13. IP vs link layer Agg. of link layer packets masks packet size limits Simple fault detection - faulty datagrams dropped Global addressing masks diffs b/w LANs IP Property Any participant of a routed IP n/w can use any link layer connection IBP “byte array” indep. Agg. access layer blocks masks fixed block size Simple fault detection - drop faulty byte arrays Global addressing (IP) maks diff b/w acc layer IBP Property Any participant of an IBP n/w can use any access layer storage resource IP vs IBP

  14. Issues with IBP • DoS vulnerability is much worse • In IP: • DoS attacks require constant sending of data • Does not profit the attacker in any way • In IBP: • Once data block is allocated it remains used • Using remote storage does benefit the attacker • Strong semantics (reliability) of traditional storage/SAN are difficult to implement in the SWAN

  15. IBP Solutions • Time-limited storage allocations • When a lease expires, the storage can be reused for some other user • Soft storage semantics in IBP • IBP is a “best-effort” service • Allocated storage can be revoked at any time

  16. exNode - Flexible Aggregation of Network Storage • Implement abstractions w/ strong properties • Higher layer construct • Aggregates primitive IBP byte-arrays • Need to maintain state that represents the agg. • exNode aggregates IBP byte-arrays as the Unix inode aggs. disk blocks

  17. e2e services for storage • exNode can hold additional metadata for services: • Redundancy • Framing of data into segments w/ checksums • exNode is analogous to the state of a TCP connection, data on disk analogue of a TCP stream

  18. Relation to p2p systems • Paper compares with Napster/Gnutella • In file sharing all allocations are at endpoints .. leads to large data transfers • Appropriate comparison is Oceanstore ? • My view • exNode infrastructure is a way to create storage services from smaller blocks • Can be useful in an Oceanstore-like setting • Can alleviate some SAN shortcomings ?

More Related