1 / 34

Optimizing End-User Data Delivery Using Storage Virtualization

This seminar discusses the problem of client-side caching in data delivery and explores the use of storage virtualization, specifically FreeLoader Desktop Storage Cache, as a solution. It addresses the challenges of wide-area data movement, latency tolerance, and limited storage options, and proposes a virtual cache approach using desktop storage scavenging. The seminar also compares FreeLoader with other storage systems and explores client access pattern aware striping for optimizing cache access.

spriggsj
Download Presentation

Optimizing End-User Data Delivery Using Storage Virtualization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Optimizing End-User Data Delivery Using Storage Virtualization Sudharshan Vazhkudai Oak Ridge National Laboratory Ohio State University Systems Group Seminar October 20th, 2006 Columbus, Ohio

  2. Outline • Problem space: Client-side caching • Storage Virtualization: • FreeLoader Desktop Storage Cache • A Virtual cache: Prefix caching • End on a funny note!!

  3. Problem Domain • Data Deluge • Experimental facilities: SNS, LHC (PBs/yr) • Observatories: sky surveys, world-wide telescopes • Simulations from NLCF end-stations • Internet archives: NIH GenBank (serves 100 gigabases of sequence data) • Typical user access traits on large scientific data • Download remote datasets using favorite tools • FTP, GridFTP, hsi, wget • Shared interest among groups of researchers • A Bioinformatics group collectively analyze and visualize a sequence database for a few days: Locality of interest! • Often times, discard original datasets after interest dissipates

  4. So, what’s the problem with this story? • Wide-area data movement is full of pitfalls • Sever bottlenecks, BW/latency fluctuations • GridFTP-like tuned tools not widely available • Popular Internet repositories still served through modest transfer tools! • User applications are often latency intolerant • e.g., real-time viz rendering of a TerraServer map from Microsoft on ORNL’s tiled display! • Why can’t we address this with the current storage landscape? • Shared storage: Limited quotas • Dedicated storage: SAN storage is a non-trivial expense! (4TB disk array ~ $40K) • Local storage: Usually not enough for such large datasets • Archive in mass storage for future accesses: High latency • Upshot • Retrieval rates significantly lower than local I/O or LAN throughput

  5. Is there a silver lining at all? (Desktop Traits) • Desktop Capabilities better than ever before • Space usage to Available storage ratio is significantly low in academic and industry settings • Increasing numbers of workstations online most of the time • At ORNL-CSMD, ~ 600 machines are estimated to be online at any given time • At NCSU, > 90% availability of 500 machines • Well-connected, secure LAN settings • A high-speed LAN connection can stream data faster than local disk I/O

  6. Storage Virtualization? • Can we use novel storage abstractions to provide: • More storage than locally available • Better performance than local or remote I/O • A seamless architecture for accessing and storing transient data

  7. Desktop Storage Scavenging as a means to virtualize I/O access • FreeLoader • Imagine Condor for storage • Harness the collective storage potential of desktop workstations ~ Harnessing idle CPU cycles • Increased throughput due to striping • Split large datasets into pieces, Morsels, and stripe them across desktops • Scientific data trends • Usually write-once-read-many • Remote copy held elsewhere • Primarily sequential accesses • Data trends + LAN-Desktop Traits + user access patterns make collaborative caches using storage scavenging a viable alternative!

  8. Old wine in a new bottle…? • Key strategies derived from “best practices” across a broad range of storage paradigms… • Desktop Storage Scavenging from P2P systems • Striping, parallel I/O from parallel file systems • Caching from cooperative Web caching • And, applied to scientific data management for • Access locality, aggregating I/O, network bandwidth and data sharing • Posing new challenges and opportunities: heterogeneity, striping, volatility, donor impact, cache management and availability

  9. FreeLoader Environment

  10. FreeLoader Architecture • Lightweight UDP • Scavenger device: metadata bitmaps, morsel organization • Morsel service layer • Monitoring and Impact control • Global free space management • Metadata management • Soft-state registrations • Data placement • Cache management • Profiling

  11. FreeLoader installed in a user’s HPC setting GridFTP access to NFS GridFTP access to PVFS hsi access to HPSS Cold data from tapes Hot data from disk caches wget access to Internet archive Testbed and Experiment setup

  12. Comparing FreeLoader with other storage systems

  13. Optimizing access to the cache: Client Access-pattern Aware Striping • Uploading client likely to access more frequently • So, let’s try to optimize data placement for him! • Overlap network I/O with local I/O • What is the optimal local:remote data ratio? • Model

  14. Philosophizing… • What the scavenged storage “is not”: • Not a file system, not a replacement to high-end storage • Not intended for wide-area resource integration • What it “is”: • Low-cost, best-effort storage cache for scientific data sources • Intended to facilitate • Transient access to large, read-only datasets • Data sharing within administrative domain • To be used in conjunction with higher-end storage systems

  15. Towards a “virtual cache” • Scientific data caches typically host complete datasets • Not always feasible in our environment since: • Desktop workstations can fail or space contributions can be withdrawn leaving partial datasets • Not enough space in the cache to host the new dataset in entirety • Cache evictions can leave partial copies of datasets • Can we host partial copies of datasets and yet serve client accesses to the entire dataset? • ~ FileSystem-BufferCache:Disk :: FreeLoader:RemoteDataSource

  16. The Prefix Caching Problem: Impedance Matching on Steroids!! • HTTP Prefix Caching • Multimedia, streaming data delivery • BitTorrent P2P System: leechers can download and yet serve • Benefits • Bootstrapping the download process • Store more datasets • Allows for efficient cache management • Oh…, that scientific data trends again (how convenient…) • Immutable data, Remote source copy, Primarily sequential accesses • Challenges • Clients should be oblivious to dataset being partially available • Performance hit? • How much of the prefix of a dataset to cache? • So, client accesses can progress seamlessly • Online patching issues • Client access to remote patching I/O mismatch • Wide-area download vagaries

  17. Virtual Cache Architecture • Capability-based resource aggregation • Persistent storage & BW-only donors • Client serving: parallel get • Remote patching using URIs • Better cache management • Stripe entirely when space available • When eviction is needed, only stripe a prefix of the dataset • Victims based on LRU: • Evict chunks from the tail until a prefix • Entire datasets evicted only after all such tails are evicted

  18. Prefix Size Prediction • Goal: Eliminate client perceived delay in data access • What is an optimal prefix size to hide the cost of suffix patching? • Prefix size depends on: • Dataset size, S • In-cache data access rate by the client, Rclient • Suffix patching rate, Rpatch • Initial latency in suffix patching, L • Client access rate indicative of time to patch, S/Rclient = L + (S – Sprefix)/Rpatch • Thus, Sprefix = S(1 – Rpatch/Rclient) + LRpatch

  19. Collective Download • Why? • Wide-area transfer reasons: • Storage systems and protocols for HEC are tuned for bulk transfers (GridFTP, HSI) • Wide-area transfer pitfalls: high latency, connection establishment cost • Client’s local-area cache access reasons: • Client accesses to the cache use a smaller stripe size (e.g., 1MB chunks in FreeLoader) • Finer granularity for better client access rates • Can we derive from collective I/O in parallel I/O

  20. Collective Download Implementation • Patching nodes perform bulk, remote I/O; ~ 256MB per request • Reducing multiple authentication costs per dataset • Automated interactive session with “Expect” for single sign on • FreeLoader patching framework instrumented with Expect • Protocol needs to allow sessions (GridFTP, HSI) • Need to reconcile the mismatch in client access stripe size and the bulk, remote I/O request size • Shuffling • Patching nodes, p, redistribute the downloaded chunks among themselves according to the client’s striping policy • Redistribution will enable a round-robin client access • Each patching node redistributes (p – 1)/p of the downloaded data • Shuffling accomplished in memory to motivate BW-only donors • Thus, client serving, collective download and shuffling are all overlapped

  21. Testbed and Experiment setup • UberFTP stateful client to GridFTP servers at TeraGrid-PSC and TeraGrid-ORNL • HSI access to HPSS • Cold data from tapes • FreeLoader patching framework deployed in this setting

  22. Collective Download Performance

  23. Prefix Size Model Verification

  24. Impact of Prefix Caching on Cache Hit rate • Tera-ORNL will see improvements around 0.2 and 0.4 curve (308% and 176% for 20% and 40% prefix ratio) • Tera-PSC sees up to 76% improvement in hit rate with 80% prefix ratio

  25. Intermediate data cache exploits this area Let me philosophize again… • Novel storage abstractions as a means to: • Provide performance impedance matching • Overlap remote I/O, cache I/O and local I/O into a seamless “data pathway” • Provide rich resource aggregation models • Provide a low-cost, best-effort architecture for “transient” data • A combination of best practices from: parallel I/O, P2P scavenging, cooperative caching, HTTP multimedia streaming; brought to bear on “scientific data caching”

  26. Let me advertise… • http://www.csm.ornl.gov/~vazhkuda/Storage.html • Email: vazhkudaiss@ornl.gov • Collaborator: Xiaosong Ma (NCSU) • Funding: DOE ORNL LDRD (Terascale & Petascale initiatives) • Interested in joining our team? • Full time positions and summer internships available

  27. More slides • Some performance numbers • Impact studies

  28. Striping Parameters

  29. Client-side Filters

  30. Computation Impact

  31. Network Activity Test

  32. Disk-intensive Task

  33. Impact Control

More Related