1 / 16

Don’t Give Up on Distributed File Systems

Collaborative proposal by MIT and NYU experts introducing WheelFS, a distributed storage layer with semantic cues for app control and global-read/local-write performance. Explore the innovative approach to simplifying distributed applications.

pbuie
Download Presentation

Don’t Give Up on Distributed File Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Don’t Give Up on Distributed File Systems Jeremy Stribling, Emil Sit, Frans Kaashoek, Jinyang Li, and Robert Morris MIT CSAIL and NYU

  2. Reinventing the Storage Wheel • New apps tend to use new storage layers • Examples: • Can we invent this layer once? BLAST

  3. What About a File System? • A FS enables quick-prototyping for apps • A familiar interface • Language-independent usage model • Hierarchical namespace useful for apps • Write distributed apps in shell scripts if [ -f /fs/cwc/$URL ]; then if notexpired /fs/cwc/$URL; then cat /fs/cwc/$URL exit fi fi wget $URL –O - | tee /fs/cwc/$URL

  4. Why Won’t That Work Today? • Needs of distributed apps: • Control over consistency and delays • Efficient data sharing between peers • Current systems focus on FS transparency • Hide faults with long timeouts • Centralized file servers

  5. Example: Cooperative Web Cache • Would rather fail and refetch than wait • Perfect consistency isn’t crucial • Avoid hotspots if [ -f /fs/cwc/$URL ]; then if notexpired /fs/cwc/$URL; then cat /fs/cwc/$URL exit fi fi wget $URL –O - | tee /fs/cwc/$URL

  6. Our Proposal: WheelFS • A distributed wide-area FS to simplify apps • Main contributions: 1) Give apps control with semantic cues 2) Provide good performance according to Read Globally, Write Locally

  7. Create foo/bar File 135? Basic Design: Reading and Writing Node 653 Node 076 135 File 550 (bar) File 135 File 135 v2 Node 150 File 135 v3 Node 554 Cached 135 076 150 257 402 554 653 bar = 550 550 Node 257 Node 402 Dir 209 (foo) 135 v3 135 v3 135 v2 135 v2 135 135

  8. Explicit Semantic Cues • Allow direct control over system behavior • Meta-data that attach to files, dirs, or refs • Apply recursively down dir tree • Possible impl: intra-path component • /wfs/cwc/.cue/foo/bar

  9. File 135? Semantic Cues: Writability • Applies to files • WriteMany (default) • WriteOnce Node 653 Node 076 Node 554 Node 150 Cached 135 v3 Cached 135 File 135 File 135 v2 File 135 v3 Node 402 Node 257

  10. File 135? Semantic Cues: Freshness • Applies to file references • LatestVersion (default) • AnyVersion • BestVersion Node 653 Node 076 Node 554 Node 150 Cached 135 File 135 Node 402 Node 257

  11. Write File 135 Write File 135 Semantic Cues: Write Consistency • Applies to files or directories • Strict (default) • Lax Node 653 Node 076 Node 554 Node 150 File 135 File 135 v2 Node 402 Node 257 135 v2 135

  12. Example: Cooperative Web Cache • Reading an older version is ok: • cat /wfs/cwc/.bestversion,maxtime=250/foo • Writing conflicting versions is ok: • wget http://foo > /wfs/cwc/.lax,writemany/foo

  13. $URL Example: Cooperative Web Cache Dir 070 (/wfs/cwc) Node 653 Client Node 076 No! 135 “$URL” == 550 “$URL”? Node 554 Node 150 File 550 Cached 135 File 135 Chunk 135? 135 = v1 402 Chunk Node 402 Node 257 $URL Cached 135 Chunk if [ -f /wfs/cwc/.maxtime=250,bestversion/$URL ]; then if notexpired /wfs/cwc/.maxtime=250,bestversion/$URL; then cat /wfs/cwc/.maxtime=250,bestversion/$URL exit fi fi wget $URL –O - | tee /wfs/cwc/.lax,writemany/$URL

  14. Discussion • Current set of cues enough for many apps • All-sites-pings • Grid computations • OverCite • Stuff we swept under the rug: • Security • Atomic renames across dirs • Storage load-balancing • Unreferenced files

  15. Related Work • Every FS paper ever written • Specifically: • Cluster FS: Farsite, GFS, xFS, Ceph • Wide-area FS: JetFile, CFS, Shark • Grid: LegionFS, GridFTP, IBP • POSIX I/O High Performance Computing Extensions

  16. Conclusion • WheelFS: distributed storage layer for newly-written applications • Control through explicit semantic cues • Performance by reading globally and writing locally

More Related