100 likes | 114 Views
This article explores the challenges and solutions in distributed filesystems, highlighting the benefits of public caching and its implementation through whole-file caching and replication strategies. It discusses the co-locality of files, network latency, resource contention, and the use of AFS-like consistency semantics. The implementation of Multi-FS with Parrot system is also examined, showcasing its innovative approach to transparent remote filesystem access and cache management.
E N D
Public Caching Michael Albrecht Rory Carmichael
Motivation • Distributed filesystems are useful in a variety of circumstances • Active Storage • Co-locality of some files important • Two big problems encountered • Network Latency • Resource Contention
Potential Solutions • Whole-File Caching • Copies only help one person/machine • Replication • Wasteful • Unnecessary copies of unused files • Not enough copies of heavily used files • Maintaining consistency causes performance problems • Caching + Replication • Combines Problems from both
Solution – Public Caches • Use Whole File Caching as normal • Use AFS-like consistency semantics • Publish location of your copy in the file’s “inode” • Provides usage-based replication • Reduction in unnecessary copies • Semantics • Copy On Read • Clobber On Close
Implementation – Multi-FS • Parrot • Type II VM • Transparent Remote Filesystem access • Home of Multi-FS • Chirp • Distributed Filesystem used by Multi-FS • Log-style Metadata • Cache Cleaner • Removes out-of-date versions • Reduces log to minimum necessary
application open A@ex/foothen read into memory parrot open A/ex/root/foo open B/ex/data/abc return “B, /ex/data/abc” B(data only) A (metadata& data) return “Hello World 2” • B/ex/data/ • abc: “Hello World 2” • A/ex/root/ • foo: B, /ex/data/abc • bar: A, /ex/data/xyz • A/ex/data/ • xyz: “Hello World” Multi-FS
Multi-FS w/Public Caches client application • client/multicache/ • mno: “Hello World 2” return “Hello World 2” open A@ex/foothen read into memory parrot open A/ex/root/foo open B/ex/data/abc return “B, /data/ex/abc” B(data only) A (metadata& data) • A/ex/root/ • foo: B, /data/ex/abc • client, /multicache/mno • bar: A, /data/ex/xyz • A/ex/data/ • xyz: “Hello World” • B/ex/data/ • abc: “Hello World 2”
Evaluation • Overhead • Not substantially different • Clustered File Requests • Impact of File Write
Conclusions • Public Caching Works • Public Caching provides performance benefits • Public Caching overhead is not excessive